aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Fix LOCK TABLE to eliminate the race condition that could make it give weirdTom Lane2009-05-12
| | | | | | | | | errors when tables are concurrently dropped. To do this we must take lock on each relation before we check its privileges. The old code was trying to do that the other way around, which is a bit pointless when there are lots of other commands that lock relations before checking privileges. I did keep it checking each relation's privilege before locking the next relation, which is a detail that ALTER TABLE isn't too picky about.
* Request XLOG switch before writing checkpoint in pg_start_backup(). OtherwiseHeikki Linnakangas2009-05-07
| | | | | | | | | | | | | | | | | | you can end up with an unrecoverable backup if you start a new base backup right after finishing archive recovery. In that scenario, the redo pointer of the checkpoint that pg_start_backup() writes points to the XLOG segment where the timeline-changing end-of-archive-recovery checkpoint is. The beginning of that segment contains pages with the old timeline ID, and we don't accept that in recovery unless we find a history file covering the old timeline ID. If you omit pg_xlog from the base backup and clear the archive directory before starting the backup, there will be no such history file available. The bug is present in all versions since PITR was introduced in 8.0, but I'm back-patching only back to 8.2. Earlier versions didn't have XLOG switch records, making this fix unfeasible. Given the lack of reports until now, it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1. Per report and suggestion by Mikael Krantz
* Insert CHECK_FOR_INTERRUPTS() calls into btree and hash index scans at theTom Lane2009-05-05
| | | | | | | | | | | | | | | | | | | points where we step right or left to the next page. This should ensure reasonable response time to a query cancel request during an unsuccessful index scan, as seen in recent gripe from Marc Cousin. It's a bit trickier than it might seem at first glance, because CHECK_FOR_INTERRUPTS() is a no-op if executed while holding a buffer lock. So we have to do it just at the point where we've dropped one page lock and not yet acquired the next. Remove CHECK_FOR_INTERRUPTS calls at the top level of btgetbitmap and hashgetbitmap, since they're pointless given the added checks. I think that GIST is okay already --- at least, there's a CHECK_FOR_INTERRUPTS at a plausible-looking place in gistnext(). I don't claim to know GIN well enough to try to poke it for this, if indeed it has a problem at all. This is a pre-existing issue, but in view of the lack of prior complaints I'm not going to risk back-patching.
* Update comment for _bt_relandgetbuf.Tom Lane2009-05-05
|
* Change the default value of max_prepared_transactions to zero, and addTom Lane2009-04-23
| | | | | | | | | | | | | | | | | | | | documentation warnings against setting it nonzero unless active use of prepared transactions is intended and a suitable transaction manager has been installed. This should help to prevent the type of scenario we've seen several times now where a prepared transaction is forgotten and eventually causes severe maintenance problems (or even anti-wraparound shutdown). The only real reason we had the default be nonzero in the first place was to support regression testing of the feature. To still be able to do that, tweak pg_regress to force a nonzero value during "make check". Since we cannot force a nonzero value in "make installcheck", add a variant regression test "expected" file that shows the results that will be obtained when max_prepared_transactions is zero. Also, extend the HINT messages for transaction wraparound warnings to mention the possibility that old prepared transactions are causing the problem. All per today's discussion.
* After archive recovery, mark the last WAL segment from the parent timelineHeikki Linnakangas2009-04-22
| | | | | | | ready for archival. It was marked at the next checkpoint anyway, but waiting for the next checkpoint is an unnecessary delay. Fujii Masao
* Add an optional parameter to pg_start_backup() that specifies whether to doTom Lane2009-04-07
| | | | | | the checkpoint in immediate or lazy mode. This is to address complaints that pg_start_backup() takes a long time even when there's no need to minimize its I/O consumption.
* Fix 'all at one page bug' in picksplit method of R-tree emulation. Add defenseTeodor Sigaev2009-04-06
| | | | from buggy user-defined picksplit to GiST.
* Fix infinite loop while checking of partial match in pending list.Teodor Sigaev2009-04-05
| | | | | Improve comments. Now GIN-indexable operators should be strict. Per Tom's questions/suggestions.
* Remove the recently added node types ReloptElem and OptionDefElem in favorTom Lane2009-04-04
| | | | | | of adding optional namespace and action fields to DefElem. Having three node types that do essentially the same thing bloats the code and leads to errors of confusion, such as in yesterday's bug report from Khee Chin.
* Disallow setting fillfactor for TOAST tables.Alvaro Herrera2009-04-04
| | | | | | | | | | | | | | To implement this without almost duplicating the reloption table, treat relopt_kind as a bitmask instead of an integer value. This decreases the range of allowed values, but it's not clear that there's need for that much values anyway. This patch also makes heap_reloptions explicitly a no-op for relation kinds other than heap and TOAST tables. Patch by ITAGAKI Takahiro with minor edits from me. (In particular I removed the bit about adding relation kind to an error message, which I intend to commit separately.)
* Revert DTrace patch from Robert LorBruce Momjian2009-04-02
|
* Add support for additional DTrace probes.Bruce Momjian2009-04-02
| | | | Robert Lor
* Fix an oversight in the support for storing/retrieving "minimal tuples" inTom Lane2009-03-30
| | | | | | | | | | | | | | | | | | | | | TupleTableSlots. We have functions for retrieving a minimal tuple from a slot after storing a regular tuple in it, or vice versa; but these were implemented by converting the internal storage from one format to the other. The problem with that is it invalidates any pass-by-reference Datums that were already fetched from the slot, since they'll be pointing into the just-freed version of the tuple. The known problem cases involve fetching both a whole-row variable and a pass-by-reference value from a slot that is fed from a tuplestore or tuplesort object. The added regression tests illustrate some simple cases, but there may be other failure scenarios traceable to the same bug. Note that the added tests probably only fail on unpatched code if it's built with --enable-cassert; otherwise the bug leads to fetching from freed memory, which will not have been overwritten without additional conditions. Fix by allowing a slot to contain both formats simultaneously; which turns out not to complicate the logic much at all, if anything it seems less contorted than before. Back-patch to 8.2, where minimal tuples were introduced.
* Adjust the APIs for GIN opclass support functions to allow the extractQuery()Tom Lane2009-03-25
| | | | | | | | | | | | | | method to pass extra data to the consistent() and comparePartial() methods. This is the core infrastructure needed to support the soon-to-appear contrib/btree_gin module. The APIs are still upward compatible with the definitions used in 8.3 and before, although *not* with the previous 8.4devel function definitions. catversion bump for changes in pg_proc entries (although these are just cosmetic, since GIN doesn't actually look at the function signature before calling it...) Teodor Sigaev and Oleg Bartunov
* Install a search tree depth limit in GIN bulk-insert operations, to preventTom Lane2009-03-24
| | | | | | | | | | | | them from degrading badly when the input is sorted or nearly so. In this scenario the tree is unbalanced to the point of becoming a mere linked list, so insertions become O(N^2). The easiest and most safely back-patchable solution is to stop growing the tree sooner, ie limit the growth of N. We might later consider a rebalancing tree algorithm, but it's not clear that the benefit would be worth the cost and complexity. Per report from Sergey Burladyan and an earlier complaint from Heikki. Back-patch to 8.2; older versions didn't have GIN indexes.
* Implement "fastupdate" support for GIN indexes, in which we try to accumulateTom Lane2009-03-24
| | | | | | | | | | | | | | | | | | multiple index entries in a holding area before adding them to the main index structure. This helps because bulk insert is (usually) significantly faster than retail insert for GIN. This patch also removes GIN support for amgettuple-style index scans. The API defined for amgettuple is difficult to support with fastupdate, and the previously committed partial-match feature didn't really work with it either. We might eventually figure a way to put back amgettuple support, but it won't happen for 8.4. catversion bumped because of change in GIN's pg_am entry, and because the format of GIN indexes changed on-disk (there's a metapage now, and possibly a pending list). Teodor Sigaev
* Const-ify the parse table passed to fillRelOptions. The previous codingTom Lane2009-03-23
| | | | meant it had to be built on-the-fly at each entry to default_reloptions.
* Code review for dtrace probes added (so far) to 8.4. Adjust placement ofTom Lane2009-03-11
| | | | | | | some bufmgr probes, take out redundant and memory-leak-inducing path arguments to smgr__md__read__done and smgr__md__write__done, fix bogus attempt to recalculate space used in sort__done, clean up formatting in places where I'm not sure pgindent will do a nice job by itself.
* Reload config file in startup process on SIGHUP.Heikki Linnakangas2009-03-04
| | | | Fujii Masao
* Reduce the maximum value of vacuum_cost_delay and autovacuum_vacuum_cost_delayTom Lane2009-02-28
| | | | | | | to 100ms (from 1000). This still seems to be comfortably larger than the useful range of the parameter, and it should help discourage people from picking uselessly large values. Tweak the documentation to recommend small values, too. Per discussion of a couple weeks ago.
* Change the signaling of end-of-recovery. Startup process now indicates endHeikki Linnakangas2009-02-23
| | | | | of recovery by exiting with exit code 0, like in previous releases. Per Tom's suggestion.
* Start background writer during archive recovery. Background writer now performsHeikki Linnakangas2009-02-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | its usual buffer cleaning duties during archive recovery, and it's responsible for performing restartpoints. This requires some changes in postmaster. When the startup process has done all the initialization and is ready to start WAL redo, it signals the postmaster to launch the background writer. The postmaster is signaled again when the point in recovery is reached where we know that the database is in consistent state. Postmaster isn't interested in that at the moment, but that's the point where we could let other backends in to perform read-only queries. The postmaster is signaled third time when the recovery has ended, so that postmaster knows that it's safe to start accepting connections. The startup process now traps SIGTERM, and performs a "clean" shutdown. If you do a fast shutdown during recovery, a shutdown restartpoint is performed, like a shutdown checkpoint, and postmaster kills the processes cleanly. You still have to continue the recovery at next startup, though. Currently, the background writer is only launched during archive recovery. We could launch it during crash recovery as well, but it seems better to keep that codepath as simple as possible, for the sake of robustness. And it couldn't do any restartpoints during crash recovery anyway, so it wouldn't be that useful. log_restartpoints is gone. Use log_checkpoints instead. This is yet to be documented. This whole operation is a pre-requisite for Hot Standby, but has some value of its own whether the hot standby patch makes 8.4 or not. Simon Riggs, with lots of modifications by me.
* Adopt Bob Jenkins' improved hash function for hash_any(). This changes theTom Lane2009-02-09
| | | | | | contents of hash indexes (again), so bump catversion. Kenneth Marshall
* Update autovacuum to use reloptions instead of a system catalog, forAlvaro Herrera2009-02-09
| | | | | | | | | per-table overrides of parameters. This removes a whole class of problems related to misusing the catalog, and perhaps more importantly, gives us pg_dump support for the parameters. Based on a patch by Euler Taveira de Oliveira, heavily reworked by me.
* Fix obsolete comment. Zdenek KotalaHeikki Linnakangas2009-02-07
|
* Allow reloption names to have qualifiers, initially supporting a TOASTAlvaro Herrera2009-02-02
| | | | | | | | qualifier, and add support for this in pg_dump. This allows TOAST tables to have user-defined fillfactor, and will also enable us to move the autovacuum parameters to reloptions without taking away the possibility of setting values for TOAST tables.
* Allow extracting and parsing of reloptions from a bare pg_class tuple, andAlvaro Herrera2009-01-26
| | | | | | refactor the relcache code that used to do that. This allows other callers (particularly autovacuum) to do the same without necessarily having to open and lock a table.
* Put back fast-path for the case that there's no backup blocks inHeikki Linnakangas2009-01-23
| | | | | RestoreBkpBlocks. Went missing in my recent refactoring patch, as pointed out by Simon's hot standby patch.
* Support column-level privileges, as required by SQL standard.Tom Lane2009-01-22
| | | | Stephen Frost, with help from KaiGai Kohei and others
* Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock shouldHeikki Linnakangas2009-01-20
| | | | | | | | | be used instead of the normal exclusive lock, and make WAL redo functions responsible for calling RestoreBkpBlocks(). They know better what kind of a lock they need. At the moment, this just moves things around with no functional change, but makes the hot standby patch that's under review cleaner.
* Simplify the writing of amoptions routines by introducing a convenienceAlvaro Herrera2009-01-12
| | | | | | | | | | | | fillRelOptions routine that stores the parsed values in the struct using a table-based approach. Per Tom suggestion. Also remove the "continue" in HANDLE_*_RELOPTION macros, which were useless and in spirit they were assuming too much of how the macros were going to be used. (Note that these macros are now unused, but the intention is to introduce some usage in a future autovacuum patch, which is why they weren't completely removed.) Also, do not call the string validation routine when not validating. It seems less error-prone this way, per commentary on the amoptions SGML docs.
* Re-enable the old code in xlog.c that tried to use posix_fadvise(), so thatTom Lane2009-01-11
| | | | | | | we can get some buildfarm feedback about whether that function is still problematic. (Note that the planned async-preread patch will not really prove anything one way or the other in buildfarm testing, since it will be inactive with default GUC settings.)
* Revise the TIDBitmap API to support multiple concurrent iterations over aTom Lane2009-01-10
| | | | | | bitmap. This is extracted from Greg Stark's posix_fadvise patch; it seems worth committing separately, since it's potentially useful independently of posix_fadvise.
* A couple further reloptions improvements, per KaiGai Kohei: add a validationAlvaro Herrera2009-01-08
| | | | | | function to the string type and add a couple of macros for string handling. In passing, fix an off-by-one bug of mine.
* Fix string reloption handling, per KaiGai Kohei.Alvaro Herrera2009-01-06
|
* Suppress compiler warning in a different way, per Alvaro.Bruce Momjian2009-01-06
|
* Supress compiler warning.Bruce Momjian2009-01-06
|
* Change the reloptions machinery to use a table-based parser, and provideAlvaro Herrera2009-01-05
| | | | | | | | a more complete framework for writing custom option processing routines by user-defined access methods. Catalog version bumped due to the general API changes, which are going to affect user-defined "amoptions" routines.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Change the name of dtrace wal tracepoints:Bruce Momjian2008-12-24
| | | | | | TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY Robert Lor
* The attached patch contains a couple of fixes in the existing probes andBruce Momjian2008-12-17
| | | | | | | | | | | | | includes a few new ones. - Fixed compilation errors on OS X for probes that use typedefs - Fixed a number of probes to pass ForkNumber per the relation forks patch - The new probes are those that were taken out from the previous submitted patch and required simple fixes. Will submit the other probes that may require more discussion in a separate patch. Robert Lor
* Make heap_update() set newtup->t_tableOid correctly, for consistency withTom Lane2008-12-16
| | | | | | | | | | the other major heapam.c functions. The only known consequence of this omission is that UPDATE RETURNING failed to return the correct value for "tableoid", as per report from KaiGai Kohei. Back-patch to 8.2. Arguably it's wrong all the way back; but without evidence of visible breakage before RETURNING was added, I'll desist from patching the older branches.
* To reduce confusion over whether VACUUM FULL is needed for anti-wraparoundTom Lane2008-12-11
| | | | | | vacuuming (it's not), say "database-wide VACUUM" instead of "full-database VACUUM" in the relevant hint messages. Also, document the permissions needed to do this. Per today's discussion.
* Revert SIGUSR1 multiplexing patch, per Tom's objection.Heikki Linnakangas2008-12-09
|
* Provide support for multiplexing SIGUSR1 signal. The upcoming synchronousHeikki Linnakangas2008-12-09
| | | | | | replication patch needs a signal, but we've already used SIGUSR1 and SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that, and for other purposes too if the need arises.
* MAPSIZE macro needs to use MAXALIGN(SizeOfPageHeaderData) instead ofHeikki Linnakangas2008-12-06
| | | | | SizeOfPageHeaderData, like PageGetContents does. Per report by Pavan Deolasee.
* Fix a couple of snapshot management bugs in the new ResourceOwner world:Alvaro Herrera2008-12-04
| | | | | | | | | | | non-writable large objects need to have their snapshots registered on the transaction resowner, not the current portal's, because it must persist until the large object is closed (which the portal does not). Also, ensure that the serializable snapshot is recorded by the transaction resource owner too, even when a subtransaction has changed the current resource owner before serializable is taken. Per bug reports from Pavan Deolasee.
* Initialize GISTScanOpaque->qual_ok even if there is no conditions.Teodor Sigaev2008-12-04
|
* Introduce visibility map. The visibility map is a bitmap with one bit perHeikki Linnakangas2008-12-03
| | | | | | | | | | | | | | | | | | heap page, where a set bit indicates that all tuples on the page are visible to all transactions, and the page therefore doesn't need vacuuming. It is stored in a new relation fork. Lazy vacuum uses the visibility map to skip pages that don't need vacuuming. Vacuum is also responsible for setting the bits in the map. In the future, this can hopefully be used to implement index-only-scans, but we can't currently guarantee that the visibility map is always 100% up-to-date. In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on each heap page, also indicating that all tuples on the page are visible to all transactions. It's important that this flag is kept up-to-date. It is also used to skip visibility tests in sequential scans, which gives a small performance gain on seqscans.