aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Force VACUUM to recalculate oldestXmin even when we haven't changed ourTom Lane2009-09-01
| | | | | | own database's datfrozenxid, if the current value is old enough to be forcing autovacuums or warning messages. This ensures that a bogus value is replaced as soon as possible. Per a comment from Heikki.
* Actually, we need to bump the format identifier on twophase filesTom Lane2009-09-01
| | | | because of readjustment of 2PC rmgr IDs for flatfile removal.
* Remove flatfiles.c, which is now obsolete.Alvaro Herrera2009-09-01
| | | | | | Recent commits have removed the various uses it was supporting. It was a performance bottleneck, according to bug report #4919 by Lauris Ulmanis; seems it slowed down user creation after a billion users.
* Track the current XID wrap limit (or more accurately, the oldest unfrozenTom Lane2009-08-31
| | | | | | | | | | | | | | | XID) in checkpoint records. This eliminates the need to recompute the value from scratch during database startup, which is one of the two remaining reasons for the flatfile code to exist. It should also simplify life for hot-standby operation. To avoid bloating the checkpoint records unreasonably, I switched from tracking the oldest database by name to tracking it by OID. This turns out to save cycles in general (everywhere but the warning-generating paths, which we hardly care about) and also helps us deal with the case that the oldest database got dropped instead of being vacuumed. The prior coding might go for a long time without updating the wrap limit in that case, which is bad because it might result in a lot of useless autovacuum activity.
* Fix handling of autovacuum reloptions.Alvaro Herrera2009-08-27
| | | | | | | | In the original coding, setting a single reloption would cause default values to be used for all the other reloptions. This is a problem particularly for autovacuum reloptions. Itagaki Takahiro
* In the checkpoint written at the end of archive recovery, the WAL page headerHeikki Linnakangas2009-08-27
| | | | | | | | | | was incorrectly initialized with timeline ID 0. That rendered the WAL page unrecoverable, making a subsequent archive recovery stop at that point. ThisTimeLineID needs to be initialized before calling AdvanceXLInsertBuffer(). This fixes bug #5011 reported by James Bardin. Backpatch to 8.4, as the bug was introduced by the changes to use of bgwriter for writing the end-of-archive-recovery checkpoint. Patch by Tom Lane.
* Fix a violation of WAL coding rules in the recent patch to include anTom Lane2009-08-24
| | | | | | | | | | | "all tuples visible" flag in heap page headers. The flag update *must* be applied before calling XLogInsert, but heap_update and the tuple moving routines in VACUUM FULL were ignoring this rule. A crash and replay could therefore leave the flag incorrectly set, causing rows to appear visible in seqscans when they should not be. This might explain recent reports of data corruption from Jeff Ross and others. In passing, do a bit of editorialization on comments in visibilitymap.c.
* Department of marginal improvements: teach tupconvert.c to avoid doing aTom Lane2009-08-17
| | | | | | | physical conversion when there are dropped columns in the same places in the input and output tupdescs. This avoids possible performance loss from the recent patch to improve dropped-column handling, in some cases where the old code would have worked.
* Allow backends to start up without use of the flat-file copy of pg_database.Tom Lane2009-08-12
| | | | | | | | | | | | | | | | | | | | | | To make this work in the base case, pg_database now has a nailed-in-cache relation descriptor that is initialized using hardwired knowledge in relcache.c. This means pg_database is added to the set of relations that need to have a Schema_pg_xxx macro maintained in pg_attribute.h. When this path is taken, we'll have to do a seqscan of pg_database to find the row we need. In the normal case, we are able to do an indexscan to find the database's row by name. This is made possible by storing a global relcache init file that describes only the shared catalogs and their indexes (and therefore is usable by all backends in any database). A new backend loads this cache file, finds its database OID after an indexscan on pg_database, and then loads the local relcache init file for that database. This change should effectively eliminate number of databases as a factor in backend startup time, even with large numbers of databases. However, the real reason for doing it is as a first step towards getting rid of the flat files altogether. There are still several other sub-projects to be tackled before that can happen.
* Document that LocalSetXLogInsertAllowed can be re-executed.Tom Lane2009-08-08
| | | | Per comment from Simon.
* rm_cleanup functions need to be allowed to write WAL entries. This oversightTom Lane2009-08-07
| | | | | appears to explain the recent reports of "PANIC: cannot make new WAL entries during recovery".
* Improve plpgsql's ability to cope with rowtypes containing dropped columns,Tom Lane2009-08-06
| | | | | | | | | | | by supporting conversions in places that used to demand exact rowtype match. Since this issue is certain to come up elsewhere (in fact, already has, in ExecEvalConvertRowtype), factor out the support code into new core functions for tuple conversion. I chose to put these in a new source file since heaptuple.c is already overly long. Heavily revised version of a patch by Pavel Stehule.
* Add ALTER TABLE ... ALTER COLUMN ... SET STATISTICS DISTINCTTom Lane2009-08-02
| | | | Robert Haas
* Department of second thoughts: let's show the exact key during unique indexTom Lane2009-08-01
| | | | | build failures, too. Refactor a bit more since that error message isn't spelled the same.
* Improve unique-constraint-violation error messages to include the exactTom Lane2009-08-01
| | | | | | | | | values being complained of. In passing, also remove the arbitrary length limitation in the similar error detail message for foreign key violations. Itagaki Takahiro
* Support deferrable uniqueness constraints.Tom Lane2009-07-29
| | | | | | | | | | The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed
* Tweak TOAST code so that columns marked with MAIN storage strategy areTom Lane2009-07-22
| | | | | | | | not forced out-of-line unless that is necessary to make the row fit on a page. Previously, they were forced out-of-line if needed to get the row down to the default target size (1/4th page). Kevin Grittner
* Make backend header files C++ safePeter Eisentraut2009-07-16
| | | | | | | | | | | This alters various incidental uses of C++ key words to use other similar identifiers, so that a C++ compiler won't choke outright. You still (probably) need extern "C" { }; around the inclusion of backend headers. based on a patch by Kurt Harriman <harriman@acm.org> Also add a script cpluspluscheck to check for C++ compatibility in the future. As of right now, this passes without error for me.
* Cleanup and code review for the patch that made bgwriter active duringTom Lane2009-06-26
| | | | | | | | | | | | | archive recovery. Invent a separate state variable and inquiry function for XLogInsertAllowed() to clarify some tests and make the management of writing the end-of-recovery checkpoint less klugy. Fix several places that were incorrectly testing InRecovery when they should be looking at RecoveryInProgress or XLogInsertAllowed (because they will now be executed in the bgwriter not startup process). Clarify handling of bad LSNs passed to XLogFlush during recovery. Use a spinlock for setting/testing SharedRecoveryInProgress. Improve quite a lot of comments. Heikki and Tom
* Fix some serious bugs in archive recovery, now that bgwriter is activeHeikki Linnakangas2009-06-25
| | | | | | | | | | | | | | | | | | | | during it: When bgwriter is active, the startup process can't perform mdsync() correctly because it won't see the fsync requests accumulated in bgwriter's private pendingOpsTable. Therefore make bgwriter responsible for the end-of-recovery checkpoint as well, when it's active. When bgwriter is active (= archive recovery), the startup process must not accumulate fsync requests to its own pendingOpsTable, since bgwriter won't see them there when it performs restartpoints. Make startup process drop its pendingOpsTable when bgwriter is launched to avoid that. Update minimum recovery point one last time when leaving archive recovery. It won't be updated by the end-of-recovery checkpoint because XLogFlush() sees us as out of recovery already. This fixes bug #4879 reported by Fujii Masao.
* The code to unlink dropped relations in FinishPreparedTransaction() wasHeikki Linnakangas2009-06-25
| | | | | | acting like runs inside WAL recovery, but it doesn't. I must've copy-pasted this from a redo-function in the relation forks patch. Noticed by Tom Lane while he was looking through callers of smgrdounlink().
* Correct grammar in picksplit debug messagesPeter Eisentraut2009-06-24
|
* Fix a few errors in comments. Patch by Fujii Masao, plus the one inHeikki Linnakangas2009-06-18
| | | | visibilitymap.c by me.
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Improve capitalization and punctuation in recently added GiST message.Peter Eisentraut2009-06-10
|
* Keep rs_startblock the same during heap_rescan, so that a rescan of a SeqScanTom Lane2009-06-10
| | | | | | | | | | | | | | node starts from the same place as the first scan did. This avoids surprising behavior of scrollable and WITH HOLD cursors, as seen in Mark Kirkwood's bug report of yesterday. It's not entirely clear whether a rescan should be forced to drop out of the syncscan mode, but for the moment I left the code behaving the same on that point. Any change there would only be a performance and not a correctness issue, anyway. Back-patch to 8.3, since the unstable behavior was created by the syncscan patch.
* Improve the IndexVacuumInfo/IndexBulkDeleteResult API to allow somewhat saneTom Lane2009-06-06
| | | | | | | | | | | | | | | | | | | behavior in cases where we don't know the heap tuple count accurately; in particular partial vacuum, but this also makes the API a bit more useful for ANALYZE. This patch adds "estimated_count" flags to both structs so that an approximate count can be flagged as such, and adjusts the logic so that approximate counts are not used for updating pg_class.reltuples. This fixes my previous complaint that VACUUM was putting ridiculous values into pg_class.reltuples for indexes. The actual impact of that bug is limited, because the planner only pays attention to reltuples for an index if the index is partial; which probably explains why beta testers hadn't noticed a degradation in plan quality from it. But it needs to be fixed. The whole thing is a bit messy and should be redesigned in future, because reltuples now has the potential to drift quite far away from reality when a long period elapses with no non-partial vacuums. But this is as good as it's going to get for 8.4.
* Fix a serious bug introduced into GIN in 8.4: now that MergeItemPointers()Tom Lane2009-06-06
| | | | | | | | | is supposed to remove duplicate heap TIDs, we have to be sure to reduce the tuple size and posting-item count accordingly in addItemPointersToTuple(). Failing to do so resulted in the effective injection of garbage TIDs into the index contents, ie, whatever happened to be in the memory palloc'd for the new tuple. I'm not sure that this fully explains the index corruption reported by Tatsuo Ishii, but the test case I'm using no longer fails.
* Only recycle normal files in pg_xlog as WAL segments. pg_standby createsHeikki Linnakangas2009-06-02
| | | | | | | | symbolic links with the -l option, and as Fujii Masao pointed out we ended up overwriting files in the archive directory before this patch. Patch by Aidan Van Dyk, Fujii Masao and me. Backpatch to 8.3, where pg_standby was introduced.
* When archiving is enabled, rotate the last WAL segment at shutdown so thatHeikki Linnakangas2009-05-28
| | | | | | all transactions are archived. Original patch by Guillaume Smet.
* Use more-portable coding for the check on handing out the last availableTom Lane2009-05-24
| | | | relopt_kind value in add_reloption_kind(). Per Zdenek Kotala.
* Fix bug #4814 (wrong subscript in consistent-function call), and add someTom Lane2009-05-19
| | | | minimal regression test coverage for matchPartialInPendingList().
* Fix all the server-side SIGQUIT handlers (grumble ... why so many identicalTom Lane2009-05-15
| | | | | | | copies?) to ensure they really don't run proc_exit/shmem_exit callbacks, as was intended. I broke this behavior recently by installing atexit callbacks without thinking about the one case where we truly don't want to run those callback functions. Noted in an example from Dave Page.
* Include recovery_end_command in recovery.conf.sample.Tom Lane2009-05-14
| | | | Per suggestion of Jaime Casanova.
* Improve a couple of comments.Tom Lane2009-05-14
|
* Add recovery_end_command option to recovery.conf. recovery_end_commandHeikki Linnakangas2009-05-14
| | | | | | | | | | | | | | is run at the end of archive recovery, providing a chance to do external cleanup. Modify pg_standby so that it no longer removes the trigger file, that is to be done using the recovery_end_command now. Provide a "smart" failover mode in pg_standby, where we don't fail over immediately, but only after recovering all unapplied WAL from the archive. That gives you zero data loss assuming all WAL was archived before failover, which is what most users of pg_standby actually want. recovery_end_command by Simon Riggs, pg_standby changes by Fujii Masao and myself.
* Rewrite xml.c's memory management (yet again). Give up on the idea ofTom Lane2009-05-13
| | | | | | | | | | | | | | | | | | | | | | redirecting libxml's allocations into a Postgres context. Instead, just let it use malloc directly, and add PG_TRY blocks as needed to be sure we release libxml data structures in error recovery code paths. This is ugly but seems much more likely to play nicely with third-party uses of libxml, as seen in recent trouble reports about using Perl XML facilities in pl/perl and bug #4774 about contrib/xml2. I left the code for allocation redirection in place, but it's only built/used if you #define USE_LIBXMLCONTEXT. This is because I found it useful to corral libxml's allocations in a palloc context when hunting for libxml memory leaks, and we're surely going to have more of those in the future with this type of approach. But we don't want it turned on in a normal build because it breaks exactly what we need to fix. I have not re-indented most of the code sections that are now wrapped by PG_TRY(); that's for ease of review. pg_indent will fix it. This is a pre-existing bug in 8.3, but I don't dare back-patch this change until it's gotten a reasonable amount of field testing.
* Fix LOCK TABLE to eliminate the race condition that could make it give weirdTom Lane2009-05-12
| | | | | | | | | errors when tables are concurrently dropped. To do this we must take lock on each relation before we check its privileges. The old code was trying to do that the other way around, which is a bit pointless when there are lots of other commands that lock relations before checking privileges. I did keep it checking each relation's privilege before locking the next relation, which is a detail that ALTER TABLE isn't too picky about.
* Request XLOG switch before writing checkpoint in pg_start_backup(). OtherwiseHeikki Linnakangas2009-05-07
| | | | | | | | | | | | | | | | | | you can end up with an unrecoverable backup if you start a new base backup right after finishing archive recovery. In that scenario, the redo pointer of the checkpoint that pg_start_backup() writes points to the XLOG segment where the timeline-changing end-of-archive-recovery checkpoint is. The beginning of that segment contains pages with the old timeline ID, and we don't accept that in recovery unless we find a history file covering the old timeline ID. If you omit pg_xlog from the base backup and clear the archive directory before starting the backup, there will be no such history file available. The bug is present in all versions since PITR was introduced in 8.0, but I'm back-patching only back to 8.2. Earlier versions didn't have XLOG switch records, making this fix unfeasible. Given the lack of reports until now, it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1. Per report and suggestion by Mikael Krantz
* Insert CHECK_FOR_INTERRUPTS() calls into btree and hash index scans at theTom Lane2009-05-05
| | | | | | | | | | | | | | | | | | | points where we step right or left to the next page. This should ensure reasonable response time to a query cancel request during an unsuccessful index scan, as seen in recent gripe from Marc Cousin. It's a bit trickier than it might seem at first glance, because CHECK_FOR_INTERRUPTS() is a no-op if executed while holding a buffer lock. So we have to do it just at the point where we've dropped one page lock and not yet acquired the next. Remove CHECK_FOR_INTERRUPTS calls at the top level of btgetbitmap and hashgetbitmap, since they're pointless given the added checks. I think that GIST is okay already --- at least, there's a CHECK_FOR_INTERRUPTS at a plausible-looking place in gistnext(). I don't claim to know GIN well enough to try to poke it for this, if indeed it has a problem at all. This is a pre-existing issue, but in view of the lack of prior complaints I'm not going to risk back-patching.
* Update comment for _bt_relandgetbuf.Tom Lane2009-05-05
|
* Change the default value of max_prepared_transactions to zero, and addTom Lane2009-04-23
| | | | | | | | | | | | | | | | | | | | documentation warnings against setting it nonzero unless active use of prepared transactions is intended and a suitable transaction manager has been installed. This should help to prevent the type of scenario we've seen several times now where a prepared transaction is forgotten and eventually causes severe maintenance problems (or even anti-wraparound shutdown). The only real reason we had the default be nonzero in the first place was to support regression testing of the feature. To still be able to do that, tweak pg_regress to force a nonzero value during "make check". Since we cannot force a nonzero value in "make installcheck", add a variant regression test "expected" file that shows the results that will be obtained when max_prepared_transactions is zero. Also, extend the HINT messages for transaction wraparound warnings to mention the possibility that old prepared transactions are causing the problem. All per today's discussion.
* After archive recovery, mark the last WAL segment from the parent timelineHeikki Linnakangas2009-04-22
| | | | | | | ready for archival. It was marked at the next checkpoint anyway, but waiting for the next checkpoint is an unnecessary delay. Fujii Masao
* Add an optional parameter to pg_start_backup() that specifies whether to doTom Lane2009-04-07
| | | | | | the checkpoint in immediate or lazy mode. This is to address complaints that pg_start_backup() takes a long time even when there's no need to minimize its I/O consumption.
* Fix 'all at one page bug' in picksplit method of R-tree emulation. Add defenseTeodor Sigaev2009-04-06
| | | | from buggy user-defined picksplit to GiST.
* Fix infinite loop while checking of partial match in pending list.Teodor Sigaev2009-04-05
| | | | | Improve comments. Now GIN-indexable operators should be strict. Per Tom's questions/suggestions.
* Remove the recently added node types ReloptElem and OptionDefElem in favorTom Lane2009-04-04
| | | | | | of adding optional namespace and action fields to DefElem. Having three node types that do essentially the same thing bloats the code and leads to errors of confusion, such as in yesterday's bug report from Khee Chin.
* Disallow setting fillfactor for TOAST tables.Alvaro Herrera2009-04-04
| | | | | | | | | | | | | | To implement this without almost duplicating the reloption table, treat relopt_kind as a bitmask instead of an integer value. This decreases the range of allowed values, but it's not clear that there's need for that much values anyway. This patch also makes heap_reloptions explicitly a no-op for relation kinds other than heap and TOAST tables. Patch by ITAGAKI Takahiro with minor edits from me. (In particular I removed the bit about adding relation kind to an error message, which I intend to commit separately.)
* Revert DTrace patch from Robert LorBruce Momjian2009-04-02
|
* Add support for additional DTrace probes.Bruce Momjian2009-04-02
| | | | Robert Lor