postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Fix pg_extension_config_dump() to handle update cases more sanely.	Tom Lane	2012-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If pg_extension_config_dump() is executed again for a table already listed in the extension's extconfig, the code was blindly making a new array entry. This does not seem useful. Fix it to replace the existing array entry instead, so that it's possible for extension update scripts to alter the filter conditions for configuration tables. In addition, teach ALTER EXTENSION DROP TABLE to check for an extconfig entry for the target table, and remove it if present. This is not a 100% solution because it's allowed for an extension update script to just summarily DROP a member table, and that code path doesn't go through ExecAlterExtensionContentsStmt. We could probably make that case clean things up if we had to, but it would involve sticking a very ugly wart somewhere in the guts of dependency.c. Since on the whole it seems quite unlikely that extension updates would want to remove pre-existing configuration tables, making the case possible with an explicit command seems sufficient. Per bug #7756 from Regina Obe. Back-patch to 9.1 where extensions were introduced.
*	Fix recycling of WAL segments after changing recovery target timeline.	Heikki Linnakangas	2012-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After the recovery target timeline is changed, we would still recycle and preallocate WAL segments on the old target timeline. Those WAL segments created for the old timeline are a waste of space, although otherwise harmless. The problem is that when installing a recycled WAL segment as a future one, ThisTimeLineID is used to construct the filename. ThisTimeLineID is initialized in the checkpointer process to the recovery target timeline at startup, but it was not updated when the startup process chooses a new target timeline (recovery_target_timeline='latest'). To fix, always update ThisTimeLineID before recycling WAL segments at a restartpoint. This still leaves a small window where we might install WAL segments under wrong timeline ID, if the target timeline is changed just as we're about to start recycling. Also, when we're not on the target timeline yet, but still replaying some older timeline, we'll install WAL segments to the newer timeline anyway and they will still go wasted. We'll just live with the waste in that situation. Commit to 9.2 and 9.1. Older versions didn't change recovery target timeline after startup, and for master, I'll commit a slightly different variant of this.
*	Check if we've reached end-of-backup point also if no redo is required.	Heikki Linnakangas	2012-12-19
\| \| \| \| \| \| \| \| \| \| \| \| \|	If you restored from a backup taken from a standby, and the last record in the backup is the checkpoint record, ie. there is no redo required except for the checkpoint record, we would fail to notice that we've reached the end-of-backup point, and the database is consistent. The result was an error "WAL ends before end of online backup". To fix, move the have-we-reached-end-of-backup check into CheckRecoveryConsistency(), which is already responsible for similar checks with minRecoveryPoint, and is called in the right places. Backpatch to 9.2, this check and bug did not exist before that.
*	Ignore libedit/libreadline while probing for standard functions.	Tom Lane	2012-12-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some versions of libedit expose bogus definitions of setproctitle(), optreset, and perhaps other symbols that we don't want configure to pick up on. There was a previous report of similar problems with strlcpy(), which we addressed in commit 59cf88da91bc88978b05275ebd94ac2d980c4047, but the problem has evidently grown in scope since then. In hopes of not having to deal with it again in future, rearrange configure's tests for supplied functions so that we ignore libedit/libreadline except when probing specifically for functions we expect them to provide. Per report from Christoph Berg, though this is slightly more aggressive than his proposed patch.
*	Fix failure to ignore leftover temp tables after a server crash.	Tom Lane	2012-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During crash recovery, we remove disk files belonging to temporary tables, but the system catalog entries for such tables are intentionally not cleaned up right away. Instead, the first backend that uses a temp schema is expected to clean out any leftover objects therein. This approach requires that we be careful to ignore leftover temp tables (since any actual access attempt would fail), even if their BackendId matches our session, if we have not yet established use of the session's corresponding temp schema. That worked fine in the past, but was broken by commit debcec7dc31a992703911a9953e299c8d730c778 which incorrectly removed the rd_islocaltemp relcache flag. Put it back, and undo various changes that substituted tests like "rel->rd_backend == MyBackendId" for use of a state-aware flag. Per trouble report from Heikki Linnakangas. Back-patch to 9.1 where the erroneous change was made. In the back branches, be careful to add rd_islocaltemp in a spot in the struct that was alignment padding before, so as not to break existing add-on code.
*	Fix filling of postmaster.pid in bootstrap/standalone mode.	Tom Lane	2012-12-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We failed to ever fill the sixth line (LISTEN_ADDR), which caused the attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared memory key never got added to the file in standalone mode. This has been broken since we added more content to our lock files in 9.1. To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR line in standalone mode. This is a tad grotty, but since that function already knows almost everything there is to know about the contents of lock files, it doesn't seem that it's any better to hack it elsewhere. It's not clear how significant this bug really is, since a standalone backend should never have any children and thus it seems not critical to be able to check the nattch count of the shmem segment externally. But I'm going to back-patch the fix anyway. This problem had escaped notice because of an ancient (and in hindsight pretty dubious) decision to suppress LOG-level messages by default in standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile that should have warned of the problem didn't do anything. Fixing that is material for a separate patch though.
*	Properly copy fmgroids.h after clean on Win32	Magnus Hagander	2012-12-16
\| \| \| \|	Craig Ringer
*	In multi-insert, don't go into infinite loop on a huge tuple and fillfactor.	Heikki Linnakangas	2012-12-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a tuple is larger than page size minus space reserved for fillfactor, heap_multi_insert would never find a page that it fits in and repeatedly ask for a new page from RelationGetBufferForTuple. If a tuple is too large to fit on any page, taking fillfactor into account, RelationGetBufferForTuple will always expand the relation. In a normal insert, heap_insert will accept that and put the tuple on the new page. heap_multi_insert, however, does a fillfactor check of its own, and doesn't accept the newly-extended page RelationGetBufferForTuple returns, even though there is no other choice to make the tuple fit. Fix that by making the logic in heap_multi_insert more like the heap_insert logic. The first tuple is always put on the page RelationGetBufferForTuple gives us, and the fillfactor check is only applied to the subsequent tuples. Report from David Gould, although I didn't use his patch.
*	Add defenses against integer overflow in dynahash numbuckets calculations.	Tom Lane	2012-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
*	Fix pg_upgrade for invalid indexes	Bruce Momjian	2012-12-11
\| \| \| \| \| \| \| \| \|	All versions of pg_upgrade upgraded invalid indexes caused by CREATE INDEX CONCURRENTLY failures and marked them as valid. The patch adds a check to all pg_upgrade versions and throws an error during upgrade or --check. Backpatch to 9.2, 9.1, 9.0. Patch slightly adjusted.
*	Consistency check should compare last record replayed, not last record read.	Heikki Linnakangas	2012-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EndRecPtr is the last record that we've read, but not necessarily yet replayed. CheckRecoveryConsistency should compare minRecoveryPoint with the last replayed record instead. This caused recovery to think it's reached consistency too early. Now that we do the check in CheckRecoveryConsistency correctly, we have to move the call of that function to after redoing a record. The current place, after reading a record but before replaying it, is wrong. In particular, if there are no more records after the one ending at minRecoveryPoint, we don't enter hot standby until one extra record is generated and read by the standby, and CheckRecoveryConsistency is called. These two bugs conspired to make the code appear to work correctly, except for the small window between reading the last record that reaches minRecoveryPoint, and replaying it. In the passing, rename recoveryLastRecPtr, which is the last record replayed, to lastReplayedEndRecPtr. This makes it slightly less confusing with replayEndRecPtr, which is the last record read that we're about to replay. Original report from Kyotaro HORIGUCHI, further diagnosis by Fujii Masao. Backpatch to 9.0, where Hot Standby subtly changed the test from "minRecoveryPoint < EndRecPtr" to "minRecoveryPoint <= EndRecPtr". The former works because where the test is performed, we have always read one more record than we've replayed.
*	Add mode where contrib installcheck runs each module in a separately named ↵	Andrew Dunstan	2012-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	database. Normally each module is tested in a database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This new mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Second attempt at this, this time accomodating make versions older than 3.82. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
*	Fix pg_upgrade -O/-o options	Bruce Momjian	2012-12-10
\| \| \| \| \| \| \|	Fix previous commit that added synchronous_commit=off, but broke -O/-o due to missing space in argument passing. Backpatch to 9.2.
*	Update minimum recovery point on truncation.	Heikki Linnakangas	2012-12-10
\| \| \| \| \| \| \| \| \|	If a file is truncated, we must update minRecoveryPoint. Once a file is truncated, there's no going back; it would not be safe to stop recovery at a point earlier than that anymore. Per report from Kyotaro HORIGUCHI. Backpatch to 8.4. Before that, minRecoveryPoint was not updated during recovery at all.
*	Fix assorted bugs in privileges-for-types patch.	Tom Lane	2012-12-09
\| \| \| \| \| \| \| \|	Commit 729205571e81b4767efc42ad7beb53663e08d1ff added privileges on data types, but there were a number of oversights. The implementation of default privileges for types missed a few places, and pg_dump was utterly innocent of the whole concept. Per bug #7741 from Nathan Alden, and subsequent wider investigation.
*	Update iso.org page link	Peter Eisentraut	2012-12-08
\| \| \| \|	The old one is responding with 404.
*	Fix intermittent crash in DROP INDEX CONCURRENTLY.	Tom Lane	2012-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When deleteOneObject closes and reopens the pg_depend relation, we must see to it that the relcache pointer held by the calling function (typically performMultipleDeletions) is updated. Usually the relcache entry is retained so that the pointer value doesn't change, which is why the problem had escaped notice ... but after a cache flush event there's no guarantee that the same memory will be reassigned. To fix, change the recursive functions' APIs so that we pass around a "Relation *" not just "Relation". Per investigation of occasional buildfarm failures. This is trivial to reproduce with -DCLOBBER_CACHE_ALWAYS, which points up the sad lack of any buildfarm member running that way on a regular basis.
*	Ensure recovery pause feature doesn't pause unless users can connect.	Tom Lane	2012-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're not in hot standby mode, then there's no way for users to connect to reset the recoveryPause flag, so we shouldn't pause. The code was aware of this but the test to see if pausing was safe was seriously inadequate: it wasn't paying attention to reachedConsistency, and besides what it was testing was that we could legally enter hot standby, not that we have done so. Get rid of that in favor of checking LocalHotStandbyActive, which because of the coding in CheckRecoveryConsistency is tantamount to checking that we have told the postmaster to enter hot standby. Also, move the recoveryPausesHere() call that reacts to asynchronous recoveryPause requests so that it's not in the middle of application of a WAL record. I put it next to the recoveryStopsHere() call --- in future those are going to need to interact significantly, so this seems like a good waystation. Also, don't bother trying to read another WAL record if we've already decided not to continue recovery. This was no big deal when the code was written originally, but now that reading a record might entail actions like fetching an archive file, it seems a bit silly to do it like that. Per report from Jeff Janes and subsequent discussion. The pause feature needs quite a lot more work, but this gets rid of some indisputable bugs, and seems safe enough to back-patch.
*	Must not reach consistency before XLOG_BACKUP_RECORD	Simon Riggs	2012-12-05
\| \| \| \| \| \| \| \| \| \|	When waiting for an XLOG_BACKUP_RECORD the minRecoveryPoint will be incorrect, so we must not declare recovery as consistent before we have seen the record. Major bug allowing recovery to end too early in some cases, allowing people to see inconsistent db. This patch to HEAD and 9.2, other fix required for 9.1 and 9.0 Simon Riggs and Andres Freund, bug report by Jeff Janes
*	Include isinf.o in libecpg if isinf() is not available on the system.	Michael Meskes	2012-12-04
\| \| \| \|	Patch done by Jiang Guiqing <jianggq@cn.fujitsu.com>.
*	Stamp 9.2.2.REL9_2_2	Tom Lane	2012-12-03
\|
*	Update release notes for 9.2.2, 9.1.7, 9.0.11, 8.4.15, 8.3.22.	Tom Lane	2012-12-03
\|
*	Revert "Add mode where contrib installcheck runs each module in a separately ↵	Andrew Dunstan	2012-12-03
\| \| \| \| \| \|	named database." This reverts commit 30248be9635e844918c2873ae7bb598f8a487e49.
*	Avoid holding vmbuffer pin after VACUUM.	Simon Riggs	2012-12-03
\| \| \| \| \| \| \| \| \| \| \| \|	During VACUUM if we pause to perform a cycle of index cleanup we drop the vmbuffer pin, so we should do the same thing when heap scan completes. This avoids holding vmbuffer pin across the main index cleanup in VACUUM, which could be minutes or hours longer than necessary for correctness. Bug report and suggested fix from Pavan Deolasee
*	Fix documentation of path(polygon) function.	Tom Lane	2012-12-03
\| \| \| \| \| \| \|	Obviously, this returns type "path", but somebody made a copy-and-pasteo long ago. Dagfinn Ilmari Mannsåker
*	Translation updates	Peter Eisentraut	2012-12-03
\|
*	Add mode where contrib installcheck runs each module in a separately named ↵	Andrew Dunstan	2012-12-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	database. Normally each module is tested in aq database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
*	Update time zone data files to tzdata release 2012j.	Tom Lane	2012-12-02
\| \| \| \| \|	DST law changes in Cuba, Israel, Jordan, Libya, Palestine, Western Samoa, and portions of Brazil.
*	Recommend triggers, not rules, in the CREATE VIEW reference page.	Tom Lane	2012-12-02
\| \| \| \| \| \|	We've generally recommended use of INSTEAD triggers over rules since that feature was added; but this old text in the CREATE VIEW reference page didn't get the memo. Noted by Thomas Kellerer.
*	Don't advance checkPoint.nextXid near the end of a checkpoint sequence.	Tom Lane	2012-12-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c11130690d6dca64267201a169cfb38c1adec5ef in favor of actually fixing the problem: namely, that we should never have been modifying the checkpoint record's nextXid at this point to begin with. The nextXid should match the state as of the checkpoint's logical WAL position (ie the redo point), not the state as of its physical position. It's especially bogus to advance it in some wal_levels and not others. In any case there is no need for the checkpoint record to carry the same nextXid shown in the XLOG_RUNNING_XACTS record just emitted by LogStandbySnapshot, as any replay operation will already have adopted that value as current. This fixes bug #7710 from Tarvi Pillessaar, and probably also explains bug #6291 from Daniel Farina, in that if a checkpoint were in progress at the instant of XID wraparound, the epoch bump would be lost as reported. (And, of course, these days there's at least a 50-50 chance of a checkpoint being in progress at any given instant.) Diagnosed by me and independently by Andres Freund. Back-patch to all branches supporting hot standby.
*	XidEpoch++ if wraparound during checkpoint.	Simon Riggs	2012-12-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If wal_level = hot_standby we update the checkpoint nextxid, though in the case where a wraparound occurred half-way through a checkpoint we would neglect updating the epoch also. Updating the nextxid is arguably the wrong thing to do, but changing that may introduce subtle bugs into hot standby startup, while updating the value doesn't cause any known bugs yet. Minimal fix now to HEAD and backbranches, wider fix later in HEAD. Bug reported in #6291 by Daniel Farina and slightly differently in Cause analysis and recommended fixes from Tom Lane and Andres Freund. Applied patch is minimal version of Andres Freund's work.
*	Fix psql crash while parsing SQL file whose encoding is different from	Tatsuo Ishii	2012-12-02
\| \| \| \| \| \|	client encoding and the client encoding is not safe one. Such an example is, file encoding is UTF-8 and client encoding SJIS. Patch contributed by Jiang Guiqing.
*	Prevent passing gmake's environment variables down through pg_regress.	Tom Lane	2012-12-01
\| \| \| \| \| \| \| \| \| \| \|	When we do "make install" to create a temp installation, we don't want that instance of make to try to communicate with any instance of make that might be calling us. This is known to cause problems if the upper make has a -jN flag, and in principle could cause problems even without that. Unset the relevant environment variables to prevent such issues. Andres Freund
*	Make sure sharedir/extension/ directory is created when needed.	Tom Lane	2012-12-01
\| \| \| \| \| \| \| \| \|	The previous coding worked as long as MODULEDIR wasn't set explicitly, because we create sharedir/$(datamoduledir) and the default value of that is "extension". But if some other value is specified for MODULEDIR then the installation directory needed for the control file wasn't made. Cédric Villemain
*	doc: Fix broken links to DocBook wiki	Peter Eisentraut	2012-12-01
\|
*	Change test ExceptionalCondition to return void	Alvaro Herrera	2012-11-30
\| \| \| \|	Commit 81107282a changed it in assert.c, but overlooked this other file.
*	Take buffer lock while inspecting btree index pages in contrib/pageinspect.	Tom Lane	2012-11-30
\| \| \| \|	It's not safe to examine a shared buffer without any lock.
*	Add missing buffer lock acquisition in GetTupleForTrigger().	Tom Lane	2012-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we had not been holding buffer pin continuously since the tuple was initially fetched by the UPDATE or DELETE query, it would be possible for VACUUM or a page-prune operation to move the tuple while we're trying to copy it. This would result in a garbage "old" tuple value being passed to an AFTER ROW UPDATE or AFTER ROW DELETE trigger. The preconditions for this are somewhat improbable, and the timing constraints are very tight; so it's not so surprising that this hasn't been reported from the field, even though the bug has been there a long time. Problem found by Andres Freund. Back-patch to all active branches.
*	Clean environment for pg_upgrade test.	Andrew Dunstan	2012-11-30
\| \| \| \| \|	This removes exisiting PG settings from the environment for pg_upgrade tests, just like pg_regress does.
*	Produce a more useful error message for over-length Unix socket paths.	Tom Lane	2012-11-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The length of a socket path name is constrained by the size of struct sockaddr_un, and there's not a lot we can do about it since that is a kernel API. However, it would be a good thing if we produced an intelligible error message when the user specifies a socket path that's too long --- and getaddrinfo's standard API is too impoverished to do this in the natural way. So insert explicit tests at the places where we construct a socket path name. Now you'll get an error that makes sense and even tells you what the limit is, rather than something generic like "Non-recoverable failure in name resolution". Per trouble report from Jeremy Drake and a fix idea from Andrew Dunstan.
*	Cleanup VirtualXact at end of Hot Standby	Simon Riggs	2012-11-29
\| \| \| \|	Resolves bug 7572 reported by Daniele Varrazzo
*	Correctly init fast path fields on PGPROC	Simon Riggs	2012-11-29
\|
*	When processing nested structure pointer variables ecpg always expected an	Michael Meskes	2012-11-29
\| \| \| \| \| \|	array datatype which of course is wrong. Applied patch by Muhammad Usama <m.usama@gmail.com> to fix this.
*	Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.	Tom Lane	2012-11-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, make the final state be indisvalid = true and indisready = false, which is otherwise nonsensical. This is pretty ugly but we can't add another column without forcing initdb, and it's too late for that in 9.2. (There's a cleaner fix in HEAD.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. Portions of this need to be back-patched even further, but I'll work on that separately. Problem reported by Amit Kapila, diagnosis by Pavan Deolasee, fix by Tom Lane and Andres Freund.
*	If we don't have a backup-end-location, don't claim we've reached it.	Heikki Linnakangas	2012-11-28
\| \| \| \| \| \| \| \|	This was apparently a typo, which caused recovery to think that it immediately reached the end of backup, and allowed the database to start up too early. Reported by Jeff Janes. Backpatch to 9.2, where this code was introduced.
*	Revert patch for taking fewer snapshots.	Tom Lane	2012-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit d573e239f03506920938bf0be56c868d9c3416da, "Take fewer snapshots". While that seemed like a good idea at the time, it caused execution to use a snapshot that had been acquired before locking any of the tables mentioned in the query. This created user-visible anomalies that were not present in any prior release of Postgres, as reported by Tomas Vondra. While this whole area could do with a redesign (since there are related cases that have anomalies anyway), it doesn't seem likely that any future patch would be reasonably back-patchable; and we don't want 9.2 to exhibit a behavior that's subtly unlike either past or future releases. Hence, revert to prior code while we rethink the problem.
*	Fix SELECT DISTINCT with index-optimized MIN/MAX on inheritance trees.	Tom Lane	2012-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a query such as "SELECT DISTINCT min(x) FROM tab", the DISTINCT is pretty useless (there being only one output row), but nonetheless it shouldn't fail. But it could fail if "tab" is an inheritance parent, because planagg.c's code for fixing up equivalence classes after making the index-optimized MIN/MAX transformation wasn't prepared to find child-table versions of the aggregate expression. The least ugly fix seems to be to add an option to mutate_eclass_expressions() to skip child-table equivalence class members, which aren't used anymore at this stage of planning so it's not really necessary to fix them. Since child members are ignored in many cases already, it seems plausible for mutate_eclass_expressions() to have an option to ignore them too. Per bug #7703 from Maxim Boguk. Back-patch to 9.1. Although the same code exists before that, it cannot encounter child-table aggregates AFAICS, because the index optimization transformation cannot succeed on inheritance trees before 9.1 (for lack of MergeAppend).
*	pg_stat_replication.sync_state was displayed incorrectly at page boundary.	Heikki Linnakangas	2012-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	XLogRecPtrIsInvalid() only checks the xrecoff field, which is correct when checking if a WAL record could legally begin at the given position, but WAL sending can legally be paused at a page boundary, in which case xrecoff is 0. Use XLByteEQ(..., InvalidXLogRecPtr) instead, which checks that both xlogid and xrecoff are 0. 9.3 doesn't have this problem because XLogRecPtr is now a single 64-bit integer, so XLogRecPtrIsInvalid() does the right thing. Apply to 9.2, and 9.1 where pg_stat_replication view was introduced. Kyotaro HORIGUCHI, reviewed by Fujii Masao.
*	Fix pg_resetxlog to use correct path to postmaster.pid.	Tom Lane	2012-11-22
\| \| \| \| \| \| \| \| \| \| \|	Since we've already chdir'd into the data directory, the file should be referenced as just "postmaster.pid", without prefixing the directory path. This is harmless in the normal case where an absolute PGDATA path is used, but quite dangerous if a relative path is specified, since the program might then fail to notice an active postmaster. Reported by Hari Babu. This got broken in my commit eb5949d190e80360386113fde0f05854f0c9824d, so patch all active versions.
*	Avoid bogus "out-of-sequence timeline ID" errors in standby-mode.	Heikki Linnakangas	2012-11-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When startup process opens a WAL segment after replaying part of it, it validates the first page on the WAL segment, even though the page it's really interested in later in the file. As part of the validation, it checks that the TLI on the page header is >= the TLI it saw on the last page it read. If the segment contains a timeline switch, and we have already replayed it, and then re-open the WAL segment (because of streaming replication got disconnected and reconnected, for example), the TLI check will fail when the first page is validated. Fix that by relaxing the TLI check when re-opening a WAL segment. Backpatch to 9.0. Earlier versions had the same code, but before standby mode was introduced in 9.0, recovery never tried to re-read a segment after partially replaying it. Reported by Amit Kapila, while testing a new feature.