postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Reject nonzero day fields in AT TIME ZONE INTERVAL functions.	Tom Lane	2013-01-31
\| \| \| \| \| \| \| \| \| \|	It's not sensible for an interval that's used as a time zone value to be larger than a day. When we changed the interval type to contain a separate day field, check_timezone() was adjusted to reject nonzero day values, but timetz_izone(), timestamp_izone(), and timestamptz_izone() evidently were overlooked. While at it, make the error messages for these three cases consistent.
*	Fix plpgsql's reporting of plan-time errors in possibly-simple expressions.	Tom Lane	2013-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression would show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
*	Fix grammar for subscripting or field selection from a sub-SELECT result.	Tom Lane	2013-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Such cases should work, but the grammar failed to accept them because of our ancient precedence hacks to convince bison that extra parentheses around a sub-SELECT in an expression are unambiguous. (Formally, they are ambiguous, but we don't especially care whether they're treated as part of the sub-SELECT or part of the expression. Bison cares, though.) Fix by adding a redundant-looking production for this case. This is a fine example of why fixing shift/reduce conflicts via precedence declarations is more dangerous than it looks: you can easily cause the parser to reject cases that should work. This has been wrong since commit 3db4056e22b0c6b2adc92543baf8408d2894fe91 or maybe before, and apparently some people have been working around it by inserting no-op casts. That method introduces a dump/reload hazard, as illustrated in bug #7838 from Jan Mate. Hence, back-patch to all active branches.
*	Provide database object names as separate fields in error messages.	Tom Lane	2013-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses the problem that applications currently have to extract object names from possibly-localized textual error messages, if they want to know for example which index caused a UNIQUE_VIOLATION failure. It adds new error message fields to the wire protocol, which can carry the name of a table, table column, data type, or constraint associated with the error. (Since the protocol spec has always instructed clients to ignore unrecognized field types, this should not create any compatibility problem.) Support for providing these new fields has been added to just a limited set of error reports (mainly, those in the "integrity constraint violation" SQLSTATE class), but we will doubtless add them to more calls in future. Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with additional hacking by Tom Lane.
*	Skip truncating ON COMMIT DELETE ROWS temp tables, if the transaction hasn't	Heikki Linnakangas	2013-01-29
\| \| \| \| \| \| \| \|	touched any temporary tables. We could try harder, and keep track of whether we've inserted to any temp tables, rather than accessed them, and which temp tables have been inserted to. But this is dead simple, and already covers many interesting scenarios.
*	Fast promote mode skips checkpoint at end of recovery.	Simon Riggs	2013-01-29
\| \| \| \| \| \| \| \| \| \| \|	pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we can achieve very fast failover when the apply delay is low. Write new WAL record XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log readers. If we skip synchronous end of recovery checkpoint we request a normal spread checkpoint so that the window of re-recovery is low. Simon Riggs and Kyotaro Horiguchi, with input from Fujii Masao. Review by Heikki Linnakangas
*	REASSIGN OWNED: handle shared objects, too	Alvaro Herrera	2013-01-28
\| \| \| \| \| \| \| \| \| \| \|	Give away ownership of shared objects (databases, tablespaces) along with local objects, per original code intention. Try to make the documentation clearer, too. Per discussion about DROP OWNED's brokenness, in bug #7748. This is not backpatched because it'd require some refactoring of the ALTER/SET OWNER code for databases and tablespaces.
*	DROP OWNED: don't try to drop tablespaces/databases	Alvaro Herrera	2013-01-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	My "fix" for bugs #7578 and #6116 on DROP OWNED at fe3b5eb08a1 not only misstated that it applied to REASSIGN OWNED (which it did not affect), but it also failed to fix the problems fully, because I didn't test the case of owned shared objects. Thus I created a new bug, reported by Thomas Kellerer as #7748, which would cause DROP OWNED to fail with a not-for-user-consumption error message. The code would attempt to drop the database, which not only fails to work because the underlying code does not support that, but is a pretty dangerous and undesirable thing to be doing as well. This patch fixes that bug by having DROP OWNED only attempt to process shared objects when grants on them are found, ignoring ownership. Backpatch to 8.3, which is as far as the previous bug was backpatched.
*	Make LATERAL implicit for functions in FROM.	Tom Lane	2013-01-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SQL standard does not have general functions-in-FROM, but it does allow UNNEST() there (see the <collection derived table> production), and the semantics of that are defined to include lateral references. So spec compliance requires allowing lateral references within UNNEST() even without an explicit LATERAL keyword. Rather than making UNNEST() a special case, it seems best to extend this flexibility to any function-in-FROM. We'll still allow LATERAL to be written explicitly for clarity's sake, but it's now a noise word in this context. In theory this change could result in a change in behavior of existing queries, by allowing what had been an outer reference in a function-in-FROM to be captured by an earlier FROM-item at the same level. However, all pre-9.3 PG releases have a bug that causes them to match variable references to earlier FROM-items in preference to outer references (and then throw an error). So no previously-working query could contain the type of ambiguity that would risk a change of behavior. Per a suggestion from Andrew Gierth, though I didn't use his patch.
*	Update comments in new DROP IF EXISTS code; commit message update	Bruce Momjian	2013-01-26
\| \| \| \| \| \| \|	DROP IF EXISTS with a missing schema in commit 7e2322dff30c04d90c0602d2b5ae24b4881db88b applies not only to tables, but to DROP IF EXISTS with missing schemas for indexes, views, sequences, and foreign tables. Yeah!
*	Update LookupExplicitNamespace() comments; commit message update	Bruce Momjian	2013-01-26
\| \| \| \| \|	Also, commit 7e2322dff30c04d90c0602d2b5ae24b4881db88b affected DROP TABLE IF EXISTS, not CREATE TABLE IF EXISTS.
*	Issue ERROR if FREEZE mode can't be honored by COPY	Bruce Momjian	2013-01-26
\| \| \| \| \| \|	Previously non-honored FREEZE mode was ignored. This also issues an appropriate error message based on the cause of the failure, per suggestion from Tom. Additional regression test case added.
*	Allow CREATE TABLE IF EXIST so succeed if the schema is nonexistent	Bruce Momjian	2013-01-26
\| \| \| \| \| \|	Previously, CREATE TABLE IF EXIST threw an error if the schema was nonexistent. This was done by passing 'missing_ok' to the function that looks up the schema oid.
*	Change plan caching to honor, not resist, changes in search_path.	Tom Lane	2013-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the initial implementation of plan caching, we saved the active search_path when a plan was first cached, then reinstalled that path anytime we needed to reparse or replan. The idea of that was to try to reselect the same referenced objects, in somewhat the same way that views continue to refer to the same objects in the face of schema or name changes. Of course, that analogy doesn't bear close inspection, since holding the search_path fixed doesn't cope with object drops or renames. Moreover sticking with the old path seems to create more surprises than it avoids. So instead of doing that, consider that the cached plan depends on search_path, and force reparse/replan if the active search_path is different than it was when we last saved the plan. This gets us fairly close to having "transparency" of plan caching, in the sense that the cached statement acts the same as if you'd just resubmitted the original query text for another execution. There are still some corner cases where this fails though: a new object added in the search path schema(s) might capture a reference in the query text, but we'd not realize that and force a reparse. We might try to fix that in the future, but for the moment it looks too expensive and complicated.
*	Add some randomness to the choice of which GiST page to insert to.	Heikki Linnakangas	2013-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When descending the tree for an insert, and there are multiple equally good pages we could insert to, make the choice in random. Previously, we would always choose the tuple with lowest offset number. That meant that when two non-leaf pages overlap - in the extreme case they might have exactly the same key - all but the first such page went unused. That wasn't optimal for space usage; if you deleted some tuples from the non-first pages, the space would never be reused. With this patch, the other pages are sometimes chosen too, although there's still a heavy bias towards low-offset tuples, so that we don't lose cache locality when doing a lot of inserts with similar keys. Original idea by Alexander Korotkov, although this patch version was written by me and copy-edited by Tom Lane.
*	Fix concat() and format() to handle VARIADIC-labeled arguments correctly.	Tom Lane	2013-01-25
\| \| \| \| \| \| \| \|	Previously, the VARIADIC labeling was effectively ignored, but now these functions act as though the array elements had all been given as separate arguments. Pavel Stehule
*	Fix SPI documentation for new handling of ExecutorRun's count parameter.	Tom Lane	2013-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since 9.0, the count parameter has only limited the number of tuples actually returned by the executor. It doesn't affect the behavior of INSERT/UPDATE/DELETE unless RETURNING is specified, because without RETURNING, the ModifyTable plan node doesn't return control to execMain.c for each tuple. And we only check the limit at the top level. While this behavioral change was unintentional at the time, discussion of bug #6572 led us to the conclusion that we prefer the new behavior anyway, and so we should just adjust the docs to match rather than change the code. Accordingly, do that. Back-patch as far as 9.0 so that the docs match the code in each branch.
*	Fix rare missing cancellations in Hot Standby.	Simon Riggs	2013-01-24
\| \| \| \| \| \| \| \| \| \| \| \|	The machinery around XLOG_HEAP2_CLEANUP_INFO failed to correctly pass through the necessary information on latestRemovedXid, avoiding cancellations in some infrequent concurrent update/cleanup scenarios. Backpatchable fix to 9.0 Detailed bug report and fix by Noah Misch, backpatchable version by me.
*	Also fix rotation of csvlog on Windows.	Heikki Linnakangas	2013-01-24
\| \| \| \|	Backpatch to 9.2, like the previous fix.
*	Fix failure to rotate postmaster log file for size reasons on Windows.	Tom Lane	2013-01-23
\| \| \| \| \| \| \| \| \| \| \| \|	When we eliminated "unnecessary" wakeups of the syslogger process, we broke size-based logfile rotation on Windows, because on that platform data transfer is done in a separate thread. While non-Windows platforms would recheck the output file size after every log message, Windows only did so when the control thread woke up for some other reason, which might be quite infrequent. Per bug #7814 from Tsunezumi. Back-patch to 9.2 where the problem was introduced. Jeff Janes
*	Improve concurrency of foreign key locking	Alvaro Herrera	2013-01-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
*	Fix more issues with cascading replication and timeline switches.	Heikki Linnakangas	2013-01-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a standby server follows the master using WAL archive, and it chooses a new timeline (recovery_target_timeline='latest'), it only fetches the timeline history file for the chosen target timeline, not any other history files that might be missing from pg_xlog. For example, if the current timeline is 2, and we choose 4 as the new recovery target timeline, the history file for timeline 3 is not fetched, even if it's part of this server's history. That's enough for the standby itself - the history file for timeline 4 includes timeline 3 as well - but if a cascading standby server wants to recover to timeline 3, it needs the history file. To fix, when a new recovery target timeline is chosen, try to copy any missing history files from the archive to pg_xlog between the old and new target timeline. A second similar issue was with the WAL files. When a standby recovers from archive, and it reaches a segment that contains a switch to a new timeline, recovery fetches only the WAL file labelled with the new timeline's ID. The file from the new timeline contains a copy of the WAL from the old timeline up to the point where the switch happened, and recovery recovers it from the new file. But in streaming replication, walsender only tries to read it from the old timeline's file. To fix, change walsender to read it from the new file, so that it behaves the same as recovery in that sense, and doesn't try to open the possibly nonexistent file with the old timeline's ID.
*	Fix a few small bugs in yesterday's event trigger patch.	Robert Haas	2013-01-22
\| \| \| \|	Dimitri Fontaine
*	Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag.	Tom Lane	2013-01-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally we didn't bother to mark FuncExprs with any indication whether VARIADIC had been given in the source text, because there didn't seem to be any need for it at runtime. However, because we cannot fold a VARIADIC ANY function's arguments into an array (since they're not necessarily all the same type), we do actually need that information at runtime if VARIADIC ANY functions are to respond unsurprisingly to use of the VARIADIC keyword. Add the missing field, and also fix ruleutils.c so that VARIADIC ANY function calls are dumped properly. Extracted from a larger patch that also fixes concat() and format() (the only two extant VARIADIC ANY functions) to behave properly when VARIADIC is specified. This portion seems appropriate to review and commit separately. Pavel Stehule
*	Add ddl_command_end support for event triggers.	Robert Haas	2013-01-21
\| \| \| \|	Dimitri Fontaine, with slight changes by me
*	Refactor ALTER some-obj RENAME implementation	Alvaro Herrera	2013-01-21
\| \| \| \| \| \| \| \| \|	Remove duplicate implementations of catalog munging and miscellaneous privilege checks. Instead rely on already existing data in objectaddress.c to do the work. Author: KaiGai Kohei, changes by me Reviewed by: Robert Haas, Álvaro Herrera, Dimitri Fontaine
*	Fix error-checking typo in check_TSCurrentConfig().	Tom Lane	2013-01-20
\| \| \| \| \| \|	The code failed to detect an out-of-memory failure. Xi Wang
*	Fix an O(N^2) performance issue for sessions modifying many relations.	Tom Lane	2013-01-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AtEOXact_RelationCache() scanned the entire relation cache at the end of any transaction that created a new relation or assigned a new relfilenode. Thus, clients such as pg_restore had an O(N^2) performance problem that would start to be noticeable after creating 10000 or so tables. Since typically only a small number of relcache entries need any cleanup, we can fix this by keeping a small list of their OIDs and doing hash_searches for them. We fall back to the full-table scan if the list overflows. Ideally, the maximum list length would be set at the point where N hash_searches would cost just less than the full-table scan. Some quick experimentation says that point might be around 50-100; I (tgl) conservatively set MAX_EOXACT_LIST = 32. For the case that we're worried about here, which is short single-statement transactions, it's unlikely there would ever be more than about a dozen list entries anyway; so it's probably not worth being too tense about the value. We could avoid the hash_searches by instead keeping the target relcache entries linked into a list, but that would be noticeably more complicated and bug-prone because of the need to maintain such a list in the face of relcache entry drops. Since a relcache entry can only need such cleanup after a somewhat-heavyweight filesystem operation, trying to save a hash_search per cleanup doesn't seem very useful anyway --- it's the scan over all the not-needing-cleanup entries that we wish to avoid here. Jeff Janes, reviewed and tweaked a bit by Tom Lane
*	Protect against SnapshotNow race conditions in pg_tablespace scans.	Tom Lane	2013-01-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use of SnapshotNow is known to expose us to race conditions if the tuple(s) being sought could be updated by concurrently-committing transactions. CREATE DATABASE and DROP DATABASE are particularly exposed because they do heavyweight filesystem operations during their scans of pg_tablespace, so that the scans run for a very long time compared to most. Furthermore, the potential consequences of a missed or twice-visited row are nastier than average: * createdb() could fail with a bogus "file already exists" error, or silently fail to copy one or more tablespace's worth of files into the new database. * remove_dbtablespaces() could miss one or more tablespaces, thus failing to free filesystem space for the dropped database. * check_db_file_conflict() could likewise miss a tablespace, leading to an OID conflict that could result in data loss either immediately or in future operations. (This seems of very low probability, though, since a duplicate database OID would be unlikely to start with.) Hence, it seems worth fixing these three places to use MVCC snapshots, even though this will someday be superseded by a generic solution to SnapshotNow race conditions. Back-patch to all active branches. Stephen Frost and Tom Lane
*	Unbreak lock conflict detection for Hot Standby.	Robert Haas	2013-01-18
\| \| \| \| \| \| \| \| \| \|	This got broken in the original fast-path locking patch, because I failed to account for the fact that Hot Standby startup process might take a strong relation lock on a relation in a database to which it is not bound, and confused MyDatabaseId with the database ID of the relation being locked. Report and diagnosis by Andres Freund. Final form of patch by me.
*	Fix off-by-one bug in xlog reading logic	Alvaro Herrera	2013-01-18
\| \| \| \| \| \|	Bug reported by Michael Paquier Author: Andres Freund
*	Now that START_REPLICATION returns the next timeline's ID after reaching end	Heikki Linnakangas	2013-01-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	of timeline, take advantage of that in walreceiver. Startup process is still in control of choosign the target timeline, by scanning the timeline history files present in pg_xlog, but walreceiver now uses the next timeline's ID to fetch its history file immediately after it has finished streaming the old timeline. Before, the standby would first try to restart streaming on the old timeline, which fetches the missing timeline history file as a side-effect, and only then restart from the new timeline. This patch eliminates the extra iteration, which speeds up the timeline switch and reduces the noise in the log caused by the extra restart on the old timeline.
*	Use the right timeline when beginning to stream from master.	Heikki Linnakangas	2013-01-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xlogreader refactoring broke the logic to decide which timeline to start streaming from. XLogPageRead() uses the timeline history to check which timeline the requested WAL position falls into. However, after the refactoring, XLogPageRead() is always first called with the first page in the segment, to verify the segment header, and only then with the actual WAL position we're interested in. That first read of the segment's header made XLogPageRead() to always start streaming from the old timeline containing the segment header, not the timeline containing the actual record, if there was a timeline switch within the segment. I thought I fixed this yesterday, but that fix was too narrow and only fixed this for the corner-case that the timeline switch happened in the first page of the segment. To fix this more robustly, pass explicitly the position of the record we're actually interested in to XLogPageRead, and use that to decide which timeline to read from, rather than deduce it from the page and offset. Per report from Fujii Masao.
*	When xlogreader asks the callback function to read a page, make sure we	Heikki Linnakangas	2013-01-17
\| \| \| \| \| \| \| \| \|	get a large enough part of the page to include the beginning of the next record we're interested in. The XLogPageRead callback uses the requested length to decide which timeline to stream WAL from, and if the first call is short, and the page contains a timeline switch, we'll repeatedly try to stream that page from the old timeline, and never get across the timeline switch.
*	I added a result set to START_STREAMING command, but neglected walreceiver.	Heikki Linnakangas	2013-01-17
\| \| \| \| \| \| \|	The patch to allow pg_receivexlog to switch timeline added a result set after copy has ended in START_STREAMING command, to return the next timeline's ID to the client. But walreceived didn't get the memo, and threw an error on the unexpected result set. Fix.
*	Accelerate end-of-transaction dropping of relations	Alvaro Herrera	2013-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When relations are dropped, at end of transaction we need to remove the files and clean the buffer pool of buffers containing pages of those relations. Previously we would scan the buffer pool once per relation to clean up buffers. When there are many relations to drop, the repeated scans make this process slow; so we now instead pass a list of relations to drop and scan the pool once, checking each buffer against the passed list. When the number of relations is larger than a threshold (which as of this patch is being set to 20 relations) we sort the array before starting, and bsearch the array; when it's smaller, we simply scan the array linearly each time, because that's faster. The exact optimal threshold value depends on many factors, but the difference is not likely to be significant enough to justify making it user-settable. This has been measured to be a significant win (a 15x win when dropping 100,000 relations; an extreme case, but reportedly a real one). Author: Tomas Vondra, some tweaks by me Reviewed by: Robert Haas, Shigeru Hanada, Andres Freund, Álvaro Herrera
*	Make pg_receivexlog and pg_basebackup -X stream work across timeline switches.	Heikki Linnakangas	2013-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This mirrors the changes done earlier to the server in standby mode. When receivelog reaches the end of a timeline, as reported by the server, it fetches the timeline history file of the next timeline, and restarts streaming from the new timeline by issuing a new START_STREAMING command. When pg_receivexlog crosses a timeline, it leaves the .partial suffix on the last segment on the old timeline. This helps you to tell apart a partial segment left in the directory because of a timeline switch, and a completed segment. If you just follow a single server, it won't make a difference, but it can be significant in more complicated scenarios where new WAL is still generated on the old timeline. This includes two small changes to the streaming replication protocol: First, when you reach the end of timeline while streaming, the server now sends the TLI of the next timeline in the server's history to the client. pg_receivexlog uses that as the next timeline, so that it doesn't need to parse the timeline history file like a standby server does. Second, when BASE_BACKUP command sends the begin and end WAL positions, it now also sends the timeline IDs corresponding the positions.
*	Improve memory space management in tuplesort and tuplestore.	Tom Lane	2013-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code originally just doubled the size of the tuple-pointer array so long as that would fit in allowedMem. This could result in failing to use as much as half of allowedMem, if (as is typical) the last doubling attempt didn't quite fit. Worse, we might double the array size but be unable to use most of the added slots, because there was no room left within the allowedMem limit for tuples the slots should point to. To fix, double only so long as we've used less than half of allowedMem in total. Then do one more array enlargement, but scale it based on total memory consumption so far. This will work nicely as long as the average tuple size is reasonably stable, and in any case should be better than the old method. This change will result in large sort operations consuming a larger fraction of work_mem than they typically did in the past. The release notes should mention that users may want to revisit their work_mem settings, if they'd tuned those settings based on the old behavior of sorting. Jeff Janes, reviewed by Peter Geoghegan and Robert Haas
*	Fix a couple of error-handling bugs in the xlogreader patch.	Heikki Linnakangas	2013-01-17
\| \| \| \| \| \| \| \| \| \| \| \|	XLogReadRecord should reset its state on every error, to make sure it re-reads the page on next call. It was inconsistent in that some errors did that, but some did not. In ReadRecord(), don't give up on an error if we're in standby mode. The loop was set up to retry, but the checks within the loop broke out of the loop on any error. Andres Freund, with some tweaking by me.
*	Make GiST indexes on-disk compatible with 9.2 again.	Heikki Linnakangas	2013-01-17
\| \| \| \| \| \| \| \| \| \| \|	The patch that turned XLogRecPtr into a uint64 inadvertently changed the on-disk format of GiST indexes, because the NSN field in the GiST page opaque is an XLogRecPtr. That breaks pg_upgrade. Revert the format of that field back to the two-field struct that XLogRecPtr was before. This is the same we did to LSNs in the page header to avoid changing on-disk format. Bump catversion, as this invalidates any existing GiST indexes built on 9.3devel.
*	Base the default SSL ciphers on DEFAULT instead of ALL	Magnus Hagander	2013-01-17
\| \| \| \| \| \| \| \|	It's better to start from what the OpenSSL people consider a good default and then remove insecure things (low encryption, exportable encryption and md5 at this point) from that, instead of starting from everything that exists and remove from that. We trust the OpenSSL people to make good choices about what the default is.
*	Split out XLog reading as an independent facility	Alvaro Herrera	2013-01-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new facility can not only be used by xlog.c to carry out crash recovery, but also by external programs. By supplying a function to read XLog pages from somewhere, all the WAL reading can be used for completely different purposes. For the standard backend use, the behavior should be pretty much the same as previously. As for non-backend programs, an hypothetical pg_xlogdump program is now closer to reality, but some more backend support is still necessary. This patch was originally submitted by Andres Freund in a different form, but Heikki Linnakangas opted for and authored another design of the concept. Andres has advanced the patch since Heikki's initial version. Review and some (mostly cosmetics) changes by me.
*	Rework order of checks in ALTER / SET SCHEMA	Alvaro Herrera	2013-01-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When attempting to move an object into the schema in which it already was, for most objects classes we were correctly complaining about exactly that ("object is already in schema"); but for some other object classes, such as functions, we were instead complaining of a name collision ("object already exists in schema"). The latter is wrong and misleading, per complaint from Robert Haas in CA+TgmoZ0+gNf7RDKRc3u5rHXffP=QjqPZKGxb4BsPz65k7qnHQ@mail.gmail.com To fix, refactor the way these checks are done. As a bonus, the resulting code is smaller and can also share some code with Rename cases. While at it, remove use of getObjectDescriptionOids() in error messages. These are normally disallowed because of translatability considerations, but this one had slipped through since 9.1. (Not sure that this is worth backpatching, though, as it would create some untranslated messages in back branches.) This is loosely based on a patch by KaiGai Kohei, heavily reworked by me.
*	Fix hash_update_hash_key() to handle same-bucket case correctly.	Tom Lane	2013-01-14
\| \| \| \| \| \|	Original coding would corrupt the hashtable if the item being updated was at the end of its bucket chain and the new hash key hashed to that same bucket. Diagnosis and fix by Heikki Linnakangas.
*	Return value of lseek() can be negative on failure.	Heikki Linnakangas	2013-01-15
\| \| \| \| \| \| \| \|	Because the return value of lseek() was assigned to an unsigned size_t variable, we'd fail to notice an error return code -1. Compiler gave a warning about this. Andres Freund
*	Fix obsolete SQL syntax in comment.	Tom Lane	2013-01-14
\| \| \| \| \| \| \|	This was legal back in the days of add_missing_from, though perhaps never good style. It's not legal anymore ... Jan Urbański
*	Reject out-of-range dates in to_date().	Tom Lane	2013-01-14
\| \| \| \| \| \| \| \| \|	Dates outside the supported range could be entered, but would not print reasonably, and operations such as conversion to timestamp wouldn't behave sanely either. Since this has the potential to result in undumpable table data, it seems worth back-patching. Hitoshi Harada
*	Remove spurious space	Alvaro Herrera	2013-01-14
\| \| \| \|	Andres Freund
*	Prevent very-low-probability PANIC during PREPARE TRANSACTION.	Tom Lane	2013-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code in PostPrepare_Locks supposed that it could reassign locks to the prepared transaction's dummy PGPROC by deleting the PROCLOCK table entries and immediately creating new ones. This was safe when that code was written, but since we invented partitioning of the shared lock table, it's not safe --- another process could steal away the PROCLOCK entry in the short interval when it's on the freelist. Then, if we were otherwise out of shared memory, PostPrepare_Locks would have to PANIC, since it's too late to back out of the PREPARE at that point. Fix by inventing a dynahash.c function to atomically update a hashtable entry's key. (This might possibly have other uses in future.) This is an ancient bug that in principle we ought to back-patch, but the odds of someone hitting it in the field seem really tiny, because (a) the risk window is small, and (b) nobody runs servers with maxed-out lock tables for long, because they'll be getting non-PANIC out-of-memory errors anyway. So fixing it in HEAD seems sufficient, at least until the new code has gotten some testing.
*	Make spelling more uniform	Peter Eisentraut	2013-01-13
\|