postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Remove silly completion for "DELETE FROM tabname ...".	Tom Lane	2015-12-20
\| \| \| \| \| \|	psql offered USING, WHERE, and SET in this context, but SET is not a valid possibility here. Seems to have been a thinko in commit f5ab0a14ea83eb6c which added DELETE's USING option.
*	Teach psql's tab completion to consider the entire input string.	Tom Lane	2015-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to now, the tab completion logic has only examined the last few words of the current input line; "last few" being originally as few as four words, but lately up to nine words. Furthermore, it only looked at what libreadline considers the current line of input, which made it rather myopic if you split your command across lines. This was tolerable, sort of, so long as the match patterns were only designed to consider the last few words of input; but with the recent addition of HeadMatches() and Matches() matching rules, we really have to do better if we want those to behave sanely. Hence, change the code to break the entire line down into words, and to include any previous lines in the command buffer along with the active readline input buffer. This will be a little bit slower than the previous coding, but some measurements say that even a query of several thousand characters can be parsed in a hundred or so microseconds on modern machines; so it's really not going to be significant for interactive tab completion. To reduce the cost some, I arranged to avoid the per-word malloc calls that used to occur: all the words are now kept in one malloc'd buffer.
*	psql: Review of new help output strings	Peter Eisentraut	2015-12-20
\|
*	Add missing COSTS OFF to EXPLAIN commands in rowsecurity.sql.	Tom Lane	2015-12-19
\| \| \| \| \| \| \| \|	Commit e5e11c8cc added a bunch of EXPLAIN statements without COSTS OFF to the regression tests. This is contrary to project policy since it results in unnecessary platform dependencies in the output (it's just luck that we didn't get buildfarm failures from it). Per gripe from Mike Wilson.
*	Adopt a more compact, less error-prone notation for tab completion code.	Tom Lane	2015-12-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace tests like else if (pg_strcasecmp(prev4_wd, "CREATE") == 0 && pg_strcasecmp(prev3_wd, "TRIGGER") == 0 && (pg_strcasecmp(prev_wd, "BEFORE") == 0 \|\| pg_strcasecmp(prev_wd, "AFTER") == 0)) with new notation like this: else if (TailMatches4("CREATE", "TRIGGER", MatchAny, "BEFORE\|AFTER")) In addition, provide some macros COMPLETE_WITH_LISTn() to reduce the amount of clutter needed to specify a small number of predetermined completion alternatives. This makes the code substantially more compact: tab-complete.c gets over a thousand lines shorter in this patch, despite the addition of a couple of hundred lines of infrastructure for the new notations. The new way of specifying match rules seems a whole lot more readable and less error-prone, too. There's a lot more that could be done now to make matching faster and more reliable; for example I suspect that most of the TailMatches() rules should now be Matches() rules. That would allow them to be skipped after a single integer comparison if there aren't the right number of words on the line, and it would reduce the risk of unintended matches. But for now, (mostly) refrain from reworking any match rules in favor of just converting what we've got into the new notation. Thomas Munro, reviewed by Michael Paquier, some adjustments by me
*	Fix tab completion for ALTER ... TABLESPACE ... OWNED BY.	Andres Freund	2015-12-19
\| \| \| \| \| \| \| \| \| \| \| \|	Previously the completion used the wrong word to match 'BY'. This was introduced brokenly, in b2de2a. While at it, also add completion of IN TABLESPACE ... OWNED BY and fix comments referencing nonexistent syntax. Reported-By: Michael Paquier Author: Michael Paquier and Andres Freund Discussion: CAB7nPqSHDdSwsJqX0d2XzjqOHr==HdWiubCi4L=Zs7YFTUne8w@mail.gmail.com Backpatch: 9.4, like the commit introducing the bug
*	Revert 9246af6799819847faa33baf441251003acbb8fe because	Teodor Sigaev	2015-12-18
\| \| \| \|	I miss too much. Patch is returned to commitfest process.
*	pgbench: Change terminology from "threshold" to "parameter".	Robert Haas	2015-12-18
\| \| \| \| \| \| \| \| \| \| \| \|	Per a recommendation from Tomas Vondra, it's more helpful to refer to the value that determines how skewed a Gaussian or exponential distribution is as a parameter rather than a threshold. Since it's not quite too late to get this right in 9.5, where it was introduced, back-patch this. Most of the patch changes only comments and documentation, but a few pgbench messages are altered to match. Fabien Coelho, reviewed by Michael Paquier and by me.
*	Remove duplicate word.	Robert Haas	2015-12-18
\| \| \| \|	Kyotaro Horiguchi
*	Fix TupleQueueReaderNext not to ignore its nowait argument.	Robert Haas	2015-12-18
\| \| \| \| \| \|	This was a silly goof on my (rhaas's) part. Report and fix by Rushabh Lathia.
*	Fix copy-and-paste error in logical decoding callback.	Robert Haas	2015-12-18
\| \| \| \| \| \| \|	This could result in the error context misidentifying where the error actually occurred. Craig Ringer
*	Fix typo in comment.	Robert Haas	2015-12-18
\| \| \| \|	Amit Langote
*	Allow to omit boundaries in array subscript	Teodor Sigaev	2015-12-18
\| \| \| \| \| \| \|	Allow to omiy lower or upper or both boundaries in array subscript for selecting slice of array. Author: YUriy Zhuravlev
*	Remove unreferenced function declarations.	Tom Lane	2015-12-17
\| \| \| \| \| \|	datapagemap_create() and datapagemap_destroy() were declared extern, but they don't actually exist anywhere. Per YUriy Zhuravlev and Michael Paquier.
*	Use just one standalone-backend session for initdb's post-bootstrap steps.	Tom Lane	2015-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, each subroutine in initdb fired up its own standalone backend session. Over time we'd grown as many as fifteen of these sessions, and the cumulative startup and shutdown work for them was getting pretty noticeable. Combining things so that all these steps share a single backend session cuts a good 10% off the total runtime of initdb, more if you're not fsync'ing. The main stumbling block to doing this before was that some of the sessions were run with -j and some not. The improved definition of -j mode implemented by my previous commit makes it possible to fix that by running all the post-bootstrap steps with -j; we just have to use double instead of single newlines to end command strings. (This is only absolutely necessary around the VACUUM and CREATE DATABASE steps, since those can't be run in a transaction block. But it seems best to make them all use double newlines so that the commands remain separate for error-reporting purposes.) A minor disadvantage is that since initdb can't tell how much of its output the backend has executed, we can no longer have the per-step progress reporting initdb used to print. But things are fast enough nowadays that that's not really all that useful anyway. In passing, add more const decoration to some of the static arrays in initdb.c.
*	Adjust behavior of single-user -j mode for better initdb error reporting.	Tom Lane	2015-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, -j caused the entire input file to be read in and executed as a single command string. That's undesirable, not least because any error causes the entire file to be regurgitated as the "failing query". Some experimentation suggests a better rule: end the command string when we see a semicolon immediately followed by two newlines, ie, an empty line after a query. This serves nicely to break up the existing examples such as information_schema.sql and system_views.sql. A limitation is that it's no longer possible to write such a sequence within a string literal or multiline comment in a file meant to be read with -j; but there are no instances of such a problem within the data currently used by initdb. (If someone does make such a mistake in future, it'll be obvious because they'll get an unterminated-literal or unterminated-comment syntax error.) Other than that, there shouldn't be any negative consequences; you're not forced to end statements that way, it's just a better idea in most cases. In passing, remove src/include/tcop/tcopdebug.h, which is dead code because it's not included anywhere, and hasn't been for more than ten years. One of the debug-support symbols it purported to describe has been unreferenced for at least the same amount of time, and the other is removed by this commit on the grounds that it was useless: forcing -j mode all the time would have broken initdb. The lack of complaints about that, or about the missing inclusion, shows that no one has tried to use TCOP_DONTUSENEWLINE in many years.
*	Fix improper initialization order for readline.	Tom Lane	2015-12-17
\| \| \| \| \| \| \| \| \|	Turns out we must set rl_basic_word_break_characters before we call rl_initialize() the first time, because it will quietly copy that value elsewhere --- but only on the first call. (Love these undocumented dependencies.) I broke this yesterday in commit 2ec477dc8108339d; like that commit, back-patch to all active branches. Per report from Pavel Stehule.
*	Rework internals of changing a type's ownership	Alvaro Herrera	2015-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is necessary so that REASSIGN OWNED does the right thing with composite types, to wit, that it also alters ownership of the type's pg_class entry -- previously, the pg_class entry remained owned by the original user, which caused later other failures such as the new owner's inability to use ALTER TYPE to rename an attribute of the affected composite. Also, if the original owner is later dropped, the pg_class entry becomes owned by a non-existant user which is bogus. To fix, create a new routine AlterTypeOwner_oid which knows whether to pass the request to ATExecChangeOwner or deal with it directly, and use that in shdepReassignOwner rather than calling AlterTypeOwnerInternal directly. AlterTypeOwnerInternal is now simpler in that it only modifies the pg_type entry and recurses to handle a possible array type; higher-level tasks are handled by either AlterTypeOwner directly or AlterTypeOwner_oid. I took the opportunity to add a few more objects to the test rig for REASSIGN OWNED, so that more cases are exercised. Additional ones could be added for superuser-only-ownable objects (such as FDWs and event triggers) but I didn't want to push my luck by adding a new superuser to the tests on a backpatchable bug fix. Per bug #13666 reported by Chris Pacejo. Backpatch to 9.5. (I would back-patch this all the way back, except that it doesn't apply cleanly in 9.4 and earlier because 59367fdf9 wasn't backpatched. If we decide that we need this in earlier branches too, we should backpatch both.)
*	Cope with Readline's failure to track SIGWINCH events outside of input.	Tom Lane	2015-12-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It emerges that libreadline doesn't notice terminal window size change events unless they occur while collecting input. This is easy to stumble over if you resize the window while using a pager to look at query output, but it can be demonstrated without any pager involvement. The symptom is that queries exceeding one line are misdisplayed during subsequent input cycles, because libreadline has the wrong idea of the screen dimensions. The safest, simplest way to fix this is to call rl_reset_screen_size() just before calling readline(). That causes an extra ioctl(TIOCGWINSZ) for every command; but since it only happens when reading from a tty, the performance impact should be negligible. A more valid objection is that this still leaves a tiny window during entry to readline() wherein delivery of SIGWINCH will be missed; but the practical consequences of that are probably negligible. In any case, there doesn't seem to be any good way to avoid the race, since readline exposes no functions that seem safe to call from a generic signal handler --- rl_reset_screen_size() certainly isn't. It turns out that we also need an explicit rl_initialize() call, else rl_reset_screen_size() dumps core when called before the first readline() call. rl_reset_screen_size() is not present in old versions of libreadline, so we need a configure test for that. (rl_initialize() is present at least back to readline 4.0, so we won't bother with a test for it.) We would need a configure test anyway since libedit's emulation of libreadline doesn't currently include such a function. Fortunately, libedit seems not to have any corresponding bug. Merlin Moncure, adjusted a bit by me
*	Speed up CREATE INDEX CONCURRENTLY's TID sort.	Robert Haas	2015-12-16
\| \| \| \| \| \| \| \| \|	Encode TIDs as 64-bit integers to speed up comparisons. This seems to speed things up on all platforms, but is even more beneficial when 8-byte integers are passed by value. Peter Geoghegan. Design suggestions and review by Tom Lane. Review also by Simon Riggs and by me.
*	Mark CHECK constraints declared NOT VALID valid if created with table.	Robert Haas	2015-12-16
\| \| \| \| \| \| \| \|	FOREIGN KEY constraints have behaved this way for a long time, but for some reason the behavior of CHECK constraints has been inconsistent up until now. Amit Langote and Amul Sul, with assorted tweaks by me.
*	Teach mdnblocks() not to create zero-length files.	Robert Haas	2015-12-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's entirely surprising that mdnblocks() has the side effect of creating new files on disk, so let's make it not do that. One consequence of the old behavior is that, if running on a damaged cluster that is missing a file, mdnblocks() can recreate the file and allow a subsequent _mdfd_getseg() for a higher segment to succeed. This happens because, while mdnblocks() stops when it finds a segment that is shorter than 1GB, _mdfd_getseg() has no such check, and thus the empty file created by mdnblocks() can allow it to continue its traversal and find higher-numbered segments which remain. It might be a good idea for _mdfd_getseg() to actually verify that each segment it finds is exactly 1GB before proceeding to the next one, but that would involve some additional system calls, so for now I'm just doing this much. Patch by me, per off-list analysis by Kevin Grittner and Rahila Syed. Review by Andres Freund.
*	Move buffer I/O and content LWLocks out of the main tranche.	Robert Haas	2015-12-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the content lock directly into the BufferDesc, so that locking and pinning a buffer touches only one cache line rather than two. Adjust the definition of BufferDesc slightly so that this doesn't make the BufferDesc any larger than one cache line (at least on platforms where a spinlock is only 1 or 2 bytes). We can't fit the I/O locks into the BufferDesc and stay within one cache line, so move those to a completely separate tranche. This leaves a relatively limited number of LWLocks in the main tranche, so increase the padding of those remaining locks to a full cache line, rather than allowing adjacent locks to share a cache line, hopefully reducing false sharing. Performance testing shows that these changes make little difference on laptop-class machines, but help significantly on larger servers, especially those with more than 2 sockets. Andres Freund, originally based on an earlier patch by Simon Riggs. Review and cosmetic adjustments (including heavy rewriting of the comments) by me.
*	Provide a way to predefine LWLock tranche IDs.	Robert Haas	2015-12-15
\| \| \| \| \| \| \| \| \| \| \|	It's a bit cumbersome to use LWLockNewTrancheId(), because the returned value needs to be shared between backends so that each backend can call LWLockRegisterTranche() with the correct ID. So, for built-in tranches, use a hard-coded value instead. This is motivated by an upcoming patch adding further built-in tranches. Andres Freund and Robert Haas
*	Collect the global OR of hasRowSecurity flags for plancache	Stephen Frost	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We carry around information about if a given query has row security or not to allow the plancache to use that information to invalidate a planned query in the event that the environment changes. Previously, the flag of one of the subqueries was simply being copied into place to indicate if the query overall included RLS components. That's wrong as we need the global OR of all subqueries. Fix by changing the code to match how fireRIRules works, which is results in OR'ing all of the flags. Noted by Tom. Back-patch to 9.5 where RLS was introduced.
*	Add missing cleanup logic in pg_rewind/t/005_same_timeline.pl test.	Tom Lane	2015-12-14
\| \| \| \|	Per Michael Paquier
*	Add missing CHECK_FOR_INTERRUPTS in lseg_inside_poly	Alvaro Herrera	2015-12-14
\| \| \| \| \| \| \| \| \|	Apparently, there are bugs in this code that cause it to loop endlessly. That bug still needs more research, but in the meantime it's clear that the loop is missing a check for interrupts so that it can be cancelled timely. Backpatch to 9.1 -- this has been missing since 49475aab8d0d.
*	Remove xmlparse(document '') test	Kevin Grittner	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \|	This one test was behaving differently between the ubuntu fix for CVE-2015-7499 and the base "expected" file. It's not worth having yet another version of the expected file for this test, so drop it. Perhaps at some point when all distros have settled down to the same behavior on this test, it can be restored. Problem found by me on libxml2 (2.9.1+dfsg1-3ubuntu4.6). Solution suggested by Tom Lane. Backpatch to 9.5, where the test was added.
*	Fix out-of-memory error handling in ParameterDescription message processing.	Heikki Linnakangas	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \|	If libpq ran out of memory while constructing the result set, it would hang, waiting for more data from the server, which might never arrive. To fix, distinguish between out-of-memory error and not-enough-data cases, and give a proper error message back to the client on OOM. There are still similar issues in handling COPY start messages, but let's handle that as a separate patch. Michael Paquier, Amit Kapila and me. Backpatch to all supported versions.
*	Fix bug in SetOffsetVacuumLimit() triggered by find_multixact_start() failure.	Andres Freund	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, if find_multixact_start() failed, SetOffsetVacuumLimit() would install 0 into MultiXactState->offsetStopLimit if it previously succeeded. Luckily, there are no known cases where find_multixact_start() will return an error in 9.5 and above. But if it were to happen, for example due to filesystem permission issues, it'd be somewhat bad: GetNewMultiXactId() could continue allocating mxids even if close to a wraparound, or it could erroneously stop allocating mxids, even if no wraparound is looming. The wrong value would be corrected the next time SetOffsetVacuumLimit() is called, or by a restart. Reported-By: Noah Misch, although this is not his preferred fix Discussion: 20151210140450.GA22278@alap3.anarazel.de Backpatch: 9.5, where the bug was introduced as part of 4f627f
*	Correct statement to actually be the intended assert statement.	Andres Freund	2015-12-14
\| \| \| \| \| \| \| \| \|	e3f4cfc7 introduced a LWLockHeldByMe() call, without the corresponding Assert() surrounding it. Spotted by Coverity. Backpatch: 9.1+, like the previous commit
*	Code and docs review for multiple -c and -f options in psql.	Tom Lane	2015-12-13
\| \| \| \| \| \| \| \| \| \| \|	Commit d5563d7df94488bf drew complaints from Coverity, which quite correctly complained that one copy of each -c or -f string was being leaked. What's more, simple_action_list_append was allocating enough space for still a third copy of each string as part of the SimpleActionListCell, even though that coding method had been superseded by a separate strdup operation. There were some other minor coding infelicities too. The documentation needed more work as well, eg it forgot to explain that -c causes psql not to accept any interactive input.
*	Consistently set all fields in pg_stat_replication to null instead of 0	Magnus Hagander	2015-12-13
\| \| \| \| \| \|	Previously the "sent" field would be set to 0 and all other xlog pointers be set to NULL if there were no valid values (such as when in a backup sending walsender).
*	Properly initialize write, flush and replay locations in walsender slots	Magnus Hagander	2015-12-13
\| \| \| \| \| \| \| \| \|	These would leak random xlog positions if a walsender used for backup would a walsender slot previously used by a replication walsender. In passing also fix a couple of cases where the xlog pointer is directly compared to zero instead of using XLogRecPtrIsInvalid, noted by Michael Paquier.
*	Fix ALTER TABLE ... SET TABLESPACE for unlogged relations.	Andres Freund	2015-12-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changing the tablespace of an unlogged relation did not WAL log the creation and content of the init fork. Thus, after a standby is promoted, unlogged relation cannot be accessed anymore, with errors like: ERROR: 58P01: could not open file "pg_tblspc/...": No such file or directory Additionally the init fork was not synced to disk, independent of the configured wal_level, a relatively small durability risk. Investigation of that problem also brought to light that, even for permanent relations, the creation of !main forks was not WAL logged, i.e. no XLOG_SMGR_CREATE record were emitted. That mostly turns out not to be a problem, because these files were created when the actual relation data is copied; nonexistent files are not treated as an error condition during replay. But that doesn't work for empty files, and generally feels a bit haphazard. Luckily, outside init and main forks, empty forks don't occur often or are not a problem. Add the required WAL logging and syncing to disk. Reported-By: Michael Paquier Author: Michael Paquier and Andres Freund Discussion: 20151210163230.GA11331@alap3.anarazel.de Backpatch: 9.1, where unlogged relations were introduced
*	Add an expected-file to match behavior of latest libxml2.	Tom Lane	2015-12-11
\| \| \| \| \| \| \| \| \| \| \|	Recent releases of libxml2 do not provide error context reports for errors detected at the very end of the input string. This appears to be a bug, or at least an infelicity, introduced by the fix for libxml2's CVE-2015-7499. We can hope that this behavioral change will get undone before too long; but the security patch is likely to spread a lot faster/further than any follow-on cleanup, which means this behavior is likely to be present in the wild for some time to come. As a stopgap, add a variant regression test expected-file that matches what you get with a libxml2 that acts this way.
*	pg_rewind: Don't error if the two clusters are already on the same timeline	Peter Eisentraut	2015-12-11
\| \| \| \| \|	This previously resulted in an error and a nonzero exit status, but after discussion this should rather be a noop with a zero exit status.
*	For REASSIGN OWNED for foreign user mappings	Alvaro Herrera	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As reported in bug #13809 by Alexander Ashurkov, the code for REASSIGN OWNED hadn't gotten word about user mappings. Deal with them in the same way default ACLs do, which is to ignore them altogether; they are handled just fine by DROP OWNED. The other foreign object cases are already handled correctly by both commands. Also add a REASSIGN OWNED statement to foreign_data test to exercise the foreign data objects. (The changes are just before the "cleanup" phase, so it shouldn't remove any existing live test.) Reported by Alexander Ashurkov, then independently by Jaime Casanova.
*	Handle policies during DROP OWNED BY	Stephen Frost	2015-12-11
\| \| \| \| \| \| \| \| \| \|	DROP OWNED BY handled GRANT-based ACLs but was not removing roles from policies. Fix that by having DROP OWNED BY remove the role specified from the list of roles the policy (or policies) apply to, or the entire policy (or policies) if it only applied to the role specified. As with ACLs, the DROP OWNED BY caller must have permission to modify the policy or a WARNING is thrown and no change is made to the policy.
*	Get rid of the planner's LateralJoinInfo data structure.	Tom Lane	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I originally modeled this data structure on SpecialJoinInfo, but after commit acfcd45cacb6df23 that looks like a pretty poor decision. All we really need is relid sets identifying laterally-referenced rels; and most of the time, what we want to know about includes indirect lateral references, a case the LateralJoinInfo data was unsuited to compute with any efficiency. The previous commit redefined RelOptInfo.lateral_relids as the transitive closure of lateral references, so that it easily supports checking indirect references. For the places where we really do want just direct references, add a new RelOptInfo field direct_lateral_relids, which is easily set up as a copy of lateral_relids before we perform the transitive closure calculation. Then we can just drop lateral_info_list and LateralJoinInfo and the supporting code. This makes the planner's handling of lateral references noticeably more efficient, and shorter too. Such a change can't be back-patched into stable branches for fear of breaking extensions that might be looking at the planner's data structures; but it seems not too late to push it into 9.5, so I've done so.
*	Handle dependencies properly in ALTER POLICY	Stephen Frost	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	ALTER POLICY hadn't fully considered partial policy alternation (eg: change just the roles on the policy, or just change one of the expressions) when rebuilding the dependencies. Instead, it would happily remove all dependencies which existed for the policy and then only recreate the dependencies for the objects referred to in the specific ALTER POLICY command. Correct that by extracting and building the dependencies for all objects referenced by the policy, regardless of if they were provided as part of the ALTER POLICY command or were already in place as part of the pre-existing policy.
*	Still more fixes for planner's handling of LATERAL references.	Tom Lane	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More fuzz testing by Andreas Seltenreich exposed that the planner did not cope well with chains of lateral references. If relation X references Y laterally, and Y references Z laterally, then we will have to scan X on the inside of a nestloop with Z, so for all intents and purposes X is laterally dependent on Z too. The planner did not understand this and would generate intermediate joins that could not be used. While that was usually harmless except for wasting some planning cycles, under the right circumstances it would lead to "failed to build any N-way joins" or "could not devise a query plan" planner failures. To fix that, convert the existing per-relation lateral_relids and lateral_referencers relid sets into their transitive closures; that is, they now show all relations on which a rel is directly or indirectly laterally dependent. This not only fixes the chained-reference problem but allows some of the relevant tests to be made substantially simpler and faster, since they can be reduced to simple bitmap manipulations instead of searches of the LateralJoinInfo list. Also, when a PlaceHolderVar that is due to be evaluated at a join contains lateral references, we should treat those references as indirect lateral dependencies of each of the join's base relations. This prevents us from trying to join any individual base relations to the lateral reference source before the join is formed, which again cannot work. Andreas' testing also exposed another oversight in the "dangerous PlaceHolderVar" test added in commit 85e5e222b1dd02f1. Simply rejecting unsafe join paths in joinpath.c is insufficient, because in some cases we will end up rejecting all possible paths for a particular join, again leading to "could not devise a query plan" failures. The restriction has to be known also to join_is_legal and its cohort functions, so that they will not select a join for which that will happen. I chose to move the supporting logic into joinrels.c where the latter functions are. Back-patch to 9.3 where LATERAL support was introduced.
*	Fix commit timestamp initialization	Alvaro Herrera	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This module needs explicit initialization in order to replay WAL records in recovery, but we had broken this recently following changes to make other (stranger) scenarios work correctly. To fix, rework the initialization sequence so that it always takes place before WAL replay commences for both master and standby. I could have gone for a more localized fix that just added a "startup" call for the master server, but it seemed better to restructure the existing callers as well so that the whole thing made more sense. As a drawback, there is more control logic in xlog.c now than previously, but doing otherwise meant passing down the ControlFile flag, which seemed uglier as a whole. This also meant adding a check to not re-execute ActivateCommitTs if it had already been called. Reported by Fujii Masao. Backpatch to 9.5.
*	Improve some messages	Peter Eisentraut	2015-12-10
\|
*	Improve ALTER POLICY tab completion.	Robert Haas	2015-12-10
\| \| \| \| \| \| \| \| \|	Complete "ALTER POLICY" with a policy name, as we do for DROP POLICY. And, complete "ALTER POLICY polname ON" with a table name that has such a policy, as we do for DROP POLICY, rather than with any table name at all. Masahiko Sawada
*	Fix ON CONFLICT UPDATE bug breaking AFTER UPDATE triggers.	Andres Freund	2015-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ExecOnConflictUpdate() passed t_ctid of the to-be-updated tuple to ExecUpdate(). That's problematic primarily because of two reason: First and foremost t_ctid could point to a different tuple. Secondly, and that's what triggered the complaint by Stanislav, t_ctid is changed by heap_update() to point to the new tuple version. The behavior of AFTER UPDATE triggers was therefore broken, with NEW.* and OLD.* tuples spuriously identical within AFTER UPDATE triggers. To fix both issues, pass a pointer to t_self of a on-stack HeapTuple instead. Fixing this bug lead to one change in regression tests, which previously failed due to the first issue mentioned above. There's a reasonable expectation that test fails, as it updates one row repeatedly within one INSERT ... ON CONFLICT statement. That is only possible if the second update is triggered via ON CONFLICT ... SET, ON CONFLICT ... WHERE, or by a WITH CHECK expression, as those are executed after ExecOnConflictUpdate() does a visibility check. That could easily be prohibited, but given it's allowed for plain UPDATEs and a rare corner case, it doesn't seem worthwhile. Reported-By: Stanislav Grozev Author: Andres Freund and Peter Geoghegan Discussion: CAA78GVqy1+LisN-8DygekD_Ldfy=BJLarSpjGhytOsgkpMavfQ@mail.gmail.com Backpatch: 9.5, where ON CONFLICT was introduced
*	Fix bug leading to restoring unlogged relations from empty files.	Andres Freund	2015-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the end of crash recovery, unlogged relations are reset to the empty state, using their init fork as the template. The init fork is copied to the main fork without going through shared buffers. Unfortunately WAL replay so far has not necessarily flushed writes from shared buffers to disk at that point. In normal crash recovery, and before the introduction of 'fast promotions' in fd4ced523 / 9.3, the END_OF_RECOVERY checkpoint flushes the buffers out in time. But with fast promotions that's not the case anymore. To fix, force WAL writes targeting the init fork to be flushed immediately (using the new FlushOneBuffer() function). In 9.5+ that flush can centrally be triggered from the code dealing with restoring full page writes (XLogReadBufferForRedoExtended), in earlier releases that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay function. Backpatch to 9.1, even if this currently is only known to trigger in 9.3+. Flushing earlier is more robust, and it is advantageous to keep the branches similar. Typical symptoms of this bug are errors like 'ERROR: index "..." contains unexpected zero page at block 0' shortly after promoting a node. Reported-By: Thom Brown Author: Andres Freund and Michael Paquier Discussion: 20150326175024.GJ451@alap3.anarazel.de Backpatch: 9.1-
*	Accept flex > 2.5.x on Windows, too.	Tom Lane	2015-12-10
\| \| \| \| \| \| \|	Commit 32f15d05c fixed this in configure, but missed the similar check in the MSVC scripts. Michael Paquier, per report from Victor Wagner
*	Allow EXPLAIN (ANALYZE, VERBOSE) to display per-worker statistics.	Robert Haas	2015-12-09
\| \| \| \| \| \| \| \| \| \| \|	The original parallel sequential scan commit included only very limited changes to the EXPLAIN output. Aggregated totals from all workers were displayed, but there was no way to see what each individual worker did or to distinguish the effort made by the workers from the effort made by the leader. Per a gripe by Thom Brown (and maybe others). Patch by me, reviewed by Amit Kapila.
*	Improve performance in freeing memory contexts	Kevin Grittner	2015-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The single linked list of memory contexts could result in O(N^2) performance to free a set of contexts if they were not freed in reverse order of creation. In many cases the reverse order was used, but there were some significant exceptions that caused real- world performance problems. Rather than requiring all callers to care about the order in which contexts were freed, and hunting down and changing all existing cases where the wrong order was used, we add one pointer per memory context so that the implementation details are not so visible. Jan Wieck