postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix subtly-wrong volatility checking in BeginCopyFrom().	Tom Lane	2013-11-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	contain_volatile_functions() is best applied to the output of expression_planner(), not its input, so that insertion of function default arguments and constant-folding have been done. (See comments at CheckMutability, for instance.) It's perhaps unlikely that anyone will notice a difference in practice, but still we should do it properly. In passing, change variable type from Node* to Expr* to reduce the net number of casts needed. Noted while perusing uses of contain_volatile_functions().
*	Be more robust when strerror() doesn't give a useful result.	Tom Lane	2013-11-07
\| \| \| \| \| \| \|	Back-patch commits 8e68816cc2567642c6fcca4eaac66c25e0ae5ced and 8dace66e0735ca39b779922d02c24ea2686e6521 into the stable branches. Buildfarm testing revealed no great portability surprises, and it seems useful to have this robustness improvement in all branches.
*	Prevent display of dropped columns in row constraint violation messages.	Tom Lane	2013-11-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ExecBuildSlotValueDescription() printed "null" for each dropped column in a row being complained of by ExecConstraints(). This has some sanity in terms of the underlying implementation, but is of course pretty surprising to users. To fix, we must pass the target relation's descriptor to ExecBuildSlotValueDescription(), because the slot descriptor it had been using doesn't get labeled with attisdropped markers. Per bug #8408 from Maxim Boguk. Back-patch to 9.2 where the feature of printing row values in NOT NULL and CHECK constraint violation messages was introduced. Michael Paquier and Tom Lane
*	Fix generation of MergeAppend plans for optimized min/max on expressions.	Tom Lane	2013-11-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before jamming a desired targetlist into a plan node, one really ought to make sure the plan node can handle projections, and insert a buffering Result plan node if not. planagg.c forgot to do this, which is a hangover from the days when it only dealt with IndexScan plan types. MergeAppend doesn't project though, not to mention that it gets unhappy if you remove its possibly-resjunk sort columns. The code accidentally failed to fail for cases in which the min/max argument was a simple Var, because the new targetlist would be equivalent to the original "flat" tlist anyway. For any more complex case, it's been broken since 9.1 where we introduced the ability to optimize min/max using MergeAppend, as reported by Raphael Bauduin. Fix by duplicating the logic from grouping_planner that decides whether we need a Result node. In 9.2 and 9.1, this requires back-porting the tlist_same_exprs() function introduced in commit 4387cf956b9eb13aad569634e0c4df081d76e2e3, else we'd uselessly add a Result node in cases that worked before. It's rather tempting to back-patch that whole commit so that we can avoid extra Result nodes in mainline cases too; but I'll refrain, since that code hasn't really seen all that much field testing yet.
*	Support default arguments and named-argument notation for window functions.	Tom Lane	2013-11-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These things didn't work because the planner omitted to do the necessary preprocessing of a WindowFunc's argument list. Add the few dozen lines of code needed to handle that. Although this sounds like a feature addition, it's really a bug fix because the default-argument case was likely to crash previously, due to lack of checking of the number of supplied arguments in the built-in window functions. It's not a security issue because there's no way for a non-superuser to create a window function definition with defaults that refers to a built-in C function, but nonetheless people might be annoyed that it crashes rather than producing a useful error message. So back-patch as far as the patch applies easily, which turns out to be 9.2. I'll put a band-aid in earlier versions as a separate patch. (Note that these features still don't work for aggregates, and fixing that case will be harder since we represent aggregate arg lists as target lists not bare expression lists. There's no crash risk though because CREATE AGGREGATE doesn't accept defaults, and we reject named-argument notation when parsing an aggregate call.)
*	Improve the error message given for modifying a window with frame clause.	Tom Lane	2013-11-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For rather inscrutable reasons, SQL:2008 disallows copying-and-modifying a window definition that has any explicit framing clause. The error message we gave for this only made sense if the referencing window definition itself contains an explicit framing clause, which it might well not. Moreover, in the context of an OVER clause it's not exactly obvious that "OVER (windowname)" implies copy-and-modify while "OVER windowname" does not. This has led to multiple complaints, eg bug #5199 from Iliya Krapchatov. Change to a hopefully more intelligible error message, and in the case where we have just "OVER (windowname)", add a HINT suggesting that omitting the parentheses will fix it. Also improve the related documentation. Back-patch to all supported branches.
*	Prevent memory leaks from accumulating across printtup() calls.	Tom Lane	2013-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically, printtup() has assumed that it could prevent memory leakage by pfree'ing the string result of each output function and manually managing detoasting of toasted values. This amounts to assuming that datatype output functions never leak any memory internally; an assumption we've already decided to be bogus elsewhere, for example in COPY OUT. range_out in particular is known to leak multiple kilobytes per call, as noted in bug #8573 from Godfried Vanluffelen. While we could go in and fix that leak, it wouldn't be very notationally convenient, and in any case there have been and undoubtedly will again be other leaks in other output functions. So what seems like the best solution is to run the output functions in a temporary memory context that can be reset after each row, as we're doing in COPY OUT. Some quick experimentation suggests this is actually a tad faster than the retail pfree's anyway. This patch fixes all the variants of printtup, except for debugtup() which is used in standalone mode. It doesn't seem worth worrying about query-lifespan leaks in standalone mode, and fixing that case would be a bit tedious since debugtup() doesn't currently have any startup or shutdown functions. While at it, remove manual detoast management from several other output-function call sites that had copied it from printtup(). This doesn't make a lot of difference right now, but in view of recent discussions about supporting "non-flattened" Datums, we're going to want that code gone eventually anyway. Back-patch to 9.2 where range_out was introduced. We might eventually decide to back-patch this further, but in the absence of known major leaks in older output functions, I'll refrain for now.
*	Retry after buffer locking failure during SPGiST index creation.	Tom Lane	2013-11-02
\| \| \| \| \| \| \| \| \|	The original coding thought this case was impossible, but it can happen if the bgwriter or checkpointer processes decide to write out an index page while creation is still proceeding, leading to a bogus "unexpected spgdoinsert() failure" error. Problem reported by Jonathan S. Katz. Teodor Sigaev
*	Ensure all files created for a single BufFile have the same resource owner.	Tom Lane	2013-11-01
\| \| \| \| \| \| \| \| \| \| \|	Callers expect that they only have to set the right resource owner when creating a BufFile, not during subsequent operations on it. While we could insist this be fixed at the caller level, it seems more sensible for the BufFile to take care of it. Without this, some temp files belonging to a BufFile can go away too soon, eg at the end of a subtransaction, leading to errors or crashes. Reported and fixed by Andres Freund. Back-patch to all active branches.
*	Fix some odd behaviors when using a SQL-style simple GMT offset timezone.	Tom Lane	2013-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Formerly, when using a SQL-spec timezone setting with a fixed GMT offset (called a "brute force" timezone in the code), the session_timezone variable was not updated to match the nominal timezone; rather, all code was expected to ignore session_timezone if HasCTZSet was true. This is of course obviously fragile, though a search of the code finds only timeofday() failing to honor the rule. A bigger problem was that DetermineTimeZoneOffset() supposed that if its pg_tz parameter was pointer-equal to session_timezone, then HasCTZSet should override the parameter. This would cause datetime input containing an explicit zone name to be treated as referencing the brute-force zone instead, if the zone name happened to match the session timezone that had prevailed before installing the brute-force zone setting (as reported in bug #8572). The same malady could affect AT TIME ZONE operators. To fix, set up session_timezone so that it matches the brute-force zone specification, which we can do using the POSIX timezone definition syntax "<abbrev>offset", and get rid of the bogus lookaside check in DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in datetime parsing and AT TIME ZONE, this will cause the timeofday() function to print its result in the user-requested time zone rather than some previously-set zone. It might also affect results in third-party extensions, if there are any that make use of session_timezone without considering HasCTZSet, but in all cases the new behavior should be saner than before. Back-patch to all supported branches.
*	Prevent using strncpy with src == dest in TupleDescInitEntry.	Tom Lane	2013-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The C and POSIX standards state that strncpy's behavior is undefined when source and destination areas overlap. While it remains dubious whether any implementations really misbehave when the pointers are exactly equal, some platforms are now starting to force the issue by complaining when an undefined call occurs. (In particular OS X 10.9 has been seen to dump core here, though the exact set of circumstances needed to trigger that remain elusive. Similar behavior can be expected to be optional on Linux and other platforms in the near future.) So tweak the code to explicitly do nothing when nothing need be done. Back-patch to all active branches. In HEAD, this also lets us get rid of an exception in valgrind.supp. Per discussion of a report from Matthias Schmitt.
*	Fix two bugs in setting the vm bit of empty pages.	Heikki Linnakangas	2013-10-23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use a critical section when setting the all-visible flag on an empty page, and WAL-logging it. log_newpage_buffer() contains an assertion that it must be called inside a critical section, and it's the right thing to do when modifying a buffer anyway. Also, the page should be marked dirty before calling log_newpage_buffer(), per the comment in log_newpage_buffer() and src/backend/access/transam/README. Patch by Andres Freund, in response to my report. Backpatch to 9.2, like the patch that introduced these bugs (a6370fd9).
*	Fix bugs in SSI tuple locking.	Heikki Linnakangas	2013-10-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. In heap_hot_search_buffer(), the PredicateLockTuple() call is passed wrong offset number. heapTuple->t_self is set to the tid of the first tuple in the chain that's visited, not the one actually being read. 2. CheckForSerializableConflictIn() uses the tuple's t_ctid field instead of t_self to check for exiting predicate locks on the tuple. If the tuple was updated, but the updater rolled back, t_ctid points to the aborted dead tuple. Reported by Hannu Krosing. Backpatch to 9.1.
*	Translation updates	Peter Eisentraut	2013-10-07
\|
*	Eliminate xmin from hash tag for predicate locks on heap tuples.	Kevin Grittner	2013-10-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a tuple was frozen while its predicate locks mattered, read-write dependencies could be missed, resulting in failure to detect conflicts which could lead to anomalies in committed serializable transactions. This field was added to the tag when we still thought that it was necessary to carry locks forward to a new version of an updated row. That was later proven to be unnecessary, which allowed simplification of the code, but elimination of xmin from the tag was missed at the time. Per report and analysis by Heikki Linnakangas. Backpatch to 9.1.
*	Fix snapshot leak if lo_open called on non-existent object.	Heikki Linnakangas	2013-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	lo_open registers the currently active snapshot, and checks if the large object exists after that. Normally, snapshots registered by lo_open are unregistered at end of transaction when the lo descriptor is closed, but if we error out before the lo descriptor is added to the list of open descriptors, it is leaked. Fix by moving the snapshot registration to after checking if the large object exists. Reported by Pavel Stehule. Backpatch to 8.4. The snapshot registration system was introduced in 8.4, so prior versions are not affected (and not supported, anyway).
*	Fix spurious warning after vacuuming a page on a table with no indexes.	Heikki Linnakangas	2013-09-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a rare race condition, when a transaction that inserted a tuple aborts while vacuum is processing the page containing the inserted tuple. Vacuum prunes the page first, which normally removes any dead tuples, but if the inserting transaction aborts right after that, the loop after pruning will see a dead tuple and remove it instead. That's OK, but if the page is on a table with no indexes, and the page becomes completely empty after removing the dead tuple (or tuples) on it, it will be immediately marked as all-visible. That's OK, but the sanity check in vacuum would throw a warning because it thinks that the page contains dead tuples and was nevertheless marked as all-visible, even though it just vacuumed away the dead tuples and so it doesn't actually contain any. Spotted this while reading the code. It's difficult to hit the race condition otherwise, but can be done by putting a breakpoint after the heap_page_prune() call. Backpatch all the way to 8.4, where this code first appeared.
*	Plug memory leak in range_cmp function.	Heikki Linnakangas	2013-09-25
\| \| \| \| \| \| \| \|	B-tree operators are not allowed to leak memory into the current memory context. Range_cmp leaked detoasted copies of the arguments. That caused a quick out-of-memory error when creating an index on a range column. Reported by Marian Krucina, bug #8468.
*	Fix pgindent comment breakage	Alvaro Herrera	2013-09-24
\|
*	Ignore interrupts during quickdie().	Noah Misch	2013-09-11
\| \| \| \| \| \| \| \| \|	Once the administrator has called for an immediate shutdown or a backend crash has triggered a reinitialization, no mere SIGINT or SIGTERM should change that course. Such derailment remains possible when the signal arrives before quickdie() blocks signals. That being a narrow race affecting most PostgreSQL signal handlers in some way, leave it for another patch. Back-patch this to all supported versions.
*	Don't fail for bad GUCs in CREATE FUNCTION with check_function_bodies off.	Tom Lane	2013-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous coding attempted to activate all the GUC settings specified in SET clauses, so that the function validator could operate in the GUC environment expected by the function body. However, this is problematic when restoring a dump, since the SET clauses might refer to database objects that don't exist yet. We already have the parameter check_function_bodies that's meant to prevent forward references in function definitions from breaking dumps, so let's change CREATE FUNCTION to not install the SET values if check_function_bodies is off. Authors of function validators were already advised not to make any "context sensitive" checks when check_function_bodies is off, if indeed they're checking anything at all in that mode. But extend the documentation to point out the GUC issue in particular. (Note that we still check the SET clauses to some extent; the behavior with !check_function_bodies is now approximately equivalent to what ALTER DATABASE/ROLE have been doing for awhile with context-dependent GUCs.) This problem can be demonstrated in all active branches, so back-patch all the way.
*	Account better for planning cost when choosing whether to use custom plans.	Tom Lane	2013-08-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous coding in plancache.c essentially used 10% of the estimated runtime as its cost estimate for planning. This can be pretty bogus, especially when the estimated runtime is very small, such as in a simple expression plan created by plpgsql, or a simple INSERT ... VALUES. While we don't have a really good handle on how planning time compares to runtime, it seems reasonable to use an estimate based on the number of relations referenced in the query, with a rather large multiplier. This patch uses 1000 * cpu_operator_cost * (nrelations + 1), so that even a trivial query will be charged 1000 * cpu_operator_cost for planning. This should address the problem reported by Marc Cousin and others that 9.2 and up prefer custom plans in cases where the planning time greatly exceeds what can be saved.
*	Don't crash when pg_xlog is empty and pg_basebackup -x is used	Magnus Hagander	2013-08-24
\| \| \| \| \| \| \| \|	The backup will not work (without a logarchive, and that's the whole point of -x) in this case, this patch just changes it to throw an error instead of crashing when this happens. Noticed and diagnosed by TAKATSUKA Haruka
*	In locate_grouping_columns(), don't expect an exact match of Var typmods.	Tom Lane	2013-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible that inlining of SQL functions (or perhaps other changes?) has exposed typmod information not known at parse time. In such cases, Vars generated by query_planner might have valid typmod values while the original grouping columns only have typmod -1. This isn't a semantic problem since the behavior of grouping only depends on type not typmod, but it breaks locate_grouping_columns' use of tlist_member to locate the matching entry in query_planner's result tlist. We can fix this without an excessive amount of new code or complexity by relying on the fact that locate_grouping_columns only gets called when make_subplanTargetList has set need_tlist_eval == false, and that can only happen if all the grouping columns are simple Vars. Therefore we only need to search the sub_tlist for a matching Var, and we can reasonably define a "match" as being a match of the Var identity fields varno/varattno/varlevelsup. The code still Asserts that vartype matches, but ignores vartypmod. Per bug #8393 from Evan Martin. The added regression test case is basically the same as his example. This has been broken for a very long time, so back-patch to all supported branches.
*	Make sure float4in/float8in accept all standard spellings of "infinity".	Tom Lane	2013-08-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The C99 and POSIX standards require strtod() to accept all these spellings (case-insensitively): "inf", "+inf", "-inf", "infinity", "+infinity", "-infinity". However, pre-C99 systems might accept only some or none of these, and apparently Windows still doesn't accept "inf". To avoid surprising cross-platform behavioral differences, manually check for each of these spellings if strtod() fails. We were previously handling just "infinity" and "-infinity" that way, but since C99 is most of the world now, it seems likely that applications are expecting all these spellings to work. Per bug #8355 from Basil Peace. It turns out this fix won't actually resolve his problem, because Python isn't being this careful; but that doesn't mean we shouldn't be.
*	Fix old visibility bug in HeapTupleSatisfiesDirty	Alvaro Herrera	2013-08-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a tuple is locked but not updated by a concurrent transaction, HeapTupleSatisfiesDirty would return that transaction's Xid in xmax, causing callers to wait on it, when it is not necessary (in fact, if the other transaction had used a multixact instead of a plain Xid to mark the tuple, HeapTupleSatisfiesDirty would have behave differently and not returned the Xmax). This bug was introduced in commit 3f7fbf85dc5b42, dated December 1998, so it's almost 15 years old now. However, it's hard to see this misbehave, because before we had NOWAIT the only consequence of this is that transactions would wait for slightly more time than necessary; so it's not surprising that this hasn't been reported yet. Craig Ringer and Andres Freund
*	Fix regexp_matches() handling of zero-length matches.	Tom Lane	2013-07-31
\| \| \| \| \| \| \| \| \| \| \|	We'd find the same match twice if it was of zero length and not immediately adjacent to the previous match. replace_text_regexp() got similar cases right, so adjust this search logic to match that. Note that even though the regexp_split_to_xxx() functions share this code, they did not display equivalent misbehavior, because the second match would be considered degenerate and ignored. Jeevan Chalke, with some cosmetic changes by me.
*	Restore REINDEX constraint validation.	Noah Misch	2013-07-30
\| \| \| \| \| \| \| \| \|	Refactoring as part of commit 8ceb24568054232696dddc1166a8563bc78c900a had the unintended effect of making REINDEX TABLE and REINDEX DATABASE no longer validate constraints enforced by the indexes in question; REINDEX INDEX still did so. Indexes marked invalid remained so, and constraint violations arising from data corruption went undetected. Back-patch to 9.0, like the causative commit.
*	Fix booltestsel() for case where we have NULL stats but not MCV stats.	Tom Lane	2013-07-24
\| \| \| \| \| \| \| \| \| \| \| \|	In a boolean column that contains mostly nulls, ANALYZE might not find enough non-null values to populate the most-common-values stats, but it would still create a pg_statistic entry with stanullfrac set. The logic in booltestsel() for this situation did the wrong thing for "col IS NOT TRUE" and "col IS NOT FALSE" tests, forgetting that null values would satisfy these tests (so that the true selectivity would be close to one, not close to zero). Per bug #8274. Fix by Andrew Gierth, some comment-smithing by me.
*	Change post-rewriter representation of dropped columns in joinaliasvars.	Tom Lane	2013-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible to drop a column from an input table of a JOIN clause in a view, if that column is nowhere actually referenced in the view. But it will still be there in the JOIN clause's joinaliasvars list. We used to replace such entries with NULL Const nodes, which is handy for generation of RowExpr expansion of a whole-row reference to the view. The trouble with that is that it can't be distinguished from the situation after subquery pull-up of a constant subquery output expression below the JOIN. Instead, replace such joinaliasvars with null pointers (empty expression trees), which can't be confused with pulled-up expressions. expandRTE() still emits the old convention, though, for convenience of RowExpr generation and to reduce the risk of breaking extension code. In HEAD and 9.3, this patch also fixes a problem with some new code in ruleutils.c that was failing to cope with implicitly-casted joinaliasvars entries, as per recent report from Feike Steenbergen. That oversight was because of an inadequate description of the data structure in parsenodes.h, which I've now corrected. There were some pre-existing oversights of the same ilk elsewhere, which I believe are now all fixed.
*	Fix regex match failures for backrefs combined with non-greedy quantifiers.	Tom Lane	2013-07-18
\| \| \| \| \| \| \| \| \| \| \| \|	An ancient logic error in cfindloop() could cause the regex engine to fail to find matches that begin later than the start of the string. This function is only used when the regex pattern contains a back reference, and so far as we can tell the error is only reachable if the pattern is non-greedy (i.e. its first quantifier uses the ? modifier). Furthermore, the actual match must begin after some potential match that satisfies the DFA but then fails the back-reference's match test. Reported and fixed by Jeevan Chalke, with cosmetic adjustments by me.
*	Ensure 64bit arithmetic when calculating tapeSpace	Stephen Frost	2013-07-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring out how many 'tapes' we can use (maxTapes) and then multiplying the result by the tape buffer overhead for each. Unfortunately, when we are on a system with an 8-byte long, we allow work_mem to be larger than 2GB and that allows maxTapes to be large enough that the 32bit arithmetic can overflow when multiplied against the buffer overhead. When this overflow happens, we end up adding the overflow to the amount of space available, causing the amount of memory allocated to be larger than work_mem. Note that to reach this point, you have to set work mem to at least 24GB and be sorting a set which is at least that size. Given that a user who can set work_mem to 24GB could also set it even higher, if they were looking to run the system out of memory, this isn't considered a security issue. This overflow risk was found by the Coverity scanner. Back-patch to all supported branches, as this issue has existed since before 8.4.
*	Fix planning of parameterized appendrel paths with expensive join quals.	Tom Lane	2013-07-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code in set_append_rel_pathlist() for building parameterized paths for append relations (inheritance and UNION ALL combinations) supposed that the cheapest regular path for a child relation would still be cheapest when reparameterized. Which might not be the case, particularly if the added join conditions are expensive to compute, as in a recent example from Jeff Janes. Fix it to compare child path costs after reparameterizing. We can short-circuit that if the cheapest pre-existing path is already parameterized correctly, which seems likely to be true often enough to be worth checking for. Back-patch to 9.2 where parameterized paths were introduced.
*	Support clean switchover.	Fujii Masao	2013-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In replication, when we shutdown the master, walsender tries to send all the outstanding WAL records to the standby, and then to exit. This basically means that all the WAL records are fully synced between two servers after the clean shutdown of the master. So, after promoting the standby to new master, we can restart the stopped master as new standby without the need for a fresh backup from new master. But there was one problem so far: though walsender tries to send all the outstanding WAL records, it doesn't wait for them to be replicated to the standby. Then, before receiving all the WAL records, walreceiver can detect the closure of connection and exit. We cannot guarantee that there is no missing WAL in the standby after clean shutdown of the master. In this case, backup from new master is required when restarting the stopped master as new standby. This patch fixes this problem. It just changes walsender so that it waits for all the outstanding WAL records to be replicated to the standby before closing the replication connection. Per discussion, this is a fix that needs to get backpatched rather than new feature. So, back-patch to 9.1 where enough infrastructure for this exists. Patch by me, reviewed by Andres Freund.
*	Ensure no xid gaps during Hot Standby startup	Simon Riggs	2013-06-23
\| \| \| \| \| \| \| \| \|	In some cases with higher numbers of subtransactions it was possible for us to incorrectly initialize subtrans leading to complaints of missing pages. Bug report by Sergey Konoplev Analysis and fix by Andres Freund
*	Avoid deadlocks during insertion into SP-GiST indexes.	Tom Lane	2013-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \|	SP-GiST's original scheme for avoiding deadlocks during concurrent index insertions doesn't work, as per report from Hailong Li, and there isn't any evident way to make it work completely. We could possibly lock individual inner tuples instead of their whole pages, but preliminary experimentation suggests that the performance penalty would be huge. Instead, if we fail to get a buffer lock while descending the tree, just restart the tree descent altogether. We keep the old tuple positioning rules, though, in hopes of reducing the number of cases where this can happen. Teodor Sigaev, somewhat edited by Tom Lane
*	Only install a portal's ResourceOwner if it actually has one.	Tom Lane	2013-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In most scenarios a portal without a ResourceOwner is dead and not subject to any further execution, but a portal for a cursor WITH HOLD remains in existence with no ResourceOwner after the creating transaction is over. In this situation, if we attempt to "execute" the portal directly to fetch data from it, we were setting CurrentResourceOwner to NULL, leading to a segfault if the datatype output code did anything that required a resource owner (such as trying to fetch system catalog entries that weren't already cached). The case appears to be impossible to provoke with stock libpq, but psqlODBC at least is able to cause it when working with held cursors. Simplest fix is to just skip the assignment to CurrentResourceOwner, so that any resources used by the data output operations will be managed by the transaction-level resource owner instead. For consistency I changed all the places that install a portal's resowner as current, even though some of them are probably not reachable with a held cursor's portal. Per report from Joshua Berry (with thanks to Hiroshi Inoue for developing a self-contained test case). Back-patch to all supported versions.
*	Fix cache flush hazard in cache_record_field_properties().	Tom Lane	2013-06-11
\| \| \| \| \| \| \| \| \| \| \| \|	We need to increment the refcount on the composite type's cached tuple descriptor while we do lookups of its column types. Otherwise a cache flush could occur and release the tuple descriptor before we're done with it. This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a failure in a production build seem rather low (since the pfree'd descriptor typically wouldn't get scribbled on immediately). That may explain the lack of any previous reports. Buildfarm issue noted by Christian Ullrich. Back-patch to 9.1 where the bogus code was added.
*	Remove unnecessary restrictions about RowExprs in transformAExprIn().	Tom Lane	2013-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the existing code here was written, it made sense to special-case RowExprs because that was the only way that we could handle row comparisons at all. Now that we have record_eq() and arrays of composites, the generic logic for "scalar" types will in fact work on RowExprs too, so there's no reason to throw error for combinations of RowExprs and other ways of forming composite values, nor to ignore the possibility of using a ScalarArrayOpExpr. But keep using the old logic when comparing two RowExprs, for consistency with the main transformAExprOp() logic. (This allows some cases with not-quite-identical rowtypes to succeed, so we might get push-back if we removed it.) Per bug #8198 from Rafal Rzepecki. Back-patch to all supported branches, since this works fine as far back as 8.4. Rafal Rzepecki and Tom Lane
*	Remove ALTER DEFAULT PRIVILEGES' requirement of schema CREATE permissions.	Tom Lane	2013-06-09
\| \| \| \| \| \| \| \|	Per discussion, this restriction isn't needed for any real security reason, and it seems to confuse people more often than it helps them. It could also result in some database states being unrestorable. So just drop it. Back-patch to 9.0, where ALTER DEFAULT PRIVILEGES was introduced.
*	Remove fixed limit on the number of concurrent AllocateFile() requests.	Tom Lane	2013-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AllocateFile(), AllocateDir(), and some sister routines share a small array for remembering requests, so that the files can be closed on transaction failure. Previously that array had a fixed size, MAX_ALLOCATED_DESCS (32). While historically that had seemed sufficient, Steve Toutant pointed out that this meant you couldn't scan more than 32 file_fdw foreign tables in one query, because file_fdw depends on the COPY code which uses AllocateFile(). There are probably other cases, or will be in the future, where this nonconfigurable limit impedes users. We can't completely remove any such limit, at least not without a lot of work, since each such request requires a kernel file descriptor and most platforms limit the number we can have. (In principle we could "virtualize" these descriptors, as fd.c already does for the main VFD pool, but not without an additional layer of overhead and a lot of notational impact on the calling code.) But we can at least let the array size be configurable. Hence, change the code to allow up to max_safe_fds/2 allocated file requests. On modern platforms this should allow several hundred concurrent file_fdw scans, or more if one increases the value of max_files_per_process. To go much further than that, we'd need to do some more work on the data structure, since the current code for closing requests has potentially O(N^2) runtime; but it should still be all right for request counts in this range. Back-patch to 9.1 where contrib/file_fdw was introduced.
*	Don't downcase non-ascii identifier chars in multi-byte encodings.	Andrew Dunstan	2013-06-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	Long-standing code has called tolower() on identifier character bytes with the high bit set. This is clearly an error and produces junk output when the encoding is multi-byte. This patch therefore restricts this activity to cases where there is a character with the high bit set AND the encoding is single-byte. There have been numerous gripes about this, most recently from Martin Schäfer. Backpatch to all live releases.
*	Fix typo in comment.	Heikki Linnakangas	2013-06-06
\|
*	Ensure that XLOG_HEAP2_VISIBLE always targets an initialized page.	Robert Haas	2013-06-06
\| \| \| \|	Andres Freund
*	Backport log_newpage_buffer.	Robert Haas	2013-06-06
\| \| \| \| \|	Andres' fix for XLOG_HEAP2_VISIBLE on unitialized pages requires this.
*	Prevent pushing down WHERE clauses into unsafe UNION/INTERSECT nests.	Tom Lane	2013-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The planner is aware that it mustn't push down upper-level quals into subqueries if the quals reference subquery output columns that contain set-returning functions or volatile functions, or are non-DISTINCT outputs of a DISTINCT ON subquery. However, it missed making this check when there were one or more levels of UNION or INTERSECT above the dangerous expression. This could lead to "set-valued function called in context that cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to silently wrong answers in the other cases. To fix, refactor the checks so that we make the column-is-unsafe checks during subquery_is_pushdown_safe(), which already has to recursively inspect all arms of a set-operation tree. This makes qual_is_pushdown_safe() considerably simpler, at the cost that we will spend some cycles checking output columns that possibly aren't referenced in any upper qual. But the cases where this code gets executed at all are already nontrivial queries, so it's unlikely anybody will notice any slowdown of planning. This has been broken since commit 05f916e6add9726bf4ee046e4060c1b03c9961f2, which makes the bug over ten years old. A bit surprising nobody noticed it before now.
*	Put analyze_keyword back in explain_option_name production.	Tom Lane	2013-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 2c92edad48796119c83d7dbe6c33425d1924626d, I broke "EXPLAIN (ANALYZE)" syntax, because I mistakenly thought that ANALYZE/ANALYSE were only partially reserved and thus would be included in NonReservedWord; but actually they're fully reserved so they still need to be called out here. A nicer solution would be to demote these words to type_func_name_keyword status (they can't be less than that because of "VACUUM [ANALYZE] ColId"). While that works fine so far as the core grammar is concerned, it breaks ECPG's grammar for reasons I don't have time to isolate at the moment. So do this for the time being. Per report from Kevin Grittner. Back-patch to 9.0, like the previous commit.
*	Provide better message when CREATE EXTENSION can't find a target schema.	Tom Lane	2013-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new message (and SQLSTATE) matches the corresponding error cases in namespace.c. This was thought to be a "can't happen" case when extension.c was written, so we didn't think hard about how to report it. But it definitely can happen in 9.2 and later, since we no longer require search_path to contain any valid schema names. It's probably also possible in 9.1 if search_path came from a noninteractive source. So, back-patch to all releases containing this code. Per report from Sean Chittenden, though this isn't exactly his patch.
*	Fix memory leak in LogStandbySnapshot().	Tom Lane	2013-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The array allocated by GetRunningTransactionLocks() needs to be pfree'd when we're done with it. Otherwise we leak some memory during each checkpoint, if wal_level = hot_standby. This manifests as memory bloat in the checkpointer process, or in bgwriter in versions before we made the checkpointer separate. Reported and fixed by Naoya Anzai. Back-patch to 9.0 where the issue was introduced. In passing, improve comments for GetRunningTransactionLocks(), and add an Assert that we didn't overrun the palloc'd array.
*	Allow type_func_name_keywords in some places where they weren't before.	Tom Lane	2013-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes type_func_name_keywords less reserved than they were before, by allowing them for role names, language names, EXPLAIN and COPY options, and SET values for GUCs; which are all places where few if any actual keywords could appear instead, so no new ambiguities are introduced. The main driver for this change is to allow "COPY ... (FORMAT BINARY)" to work without quoting the word "binary". That is an inconsistency that has been complained of repeatedly over the years (at least by Pavel Golub, Kurt Lidl, and Simon Riggs); but we hadn't thought of any non-ugly solution until now. Back-patch to 9.0 where the COPY (FORMAT BINARY) syntax was introduced.