postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Fix whitespace	Peter Eisentraut	2017-08-15
\|
*	Handle elog(FATAL) during ROLLBACK more robustly.	Tom Lane	2017-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stress testing by Andreas Seltenreich disclosed longstanding problems that occur if a FATAL exit (e.g. due to receipt of SIGTERM) occurs while we are trying to execute a ROLLBACK of an already-failed transaction. In such a case, xact.c is in TBLOCK_ABORT state, so that AbortOutOfAnyTransaction would skip AbortTransaction and go straight to CleanupTransaction. This led to an assert failure in an assert-enabled build (due to the ROLLBACK's portal still having a cleanup hook) or without assertions, to a FATAL exit complaining about "cannot drop active portal". The latter's not disastrous, perhaps, but it's messy enough to want to improve it. We don't really want to run all of AbortTransaction in this code path. The minimum required to clean up the open portal safely is to do AtAbort_Memory and AtAbort_Portals. It seems like a good idea to do AtAbort_Memory unconditionally, to be entirely sure that we are starting with a safe CurrentMemoryContext. That means that if the main loop in AbortOutOfAnyTransaction does nothing, we need an extra step at the bottom to restore CurrentMemoryContext = TopMemoryContext, which I chose to do by invoking AtCleanup_Memory. This'll result in calling AtCleanup_Memory twice in many of the paths through this function, but that seems harmless and reasonably inexpensive. The original motivation for the assertion in AtCleanup_Portals was that we wanted to be sure that any user-defined code executed as a consequence of the cleanup hook runs during AbortTransaction not CleanupTransaction. That still seems like a valid concern, and now that we've seen one case of the assertion firing --- which means that exactly that would have happened in a production build --- let's replace the Assert with a runtime check. If we see the cleanup hook still set, we'll emit a WARNING and just drop the hook unexecuted. This has been like this a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/877ey7bmun.fsf@ansel.ydns.eu
*	Remove AtEOXact_CatCache().	Tom Lane	2017-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sole useful effect of this function, to check that no catcache entries have positive refcounts at transaction end, has really been obsolete since we introduced ResourceOwners in PG 8.1. We reduced the checks to assertions years ago, so that the function was a complete no-op in production builds. There have been previous discussions about removing it entirely, but consensus up to now was that it had some small value as a cross-check for bugs in the ResourceOwner logic. However, it now emerges that it's possible to trigger these assertions if you hit an assert-enabled backend with SIGTERM during a call to SearchCatCacheList, because that function temporarily increases the refcounts of entries it's intending to add to a catcache list construct. In a normal ERROR scenario, the extra refcounts are cleaned up by SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a transaction abort and exit without ever executing PG_CATCH handlers. There's a case to be made that this is a generic hazard and we should consider restructuring elog(FATAL) handling so that pending PG_CATCH handlers do get run. That's pretty scary though: it could easily create more problems than it solves. Preliminary stress testing by Andreas Seltenreich suggests that there are not many live problems of this ilk, so we rejected that idea. There are more-localized ways to fix the problem; the most principled one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY. But adding cycles to SearchCatCacheList isn't very appealing. We could also weaken the assertions in AtEOXact_CatCache in some more or less ad-hoc way, but that just makes its raison d'etre even less compelling. In the end, the most reasonable solution seems to be to just remove AtEOXact_CatCache altogether, on the grounds that it's not worth trying to fix it. It hasn't found any bugs for us in many years. Per report from Jeevan Chalke. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com
*	Fix handling of container types in find_composite_type_dependencies.	Tom Lane	2017-08-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	find_composite_type_dependencies correctly found columns that are of the specified type, and columns that are of arrays of that type, but not columns that are domains or ranges over the given type, its array type, etc. The most general way to handle this seems to be to assume that any type that is directly dependent on the specified type can be treated as a container type, and processed recursively (allowing us to handle nested cases such as ranges over domains over arrays ...). Since a type's array type already has such a dependency, we can drop the existing special case for the array type. The very similar logic in get_rels_with_domain was likewise a few bricks shy of a load, as it supposed that a directly dependent type could only be a sub-domain. This is already wrong for ranges over domains, and it'll someday be wrong for arrays over domains. Add test cases illustrating the problems, and back-patch to all supported branches. Discussion: https://postgr.es/m/15268.1502309024@sss.pgh.pa.us
*	Fix datumSerialize infrastructure to not crash on non-varlena data.	Tom Lane	2017-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 1efc7e538 did a poor job of emulating existing logic for touching Datums that might be expanded-object pointers. It didn't check for typlen being -1 first, which meant it could crash on fixed-length pass-by-ref values, and probably on cstring values as well. It also didn't use DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently harmless is not according to documentation nor prevailing style. I also think the lack of any explanation as to why datumSerialize makes these particular nonobvious choices is pretty awful, so fix that. Per report from Jarred Ward. Back-patch to 9.6 where this code came in. Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com
*	Reword some unclear comments	Alvaro Herrera	2017-08-08
\|
*	Translation updates	Peter Eisentraut	2017-08-07
\| \| \| \| \|	Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: d81b8e4ab322171b7ea691c01513ede1cf398404
*	Require update permission for the large object written by lo_put().	Tom Lane	2017-08-07
\| \| \| \| \| \| \| \| \| \|	lo_put() surely should require UPDATE permission, the same as lowrite(), but it failed to check for that, as reported by Chapman Flack. Oversight in commit c50b7c09d; backpatch to 9.4 where that was introduced. Tom Lane and Michael Paquier Security: CVE-2017-7548
*	Again match pg_user_mappings to information_schema.user_mapping_options.	Noah Misch	2017-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 3eefc51053f250837c3115c12f8119d16881a2d7 claimed to make pg_user_mappings enforce the qualifications user_mapping_options had been enforcing, but its removal of a longstanding restriction left them distinct when the current user is the subject of a mapping yet has no server privileges. user_mapping_options emits no rows for such a mapping, but pg_user_mappings includes full umoptions. Change pg_user_mappings to show null for umoptions. Back-patch to 9.2, like the above commit. Reviewed by Tom Lane. Reported by Jeff Janes. Security: CVE-2017-7547
*	Don't allow logging in with empty password.	Heikki Linnakangas	2017-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some authentication methods allowed it, others did not. In the client-side, libpq does not even try to authenticate with an empty password, which makes using empty passwords hazardous: an administrator might think that an account with an empty password cannot be used to log in, because psql doesn't allow it, and not realize that a different client would in fact allow it. To clear that confusion and to be be consistent, disallow empty passwords in all authentication methods. All the authentication methods that used plaintext authentication over the wire, except for BSD authentication, already checked that the password received from the user was not empty. To avoid forgetting it in the future again, move the check to the recv_password_packet function. That only forbids using an empty password with plaintext authentication, however. MD5 and SCRAM need a different fix: * In stable branches, check that the MD5 hash stored for the user does not not correspond to an empty string. This adds some overhead to MD5 authentication, because the server needs to compute an extra MD5 hash, but it is not noticeable in practice. * In HEAD, modify CREATE and ALTER ROLE to clear the password if an empty string, or a password hash that corresponds to an empty string, is specified. The user-visible behavior is the same as in the stable branches, the user cannot log in, but it seems better to stop the empty password from entering the system in the first place. Secondly, it is fairly expensive to check that a SCRAM hash doesn't correspond to an empty string, because computing a SCRAM hash is much more expensive than an MD5 hash by design, so better avoid doing that on every authentication. We could clear the password on CREATE/ALTER ROLE also in stable branches, but we would still need to check at authentication time, because even if we prevent empty passwords from being stored in pg_authid, there might be existing ones there already. Reported by Jeroen van der Ham, Ben de Graaff and Jelte Fennema. Security: CVE-2017-7546
*	Fix thinko introduced in 2bef06d516460 et al.	Andres Freund	2017-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The callers for GetOldestSafeDecodingTransactionId() all inverted the argument for the argument introduced in 2bef06d516460. Luckily this appears to be inconsequential for the moment, as we wait for concurrent in-progress transaction when assembling a snapshot. Additionally this could only make a difference when adding a second logical slot, because only a pre-existing slot could cause an issue by lowering the returned xid dangerously much. Reported-By: Antonin Houska Discussion: https://postgr.es/m/32704.1496993134@localhost Backport: 9.4-, where 2bef06d516460 was backpatched to.
*	Disallow SSL session tickets.	Tom Lane	2017-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't actually support session tickets, since we do not create an SSL session identifier. But it seems that OpenSSL will issue a session ticket on-demand anyway, which will then fail when used. This results in reconnection failures when using ticket-aware client-side SSL libraries (such as the Npgsql .NET driver), as reported by Shay Rojansky. To fix, just tell OpenSSL not to issue tickets. At some point in the far future, we might consider enabling tickets instead. But the security implications of that aren't entirely clear; and besides it would have little benefit except for very short-lived database connections, which is Something We're Bad At anyhow. It would take a lot of other work to get to a point where that would really be an exciting thing to do. While at it, also tell OpenSSL not to use a session cache. This doesn't really do anything, since a backend would never populate the cache anyway, but it might gain some micro-efficiencies and/or reduce security exposures. Patch by me, per discussion with Heikki Linnakangas and Shay Rojansky. Back-patch to all supported versions. Discussion: https://postgr.es/m/CADT4RqBU8N-csyZuzaook-c795dt22Zcwg1aHWB6tfVdAkodZA@mail.gmail.com
*	Add missing ALTER USER variants	Peter Eisentraut	2017-08-03
\| \| \| \| \| \| \|	ALTER USER ... SET did not support all the syntax variants of ALTER ROLE ... SET. Reported-by: Pavel Golub <pavel@microolap.com>
*	Allow a foreign table CHECK constraint to be initially NOT VALID.	Robert Haas	2017-08-03
\| \| \| \| \| \| \| \| \| \| \| \|	For a table, the constraint can be considered validated immediately, because the table must be empty. But for a foreign table this is not necessarily the case. Fixes a bug in commit f27a6b15e6566fba7748d0d9a3fc5bcfd52c4a1b. Amit Langote, with some changes by me. Discussion: http://postgr.es/m/d2b7419f-4a71-cf86-cc99-bfd0f359a1ea@lab.ntt.co.jp
*	Remove broken and useless entry-count printing in HASH_DEBUG code.	Tom Lane	2017-08-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	init_htab(), with #define HASH_DEBUG, prints a bunch of hashtable parameters. It used to also print nentries, but commit 44ca4022f changed that to "hash_get_num_entries(hctl)", which is wrong (the parameter should be "hashp"). Rather than correct the coding, though, let's just remove that field from the printout. The table must be empty, since we just finished building it, so expensively calculating the number of entries is rather pointless. Moreover hash_get_num_entries makes assumptions (about not needing locks) which we could do without in debugging code. Noted by Choi Doo-Won in bug #14764. Back-patch to 9.6 where the faulty code was introduced. Discussion: https://postgr.es/m/20170802032353.8424.12274@wrigleys.postgresql.org
*	Add missing comment in postgresql.conf.	Tatsuo Ishii	2017-07-31
\| \| \| \| \| \| \|	current_source requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back patched to 9.2 and beyond.
*	Add missing comment in postgresql.conf.	Tatsuo Ishii	2017-07-31
\| \| \| \| \| \| \|	dynamic_shared_memory_type requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back pached to 9.4 and beyond.
*	Fix concurrent locking of tuple update chain	Alvaro Herrera	2017-07-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If several sessions are concurrently locking a tuple update chain with nonconflicting lock modes using an old snapshot, and they all succeed, it may happen that some of them fail because of restarting the loop (due to a concurrent Xmax change) and getting an error in the subsequent pass while trying to obtain a tuple lock that they already have in some tuple version. This can only happen with very high concurrency (where a row is being both updated and FK-checked by multiple transactions concurrently), but it's been observed in the field and can have unpleasant consequences such as an FK check failing to see a tuple that definitely exists: ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name" DETAIL: Key (keyid)=(123456) is not present in table "parent_table". (where the key is observably present in the table). Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql
*	Fix race condition in predicate-lock init code in EXEC_BACKEND builds.	Tom Lane	2017-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trading a little too heavily on letting the code path be the same whether we were creating shared data structures or only attaching to them, InitPredicateLocks() inserted the "scratch" PredicateLockTargetHash entry unconditionally. This is just wrong if we're in a postmaster child, which would only reach this code in EXEC_BACKEND builds. Most of the time, the hash_search(HASH_ENTER) call would simply report that the entry already existed, causing no visible effect since the code did not bother to check for that possibility. However, if this happened while some other backend had transiently removed the "scratch" entry, then that other backend's eventual RestoreScratchTarget would suffer an assert failure; this appears to be the explanation for a recent failure on buildfarm member culicidae. In non-assert builds, there would be no visible consequences there either. But nonetheless this is a pretty bad bug for EXEC_BACKEND builds, for two reasons: 1. Each new backend would perform the hash_search(HASH_ENTER) call without holding any lock that would prevent concurrent access to the PredicateLockTargetHash hash table. This creates a low but certainly nonzero risk of corruption of that hash table. 2. In the event that the race condition occurred, by reinserting the scratch entry too soon, we were defeating the entire purpose of the scratch entry, namely to guarantee that transaction commit could move hash table entries around with no risk of out-of-memory failure. The odds of an actual OOM failure are quite low, but not zero, and if it did happen it would again result in corruption of the hash table. The user-visible symptoms of such corruption are a little hard to predict, but would presumably amount to misbehavior of SERIALIZABLE transactions that'd require a crash or postmaster restart to fix. To fix, just skip the hash insertion if IsUnderPostmaster. I also inserted a bunch of assertions that the expected things happen depending on whether IsUnderPostmaster is true. That might be overkill, since most comparable code in other functions isn't quite that paranoid, but once burnt twice shy. In passing, also move a couple of lines to places where they seemed to make more sense. Diagnosis of problem by Thomas Munro, patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/10593.1500670709@sss.pgh.pa.us
*	When WCOs are present, disable direct foreign table modification.	Robert Haas	2017-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the user modifies a view that has CHECK OPTIONs and this gets translated into a modification to an underlying relation which happens to be a foreign table, the check options should be enforced. In the normal code path, that was happening properly, but it was not working properly for "direct" modification because the whole operation gets pushed to the remote side in that case and we never have an option to enforce the constraint against individual tuples. Fix by disabling direct modification when there is a need to enforce CHECK OPTIONs. Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me. Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp
*	Ensure that pg_get_ruledef()'s output matches pg_get_viewdef()'s.	Tom Lane	2017-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various cases involving renaming of view columns are handled by having make_viewdef pass down the view's current relation tupledesc to get_query_def, which then takes care to use the column names from the tupledesc for the output column names of the SELECT. For some reason though, we'd missed teaching make_ruledef to do similarly when it is printing an ON SELECT rule, even though this is exactly the same case. The results from pg_get_ruledef would then be different and arguably wrong. In particular, this breaks pre-v10 versions of pg_dump, which in some situations would define views by means of emitting a CREATE RULE ... ON SELECT command. Third-party tools might not be happy either. In passing, clean up some crufty code in make_viewdef; we'd apparently modernized the equivalent code in make_ruledef somewhere along the way, and missed this copy. Per report from Gilles Darold. Back-patch to all supported versions. Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com
*	Fix typo in comment	Alvaro Herrera	2017-07-21
\| \| \| \| \| \| \|	Commit fd31cd265138 renamed the variable to skipping_blocks, but forgot to update this comment. Noticed while inspecting code.
*	Fix double shared memory allocation.	Teodor Sigaev	2017-07-21
\| \| \| \| \| \| \| \| \| \| \|	SLRU buffer lwlocks are allocated twice by oversight in commit fe702a7b3f9f2bc5bf6d173166d7d55226af82c8 where that locks were moved to separate tranche. The bug doesn't have user-visible effects except small overspending of shared memory. Backpatch to 9.6 where it was introduced. Alexander Korotkov with small editorization by me.
*	Fix dumping of outer joins with empty qual lists.	Tom Lane	2017-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally, a JoinExpr would have empty "quals" only if it came from CROSS JOIN syntax. However, it's possible to get to this state by specifying NATURAL JOIN between two tables with no common column names, and there might be other ways too. The code previously printed no ON clause if "quals" was empty; that's right for CROSS JOIN but syntactically invalid if it's some type of outer join. Fix by printing ON TRUE in that case. This got broken by commit 2ffa740be, which stopped using NATURAL JOIN syntax in ruleutils output due to its brittleness in the face of column renamings. Back-patch to 9.3 where that commit appeared. Per report from Tushar Ahuja. Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com
*	Fix dumping of FUNCTION RTEs that contain non-function-call expressions.	Tom Lane	2017-07-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The grammar will only accept something syntactically similar to a function call in a function-in-FROM expression. However, there are various ways to input something that ruleutils.c won't deparse that way, potentially leading to a view or rule that fails dump/reload. Fix by inserting a dummy CAST around anything that isn't going to deparse as a function (which is one of the ways to get something like that in there in the first place). In HEAD, also make use of the infrastructure added by this to avoid emitting unnecessary parentheses in CREATE INDEX deparsing. I did not change that in back branches, thinking that people might find it to be unexpected/unnecessary behavioral change. In HEAD, also fix incorrect logic for when to add extra parens to partition key expressions. Somebody apparently thought they could get away with simpler logic than pg_get_indexdef_worker has, but they were wrong --- a counterexample is PARTITION BY LIST ((a[1])). Ignoring the prettyprint flag for partition expressions isn't exactly a nice solution anyway. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us
*	Fix race between GetNewTransactionId and GetOldestActiveTransactionId.	Heikki Linnakangas	2017-07-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The race condition goes like this: 1. GetNewTransactionId advances nextXid e.g. from 100 to 101 2. GetOldestActiveTransactionId reads the new nextXid, 101 3. GetOldestActiveTransactionId loops through the proc array. There are no active XIDs there, so it returns 101 as the oldest active XID. 4. GetNewTransactionid stores XID 100 to MyPgXact->xid So, GetOldestActiveTransactionId returned XID 101, even though 100 only just started and is surely still running. This would be hard to hit in practice, and even harder to spot any ill effect if it happens. GetOldestActiveTransactionId is only used when creating a checkpoint in a master server, and the race condition can only happen on an online checkpoint, as there are no backends running during a shutdown checkpoint. The oldestActiveXid value of an online checkpoint is only used when starting up a hot standby server, to determine the starting point where pg_subtrans is initialized from. For the race condition to happen, there must be no other XIDs in the proc array that would hold back the oldest-active XID value, which means that the missed XID must be a top transaction's XID. However, pg_subtrans is not used for top XIDs, so I believe an off-by-one error is in fact inconsequential. Nevertheless, let's fix it, as it's clearly wrong and the fix is simple. This has been wrong ever since hot standby was introduced, so backport to all supported versions. Discussion: https://www.postgresql.org/message-id/e7258662-82b6-7a45-56d4-99b337a32bf7@iki.fi
*	Fix ruleutils.c for domain-over-array cases, too.	Tom Lane	2017-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Further investigation shows that ruleutils isn't quite up to speed either for cases where we have a domain-over-array: it needs to be prepared to look past a CoerceToDomain at the top level of field and element assignments, else it decompiles them incorrectly. Potentially this would result in failure to dump/reload a rule, if it looked like the one in the new test case. (I also added a test for EXPLAIN; that output isn't broken, but clearly we need more test coverage here.) Like commit b1cb32fb6, this bug is reachable in cases we already support, so back-patch all the way.
*	Reduce memory usage of tsvector type analyze function.	Heikki Linnakangas	2017-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	compute_tsvector_stats() detoasted and kept in memory every tsvector value in the sample, but that can be a lot of memory. The original bug report described a case using over 10 gigabytes, with statistics target of 10000 (the maximum). To fix, allocate a separate copy of just the lexemes that we keep around, and free the detoasted tsvector values as we go. This adds some palloc/pfree overhead, when you have a lot of distinct lexemes in the sample, but it's better than running out of memory. Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org
*	Avoid integer overflow while sifting-up a heap in tuplesort.c.	Tom Lane	2017-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the number of tuples in the heap exceeds approximately INT_MAX/2, this loop's calculation "2*i+1" could overflow, resulting in a crash. Fix it by using unsigned int rather than int for the relevant local variables; that shouldn't cost anything extra on any popular hardware. Per bug #14722 from Sergey Koposov. Original patch by Sergey Koposov, modified by me per a suggestion from Heikki Linnakangas to use unsigned int not int64. Back-patch to 9.4, where tuplesort.c grew the ability to sort as many as INT_MAX tuples in-memory (commit 263865a48). Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org
*	Fix variable and type name in comment.	Heikki Linnakangas	2017-07-12
\| \| \| \| \| \|	Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/20170711.163441.241981736.horiguchi.kyotaro@lab.ntt.co.jp
*	Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure.	Heikki Linnakangas	2017-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 14e8803f1 removed the locking in SyncRepWaitForLSN, but that introduced a race condition, where SyncRepWaitForLSN might see syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was not yet removed from the queue. That tripped the assertion, that the process should no longer be in the uqeue. Reorder the operations in SyncRepWakeQueue to remove the process from the queue first, and update syncRepState only after that, and add a memory barrier in between to make sure the operations are made visible to other processes in that order. Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro. Backpatch down to 9.5, where the locking was removed. Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org
*	Fix multiple assignments to a column of a domain type.	Tom Lane	2017-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We allow INSERT and UPDATE commands to assign to the same column more than once, as long as the assignments are to subfields or elements rather than the whole column. However, this failed when the target column was a domain over array rather than plain array. Fix by teaching process_matched_tle() to look through CoerceToDomain nodes, and add relevant test cases. Also add a group of test cases exercising domains over array of composite. It's doubtless accidental that CREATE DOMAIN allows this case while not allowing straight domain over composite; but it does, so we'd better make sure we don't break it. (I could not find any documentation mentioning either side of that, so no doc changes.) It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
*	On Windows, retry process creation if we fail to reserve shared memory.	Tom Lane	2017-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've heard occasional reports of backend launch failing because pgwin32_ReserveSharedMemoryRegion() fails, indicating that something has already used that address space in the child process. It's not very clear what, given that we disable ASLR in Windows builds, but suspicion falls on antivirus products. It'd be better if we didn't have to disable ASLR, anyway. So let's try to ameliorate the problem by retrying the process launch after such a failure, up to 100 times. Patch by me, based on previous work by Amit Kapila and others. This is a longstanding issue, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
*	Fix typo	Alvaro Herrera	2017-07-07
\| \| \| \|	Noticed while reviewing code.
*	Fix potential data corruption during freeze	Teodor Sigaev	2017-07-06
\| \| \| \| \| \| \|	Fix oversight in 3b97e6823b94 bug fix. Bitwise AND is used instead of OR and it cleans all bits in t_infomask heap tuple field. Backpatch to 9.3
*	Treat clean shutdown of an SSL connection same as the non-SSL case.	Heikki Linnakangas	2017-07-03
\| \| \| \| \| \| \| \| \| \|	If the client closes an SSL connection, treat it the same as EOF on a non-SSL connection. In particular, don't write a message in the log about that. Michael Paquier. Discussion: https://www.postgresql.org/message-id/CAB7nPqSfyVV42Q2acFo%3DvrvF2gxoZAMJLAPq3S3KkjhZAYi7aw@mail.gmail.com
*	Fix walsender to exit promptly if client requests shutdown.	Tom Lane	2017-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible for WalSndWaitForWal to be asked to wait for WAL that doesn't exist yet. That's fine, in fact it's the normal situation if we're caught up; but when the client requests shutdown we should not keep waiting. The previous coding could wait indefinitely if the source server was idle. In passing, improve the rather weak comments in this area, and slightly rearrange some related code for better readability. Back-patch to 9.4 where this code was introduced. Discussion: https://postgr.es/m/14154.1498781234@sss.pgh.pa.us
*	Second try at fixing tcp_keepalives_idle option on Solaris.	Tom Lane	2017-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buildfarm evidence shows that TCP_KEEPALIVE_THRESHOLD doesn't exist after all on Solaris < 11. This means we need to take positive action to prevent the TCP_KEEPALIVE code path from being taken on that platform. I've chosen to limit it with "&& defined(__darwin__)", since it's unclear that anyone else would follow Apple's precedent of spelling the symbol that way. Also, follow a suggestion from Michael Paquier of eliminating code duplication by defining a couple of intermediate symbols for the socket option. In passing, make some effort to reduce the number of translatable messages by replacing "setsockopt(foo) failed" with "setsockopt(%s) failed", etc, throughout the affected files. And update relevant documentation so that it doesn't claim to provide an exhaustive list of the possible socket option names. Like the previous commit (f0256c774), back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
*	Support tcp_keepalives_idle option on Solaris.	Tom Lane	2017-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Turns out that the socket option for this is named TCP_KEEPALIVE_THRESHOLD, at least according to the tcp(7P) man page for Solaris 11. (But since that text refers to "SunOS", it's likely pretty ancient.) It appears that the symbol TCP_KEEPALIVE does get defined on that platform, but it doesn't seem to represent a valid protocol-level socket option. This leads to bleats in the postmaster log, and no tcp_keepalives_idle functionality. Per bug #14720 from Andrey Lizenko, as well as an earlier report from Dhiraj Chawla that nobody had followed up on. The issue's been there since we added the TCP_KEEPALIVE code path in commit 5acd417c8, so back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
*	Re-allow SRFs and window functions within sub-selects within aggregates.	Tom Lane	2017-06-27
\| \| \| \| \| \| \| \| \| \| \|	check_agg_arguments_walker threw an error upon seeing a SRF or window function, but that is too aggressive: if the function is within a sub-select then it's perfectly fine. I broke the SRF case in commit 0436f6bde by copying the logic for window functions ... but that was broken too, and had been since commit eaccfded9. Repair both cases in HEAD, and the window function case back to 9.3. 9.2 gets this right.
*	Don't lose walreceiver start requests due to race condition in postmaster.	Tom Lane	2017-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a walreceiver dies, the startup process will notice that and send a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new walreceiver to be launched. There's a race condition, which at least in HEAD is very easy to hit, whereby the postmaster might see that signal before it processes the SIGCHLD from the walreceiver process. In that situation, sigusr1_handler() just dropped the start request on the floor, reasoning that it must be redundant. Eventually, after 10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a fresh request --- but that's a long time if the connection could have been re-established almost immediately. Fix it by setting a state flag inside the postmaster that we won't clear until we do launch a walreceiver. In cases where that results in an extra walreceiver launch, it's up to the walreceiver to realize it's unwanted and go away --- but we have, and need, that logic anyway for the opposite race case. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there in test cases where a master server is stopped and restarted while leaving streaming slaves active. This logic has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
*	Ignore old stats file timestamps when starting the stats collector.	Tom Lane	2017-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stats collector disregards inquiry messages that bear a cutoff_time before when it last wrote the relevant stats file. That's fine, but at startup when it reads the "permanent" stats files, it absorbed their timestamps as if they were the times at which the corresponding temporary stats files had been written. In reality, of course, there's no data out there at all. This led to disregarding inquiry messages soon after startup if the postmaster had been shut down and restarted within less than PGSTAT_STAT_INTERVAL; which is a pretty common scenario, both for testing and in the field. Requesting backends would hang for 10 seconds and then report failure to read statistics, unless they got bailed out by some other backend coming along and making a newer request within that interval. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there because the autovacuum launcher hangs for 10 seconds when it can't get statistics at startup, thus preventing a second shutdown from occurring promptly. We might want to do some things in the autovac code to make it less prone to getting stuck that way, but this change is a good bug fix regardless. In passing, also fix pgstat_read_statsfiles() to ensure that it re-zeroes its global stats variables if they are corrupted by a short read from the stats file. (Other reads in that function go into temp variables, so that the issue doesn't arise.) This has been broken since we created the separation between permanent and temporary stats files in 8.4, so back-patch to all supported branches. Discussion: https://postgr.es/m/16860.1498442626@sss.pgh.pa.us
*	Minor code review for parse_phrase_operator().	Tom Lane	2017-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix its header comment, which described the old behavior of the <N> phrase distance operator; we missed updating that in commit 028350f61. Also, reset errno before strtol() call, to defend against the possibility that it was already ERANGE at entry. (The lack of complaints says that it generally isn't, but this is at least a latent bug.) Very minor stylistic improvements as well. Victor Drobny noted the obsolete comment, I noted the errno issue. Back-patch to 9.6 where this code was added, just in case the errno issue is a live bug in some cases. Discussion: https://postgr.es/m/2b5382fdff9b1f79d5eb2c99c4d2cbe2@postgrespro.ru
*	Fix typo in comment	Alvaro Herrera	2017-06-22
\| \| \| \| \| \| \| \|	Once upon a time, WAL pointers could be NULL, but no longer. We talk about "valid" now. Reported-by: Amit Langote Discussion: https://postgr.es/m/33e9617d-27f1-eee8-3311-e27af98eaf2b@lab.ntt.co.jp
*	Fix possibility of creating a "phantom" segment after promotion.	Andres Freund	2017-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When promoting a standby just after a XLOG_SWITCH record was replayed, and next segment(s) are already are locally available (via walsender, restore_command + trigger/recovery target), that segment could accidentally be recycled onto the past of the new timeline. Later checkpointer would create a .ready file for it, assuming there was an error during creation, and it would get archived. That causes trouble if another standby is later brought up from a basebackup from before the timeline creation, because it would try to read the segment, because XLogFileReadAnyTLI just tries all possible timelines, which doesn't have valid contents. Thus replay would fail. The problem, if already occurred, can be fixed by removing the segment and/or having restore_command filter it out. The reason for the creation of such "phantom" segments was, that after an XLOG_SWITCH record the EndOfLog variable points to the beginning of the next segment, and RemoveXlogFile() used XLByteToPrevSeg(). Normally RemoveXlogFile() doing so is harmless, because the last segment will still exist preventing InstallXLogFileSegment() from causing harm, but just after promotion there's no previous segment on the new timeline. Fix that by using XLByteToSeg() instead of XLByteToPrevSeg(). Author: Andres Freund Reported-By: Greg Burek Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de Backpatch: 9.2-, bug older than all supported versions
*	Fix typo in comment.	Heikki Linnakangas	2017-06-21
\| \| \| \|	Etsuro Fujita
*	Avoid regressions in foreign-key-based selectivity estimates.	Tom Lane	2017-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	David Rowley found that the "use the smallest per-column selectivity" heuristic applied in some cases by get_foreign_key_join_selectivity() was badly off if the FK columns are independent, producing estimates much worse than we got before that code was added in 9.6. One case where that heuristic was used was for LEFT and FULL outer joins with the referenced rel on the outside of the join. But we should not really need to special-case those here. eqjoinsel() never has had such a special case; the correction is applied by calc_joinrel_size_estimate() instead. Let's just estimate such cases like inner joins and rely on that later adjustment. (I think there was something of a thinko here, in that the comments seem to be thinking about the selectivity as defined for semi/anti joins; but that shouldn't apply to left/full joins.) Add a regression test exercising such a case to show that this is sane in at least some cases. The other case where we used that heuristic was for SEMI/ANTI outer joins, either if the referenced rel was on the outside, or if it was on the inside but was part of a join within the RHS. In either case, the FK doesn't give us a lot of traction towards estimating the selectivity. To ensure that we don't have regressions from what happened before 9.6, let's punt by ignoring the FK in such cases and applying the traditional selectivity calculation. (We might be able to improve on that later, but for now I just want to be sure it's not worse than 9.5.) Report and patch by David Rowley, simplified a bit by me. Back-patch to 9.6 where this code was added. Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
*	Fix leaking of small spilled subtransactions during logical decoding.	Andres Freund	2017-06-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When, during logical decoding, a transaction gets too big, it's contents get spilled to disk. Not just the top-transaction gets spilled, but also all of its subtransactions, even if they're not that large themselves. Unfortunately we didn't clean up such small spilled subtransactions from disk. Fix that, by keeping better track of whether a transaction has been spilled to disk. Author: Andres Freund Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello Discussion: https://postgr.es/m/1457621358.355011041@f382.i.mail.ru https://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com Backpatch: 9.4-, where logical decoding was introduced
*	Fix dependency, when changing a function's argument/return type.	Heikki Linnakangas	2017-06-16
\| \| \| \| \| \| \| \| \| \| \| \|	When a new base type is created using the old-style procedure of first creating the input/output functions with "opaque" in place of the base type, the "opaque" argument/return type is changed to the final base type, on CREATE TYPE. However, we did not create a pg_depend record when doing that, so the functions were left not depending on the type. Fixes bug #14706, reported by Karen Huddleston. Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org
*	Fix low-probability leaks of PGresult objects in the backend.	Tom Lane	2017-06-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had three occurrences of essentially the same coding pattern wherein we tried to retrieve a query result from a libpq connection without blocking. In the case where PQconsumeInput failed (typically indicating a lost connection), all three loops simply gave up and returned, forgetting to clear any previously-collected PGresult object. Since those are malloc'd not palloc'd, the oversight results in a process-lifespan memory leak. One instance, in libpqwalreceiver, is of little significance because the walreceiver process would just quit anyway if its connection fails. But we might as well fix it. The other two instances, in postgres_fdw, are somewhat more worrisome because at least in principle the scenario could be repeated, allowing the amount of memory leaked to build up to something worth worrying about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS calls, as well as other calls that could potentially elog(ERROR), providing another way to exit without having cleared the PGresult. Here we need to add PG_TRY logic similar to what exists in quite a few other places in postgres_fdw. Coverity noted the libpqwalreceiver bug; I found the other two cases by checking all calls of PQconsumeInput. Back-patch to all supported versions as appropriate (9.2 lacks postgres_fdw, so this is really quite unexciting for that branch). Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us