postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Silence assorted "variable may be used uninitialized" warnings.	Tom Lane	2018-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All of these are false positives, but in each case a fair amount of analysis is needed to see that, and it's not too surprising that not all compilers are smart enough. (In particular, in the logtape.c case, a compiler lacking the knowledge provided by the Assert would almost surely complain, so that this warning will be seen in any non-assert build.) Some of these are of long standing while others are pretty recent, but it only seems worth fixing them in HEAD. Jaime Casanova, tweaked a bit by me Discussion: https://postgr.es/m/CAJGNTeMcYAMJdPAom52dppLMtF-UnEZi0dooj==75OEv1EoBZA@mail.gmail.com
*	Update out-of-date comment in StartupXLOG.	Robert Haas	2018-02-07
\| \| \| \| \| \| \| \| \|	Commit 4b0d28de06b28e57c540fca458e4853854fbeaf8 should have updated this comment, but did not. Thomas Munro Discussion: http://postgr.es/m/CAEepm=0iJ8aqQcF9ij2KerAkuHF3SwrVTzjMdm1H4w++nfBf9A@mail.gmail.com
*	Support all SQL:2011 options for window frame clauses.	Tom Lane	2018-02-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
*	Doc: move info for btree opclass implementors into main documentation.	Tom Lane	2018-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to now, useful info for writing a new btree opclass has been buried in the backend's nbtree/README file. Let's move it into the SGML docs, in preparation for extending it with info about "in_range" functions in the upcoming window RANGE patch. To do this, I chose to create a new chapter for btree indexes in Part VII (Internals), parallel to the chapters that exist for the newer index AMs. This is a pretty short chapter as-is. At some point somebody might care to flesh it out with more detail about btree internals, but that is beyond the scope of my ambition for today. Discussion: https://postgr.es/m/23141.1517874668@sss.pgh.pa.us
*	Be more wary about shm_toc_lookup failure.	Tom Lane	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 445dbd82a basically missed the point of commit d46633506, which was that we shouldn't allow shm_toc_lookup() failure to lead to a core dump or assertion crash, because the odds of such a failure should never be considered negligible. It's correct that we can't expect the PARALLEL_KEY_ERROR_QUEUE TOC entry to be there if we have no workers. But if we have no workers, we're not going to do anything in this function with the lookup result anyway, so let's just skip it. That lets the code use the easy-to-prove-safe noError=false case, rather than anything requiring effort to review. Back-patch to v10, like the previous commit. Discussion: https://postgr.es/m/3647.1517601675@sss.pgh.pa.us
*	Support parallel btree index builds.	Robert Haas	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
*	Add new function WaitForParallelWorkersToAttach.	Robert Haas	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once this function has been called, we know that all workers have started and attached to their error queues -- so if any of them subsequently exit uncleanly, we'll be sure to throw an ERROR promptly. Otherwise, users of the ParallelContext machinery must be careful not to wait forever for a worker that has failed to start. Parallel query manages to work without needing this for reasons explained in new comments added by this patch, but it's a useful primitive for other parallel operations, such as the pending patch to make creating a btree index run in parallel. Amit Kapila, revised by me. Additional review by Peter Geoghegan. Discussion: http://postgr.es/m/CAA4eK1+e2MzyouF5bg=OtyhDSX+=Ao=3htN=T-r_6s3gCtKFiw@mail.gmail.com
*	Fix possible failure to mark hash metapage dirty.	Robert Haas	2018-02-01
\| \| \| \| \| \| \|	Report and suggested fix by Lixian Zou. Amit Kapila put it in the form of a patch and reviewed. Discussion: http://postgr.es/m/151739848647.1239.12528851873396651946@wrigleys.postgresql.org
*	Avoid unnecessary use of pg_strcasecmp for already-downcased identifiers.	Tom Lane	2018-01-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a lot of code in which option names, which from the user's viewpoint are logically keywords, are passed through the grammar as plain identifiers, and then matched to string literals during command execution. This approach avoids making words into lexer keywords unnecessarily. Some places matched these strings using plain strcmp, some using pg_strcasecmp. But the latter should be unnecessary since identifiers would have been downcased on their way through the parser. Aside from any efficiency concerns (probably not a big factor), the lack of consistency in this area creates a hazard of subtle bugs due to different places coming to different conclusions about whether two option names are the same or different. Hence, standardize on using strcmp() to match any option names that are expected to have been fed through the parser. This does create a user-visible behavioral change, which is that while formerly all of these would work: alter table foo set (fillfactor = 50); alter table foo set (FillFactor = 50); alter table foo set ("fillfactor" = 50); alter table foo set ("FillFactor" = 50); now the last case will fail because that double-quoted identifier is different from the others. However, none of our documentation says that you can use a quoted identifier in such contexts at all, and we should discourage doing so since it would break if we ever decide to parse such constructs as true lexer keywords rather than poor man's substitutes. So this shouldn't create a significant compatibility issue for users. Daniel Gustafsson, reviewed by Michael Paquier, small changes by me Discussion: https://postgr.es/m/29405B24-564E-476B-98C0-677A29805B84@yesql.se
*	Update obsolete sentence in README.parallel.	Robert Haas	2018-01-23
\| \| \| \| \| \|	Since 9.6, heavyweight locking is not an abstract and unhandled concern of the parallel machinery, but rather something to which we have a specific approach.
*	Report an ERROR if a parallel worker fails to start properly.	Robert Haas	2018-01-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 28724fd90d2f85a0573a8107b48abad062a86d83 fixed things so that if a background worker fails to start due to fork() failure or because it is terminated before startup succeeds, BGWH_STOPPED will be reported. However, that only helps if the code that uses the background worker machinery notices the change in status, and the code in parallel.c did not. To fix that, do two things. First, make sure that when a worker exits, it triggers the leader to read from error queues. That way, if a worker which has attached to an error queue exits uncleanly, the leader is sure to throw some error, either the contents of the ErrorResponse sent by the worker, or "lost connection to parallel worker" if it exited without sending one. To cover the case where the worker never starts up in the first place or exits before attaching to the error queue, the ParallelContext now keeps track of which workers have sent at least one message via the error queue. A worker which sends no messages by the time the parallel operation finishes will be checked to see whether it exited before attaching to the error queue; if so, a new error message, "parallel worker failed to initialize", will be reported. If not, we'll continue to wait until it either starts up and exits cleanly, starts up and exits uncleanly, or fails to start, and then take the appropriate action. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com
*	Replace AclObjectKind with ObjectType	Peter Eisentraut	2018-01-19
\| \| \| \| \| \| \| \| \|	AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Local partitioned indexes	Alvaro Herrera	2018-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CREATE INDEX is run on a partitioned table, create catalog entries for an index on the partitioned table (which is just a placeholder since the table proper has no data of its own), and recurse to create actual indexes on the existing partitions; create them in future partitions also. As a convenience gadget, if the new index definition matches some existing index in partitions, these are picked up and used instead of creating new ones. Whichever way these indexes come about, they become attached to the index on the parent table and are dropped alongside it, and cannot be dropped on isolation unless they are detached first. To support pg_dump'ing these indexes, add commands CREATE INDEX ON ONLY <table> (which creates the index on the parent partitioned table, without recursing) and ALTER INDEX ATTACH PARTITION (which is used after the indexes have been created individually on each partition, to attach them to the parent index). These reconstruct prior database state exactly. Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit Langote, Jesper Pedersen, Simon Riggs, David Rowley Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
*	Transfer state pertaining to pending REINDEX operations to workers.	Robert Haas	2018-01-19
\| \| \| \| \| \| \| \| \| \| \| \|	This will allow the pending patch for parallel CREATE INDEX to work on system catalogs, and to provide the same level of protection against use of user indexes while they are being rebuilt that we have for non-parallel CREATE INDEX. Patch by me, reviewed by Peter Geoghegan. Discussion: http://postgr.es/m/CA+TgmoYN-YQU9JsGQcqFLovZ-C+Xgp1_xhJQad=cunGG-_p5gg@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wzkv4UNkXYhqQRqk-u9rS7h5c-4cCW+EqQ8K_WSeS43aZg@mail.gmail.com
*	Change some bogus PageGetLSN calls to BufferGetLSNAtomic	Alvaro Herrera	2018-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As src/backend/access/transam/README says, PageGetLSN may only be called by processes holding either exclusive lock on buffer, or a shared lock on buffer plus buffer header lock. Therefore any place that only holds a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN, which internally obtains buffer header lock prior to reading the LSN. A few callsites failed to comply with this rule. This was detected by running all tests under a new (not committed) assertion that verifies PageGetLSN locking contract. All but one of the callsites that failed the assertion are fixed by this patch. Remaining callsites were inspected manually and determined not to need any change. The exception (unfixed callsite) is in TestForOldSnapshot, which only has a Page argument, making it impossible to access the corresponding Buffer from it. Fixing that seems a much larger patch that will have to be done separately; and that's just as well, since it was only introduced in 9.6 and other bugs are much older. Some of these bugs are ancient; backpatch all the way back to 9.3. Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
*	Add TIMELINE to backup_label file	Simon Riggs	2018-01-06
\| \| \| \| \| \| \|	Allows new test to confirm timelines match Author: Michael Paquier Reviewed-by: David Steele
*	Clean up tupdesc.c for recent changes.	Tom Lane	2018-01-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TupleDescCopy needs to have the same effects as CreateTupleDescCopy in that, since it doesn't copy constraints, it should clear the per-attribute fields associated with them. Oversight in commit cc5f81366. Since TupleDescCopy has already established the presumption that it can just flat-copy the entire attribute array in one go, propagate that approach into CreateTupleDescCopy and CreateTupleDescCopyConstr. (I'm suspicious that this would lead to valgrind complaints if we had any trailing padding in the struct, but we do not, and anyway fixing that seems like a job for a separate commit.) Add some better comments. Thomas Munro, reviewed by Vik Fearing, some additional hacking by me Discussion: https://postgr.es/m/CAEepm=0NvOGZ8B6GbQyQe2C_c2m3LKJ9w=8OMBaYRLgZ_Gw6Nw@mail.gmail.com
*	Update copyright for 2018	Bruce Momjian	2018-01-02
\| \| \| \|	Backpatch-through: certain files through 9.3
*	Don't cast between GinNullCategory and bool	Peter Eisentraut	2018-01-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original idea was that we could use an isNull-style bool array directly as a GinNullCategory array. However, the existing code already acknowledges that that doesn't really work, because of the possibility that bool as currently defined can have arbitrary bit patterns for true values. So it has to loop through the nullFlags array to set each bool value to an acceptable value. But if we are looping through the whole array anyway, we might as well build a proper GinNullCategory array instead and abandon the type casting. That makes the code much safer in case bool is ever changed to something else. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Extend near-wraparound hints to include replication slots	Simon Riggs	2017-12-29
\| \| \| \| \|	Author: Feike Steenbergen Reviewed-by: Michael Paquier
*	Add optional compression method to SP-GiST	Teodor Sigaev	2017-12-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch allows to have different types of column and value stored in leaf tuples of SP-GiST. The main application of feature is to transform complex column type to simple indexed type or for truncating too long value, transformation could be lossy. Simple example: polygons are converted to their bounding boxes, this opclass follows. Authors: me, Heikki Linnakangas, Alexander Korotkov, Nikita Glukhov Reviewed-By: all authors + Darafei Praliaskouski Discussions: https://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ru https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru#54907069.1030506@sigaev.ru
*	Adjust assertion in GetCurrentCommandId.	Robert Haas	2017-12-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	currentCommandIdUsed is only used to skip redundant increments of the command counter, and CommandCounterIncrement() is categorically denied under parallelism anyway. Therefore, it's OK for GetCurrentCommandId() to mark the counter value used, as long as it happens in the leader, not a worker. Prior to commit e9baa5e9fa147e00a2466ab2c40eb99c8a700824, the slightly incorrect check didn't matter, but now it does. A test case added by commit 1804284042e659e7d16904e7bbb0ad546394b6a3 uncovered the problem by accident; it caused failures with force_parallel_mode=on/regress. Report and review by Andres Freund. Patch by me. Discussion: http://postgr.es/m/20171221143106.5lhtygohvmazli3x@alap3.anarazel.de
*	Cancel CV sleep during subtransaction abort.	Robert Haas	2017-12-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally, error recovery paths that need to do things like LWLockReleaseAll and pgstat_report_wait_end also need to call ConditionVariableCancelSleep, but AbortSubTransaction was missed. Since subtransaction abort might destroy up the DSM segment that contains the ConditionVariable stored in cv_sleep_target, this can result in a crash for anything using condition variables. Reported and diagnosed by Andres Freund. Discussion: http://postgr.es/m/20171221110048.rxk6464azzl5t2fi@alap3.anarazel.de
*	Fix bug in cancellation of non-exclusive backup to avoid assertion failure.	Fujii Masao	2017-12-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously an assertion failure occurred when pg_stop_backup() for non-exclusive backup was aborted while it's waiting for WAL files to be archived. This assertion failure happened in do_pg_abort_backup() which was called when a non-exclusive backup was canceled. do_pg_abort_backup() assumes that there is at least one non-exclusive backup running when it's called. But pg_stop_backup() can be canceled even after it marks the end of non-exclusive backup (e.g., during waiting for WAL archiving). This broke the assumption that do_pg_abort_backup() relies on, and which caused an assertion failure. This commit changes do_pg_abort_backup() so that it does nothing when non-exclusive backup has been already marked as completed. That is, the asssumption is also changed, and do_pg_abort_backup() now can handle even the case where it's called when there is no running backup. Backpatch to 9.6 where SQL-callable non-exclusive backup was added. Author: Masahiko Sawada and Michael Paquier Reviewed-By: Robert Haas and Fujii Masao Discussion: https://www.postgresql.org/message-id/CAD21AoD2L1Fu2c==gnVASMyFAAaq3y-AQ2uEVj-zTCGFFjvmDg@mail.gmail.com
*	Perform a lot more sanity checks when freezing tuples.	Andres Freund	2017-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
*	Fix parallel index scan hang with deleted or half-dead pages.	Robert Haas	2017-12-13
\| \| \| \| \| \| \| \| \| \|	The previous coding forgot to release the scan before seizing it again, leading to a lockup. Report by Patrick Hemmer. Diagnosis by Thomas Munro. Patch by Amit Kapila. Discussion: http://postgr.es/m/CAEepm=2xZUcOGP9V0O_G0=2P2wwXwPrkF=upWTCJSisUxMnuSg@mail.gmail.com
*	Rethink MemoryContext creation to improve performance.	Tom Lane	2017-12-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now required to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
*	Fix mistake in comment	Peter Eisentraut	2017-12-08
\| \| \| \|	Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
*	Simplify do_pg_start_backup's API by opening pg_tblspc internally.	Tom Lane	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	do_pg_start_backup() expects its callers to pass in an open DIR pointer for the pg_tblspc directory, but there's no apparent advantage in that. It complicates the callers without adding any flexibility, and there's no robustness advantage, since we surely have to be prepared for errors during the scan of pg_tblspc anyway. In fact, by holding an extra kernel resource during operations like the preliminary checkpoint, we might be making things a fraction more failure-prone not less. Hence, remove that argument and open the directory just for the duration of the actual scan. Discussion: https://postgr.es/m/28752.1512413887@sss.pgh.pa.us
*	Clean up assorted messiness around AllocateDir() usage.	Tom Lane	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
*	Update typedefs.list and re-run pgindent	Robert Haas	2017-11-29
\| \| \| \|	Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
*	Fix ReinitializeParallelDSM to tolerate finding no error queues.	Robert Haas	2017-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d4663350646ca0c069a36d906155a0f7e3372eb7 changed things so that shm_toc_lookup would fail with an error rather than silently returning NULL in the hope that such failures would be reported in a useful way rather than via a system crash. However, it overlooked the fact that the lookup of PARALLEL_KEY_ERROR_QUEUE in ReinitializeParallelDSM is expected to fail when no DSM segment was created in the first place; in that case, we end up with a backend-private memory segment that still contains an entry for PARALLEL_KEY_FIXED but no others. Consequently a benign failure to initialize parallelism can escalate into an elog(ERROR); repair. Discussion: http://postgr.es/m/CA+Tgmob8LFw55DzH1QEREpBEA9RJ_W_amhBFCVZ6WMwUhVpOqg@mail.gmail.com
*	Fix typo.	Robert Haas	2017-11-28
\| \| \| \| \| \|	Masahiko Sawada Discussion: http://postgr.es/m/CAD21AoCq_QG6UEo6yb-purmhLQjDLsCA2_B+8Wh3ah9P2-XmrQ@mail.gmail.com
*	Pad XLogReaderState's per-buffer data_bufsz more aggressively.	Simon Riggs	2017-11-27
\| \| \| \| \| \| \| \| \| \| \| \|	Originally, we palloc'd this buffer just barely big enough to hold the largest xlog backup block seen so far. We now MAXALIGN the palloc size. The original coding could result in many repeated palloc cycles, in the worst case where we see a series ofgradually larger xlog records. We ameliorate that similarly to 8735978e7aebfbc499843630131c18d1f7346c79 by imposing a minimum buffer size of BLCKSZ. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
*	Pad XLogReaderState's main_data buffer more aggressively.	Tom Lane	2017-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, we palloc'd this buffer just barely big enough to hold the largest xlog record seen so far. It turns out that that can result in valgrind complaints, because some compilers will emit code that assumes it can safely fetch padding bytes at the end of a struct, and those padding bytes were unallocated so far as aset.c was concerned. We can fix that by MAXALIGN'ing the palloc request size, ensuring that it is big enough to include any possible padding that might've been omitted from the on-disk record. An additional objection to the original coding is that it could result in many repeated palloc cycles, in the worst case where we see a series of gradually larger xlog records. We can ameliorate that cheaply by imposing a minimum buffer size that's large enough for most xlog records. BLCKSZ/2 was chosen after a bit of discussion. In passing, remove an obsolete comment in struct xl_heap_new_cid that the combocid field is free due to alignment considerations. Perhaps that was true at some point, but it's not now. Back-patch to 9.5 where this code came in. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
*	Parameter toast_tuple_target controls TOAST for new rows	Simon Riggs	2017-11-20
\| \| \| \| \| \| \| \| \| \| \| \|	Specifies the point at which we try to move long column values into TOAST tables. No effect on existing rows. Discussion: https://postgr.es/m/CANP8+jKsVmw6CX6YP9z7zqkTzcKV1+Uzr3XjKcZW=2Ya00OyQQ@mail.gmail.com Author: Simon Riggs <simon@2ndQudrant.com> Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndQuadrant.com>
*	Fix broken cleanup interlock for GIN pending list.	Robert Haas	2017-11-16
\| \| \| \| \| \| \| \| \| \| \| \| \|	The pending list must (for correctness) always be cleaned up by vacuum, and should (for the avoidance of surprising behavior) always be cleaned up by an explicit call to gin_clean_pending_list, but cleanup is optional when inserting. The old logic got this backward: cleanup was forced if (stats == NULL), but that's going to be false when vacuuming and true for inserts. Masahiko Sawada, reviewed by me. Discussion: http://postgr.es/m/CAD21AoBLUSyiYKnTYtSAbC+F=XDjiaBrOUEGK+zUXdQ8owfPKw@mail.gmail.com
*	Add some const decorations to prototypes	Peter Eisentraut	2017-11-10
\| \| \| \|	Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
*	Change TRUE/FALSE to true/false	Peter Eisentraut	2017-11-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Remove secondary checkpoint	Simon Riggs	2017-11-07
\| \| \| \| \| \| \| \| \| \|	Previously server reserved WAL for last two checkpoints, which used too much disk space for small servers. Bumps PG_CONTROL_VERSION Author: Simon Riggs <simon@2ndQuadrant.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Flag index metapages as standard-format in xlog.c calls.	Tom Lane	2017-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	btree, hash, and bloom indexes all set up their metapages in standard format (that is, with pd_lower and pd_upper correctly delimiting the unused area); but they mostly didn't inform the xlog routines of this. When calling log_newpage[_buffer], this is bad because it loses the opportunity to compress unused data out of the WAL record. When calling XLogRegisterBuffer, it's not such a performance problem because all of these call sites also use REGBUF_WILL_INIT, preventing an FPI image from being written. But it's still a good idea to provide the flag when relevant, because that aids WAL consistency checking. This completes the project of getting all the in-core index AMs to handle their metapage WAL operations similarly. Amit Kapila, reviewed by Michael Paquier Discussion: https://postgr.es/m/0d273805-0e9e-ec1a-cb84-d4da400b8f85@lab.ntt.co.jp
*	Fix thinkos in BRIN summarization	Alvaro Herrera	2017-11-03
\| \| \| \| \| \|	The previous commit contained a thinko that made a single-range summarization request process from there to end of table. Fix by setting the correct end range point. Per buildfarm.
*	Fix BRIN summarization concurrent with extension	Alvaro Herrera	2017-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a process is extending a table concurrently with some BRIN summarization process, it is possible for the latter to miss pages added by the former because the number of pages is computed ahead of time. Fix by determining a fresh relation size after inserting the placeholder tuple: any process that further extends the table concurrently will update the placeholder tuple, while previous pages will be processed by the heap scan. Reported-by: Tomas Vondra Reviewed-by: Tom Lane Author: Álvaro Herrera Discussion: https://postgr.es/m/083d996a-4a8a-0e13-800a-851dd09ad8cc@2ndquadrant.com Backpatch-to: 9.5
*	Set the metapage's pd_lower correctly in brin, gin, and spgist indexes.	Tom Lane	2017-11-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, these index types left the pd_lower field set to the default SizeOfPageHeaderData, which is really a lie because it ought to point past whatever space is being used for metadata. The coding accidentally failed to fail because we never told xlog.c that the metapage is of standard format --- but that's not very good, because it impedes WAL consistency checking, and in some cases prevents compression of full-page images. To fix, ensure that we set pd_lower correctly, not only when creating a metapage but whenever we write it out (these apparently redundant steps are needed to cope with pg_upgrade'd indexes that don't yet contain the right value). This allows telling xlog.c that the page is of standard format. The WAL consistency check mask functions are made to mask only if pd_lower appears valid, which I think is likely unnecessary complication, since any metapage appearing in a v11 WAL stream should contain valid pd_lower. But it doesn't cost much to be paranoid. Amit Langote, reviewed by Michael Paquier and Amit Kapila Discussion: https://postgr.es/m/0d273805-0e9e-ec1a-cb84-d4da400b8f85@lab.ntt.co.jp
*	Fix corner-case errors in brin_doupdate().	Tom Lane	2017-11-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases the BRIN code releases lock on an index page, and later re-acquires lock and tries to check that the tuple it was working on is still there. That check was a couple bricks shy of a load. It didn't consider that the page might have turned into a "revmap" page. (The samepage code path doesn't call brin_getinsertbuffer(), so it isn't protected by the checks for revmap status there.) It also didn't check whether the tuple offset was now off the end of the linepointer array. Since commit 24992c6db the latter case is pretty common, but at least in principle it could have occurred before that. The net result is that concurrent updates of a BRIN index could fail with errors like "invalid index offnum" or "inconsistent range map". Per report from Tomas Vondra. Back-patch to 9.5, since this code is substantially the same in all versions containing BRIN. Discussion: https://postgr.es/m/10d2b9f9-f427-03b8-8ad9-6af4ecacbee9@2ndquadrant.com
*	Revert bogus fixes of HOT-freezing bug	Alvaro Herrera	2017-11-02
\| \| \| \| \| \| \| \| \| \|	It turns out we misdiagnosed what the real problem was. Revert the previous changes, because they may have worse consequences going forward. A better fix is forthcoming. The simplistic test case is kept, though disabled. Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de
*	Fix problems with the "role" GUC and parallel query.	Robert Haas	2017-10-29
\| \| \| \| \| \| \| \| \| \|	Without this fix, dropping a role can sometimes result in parallel query failures in sessions that have used "SET ROLE" to assume the dropped role, even if that setting isn't active any more. Report by Pavan Deolasee. Patch by Amit Kapila, reviewed by me. Discussion: http://postgr.es/m/CABOikdOomRcZsLsLK+Z+qENM1zxyaWnAvFh3MJZzZnnKiF+REg@mail.gmail.com
*	Replace remaining uses of pq_sendint with pq_sendint{8,16,32}.	Andres Freund	2017-10-11
\| \| \| \| \| \| \|	pq_sendint() remains, so extension code doesn't unnecessarily break. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
*	Improve performance of SendRowDescriptionMessage.	Andres Freund	2017-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's three categories of changes leading to better performance: - Splitting the per-attribute part of SendRowDescriptionMessage into a v2 and a v3 version allows avoiding branches for every attribute. - Preallocating the size of the buffer to be big enough for all attributes and then using pq_write* avoids unnecessary buffer size checks & resizing. - Reusing a persistently allocated StringInfo for all SendRowDescriptionMessage() invocations avoids repeated allocations & reallocations. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
*	Use one stringbuffer for all rows printed in printtup.c.	Andres Freund	2017-10-11
\| \| \| \| \| \| \| \| \| \|	This avoids newly allocating, and then possibly growing, the stringbuffer for every row. For wide rows this can substantially reduce memory allocator overhead, at the price of not immediately reducing memory usage after outputting an especially wide row. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de