aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
...
* Disable WAL-skipping optimization for COPY on views and foreign tablesMichael Paquier2018-12-23
| | | | | | | | | | | | | | | | | | | | COPY can skip writing WAL when loading data on a table which has been created in the same transaction as the one loading the data, however this cannot work on views or foreign table as this would result in trying to flush relation files which do not exist. So disable the optimization so as commands are able to work the same way with any configuration of wal_level. Tests are added to cover the different cases, which need to have wal_level set to minimal to allow the problem to show up, and that is not the default configuration. Reported-by: Luis M. Carril, Etsuro Fujita Author: Amit Langote, Michael Paquier Reviewed-by: Etsuro Fujita Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org Backpatch-through: 10, where support for COPY on views has been added, while v11 has added support for COPY on foreign tables.
* Fix ancient compiler warnings and typos in !HAVE_SYMLINK codePeter Eisentraut2018-12-22
| | | | This has never been correct since this code was introduced.
* Check for conflicting queries during replay of gistvacuumpage()Alexander Korotkov2018-12-21
| | | | | | | | | | | | | | | | | | | | | | | 013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks index tuples as dead when kill_prior_tuple is set. Later, when new tuple insertion claims page space, those dead index tuples are physically deleted from page. When this deletion is replayed on standby, it might conflict with read-only queries. But 013ebc0a7b doesn't handle this. That may lead to disappearance of some tuples from read-only snapshots on standby. This commit implements resolving of conflicts between replay of GiST microvacuum and standby queries. On the master we implement new WAL record type XLOG_GIST_DELETE, which comprises necessary information. On stable releases we've to be tricky to keep WAL compatibility. Information required for conflict processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So, PostgreSQL version, which doesn't know about conflict processing, will just ignore that. Reported-by: Andres Freund Diagnosed-by: Andres Freund Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de Author: Alexander Korotkov Backpatch-through: 9.6
* Fix lock level used for partition when detaching itAlvaro Herrera2018-12-20
| | | | | | | | | | | | | | For probably bogus reasons, we acquire only AccessShareLock on the partition when we try to detach it from its parent partitioned table. This can cause ugly things to happen if another transaction is doing any sort of DDL to the partition concurrently. Upgrade that lock to ShareUpdateExclusiveLock, which per discussion seems to be the minimum needed. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
* Doc: fix ancient mistake in search_path documentation.Tom Lane2018-12-20
| | | | | | | | | | | "$user" in a search_path string is replaced by CURRENT_USER not SESSION_USER. (It actually was SESSION_USER in the initial implementation, but we changed it shortly later, and evidently forgot to fix the docs to match.) Noted by antonov@stdpr.ru Discussion: https://postgr.es/m/159151fb45d490c8d31ea9707e9ba99d@stdpr.ru
* DETACH PARTITION: hold locks on indexes until end of transactionAlvaro Herrera2018-12-20
| | | | | | | | | | | | | | | | | | | | When a partition is detached from its parent, we acquire locks on all attached indexes to also detach them ... but we release those locks immediately. This is a violation of the policy of keeping locks on user objects to the end of the transaction. Bug introduced in 8b08f7d4820f. It's unclear that there are any ill effects possible, but it's clearly wrong nonetheless. It's likely that bad behavior *is* possible, but mostly because the relation that the index is for is only locked with AccessShareLock, which is an older bug that shall be fixed separately. While touching that line of code, close the index opened with index_open() using index_close() instead of relation_close(). No difference in practice, but let's be consistent. Unearthed by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
* Fix ADD IF NOT EXISTS used in conjunction with ALTER TABLE ONLYGreg Stark2018-12-19
| | | | | | The flag for IF NOT EXISTS was only being passed down in the normal recursing case. It's been this way since originally added in 9.6 in commit 2cd40adb85 so backpatch back to 9.6.
* Doc: fix incorrect example of collecting arguments with fmgr macros.Tom Lane2018-12-19
| | | | | | Thinko in commit f66912b0a. Back-patch to v10, as that was. Discussion: https://postgr.es/m/154522283371.15419.15167411691473730460@wrigleys.postgresql.org
* Correct obsolete nbtree recovery comments.Peter Geoghegan2018-12-18
| | | | | | | | | | | | | | | Commit 40dae7ec537, which made the handling of interrupted nbtree page splits more robust, removed an nbtree-specific end-of-recovery cleanup step. This meant that it was no longer possible to complete an interrupted page split during recovery. However, a reference to recovery as a reason for using a NULL stack while inserting into a parent page was missed. Remove the reference. Remove a similar obsolete reference to recovery that was introduced much more recently, as part of the btree fastpath optimization enhancement that made it into Postgres 11 (commit 2b272734, and follow-up commits). Backpatch: 11-, where the fastpath optimization was introduced.
* Doc: fix typo in "Generic File Access Functions" section.Tatsuo Ishii2018-12-19
| | | | | Issue reported by me and fix by Tom Lane. Discussion: https://postgr.es/m/20181219.080458.1434575730369741406.t-ishii%40sraoss.co.jp
* Fix ancient thinko in mergejoin cost estimation.Tom Lane2018-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | "rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples, which is sensible if it's to be multiplied by total-inner-tuples or a cost value corresponding to scanning all the inner tuples. But in reality it was (mostly) multiplied by inner_rows or a related cost, numbers that take into account the possibility of stopping short of scanning the whole inner relation thanks to a limited key range in the outer relation. This'd still make sense if we could expect that stopping short would result in a proportional decrease in the number of tuples that have to be rescanned. It does not, however. The argument that establishes the validity of our estimate for that number is independent of whether we scan all of the inner relation or stop short, and experimentation also shows that stopping short doesn't reduce the number of rescanned tuples. So the correct calculation is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply that by inner_rows or a corresponding cost value. Most of the time this doesn't make much difference, but if we have both a high rescan rate (due to lots of duplicate values) and an outer key range much smaller than the inner key range, then the error can be significant, leading to a large underestimate of the cost associated with rescanning. Per report from Vijaykumar Jain. This thinko appears to go all the way back to the introduction of the rescan estimation logic in commit 70fba7043, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
* Update project link of pgBadger in documentationMichael Paquier2018-12-18
| | | | | | | The project has moved to a new place. Reported-by: Peter Neave Discussion: https://postgr.es/m/154474118231.5066.16352227860913505754@wrigleys.postgresql.org
* Include ALTER INDEX SET STATISTICS in pg_dumpMichael Paquier2018-12-18
| | | | | | | | | | | | | | | | | | | | | | | The new grammar pattern of ALTER INDEX SET STATISTICS able to use column numbers on top of the existing column names introduced by commit 5b6d13e forgot to add support for the feature in pg_dump, so defining statistics on index columns was missing from the dumps, potentially causing silent planning problems with a subsequent restore. pg_dump ought to not use column names in what it generates as these are automatically generated by the server and could conflict with real relation attributes with matching patterns. "expr" and "exprN", N incremented automatically after the creation of the first one, are used as default attribute names for index expressions, and that could easily match what is defined in other relations, causing the dumps to fail if some of those attributes are renamed at some point. So to avoid any problems, the new grammar with column numbers gets used. Reported-by: Ronan Dunklau Author: Michael Paquier Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com Backpatch-through: 11, where the new syntax has been introduced.
* Clarify runtime pruning in EXPLAINAlvaro Herrera2018-12-17
| | | | | | Author: Amit Langote Reviewed-by: David Rowley Discussion: https://postgr.es/m/002dec69-9afb-b621-5630-235eceafe0bd@lab.ntt.co.jp
* Remove extra semicolons.Amit Kapila2018-12-17
| | | | | | | | Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAKJS1f8EneeYyzzvdjahVZ6gbAHFkHbSFB5m_C0Y6TUJs9Dgdg@mail.gmail.com
* Fix use-after-free bug when renaming constraintsMichael Paquier2018-12-17
| | | | | | | | | This is an oversight from recent commit b13fd344. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.
* Make constraint rename issue relcache invalidation on target relationMichael Paquier2018-12-17
| | | | | | | | | | | | | | | When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org
* Make error handling in parallel pg_upgrade less bogus.Tom Lane2018-12-16
| | | | | | | | | | | | reap_child() basically ignored the possibility of either an error in waitpid() itself or a child process failure on signal. We don't really need to do more than report and crash hard, but proceeding as though nothing is wrong is definitely Not Acceptable. The error report for nonzero child exit status was pretty off-point, as well. Noted while fooling around with child-process failure detection logic elsewhere. It's been like this a long time, so back-patch to all supported branches.
* Improve detection of child-process SIGPIPE failures.Tom Lane2018-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ffa4cbd62 added logic to detect SIGPIPE failure of a COPY child process, but it only worked correctly if the SIGPIPE occurred in the immediate child process. Depending on the shell in use and the complexity of the shell command string, we might instead get back an exit code of 128 + SIGPIPE, representing a shell error exit reporting SIGPIPE in the child process. We could just hack up ClosePipeToProgram() to add the extra case, but it seems like this is a fairly general issue deserving a more general and better-documented solution. I chose to add a couple of functions in src/common/wait_error.c, which is a natural place to know about wait-result encodings, that will test for either a specific child-process signal type or any child-process signal failure. Then, adjust other places that were doing ad-hoc tests of this type to use the common functions. In RestoreArchivedFile, this fixes a race condition affecting whether the process will report an error or just silently proc_exit(1): before, that depended on whether the intermediate shell got SIGTERM'd itself or reported a child process failing on SIGTERM. Like the previous patch, back-patch to v10; we could go further but there seems no real need to. Per report from Erik Rijkers. Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl
* Fix bogus logic for skipping unnecessary partcollation dependencies.Tom Lane2018-12-13
| | | | | | | | | The idea here is to not call recordDependencyOn for the default collation, since we know that's pinned. But what the code actually did was to record the partition key's dependency on the opclass twice, instead. Evidently introduced by sloppy coding in commit 2186b608b. Back-patch to v10 where that came in.
* Prevent GIN deleted pages from being reclaimed too earlyAlexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
* Prevent deadlock in ginRedoDeletePage()Alexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | | | On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4
* Fix deadlock in GIN vacuum introduced by 218f51584d5Alexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | | | | Before 218f51584d5 if posting tree page is about to be deleted, then the whole posting tree is locked by LockBufferForCleanup() on root preventing all the concurrent inserts. 218f51584d5 reduced locking to the subtree containing page to be deleted. However, due to concurrent parent split, inserter doesn't always holds pins on all the pages constituting path from root to the target leaf page. That could cause a deadlock between GIN vacuum process and GIN inserter. And we didn't find non-invasive way to fix this. This commit reverts VACUUM behavior to lock the whole posting tree before delete any page. However, we keep another useful change by 218f51584d5: the tree is locked only if there are pages to be deleted. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Andrey Borodin, Peter Geoghegan Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov, based on ideas from Andrey Borodin and Peter Geoghegan Reviewed-by: Andrey Borodin Backpatch-through: 10
* Repair bogus EPQ plans generated for postgres_fdw foreign joins.Tom Lane2018-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw's postgresGetForeignPlan() assumes without checking that the outer_plan it's given for a join relation must have a NestLoop, MergeJoin, or HashJoin node at the top. That's been wrong at least since commit 4bbf6edfb (which could cause insertion of a Sort node on top) and it seems like a pretty unsafe thing to Just Assume even without that. Through blind good fortune, this doesn't seem to have any worse consequences today than strange EXPLAIN output, but it's clearly trouble waiting to happen. To fix, test the node type explicitly before touching Join-specific fields, and avoid jamming the new tlist into a node type that can't do projection. Export a new support function from createplan.c to avoid building low-level knowledge about the latter into FDWs. Back-patch to 9.6 where the faulty coding was added. Note that the associated regression test cases don't show any changes before v11, apparently because the tests back-patched with 4bbf6edfb don't actually exercise the problem case before then (there's no top-level Sort in those plans). Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us
* Repair bogus handling of multi-assignment Params in upper plan levels.Tom Lane2018-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
* Fix test_rls_hooks to assign expression collations properly.Tom Lane2018-12-11
| | | | | | | | | | | | This module overlooked this necessary fixup step on the results of transformWhereClause(). It accidentally worked anyway, because the constructed expression involved type "name" which is not collatable, but it fell over while I was experimenting with changing "name" to be collatable. Back-patch, not because there's any live bug here in back branches, but because somebody might use this code as a model for some real application and then not understand why it doesn't work.
* Doc: improve documentation about ALTER LARGE OBJECT requirements.Tom Lane2018-12-11
| | | | | | | | Unlike other ALTER ref pages, this one neglected to mention that ALTER OWNER requires being a member of the new owning role. Per bug #15546 from Stefan Kadow. Discussion: https://postgr.es/m/15546-0558c75fd2025e7c@postgresql.org
* Raise some timeouts to 180s, in test code.Noah Misch2018-12-10
| | | | | | | | | | | | Slow runs of buildfarm members chipmunk, hornet and mandrill saw the shorter timeouts expire. The 180s timeout in poll_query_until has been trouble-free since 2a0f89cd717ce6d49cdc47850577823682167e87 introduced it two years ago, so use 180s more widely. Back-patch to 9.6, where the first of these timeouts was introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/20181209001601.GC2973271@rfd.leadboat.com
* Add stack depth checks to key recursive functions in backend/nodes/*.c.Tom Lane2018-12-10
| | | | | | | | | | Although copyfuncs.c has a check_stack_depth call in its recursion, equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems unwise. Likewise fix planstate_tree_walker(), in branches where that exists. Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us
* Make TupleDescInitBuiltinEntry throw error for unsupported types.Tom Lane2018-12-10
| | | | | | | | | Previously, it would just pass back a partially-uninitialized tupdesc, which doesn't seem like a safe or useful behavior. Backpatch to v10 where this code came in. Discussion: https://postgr.es/m/30830.1544384975@sss.pgh.pa.us
* Fix misapplication of pgstat_count_truncate to wrong relation.Tom Lane2018-12-07
| | | | | | | | | | | | | | | | | | | | | | | | The stanza of ExecuteTruncate[Guts] that truncates a target table's toast relation re-used the loop local variable "rel" to reference the toast rel. This was safe enough when written, but commit d42358efb added code below that that supposed "rel" still pointed to the parent table. Therefore, the stats counter update was applied to the wrong relcache entry (the toast rel not the user rel); and if we were unlucky and that relcache entry had been flushed during reindex_relation, very bad things could ensue. (I'm surprised that CLOBBER_CACHE_ALWAYS testing hasn't found this. I'm even more surprised that the problem wasn't detected during the development of d42358efb; it must not have been tested in any case with a toast table, as the incorrect stats counts are very obvious.) To fix, replace use of "rel" in that code branch with a more local variable. Adjust test cases added by d42358efb so that some of them use tables with toast tables. Per bug #15540 from Pan Bian. Back-patch to 9.5 where d42358efb came in. Discussion: https://postgr.es/m/15540-01078812338195c0@postgresql.org
* Clean up sloppy coding in publicationcmds.c's OpenTableList().Tom Lane2018-12-07
| | | | | | | | | | | | | | Remove dead code (which would be incorrect if it weren't dead), per report from Pan Bian. Add a CHECK_FOR_INTERRUPTS in the inner loop over child relations, because there's little point in having one in the outer loop if there's not one here too. Minor stylistic adjustments and comment improvements. Seems to be aboriginal to this code (cf commit 665d1fad9). Back-patch to v10 where that came in, not because any of this is significant, but just to keep the branches looking similar. Discussion: https://postgr.es/m/15539-06d00ef6b1e2e1bb@postgresql.org
* Doc: make cross-reference to format() function more specific.Tom Lane2018-12-07
| | | | | | Jeff Janes Discussion: https://postgr.es/m/CAMkU=1w7Tn2M9BhK+rt8Shtz1AkU+ty7By8gj5C==z65=U4vyQ@mail.gmail.com
* Improve our response to invalid format strings, and detect more cases.Tom Lane2018-12-06
| | | | | | | | | | | | | | | | | | | | | | | Places that are testing for *printf failure ought to include the format string in their error reports, since bad-format-string is one of the more likely causes of such failure. This both makes it easier to find and repair the mistake, and provides at least some useful info to the user who stumbles across such a problem. Also, tighten snprintf.c to report EINVAL for an invalid flag or final character in a format %-spec (including the case where the %-spec is missing a final character altogether). This seems like better project policy, and it also allows removing an instruction or two from the hot code path. Back-patch the error reporting change in pvsnprintf, since it should be harmless and may be helpful; but not the snprintf.c change. Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an invalid translated format string. These changes don't fix that error, but they should improve matters next time we make such a mistake. Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org
* Improve planner stats documentationStephen Frost2018-12-06
| | | | | | | | It was pointed out that in the planner stats documentation under Extended Statistics, one of the sentences was a bit awkward. Improve that by rewording it slightly. Discussion: https://postgr.es/m/154409976780.14137.2785644488950047100@wrigleys.postgresql.org
* Don't mark partitioned indexes invalid unnecessarilyAlvaro Herrera2018-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | When an indexes is created on a partitioned table using ONLY (don't recurse to partitions), it gets marked invalid until index partitions are attached for each table partition. But there's no reason to do this if there are no partitions ... and moreover, there's no way to get the index to become valid afterwards, because all partitions that get created/attached get their own index partition already attached to the parent index, so there's no chance to do ALTER INDEX ... ATTACH PARTITION that would make the parent index valid. Fix by not marking the index as invalid to begin with. This is very similar to 9139aa19423b, but the pg_dump aspect does not appear to be relevant until we add FKs that can point to PKs on partitioned tables. (I tried to cause the pg_upgrade test to break by leaving some of these bogus tables around, but wasn't able to.) Making this change means that an index that was supposed to be invalid in the insert_conflict regression test is no longer invalid; reorder the DDL so that the test continues to verify the behavior we want it to. Author: Álvaro Herrera Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20181203225019.2vvdef2ybnkxt364@alvherre.pgsql
* Fix invalid value of synchronous_commit in description of flush_lagMichael Paquier2018-12-05
| | | | | | | "remote_flush" has never been a valid user-facing value, but "on" is. Author: Maksim Milyutin Discussion: https://postgr.es/m/27b3b80c-3615-2d76-02c5-44566b53136c@gmail.com
* Fix various checksum check problems for pg_verify_checksums and base backupsMichael Paquier2018-11-30
| | | | | | | | | | | | | | | | | | | | | Three issues are fixed in this patch: - Base backups forgot to ignore files specific to EXEC_BACKEND, leading to spurious warnings when checksums are enabled, per analysis from me. - pg_verify_checksums forgot about files specific to EXEC_BACKEND, leading to failures of the tool on any such build, particularly Windows. This error was originally found by newly-introduced TAP tests in various buildfarm members using EXEC_BACKEND. - pg_verify_checksums forgot to count for temporary files and temporary paths, which could be valid relation files, without checksums, per report from Andres Freund. More tests are added to cover this case. A new test case which emulates corruption for a file in a different tablespace is added, coming from from Michael Banck, while I have coded the main code and refactored the test code. Author: Michael Banck, Michael Paquier Reviewed-by: Stephen Frost, David Steele Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz
* Switch pg_verify_checksums back to a blacklistMichael Paquier2018-11-30
| | | | | | | | | | | | | | | This basically reverts commit d55241af705667d4503638e3f77d3689fd6be31, leaving around a portion of the regression tests still adapted with empty relation files, and corrupted cases. This is also proving to be failing to check properly relation files located in a non-default tablespace path. Per discussion with various folks, including Stephen Frost, David Steele, Andres Freund, Michael Banck and myself. Reported-by: Michael Banck Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz Backpatch-through: 11
* Document handling of invalid/ambiguous timestamp input near DST boundaries.Tom Lane2018-11-29
| | | | | | | | | | | | | The source code comments documented this, but the user-facing docs, not so much. Add a section to Appendix B that discusses it. In passing, improve a couple other things in Appendix B --- notably, a long-obsolete claim that time zone abbreviations are looked up in a fixed table. Per bug #15527 from Michael Davidson. Discussion: https://postgr.es/m/15527-f1be0b4dc99ebbe7@postgresql.org
* Ensure static libraries have correct mod time even if ranlib messes it up.Tom Lane2018-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In at least Apple's version of ranlib, the output file is updated to have a mod time equal to the max of the timestamps of its components, and that data only has seconds precision. On a filesystem with sub-second file timestamp precision --- say, APFS --- this can result in the finished static library appearing older than its input files, which causes useless rebuilds and possible outright failures in parallel makes. We've only seen this reported in the field from people using Apple's ranlib with a non-Apple make, because Apple's make doesn't know about sub-second timestamps either so it doesn't decide rebuilds are needed. But Apple's ranlib presumably shares code with at least some BSDen, so it's not that unlikely that the same problem could arise elsewhere. To fix, just "touch" the output file after ranlib finishes. We seem to need this in only one place. There are other calls of ranlib in our makefiles, but they are working on intermediate files whose timestamps are not actually important, or else on an installed static library for which sub-second timestamp precision is unlikely to matter either. (Also, so far as I can tell, Apple's ranlib doesn't mess up the file timestamp in the latter usage anyhow.) In passing, change "ranlib" to "$(RANLIB)" in one place that was bypassing the make macro for no good reason. Per bug #15525 from Jack Kelly (via Alyssa Ross). Back-patch to all supported branches. Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org
* Fix minor typo in dsa.c.Thomas Munro2018-11-29
| | | | | Author: Takeshi Ideriha Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F3BF22D%40G01JPEXMBKW04
* Fix handling of synchronous replication for stopping WAL sendersMichael Paquier2018-11-29
| | | | | | | | | | | | | | | | | | | This fixes an oversight from c6c3334 which forgot that if a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction. However the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters and are tracked properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4
* Have BufFileSize() ereport() on FileSize() failure.Peter Geoghegan2018-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the responsibility for checking for and reporting a failure from the only current BufFileSize() caller, logtape.c, to BufFileSize() itself. Code within buffile.c is generally responsible for interfacing with fd.c to report irrecoverable failures. This seems like a convention that's worth sticking to. Reorganizing things this way makes it easy to make the error message raised in the event of BufFileSize() failure descriptive of the underlying problem. We're now clear on the distinction between temporary file name and BufFile name, and can show errno, confident that its value actually relates to the error being reported. In passing, an existing, similar buffile.c ereport() + errcode_for_file_access() site is changed to follow the same conventions. The API of the function BufFileSize() is changed by this commit, despite already being in a stable release (Postgres 11). This seems acceptable, since the BufFileSize() ABI was changed by commit aa551830421, which hasn't made it into a point release yet. Besides, it's difficult to imagine a third party BufFileSize() caller not just raising an error anyway, since BufFile state should be considered corrupt when BufFileSize() fails. Per complaint from Tom Lane. Discussion: https://postgr.es/m/26974.1540826748@sss.pgh.pa.us Backpatch: 11-, where shared BufFiles were introduced.
* C comment: remove extra '*'Bruce Momjian2018-11-28
| | | | | | | | | | Reported-by: Etsuro Fujita Discussion: https://postgr.es/m/5BFE34DE.1080404@lab.ntt.co.jp Author: Etsuro Fujita Backpatch-through: 10
* Don't set PAM_RHOST for Unix sockets.Thomas Munro2018-11-28
| | | | | | | | | | | | | | | Since commit 2f1d2b7a we have set PAM_RHOST to "[local]" for Unix sockets. This caused Linux PAM's libaudit integration to make DNS requests for that name. It's not exactly clear what value PAM_RHOST should have in that case, but it seems clear that we shouldn't set it to an unresolvable name, so don't do that. Back-patch to 9.6. Bug #15520. Author: Thomas Munro Reviewed-by: Peter Eisentraut Reported-by: Albert Schabhuetl Discussion: https://postgr.es/m/15520-4c266f986998e1c5%40postgresql.org
* Do not decode TOAST data for table rewritesTomas Vondra2018-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit e9edc1ba already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced
* Fix jit compilation bug on wide tables.Andres Freund2018-11-27
| | | | | | | | | | | | | | | | | | | | The function generated to perform JIT compiled tuple deforming failed when HeapTupleHeader's t_hoff was bigger than a signed int8. I'd failed to realize that LLVM's getelementptr would treat an int8 index argument as signed, rather than unsigned. That means that a hoff larger than 127 would result in a negative offset being applied. Fix that by widening the index to 32bit. Add a testcase with a wide table. Don't drop it, as it seems useful to verify other tools deal properly with wide tables. Thanks to Justin Pryzby for both reporting a bug and then reducing it to a reproducible testcase! Reported-By: Justin Pryzby Author: Andres Freund Discussion: https://postgr.es/m/20181115223959.GB10913@telsasoft.com Backpatch: 11, just as jit compilation was
* Fix ac218aa4f6 to work on versions before 9.5.Andres Freund2018-11-26
| | | | | | | | | | | | Unfortunately ac218aa4f6 missed the fact that a reference to 'pg_catalog.regnamespace'::regclass wouldn't work before that type is known. Fix that, by replacing the regtype usage with a join to pg_type. Reported-By: Tom Lane Author: Andres Freund Discussion: https://postgr.es/m/8863.1543297423@sss.pgh.pa.us Backpatch: 9.5-, like ac218aa4f6
* Update pg_upgrade test for reg* to include regrole and regnamespace.Andres Freund2018-11-26
| | | | | | | | | | | | | | | | | When the regrole (0c90f6769) and regnamespace (cb9fa802b) types were added in 9.5, pg_upgrade's check for reg* types wasn't updated. While regrole currently is safe, regnamespace is not. It seems unlikely that anybody uses regnamespace inside catalog tables across a pg_upgrade, but the tests should be correct nevertheless. While at it, reorder the types checked in the query to be alphabetical. Otherwise it's annoying to compare existing and tested for types. Author: Andres Freund Discussion: https://postgr.es/m/037e152a-cb25-3bcb-4f35-bdc9988f8204@2ndQuadrant.com Backpatch: 9.5-, as regrole/regnamespace