postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix handling of partitioned index in RelationGetNumberOfBlocksInFork()	Peter Eisentraut	2021-08-26
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since a partitioned index doesn't have storage, getting the number of blocks from it will not give sensible results. Existing callers already check that they don't call it that way, so there doesn't appear to be a live problem. But for correctness, handle RELKIND_PARTITIONED_INDEX together with the other non-storage relkinds. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/1d3a5fbe-f48b-8bea-80da-9a5c4244aef9@enterprisedb.com
*	Remove redundant test.	Tom Lane	2021-08-25
\| \| \| \| \| \| \| \| \| \|	The condition "context_start < context_end" is strictly weaker than "context_end - context_start >= 50", so we don't need both. Oversight in commit ffd3944ab, noted by tanghy.fnst. In passing, line-wrap a nearby test to make it more readable. Discussion: https://postgr.es/m/OS0PR01MB61137C4054774F44E3A9DC89FBC69@OS0PR01MB6113.jpnprd01.prod.outlook.com
*	Fix broken snapshot handling in parallel workers.	Robert Haas	2021-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pengchengliu reported an assertion failure in a parallel woker while performing a parallel scan using an overflowed snapshot. The proximate cause is that TransactionXmin was set to an incorrect value. The underlying cause is incorrect snapshot handling in parallel.c. In particular, InitializeParallelDSM() was unconditionally calling GetTransactionSnapshot(), because I (rhaas) mistakenly thought that was always retrieving an existing snapshot whereas, at isolation levels less than REPEATABLE READ, it's actually taking a new one. So instead do this only at higher isolation levels where there actually is a single snapshot for the whole transaction. By itself, this is not a sufficient fix, because we still need to guarantee that TransactionXmin gets set properly in the workers. The easiest way to do that seems to be to install the leader's active snapshot as the transaction snapshot if the leader did not serialize a transaction snapshot. This doesn't affect the results of future GetTrasnactionSnapshot() calls since those have to take a new snapshot anyway; what we care about is the side effect of setting TransactionXmin. Report by Pengchengliu. Patch by Greg Nancarrow, except for some comment text which I supplied. Discussion: https://postgr.es/m/002f01d748ac$eaa781a0$bff684e0$@tju.edu.cn
*	Fix toast rewrites in logical decoding.	Amit Kapila	2021-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 325f2ec555 introduced pg_class.relwrite to skip operations on tables created as part of a heap rewrite during DDL. It links such transient heaps to the original relation OID via this new field in pg_class but forgot to do anything about toast tables. So, logical decoding was not able to skip operations on internally created toast tables. This leads to an error when we tried to decode the WAL for the next operation for which it appeared that there is a toast data where actually it didn't have any toast data. To fix this, we set pg_class.relwrite for internally created toast tables as well which allowed skipping operations on them during logical decoding. Author: Bertrand Drouvot Reviewed-by: David Zhang, Amit Kapila Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com
*	Avoid using ambiguous word "positive" in error message.	Fujii Masao	2021-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two identical error messages about valid value of modulus for hash partition, in PostgreSQL source code. Commit 0e1275fb07 improved only one of them so that ambiguous word "positive" was avoided there, and forgot to improve the other. This commit improves the other. Which would reduce translator burden. Back-pach to v11 where the error message exists. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com
*	Improve error message about valid value for distance in phrase operator.	Fujii Masao	2021-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The distance in phrase operator must be an integer value between zero and MAXENTRYPOS inclusive. But previously the error message about its valid value included the information about its upper limit but not lower limit (i.e., zero). This commit improves the error message so that it also includes the information about its lower limit. Back-patch to v9.6 where full-text phrase search was supported. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com
*	Fix regexp misbehavior with capturing parens inside "{0}".	Tom Lane	2021-08-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Regexps like "(.){0}...\1" drew an "invalid backreference number". That's not unreasonable on its face, since the capture group will never be matched if it's iterated zero times. However, other engines such as Perl's don't complain about this, nor do we throw an error for related cases such as "(.)\|\1", even though that backref can never succeed either. Also, if the zero-iterations case happens at runtime rather than compile time --- say, "(x)*...\1" when there's no "x" to be found --- that's not an error, we just deem the backref to not match. Making this even less defensible, no error was thrown for nested cases such as "((.)){0}...\2"; and to add insult to injury, those cases could result in assertion failures instead. (It seems that nothing especially bad happened in non-assert builds, though.) Let's just fix it so that no error is thrown and instead the backref is deemed to never match, so that compile-time detection of no iterations behaves the same as run-time detection. Per report from Mark Dilger. This appears to be an aboriginal error in Spencer's library, so back-patch to all supported versions. Pre-v14, it turns out to also be necessary to back-patch one aspect of commits cb76fbd7e/00116dee5, namely to create capture-node subREs with the begin/end states of their subexpressions, not the current lp/rp of the outer parseqatom invocation. Otherwise delsub complains that we're trying to disconnect a state from itself. This is a bit scary but code examination shows that it's safe: in the pre-v14 code, if we want to wrap iteration around the subexpression, the first thing we do is overwrite the atom's begin/end fields with new states. So the bogus values didn't survive long enough to be used for anything, except if no iteration is required, in which case it doesn't matter. Discussion: https://postgr.es/m/A099E4A8-4377-4C64-A98C-3DEDDC075502@enterprisedb.com
*	Fix Alter Subscription's Add/Drop Publication behavior.	Amit Kapila	2021-08-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current refresh behavior tries to just refresh added/dropped publications but that leads to removing wrong tables from subscription. We can't refresh just the dropped publication because it is quite possible that some of the tables are removed from publication by that time and now those will remain as part of the subscription. Also, there is a chance that the tables that were part of the publication being dropped are also part of another publication, so we can't remove those. So, we decided that by default, add/drop commands will also act like REFRESH PUBLICATION which means they will refresh all the publications. We can keep the old behavior for "add publication" but it is better to be consistent with "drop publication". Author: Hou Zhijie Reviewed-by: Masahiko Sawada, Amit Kapila Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/OS0PR01MB5716935D4C2CC85A6143073F94EF9@OS0PR01MB5716.jpnprd01.prod.outlook.com
*	Prevent regexp back-refs from sometimes matching when they shouldn't.	Tom Lane	2021-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recursion in cdissect() was careless about clearing match data for capturing parentheses after rejecting a partial match. This could allow a later back-reference to succeed when by rights it should fail for lack of a defined referent. To fix, think a little more rigorously about what the contract between different levels of cdissect's recursion needs to be. With the right spec, we can fix this using fewer rather than more resets of the match data; the key decision being that a failed sub-match is now explicitly responsible for clearing any matches it may have set. There are enough other cross-checks and optimizations in the code that it's not especially easy to exhibit this problem; usually, the match will fail as-expected. Plus, regexps that are even potentially vulnerable are most likely user errors, since there's just not much point in writing a back-ref that doesn't always have a referent. These facts perhaps explain why the issue hasn't been detected, even though it's almost certainly a couple of decades old. Discussion: https://postgr.es/m/151435.1629733387@sss.pgh.pa.us
*	Avoid creating archive status ".ready" files too early	Alvaro Herrera	2021-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WAL records may span multiple segments, but XLogWrite() does not wait for the entire record to be written out to disk before creating archive status files. Instead, as soon as the last WAL page of the segment is written, the archive status file is created, and the archiver may process it. If PostgreSQL crashes before it is able to write and flush the rest of the record (in the next WAL segment), the wrong version of the first segment file lingers in the archive, which causes operations such as point-in-time restores to fail. To fix this, keep track of records that span across segments and ensure that segments are only marked ready-for-archival once such records have been completely written to disk. This has always been wrong, so backpatch all the way back. Author: Nathan Bossart <bossartn@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CBDDFA01-6E40-46BB-9F98-9340F4379505@amazon.com
*	Improve defaults shown in postgresql.conf.sample and pg_settings	Bruce Momjian	2021-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, these showed unlikely default values. The new default value 128MB (since PG 10) is not always accurate since initdb tries several increasing values, but it likely to be accurate. Reported-by: Zhangjie <zhangjie2@fujitsu.com> Discussion: https://postgr.es/m/TYWPR01MB7678772FD8640C404F1DC882F9079@TYWPR01MB7678.jpnprd01.prod.outlook.com Author: Zhangjie Backpatch-through: master
*	Fix backup manifests to generate correct WAL-Ranges across timelines	Michael Paquier	2021-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a backup manifest, WAL-Ranges stores the range of WAL that is required for the backup to be valid. pg_verifybackup would then internally use pg_waldump for the checks based on this data. When the timeline where the backup started was more than 1 with a history file looked at for the manifest data generation, the calculation of the WAL range for the first timeline to check was incorrect. The previous logic used as start LSN the start position of the first timeline, but it needs to use the start LSN of the backup. This would cause failures with pg_verifybackup, or any tools making use of the backup manifests. This commit adds a test based on a logic using a self-promoted node, making it rather cheap. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210818.143031.1867083699202617521.horikyota.ntt@gmail.com Backpatch-through: 13
*	Allow parallel DISTINCT	David Rowley	2021-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've supported parallel aggregation since e06a38965. At the time, we didn't quite get around to also adding parallel DISTINCT. So, let's do that now. This is implemented by introducing a two-phase DISTINCT. Phase 1 is performed on parallel workers, rows are made distinct there either by hashing or by sort/unique. The results from the parallel workers are combined and the final distinct phase is performed serially to get rid of any duplicate rows that appear due to combining rows for each of the parallel workers. Author: David Rowley Reviewed-by: Zhihong Yu Discussion: https://postgr.es/m/CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com
*	Improve error messages about misuse of SELECT INTO.	Tom Lane	2021-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve two places in plpgsql, and one in spi.c, where an error message would confusingly tell you that you couldn't use a SELECT query, when what you had written was a SELECT query. The actual problem is that you can't use SELECT ... INTO in these contexts, but the messages failed to make that apparent. Special-case SELECT INTO to make these errors more helpful. Also, fix the same spots in plpgsql, as well as several messages in exec_eval_expr(), to not quote the entire complained-of query or expression in the primary error message. That behavior very easily led to violating our message style guideline about keeping the primary error message short and single-line. Also, since the important part of the message was after the inserted text, it could make the real problem very hard to see. We can report the query or expression as the first line of errcontext instead. Per complaint from Roger Mason. Back-patch to v14, since (a) some of these messages are new in v14 and (b) v14's translatable strings are still somewhat in flux. The problem's older than that of course, but I'm hesitant to change the behavior further back. Discussion: https://postgr.es/m/1914708.1629474624@sss.pgh.pa.us
*	Fix performance bug in regexp's citerdissect/creviterdissect.	Tom Lane	2021-08-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After detecting a sub-match "dissect" failure (i.e., a backref match failure) in the i'th sub-match of an iteration node, we should proceed by adjusting the attempted length of the i'th submatch. As coded, though, these functions changed the attempted length of the last sub-match, and only after exhausting all possibilities for that would they back up to adjust the next-to-last sub-match, and then the second-from-last, etc; all of which is wasted effort, since only changing the start or length of the i'th sub-match can possibly make it succeed. This oversight creates the possibility for exponentially bad performance. Fortunately the problem is masked in most cases by optimizations or constraints applied elsewhere; which explains why we'd not noticed it before. But it is possible to reach the problem with fairly simple, if contrived, regexps. Oversight in my commit 173e29aa5. That's pretty ancient now, so back-patch to all supported branches. Discussion: https://postgr.es/m/1808998.1629412269@sss.pgh.pa.us
*	Avoid trying to lock OLD/NEW in a rule with FOR UPDATE.	Tom Lane	2021-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	transformLockingClause neglected to exclude the pseudo-RTEs for OLD/NEW when processing a rule's query. This led to odd errors or even crashes later on. This bug is very ancient, but it's not terribly surprising that nobody noticed, since the use-case for SELECT FOR UPDATE in a non-view rule is somewhere between thin and non-existent. Still, crashing is not OK. Per bug #17151 from Zhiyong Wu. Thanks to Masahiko Sawada for analysis of the problem. Discussion: https://postgr.es/m/17151-c03a3e6e4ec9aadb@postgresql.org
*	Unset MyBEEntry, making elog.c's call to pgstat_get_my_query_id() safe.	Andres Freund	2021-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously log messages late during shutdown could end up using either another backend's PgBackendStatus (multi user) or segfault (single user) because pgstat_get_my_query_id()'s check for !MyBEEntry didn't filter out use after pgstat_beshutdown_hook(). This became a bug in 4f0b0966c86, but was a bit fishy before. But given there's no known problematic cases before 14, it doesn't seem worth backpatching further. Also fixes a wrong filename in a comment, introduced in e1025044. Reported-By: Andres Freund <andres@anarazel.de> Reviewed-By: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://postgr.es/m/Julien Rouhaud <rjuju123@gmail.com> Backpatch: 14-
*	Rename LOGICAL_REP_MSG_STREAM_END to LOGICAL_REP_MSG_STREAM_STOP.	Amit Kapila	2021-08-19
\| \| \| \| \| \| \| \| \| \|	In the code, most places used the term "Stream Stop" for the logical stream message. This commit improves consistency by renaming LogicalRepMsgType "LOGICAL_REP_MSG_STREAM_END" to "LOGICAL_REP_MSG_STREAM_STOP". Author: Masahiko Sawada Reviewed-by: Hou Zhijie, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
*	Revert refactoring of hex code to src/common/	Michael Paquier	2021-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a combined revert of the following commits: - c3826f8, a refactoring piece that moved the hex decoding code to src/common/. This code was cleaned up by aef8948, as it originally included no overflow checks in the same way as the base64 routines in src/common/ used by SCRAM, making it unsafe for its purpose. - aef8948, a more advanced refactoring of the hex encoding/decoding code to src/common/ that added sanity checks on the result buffer for hex decoding and encoding. As reported by Hans Buschmann, those overflow checks are expensive, and it is possible to see a performance drop in the decoding/encoding of bytea or LOs the longer they are. Simple SQLs working on large bytea values show a clear difference in perf profile. - ccf4e27, a cleanup made possible by aef8948. The reverts of all those commits bring back the performance of hex decoding and encoding back to what it was in ~13. Fow now and post-beta3, this is the simplest option. Reported-by: Hans Buschmann Discussion: https://postgr.es/m/1629039545467.80333@nidsa.net Backpatch-through: 14
*	Fix check_agg_arguments' examination of aggregate FILTER clauses.	Tom Lane	2021-08-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recursion into the FILTER clause was mis-implemented, such that a relevant Var or Aggref at the very top of the FILTER clause would be ignored. (Of course, that'd have to be a plain boolean Var or boolean-returning aggregate.) The consequence would be mis-identification of the correct semantic level of the aggregate, which could lead to not-per-spec query behavior. If the FILTER expression is an aggregate, this could also lead to failure to issue an expected "aggregate function calls cannot be nested" error, which would likely result in a core dump later on, since the planner and executor aren't expecting such cases to appear. The root cause is that commit b560ec1b0 blindly copied some code that assumed it's recursing into a List, and thus didn't examine the top-level node. To forestall questions about why this call doesn't look like the others, as well as possible future copy-and-paste mistakes, let's change all three check_agg_arguments_walker calls in check_agg_arguments, even though only the one for the filter clause is really broken. Per bug #17152 from Zhiyong Wu. This has been wrong since we implemented FILTER, so back-patch to all supported versions. (Testing suggests that pre-v11 branches manage to avoid crashing in the bad-Aggref case, thanks to "redundant" checks in ExecInitAgg. But I'm not sure how thorough that protection is, and anyway the wrong-behavior issue remains, so fix 9.6 and 10 too.) Discussion: https://postgr.es/m/17152-c7f906cc1a88e61b@postgresql.org
*	Prevent ALTER TYPE/DOMAIN/OPERATOR from changing extension membership.	Tom Lane	2021-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If recordDependencyOnCurrentExtension is invoked on a pre-existing, free-standing object during an extension update script, that object will become owned by the extension. In our current code this is possible in three cases: * Replacing a "shell" type or operator. * CREATE OR REPLACE overwriting an existing object. * ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET. The first of these cases is intentional behavior, as noted by the existing comments for GenerateTypeDependencies. It seems like appropriate behavior for CREATE OR REPLACE too; at least, the obvious alternatives are not better. However, the fact that it happens during ALTER is an artifact of trying to share code (GenerateTypeDependencies and makeOperatorDependencies) between the CREATE and ALTER cases. Since an extension script would be unlikely to ALTER an object that didn't already belong to the extension, this behavior is not very troubling for the direct target object ... but ALTER TYPE SET will recurse to dependent domains, and it is very uncool for those to become owned by the extension if they were not already. Let's fix this by redefining the ALTER cases to never change extension membership, full stop. We could minimize the behavioral change by only changing the behavior when ALTER TYPE SET is recursing to a domain, but that would complicate the code and it does not seem like a better definition. Per bug #17144 from Alex Kozhemyakin. Back-patch to v13 where ALTER TYPE SET was added. (The other cases are older, but since they only affect the directly-named object, there's not enough of a problem to justify changing the behavior further back.) Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org
*	Improve regex compiler's arc moving/copying logic.	Tom Lane	2021-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions moveins(), copyins(), moveouts(), copyouts() are required to preserve the invariant that there are no duplicate arcs in the regex's NFA. Spencer's original implementation of them was O(N^2) since it checked separately for a match to each source arc. In commit 579840ca0 I improved that by adding sort/merge logic to be used if more than a few arcs are to be moved/copied. However, I now realize that that missed a bet. At many call sites, the target state is newly made and cannot have any existing in-arcs (respectively out-arcs) that could be duplicates. So spending any cycles at all on checking for duplicates is wasted effort; in these cases we can just blindly move/copy all the source arcs. Add code paths to do that. It turns out that for copyins()/copyouts(), all the call sites have this property, making all the "improved" logic in them flat out unreachable. Perhaps we'll need the full capability again someday, so I just #ifdef'd those paths out rather than removing them entirely. In passing, add a few test cases to improve code coverage in this area as well as in regc_locale.c/regc_pg_locale.c. Discussion: https://postgr.es/m/810272.1629064063@sss.pgh.pa.us
*	Set type identifier on BIO	Daniel Gustafsson	2021-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In OpenSSL there are two types of BIO's (I/O abstractions): source/sink and filters. A source/sink BIO is a source and/or sink of data, ie one acting on a socket or a file. A filter BIO takes a stream of input from another BIO and transforms it. In order for BIO_find_type() to be able to traverse the chain of BIO's and correctly find all BIO's of a certain type they shall have the type bit set accordingly, source/sink BIO's (what PostgreSQL implements) use BIO_TYPE_SOURCE_SINK and filter BIO's use BIO_TYPE_FILTER. In addition to these, file descriptor based BIO's should have the descriptor bit set, BIO_TYPE_DESCRIPTOR. The PostgreSQL implementation didn't set the type bits, which went unnoticed for a long time as it's only really relevant for code auditing the OpenSSL installation, or doing similar tasks. It is required by the API though, so this fixes it. Backpatch through 9.6 as this has been wrong for a long time. Author: Itamar Gafni Discussion: https://postgr.es/m/SN6PR06MB39665EC10C34BB20956AE4578AF39@SN6PR06MB3966.namprd06.prod.outlook.com Backpatch-through: 9.6
*	Revert analyze support for partitioned tables	Alvaro Herrera	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts the following commits: 1b5617eb844cd2470a334c1d2eec66cf9b39c41a Describe (auto-)analyze behavior for partitioned tables 0e69f705cc1a3df273b38c9883fb5765991e04fe Set pg_class.reltuples for partitioned tables 41badeaba8beee7648ebe7923a41c04f1f3cb302 Document ANALYZE storage parameters for partitioned tables 0827e8af70f4653ba17ed773f123a60eadd9f9c9 autovacuum: handle analyze for partitioned tables There are efficiency issues in this code when handling databases with large numbers of partitions, and it doesn't look like there isn't any trivial way to handle those. There are some other issues as well. It's now too late in the cycle for nontrivial fixes, so we'll have to let Postgres 14 users continue to manually deal with ANALYZE their partitioned tables, and hopefully we can fix the issues for Postgres 15. I kept [most of] be280cdad298 ("Don't reset relhasindex for partitioned tables on ANALYZE") because while we added it due to 0827e8af70f4, it is a good bugfix in its own right, since it affects manual analyze as well as autovacuum-induced analyze, and there's no reason to revert it. I retained the addition of relkind 'p' to tables included by pg_stat_user_tables, because reverting that would require a catversion bump. Also, in pg14 only, I keep a struct member that was added to PgStat_TabStatEntry to avoid breaking compatibility with existing stat files. Backpatch to 14. Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de
*	Reduce memory consumption for pending invalidation messages.	Tom Lane	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing data structures in inval.c are fairly inefficient for the common case of a command or subtransaction that registers a small number of cache invalidation events. While this doesn't matter if we commit right away, it can build up to a lot of bloat in a transaction that contains many DDL operations. By making a few more assumptions about the expected use-case, we can switch to a representation using densely-packed arrays. Although this eliminates some data-copying, it doesn't seem to make much difference time-wise. But the space consumption decreases substantially. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/2380555.1622395376@sss.pgh.pa.us
*	Emit namespace in the post-copy errmsg	Daniel Gustafsson	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	During a VACUUM or CLUSTER command, the initial output emits a fully qualified relation path with namespace. The post-action errmsg only emitted the relation name however, which may lead to hard to parse output when using multiple jobs with vacuumdb as the output from different jobs may be interleaved. Include the full path in the post-action errmsg to be consistent with the initial errmsg. Author: Mike Fiedler <miketheman@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CAMerE0oz+8G-aORZL_BJcPxnBqewZAvND4bSUysjz+r-oT1BxQ@mail.gmail.com
*	Refresh apply delay on reload of recovery_min_apply_delay at recovery	Michael Paquier	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit ensures that the wait interval in the replay delay loop waiting for an amount of time defined by recovery_min_apply_delay is correctly handled on reload, recalculating the delay if this GUC value is updated, based on the timestamp of the commit record being replayed. The previous behavior would be problematic for example with replay still waiting even if the delay got reduced or just cancelled. If the apply delay was increased to a larger value, the wait would have just respected the old value set, finishing earlier. Author: Soumyadeep Chakraborty, Ashwin Agrawal Reviewed-by: Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/CAE-ML+93zfr-HLN8OuxF0BjpWJ17O5dv1eMvSE5jsj9jpnAXZA@mail.gmail.com Backpatch-through: 9.6
*	Un-break s_lock_test.	Tom Lane	2021-08-13
\| \| \| \| \| \| \| \| \| \|	Commit 80abbeba2 evidently didn't bother checking this code. Also, list the generated executable in .gitignore (so it's been a REALLY long time since anyone tried this). Noted while trying out RISC-V spinlock patch. Given that this has been broken for 5 years and nobody noticed, it's likely not worth back-patching.
*	Remove support for background workers without BGWORKER_SHMEM_ACCESS.	Andres Freund	2021-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Background workers without shared memory access have been broken on EXEC_BACKEND / windows builds since shortly after background workers have been introduced, without that being reported. Clearly they are not commonly used. The problem is that bgworker startup requires to be attached to shared memory in EXEC_BACKEND child processes. StartBackgroundWorker() detaches from shared memory for unconnected workers, but at that point we already have initialized subsystems referencing shared memory. Fixing this problem is not entirely trivial, so removing the option to not be connected to shared memory seems the best way forward. In most use cases the advantages of being connected to shared memory far outweigh the disadvantages. As there have been no reports about this issue so far, we have decided that it is not worth trying to address the problem in the back branches. Per discussion with Alvaro Herrera, Robert Haas and Tom Lane. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210802065116.j763tz3vz4egqy3w@alap3.anarazel.de
*	Fix typo.	Andres Freund	2021-08-13
\| \| \| \| \|	Reported-By: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/YRIlNQhLNfx555Nx@paquier.xyz
*	Make EXEC_BACKEND more convenient on macOS.	Thomas Munro	2021-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's hard to disable ASLR on current macOS releases, for testing with -DEXEC_BACKEND. You could already set the environment variable PG_SHMEM_ADDR to something not likely to collide with mappings created earlier in process startup. Let's also provide a default value that works on current releases and architectures, for developer convenience. As noted in the pre-existing comment, this is a horrible hack, but -DEXEC_BACKEND is only used by Unix-based PostgreSQL developers for testing some otherwise Windows-only code paths, so it seems excusable. Back-patch to all supported branches. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de
*	Use appropriate tuple descriptor in FDW batching	Tomas Vondra	2021-08-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The FDW batching code was using the same tuple descriptor both for all slots (regular and plan slots), but that's incorrect - the subplan may use a different descriptor. Currently this is benign, because batching is used only for INSERTs, and in that case the descriptors always match. But that would change if we allow batching UPDATEs. Fix by copying the appropriate tuple descriptor. Backpatch to 14, where the FDW batching was implemented. Author: Amit Langote Backpatch-through: 14, where FDW batching was added Discussion: https://postgr.es/m/CA%2BHiwqEWd5B0-e-RvixGGUrNvGkjH2s4m95%3DJcwUnyV%3Df0rAKQ%40mail.gmail.com
*	Fix grammar mistake in hash index README	John Naylor	2021-08-12
\| \| \| \| \| \|	Dilip Kumar Discussion: https://www.postgresql.org/message-id/CAFiTN-tjZbuY6vy7kZZ6xO%2BD4mVcO5wOPB5KiwJ3AHhpytd8fg%40mail.gmail.com
*	Avoid unnecessary shared invalidations in ROLLBACK PREPARED	Michael Paquier	2021-08-12
\| \| \| \| \| \| \| \| \| \|	The performance gain is minimal, but this makes the logic more consistent with AtEOXact_Inval(). No other invalidation is needed in this case as PREPARE takes already care of sending any local ones. Author: Liu Huailing Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/OSZPR01MB6215AA84D71EF2B3D354CF86BE139@OSZPR01MB6215.jpnprd01.prod.outlook.com
*	Fix segfault during EvalPlanQual with mix of local and foreign partitions.	Heikki Linnakangas	2021-08-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not sensible to re-evaluate a direct-modify Foreign Update or Delete during EvalPlanQual. However, ExecInitForeignScan() can still get called if a table mixes local and foreign partitions. EvalPlanQualStart() left the es_result_relations array uninitialized in the child EPQ EState, but ExecInitForeignScan() still expected to find it. That caused a segfault. Fix by skipping the es_result_relations lookup during EvalPlanQual processing. To make things a bit more robust, also skip the BeginDirectModify calls, and add a runtime check that ExecForeignScan() is not called on direct-modify foreign scans during EvalPlanQual processing. This is new in v14, commit 1375422c782. Before that, EvalPlanQualStart() copied the whole ResultRelInfo array to the EPQ EState. Backpatch to v14. Report and diagnosis by Andrey Lepikhov. Discussion: https://www.postgresql.org/message-id/cb2b808d-cbaa-4772-76ee-c8809bafcf3d%40postgrespro.ru
*	Add call to object access hook at the end of table rewrite in ALTER TABLE	Michael Paquier	2021-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ALTER TABLE .. SET {LOGGED,UNLOGGED,ACCESS METHOD} would never do a table-level object access hook, which was inconsistent with SET TABLESPACE. Note that contrary to SET TABLESPACE, the no-op case is left off for those commands as this requires tracking if commands have been called, but they may not execute a physical rewrite. Another thing worth noting is that the physical file swap at the end of a rewrite does a couple of access calls for internal objects created for the swap operation (internal objects are for example skipped by the tests of sepgsql), but this does not trigger the hook for the table on which the operation is done. f41872d, that added support for SET LOGGED/UNLOGGED in ALTER TABLE, visibly forgot to consider that. Based on what I checked, two regression tests of sepgsql in ddl.sql are going to log more information with this test, something that buildfarm member rhinoceros will tell soon enough. I am not completely sure of their format though, so these are not refreshed yet. This is arguably a bug, but no backpatch is done as this could cause a behavior change for anybody using object access hooks. Reported-by: Jeff Davis Discussion: https://postgr.es/m/YQJKV29/1a60uG68@paquier.xyz
*	Let regexp_replace() make use of REG_NOSUB when feasible.	Tom Lane	2021-08-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the replacement string doesn't contain \1...\9, then we don't need sub-match locations, so we can use the REG_NOSUB optimization here too. There's already a pre-scan of the replacement string to look for backslashes, so extend that to check for digits, and refactor to allow that to happen before we compile the regexp. While at it, try to speed up the pre-scan by using memchr() instead of a handwritten loop. It's likely that this is lost in the noise compared to the regexp processing proper, but maybe not. In any case, this coding is shorter. Also, add some test cases to improve the poor coverage of appendStringInfoRegexpSubstr(). Discussion: https://postgr.es/m/3534632.1628536485@sss.pgh.pa.us
*	Fix bogus assertion in BootstrapModeMain().	Andres Freund	2021-08-09
\| \| \| \| \| \| \|	The assertion was always true, as written, thanks to me "simplifying" it before commit. Per coverity and Tom Lane.
*	Avoid determining regexp subexpression matches, when possible.	Tom Lane	2021-08-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Identifying the precise match locations for parenthesized subexpressions is a fairly expensive task given the way our regexp engine works, both at regexp compile time (where we must create an optimized NFA for each parenthesized subexpression) and at runtime (where determining exact match locations requires laborious search). Up to now we've made little attempt to optimize this situation. This patch identifies cases where we know at compile time that we won't need to know subexpression match locations, and teaches the regexp compiler to not bother creating per-subexpression regexps for parenthesis pairs that are not referenced by backrefs elsewhere in the regexp. (To preserve semantics, we obviously still have to pin down the match locations of backref references.) Users could have obtained the same results before this by being careful to write "non capturing" parentheses wherever possible, but few people bother with that. Discussion: https://postgr.es/m/2219936.1628115334@sss.pgh.pa.us
*	Use ExplainPropertyInteger for queryid in EXPLAIN	David Rowley	2021-08-09
\| \| \| \| \| \| \| \| \| \| \|	This saves a few lines of code. Also add a comment to mention why we use ExplainPropertyInteger instead of ExplainPropertyUInteger given that queryid is a uint64 type. Author: David Rowley Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/CAApHDvqhSLYpSU_EqUdN39w9Uvb8ogmHV7_3YhJ0S3aScGBjsg@mail.gmail.com Backpatch-through: 14, where this code was originally added
*	Remove some unnecessary casts in format arguments	Peter Eisentraut	2021-08-08
\| \| \| \| \| \|	We can use %zd or %zu directly, no need to cast to int. Conversely, some code was casting away from int when it could be using %d directly.
*	Check the size in COPY_POINTER_FIELD	Peter Eisentraut	2021-08-08
\| \| \| \| \| \|	instead of making each caller do it. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
*	Change NestPath node to contain JoinPath node	Peter Eisentraut	2021-08-08
\| \| \| \| \| \| \|	This makes the structure of all JoinPath-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
*	Change SeqScan node to contain Scan node	Peter Eisentraut	2021-08-08
\| \| \| \| \| \| \|	This makes the structure of all Scan-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
*	Rethink regexp engine's backref-related compilation state.	Tom Lane	2021-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I had committer's remorse almost immediately after pushing cb76fbd7e, upon finding that removing capturing subexpressions' subREs from the data structure broke my proposed patch for REG_NOSUB optimization. Revert that data structure change. Instead, address the concern about not changing capturing subREs' endpoints by not changing the endpoints. We don't need to, because the point of that bit was just to ensure that the atom has endpoints distinct from the outer state pair that we're stringing the branch between. We already made suitable states in the parenthesized-subexpression case, so the additional ones were just useless overhead. This seems more understandable than Spencer's original coding, and it ought to be a shade faster too by saving a few state creations and arc changes. (I actually see a couple percent improvement on Jacobson's web corpus, though that's barely above the noise floor so I wouldn't put much stock in that result.) Also, fix the logic added by ea1268f63 to ensure that the subRE recorded in v->subs[subno] is exactly the one with capno == subno. Spencer's original coding recorded the child subRE of the capture node, which is okay so far as having the right endpoint states is concerned, but as of cb76fbd7e the capturing subRE itself always has those endpoints too. I think the inconsistency is confusing for the REG_NOSUB optimization. As before, backpatch to v14. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
*	Make regexp engine's backref-related compilation state more bulletproof.	Tom Lane	2021-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to now, we remembered the definition of a capturing parenthesis subexpression by storing a pointer to the associated subRE node. That was okay before, because that subRE didn't get modified anymore while parsing the rest of the regexp. However, in the wake of commit ea1268f63, that's no longer true: the outer invocation of parseqatom() feels free to scribble on that subRE. This seems to work anyway, because the states we jam into the child atom in the "prepare a general-purpose state skeleton" stanza aren't really semantically different from the original endpoints of the child atom. But that would be mighty easy to break, and it's definitely not how things worked before. Between this and the issue fixed in the prior commit, it seems best to get rid of this dependence on subRE nodes entirely. We don't need the whole child subRE for future backrefs, only its starting and ending NFA states; so let's just store pointers to those. Also, in the corner case where we make an extra subRE to handle immediately-nested capturing parentheses, it seems like it'd be smart to have the extra subRE have the same begin/end states as the original child subRE does (s/s2 not lp/rp). I think that linking it from lp to rp might actually be semantically wrong, though since Spencer's original code did it that way, I'm not totally certain. Using s/s2 is certainly not wrong, in any case. Per report from Mark Dilger. Back-patch to v14 where the problematic patches came in. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
*	Fix use-after-free issue in regexp engine.	Tom Lane	2021-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit cebc1d34e taught parseqatom() to optimize cases where a branch contains only one, "messy", atom by getting rid of excess subRE nodes. The way we really should do that is to keep the subRE built for the "messy" child atom; but to avoid changing parseqatom's nominal API, I made it delete that node after copying its fields to the outer subRE made by parsebranch(). It seems that that actually worked at the time; but it became dangerous after ea1268f63, because that later commit allowed the lower invocation of parse() to return a subRE that was also pointed to by some v->subs[] entry. This meant we could wind up with a dangling pointer in v->subs[], allowing a later backref to misbehave, but only if that subRE struct had been reused in between. So the damage seems confined to cases like '((...))...(...\2'. To fix, do what I should have done before and modify parseqatom's API to make it possible for it to remove the caller's subRE instead of the callee's. That's safer because we know that subRE isn't complete yet, so noplace else will have a pointer to it. Per report from Mark Dilger. Back-patch to v14 where the problematic patches came in. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
*	Move temporary file cleanup to before_shmem_exit().	Andres Freund	2021-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As reported by a few OSX buildfarm animals there exist at least one path where temporary files exist during AtProcExit_Files() processing. As temporary file cleanup causes pgstat reporting, the assertions added in ee3f8d3d3ae caused failures. This is not an OSX specific issue, we were just lucky that timing on OSX reliably triggered the problem. The known way to cause this is a FATAL error during perform_base_backup() with a MANIFEST used - adding an elog(FATAL) after InitializeBackupManifest() reliably reproduces the problem in isolation. The problem is that the temporary file created in InitializeBackupManifest() is not cleaned up via resource owner cleanup as WalSndResourceCleanup() currently is only used for non-FATAL errors. That then allows to reach AtProcExit_Files() with existing temporary files, causing the assertion failure. To fix this problem, move temporary file cleanup to a before_shmem_exit() hook and add assertions ensuring that no temporary files are created before / after temporary file management has been initialized / shut down. The cleanest way to do so seems to be to split fd.c initialization into two, one for plain file access and one for temporary file access. Right now there's no need to perform further fd.c cleanup during process exit, so I just renamed AtProcExit_Files() to BeforeShmemExit_Files(). Alternatively we could perform another pass through the files to check that no temporary files exist, but the added assertions seem to provide enough protection against that. It might turn out that the assertions added in ee3f8d3d3ae will cause too much noise - in that case we'll have to downgrade them to a WARNING, at least temporarily. This commit is not necessarily the best approach to address this issue, but it should resolve the buildfarm failures. We can revise later. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210807190131.2bm24acbebl4wl6i@alap3.anarazel.de
*	Really fix the ambiguity in REFRESH MATERIALIZED VIEW CONCURRENTLY.	Tom Lane	2021-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than trying to pick table aliases that won't conflict with any possible user-defined matview column name, adjust the queries' syntax so that the aliases are only used in places where they can't be mistaken for column names. Mostly this consists of writing "alias." not just "alias", which adds clarity for humans as well as machines. We do have the issue that "SELECT alias." acts differently from "SELECT alias", but we can use the same hack ruleutils.c uses for whole-row variables in SELECT lists: write "alias.*::compositetype". We might as well revert to the original aliases after doing this; they're a bit easier to read. Like 75d66d10e, back-patch to all supported branches. Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us
*	Message style improvements	Peter Eisentraut	2021-08-07
\|