aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Remove still more useless assignments.Tom Lane2020-09-04
| | | | | | | | Fix some more things scan-build pointed to as dead stores. In some of these cases, rearranging the code a little leads to more readable code IMO. It's all cosmetic, though. Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
* Fix bogus MaxAllocSize check in logtape.c.Jeff Davis2020-09-04
| | | | | | Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=NZPZc3-fkdmvu=w2itx0PiB-G6QpxHXZOjuvFAzPdZw@mail.gmail.com Backpatch-through: 13
* Report expected contrecord length on mismatchAlvaro Herrera2020-09-04
| | | | | | | | When reading a WAL record fails to find continuation record(s) of the proper length, report what it expects, for clarity. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20200903212152.GA15319@alvherre.pgsql
* Remove some more useless assignments.Tom Lane2020-09-04
| | | | | | | | | Found with clang's scan-build tool. It also whines about a lot of other dead stores that we should *not* change IMO, either as a matter of style or future-proofing. But these places seem like clear oversights. Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
* Fix inline marking introduced in commit 464824323e.Amit Kapila2020-09-04
| | | | | | | | Forgot to add inline marking in changes_filename() declaration. In the passing, add inline marking for a similar function subxact_filename(). Reported-By: Nathan Bossart Discussion: https://postgr.es/m/E98FBE8F-B878-480D-A728-A60C6EED3047@amazon.com
* remove redundant initializationsBruce Momjian2020-09-03
| | | | | | | | | | Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com Author: Ranier Vilela Backpatch-through: master
* Remove variable "concurrent" from ReindexStmtMichael Paquier2020-09-04
| | | | | | | | | | This node already handles multiple options using a bitmask, so having a separate boolean flag is not necessary. This simplifies the code a bit with less arguments to give to the reindex routines, by replacing the boolean with an equivalent bitmask value. Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/20200902110326.GA14963@paquier.xyz
* Remove arbitrary restrictions on password length.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch started out with the goal of harmonizing various arbitrary limits on password length, but after awhile a better idea emerged: let's just get rid of those fixed limits. recv_password_packet() has an arbitrary limit on the packet size, which we don't really need, so just drop it. (Note that this doesn't really affect anything for MD5 or SCRAM password verification, since those will hash the user's password to something shorter anyway. It does matter for auth methods that require a cleartext password.) Likewise remove the arbitrary error condition in pg_saslprep(). The remaining limits are mostly in client-side code that prompts for passwords. To improve those, refactor simple_prompt() so that it allocates its own result buffer that can be made as big as necessary. Actually, it proves best to make a separate routine pg_get_line() that has essentially the semantics of fgets(), except that it allocates a suitable result buffer and hence will never return a truncated line. (pg_get_line has a lot of potential applications to replace randomly-sized fgets buffers elsewhere, but I'll leave that for another patch.) I built pg_get_line() atop stringinfo.c, which requires moving that code to src/common/; but that seems fine since it was a poor fit for src/port/ anyway. This patch is mostly mine, but it owes a good deal to Nathan Bossart who pressed for a solution to the password length problem and created a predecessor patch. Also thanks to Peter Eisentraut and Stephen Frost for ideas and discussion. Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com
* Avoid lockup of a parallel worker when reporting a long error message.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | | | | | Because sigsetjmp() will restore the initial state with signals blocked, the code path in bgworker.c for reporting an error and exiting would execute that way. Usually this is fairly harmless; but if a parallel worker had an error message exceeding the shared-memory communication buffer size (16K) it would lock up, because it would wait for a resume-sending signal from its parallel leader which it would never detect. To fix, just unblock signals at the appropriate point. This can be shown to fail back to 9.6. The lack of parallel query infrastructure makes it difficult to provide a simple test case for 9.5; but I'm pretty sure the issue exists in some form there as well, so apply the code change there too. Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
* Allow records to span multiple lines in pg_hba.conf and pg_ident.conf.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | A backslash at the end of a line now causes the next line to be appended to the current one (effectively, the backslash and newline are discarded). This allows long HBA entries to be created without legibility problems. While we're here, get rid of the former hard-wired length limit on pg_hba.conf lines, by using an expansible StringInfo buffer instead of a fixed-size local variable. Since the same code is used to read the ident map file, these changes apply there as well. Fabien Coelho, reviewed by Justin Pryzby and David Zhang Discussion: https://postgr.es/m/alpine.DEB.2.21.2003251906140.15243@pseudo
* Add support for streaming to built-in logical replication.Amit Kapila2020-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | To add support for streaming of in-progress transactions into the built-in logical replication, we need to do three things: * Extend the logical replication protocol, so identify in-progress transactions, and allow adding additional bits of information (e.g. XID of subtransactions). * Modify the output plugin (pgoutput) to implement the new stream API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle streamed in-progress transaction by spilling the data to disk and then replaying them on commit. We however must explicitly disable streaming replication during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover we don't have a replication connection open so we don't have where to send the data anyway. Author: Tomas Vondra, Dilip Kumar and Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
* Add string_to_table() function.Tom Lane2020-09-02
| | | | | | | | | | | | | | | This splits a string at occurrences of a delimiter. It is exactly like string_to_array() except for producing a set of values instead of an array of values. Thus, the relationship of these two functions is the same as between regexp_split_to_table() and regexp_split_to_array(). Although the same results could be had from unnest(string_to_array()), this is somewhat faster than that, and anyway it seems reasonable to have it for symmetry with the regexp functions. Pavel Stehule, reviewed by Peter Smith Discussion: https://postgr.es/m/CAFj8pRD8HOpjq2TqeTBhSo_QkzjLOhXzGCpKJ4nCs7Y9SQkuPw@mail.gmail.com
* Avoid unnecessary acquisition of SyncRepLock in transaction commit time.Fujii Masao2020-09-02
| | | | | | | | | | | | | | | | In SyncRepWaitForLSN() routine called in transaction commit time, SyncRepLock is necessary to atomically both check the shared sync_standbys_defined flag and operate the sync replication wait-queue. On the other hand, when the flag is false, the lock is not necessary because the wait-queue is not touched. But due to the changes by commit 48c9f49265, previously the lock was taken whatever the flag was. This could cause unnecessary performance overhead in every transaction commit time. Therefore this commit avoids that unnecessary aquisition of SyncRepLock. Author: Fujii Masao Reviewed-by: Asim Praveen, Masahiko Sawada, Discussion: https://postgr.es/m/20200406050332.nsscfqjzk2d57zyx@alap3.anarazel.de
* Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEMMichael Paquier2020-09-02
| | | | | | | | | | | | | | | | | | | When multiple relations are reindexed, a scan of pg_class is done first to build the list of relations to work on. However the REINDEX logic has never checked if a relation listed still exists when beginning the work on it, causing for example sudden cache lookup failures. This commit adds safeguards against dropped relations for REINDEX, similarly to VACUUM or CLUSTER where we try to open the relation, ignoring it if it is missing. A new option is added to the REINDEX routines to control if a missed relation is OK to ignore or not. An isolation test, based on REINDEX SCHEMA, is added for the concurrent and non-concurrent cases. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz
* Set cutoff xmin more aggressively when vacuuming a temporary table.Tom Lane2020-09-01
| | | | | | | | | | | | | | | | | | Since other sessions aren't allowed to look into a temporary table of our own session, we do not need to worry about the global xmin horizon when setting the vacuum XID cutoff. Indeed, if we're not inside a transaction block, we may set oldestXmin to be the next XID, because there cannot be any in-doubt tuples in a temp table, nor any tuples that are dead but still visible to some snapshot of our transaction. (VACUUM, of course, is never inside a transaction block; but we need to test that because CLUSTER shares the same code.) This approach allows us to always clean out a temp table completely during VACUUM, independently of concurrent activity. Aside from being useful in its own right, that simplifies building reproducible test cases. Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us
* Raise error on concurrent drop of partitioned indexAlvaro Herrera2020-09-01
| | | | | | | | | | | | | | | | | | We were already raising an error for DROP INDEX CONCURRENTLY on a partitioned table, albeit a different and confusing one: ERROR: DROP INDEX CONCURRENTLY must be first action in transaction Change that to throw a more comprehensible error: ERROR: cannot drop partitioned index \"%s\" concurrently Michael Paquier authored the test case for indexes on temporary partitioned tables. Backpatch to 11, where indexes on partitioned tables were added. Reported-by: Jan Mussler <jan.mussler@zalando.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
* Fix the SharedFileSetUnregister API.Amit Kapila2020-09-01
| | | | | | | | | | | | Commit 808e13b282 introduced a few APIs to extend the existing Buffile interface. In SharedFileSetDeleteOnProcExit, it tries to delete the list element while traversing the list with 'foreach' construct which makes the behavior of list traversal unpredictable. Author: Amit Kapila Reviewed-by: Dilip Kumar Tested-by: Dilip Kumar and Neha Sharma Discussion: https://postgr.es/m/CAA4eK1JhLatVcQ2OvwA_3s0ih6Hx9+kZbq107cXVsSWWukH7vA@mail.gmail.com
* Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.Tom Lane2020-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically, we've considered the state with relpages and reltuples both zero as indicating that we do not know the table's tuple density. This is problematic because it's impossible to distinguish "never yet vacuumed" from "vacuumed and seen to be empty". In particular, a user cannot use VACUUM or ANALYZE to override the planner's normal heuristic that an empty table should not be believed to be empty because it is probably about to get populated. That heuristic is a good safety measure, so I don't care to abandon it, but there should be a way to override it if the table is indeed intended to stay empty. Hence, represent the initial state of ignorance by setting reltuples to -1 (relpages is still set to zero), and apply the minimum-ten-pages heuristic only when reltuples is still -1. If the table is empty, VACUUM or ANALYZE (but not CREATE INDEX) will override that to reltuples = relpages = 0, and then we'll plan on that basis. This requires a bunch of fiddly little changes, but we can get rid of some ugly kluges that were formerly needed to maintain the old definition. One notable point is that FDWs' GetForeignRelSize methods will see baserel->tuples = -1 when no ANALYZE has been done on the foreign table. That seems like a net improvement, since those methods were formerly also in the dark about what baserel->tuples = 0 really meant. Still, it is an API change. I bumped catversion because code predating this change would get confused by seeing reltuples = -1. Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
* Reset indisreplident for an invalid index in DROP INDEX CONCURRENTLYMichael Paquier2020-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | A failure when dropping concurrently an index used in a replica identity could leave in pg_index an index marked as !indisvalid and indisreplident. Reindexing this index would switch back indisvalid to true, and if the replica identity of the parent relation was switched to use a different index, it would be possible to finish with more than one index marked as indisreplident. If that were to happen, this could mess up with the relation cache as an incorrect index could be used for the replica identity. Indexes marked as invalid are discarded as candidates for the replica identity, as of RelationGetIndexList(), so similarly to what is done with indisclustered, resetting indisreplident when the index is marked as invalid keeps things consistent. REINDEX CONCURRENTLY's swapping already resets the flag for the old index, while the new index inherits the value of the old index to-be-dropped, so only DROP INDEX was an issue. Even if this is a bug, the sequence able to reproduce a problem requires a failure while running DROP INDEX CONCURRENTLY, something unlikely going to happen in the field, so no backpatch is done. Author: Michael Paquier Reviewed-by: Dmitry Dolgov Discussion: https://postgr.es/m/20200827025721.GN2017@paquier.xyz
* Fix code for re-finding scan position in a multicolumn GIN index.Tom Lane2020-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | collectMatchBitmap() needs to re-find the index tuple it was previously looking at, after transiently dropping lock on the index page it's on. The tuple should still exist and be at its prior position or somewhere to the right of that, since ginvacuum never removes tuples but concurrent insertions could add one. However, there was a thinko in that logic, to the effect of expecting any inserted tuples to have the same index "attnum" as what we'd been scanning. Since there's no physical separation of tuples with different attnums, it's not terribly hard to devise scenarios where this fails, leading to transient "lost saved point in index" errors. (While I've duplicated this with manual testing, it seems impossible to make a reproducible test case with our available testing technology.) Fix by just continuing the scan when the attnum doesn't match. While here, improve the error message used if we do fail, so that it matches the wording used in btree for a similar case. collectMatchBitmap()'s posting-tree code path was previously not exercised at all by our regression tests. While I can't make a regression test that exhibits the bug, I can at least improve the code coverage here, so do that. The test case I made for this is an extension of one added by 4b754d6c1, so it only works in HEAD and v13; didn't seem worth trying hard to back-patch it. Per bug #16595 from Jesse Kinkead. This has been broken since multicolumn capability was added to GIN (commit 27cb66fdf), so back-patch to all supported branches. Discussion: https://postgr.es/m/16595-633118be8eef9ce2@postgresql.org
* Fix comment in procarray.cMichael Paquier2020-08-27
| | | | | | | | The description of GlobalVisDataRels was missing, GlobalVisCatalogRels being mentioned instead. Author: Jim Nasby Discussion: https://postgr.es/m/8e06c883-2858-1fd4-07c5-560c28b08dcd@amazon.com
* Suppress compiler warning in non-cassert builds.Tom Lane2020-08-26
| | | | | | Oversight in 808e13b28, reported by Bruce Momjian. Discussion: https://postgr.es/m/20200826160251.GB21909@momjian.us
* Add additional information in the vacuum error context.Amit Kapila2020-08-26
| | | | | | | | | | The additional information added will be an offset number for heap operations. This information will help us in finding the exact tuple due to which the error has occurred. Author: Mahendra Singh Thalor and Amit Kapila Reviewed-by: Sawada Masahiko, Justin Pryzby and Amit Kapila Discussion: https://postgr.es/m/CAKYtNApK488TDF4bMbw+1QH8HJf9cxdNDXquhU50TK5iv_FtCQ@mail.gmail.com
* Extend the BufFile interface.Amit Kapila2020-08-26
| | | | | | | | | | | | | | | | | | | | | | Allow BufFile to support temporary files that can be used by the single backend when the corresponding files need to be survived across the transaction and need to be opened and closed multiple times. Such files need to be created as a member of a SharedFileSet. Additionally, this commit implements the interface for BufFileTruncate to allow files to be truncated up to a particular offset and extends the BufFileSeek API to support the SEEK_END case. This also adds an option to provide a mode while opening the shared BufFiles instead of always opening in read-only mode. These enhancements in BufFile interface are required for the upcoming patch to allow the replication apply worker, to handle streamed in-progress transactions. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
* Move codes for pg_backend_memory_contexts from mmgr/mcxt.c to adt/mcxtfuncs.c.Fujii Masao2020-08-26
| | | | | | | | | | | Previously the codes for pg_backend_memory_contexts were in src/backend/utils/mmgr/mcxt.c. This commit moves them to src/backend/utils/adt/mcxtfuncs.c so that mcxt.c basically includes only the low-level interface for memory contexts. Author: Atsushi Torikoshi Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/20200819135545.GC19121@paquier.xyz
* Prevent non-superusers from reading pg_backend_memory_contexts, by default.Fujii Masao2020-08-26
| | | | | | | | | | | | | pg_backend_memory_contexts view contains some internal information of memory contexts. Since exposing them to any users by default may cause security issue, this commit allows only superusers to read this view, by default, like we do for pg_shmem_allocations view. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/1414992.1597849297@sss.pgh.pa.us
* Fixup some misusages of bms_num_members()David Rowley2020-08-26
| | | | | | | | | | | It's a bit inefficient to test if a Bitmapset is empty by counting all the members and seeing if that number is zero. It's much better just to use bms_is_empty(). Likewise for checking if there are at least two members, just use bms_membership(), which does not need to do anything more after finding two members. Discussion: https://postgr.es/m/CAApHDvpvwm_QjbDOb5xga%2BKmX9XkN9xQavNGm3SvDbVnCYOerQ%40mail.gmail.com Reviewed-by: Tomas Vondra
* Improve the vacuum error context phase information.Amit Kapila2020-08-24
| | | | | | | | | | | | | | | We were displaying the wrong phase information for 'info' message in the index clean up phase because we were switching to the previous phase a bit early. We were also not displaying context information for heap phase unless the block number is valid which is fine for error cases but for messages at 'info' or lower error level it appears to be inconsistent with index phase information. Reported-by: Sawada Masahiko Author: Sawada Masahiko Reviewed-by: Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/CA+fd4k4HcbhPnCs7paRTw1K-AHin8y4xKomB9Ru0ATw0UeTy2w@mail.gmail.com
* Avoid pushing quals down into sub-queries that have grouping sets.Tom Lane2020-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | The trouble with doing this is that an apparently-constant subquery output column isn't really constant if it is a grouping column that appears in only some of the grouping sets. A qual using such a column would be subject to incorrect const-folding after push-down, as seen in bug #16585 from Paul Sivash. To fix, just disable qual pushdown altogether if the sub-query has nonempty groupingSets. While we could imagine far less restrictive solutions, there is not much point in working harder right now, because subquery_planner() won't move HAVING clauses to WHERE within such a subquery. If the qual stays in HAVING it's not going to be a lot more useful than if we'd kept it at the outer level. Having said that, this restriction could be removed if we used a parsetree representation that distinguished such outputs from actual constants, which is something I hope to do in future. Hence, make the patch a minimal addition rather than integrating it more tightly (e.g. by renumbering the existing items in subquery_is_pushdown_safe's comment). Back-patch to 9.5 where grouping sets were introduced. Discussion: https://postgr.es/m/16585-9d8c340d23ade8c1@postgresql.org
* Fix ALTER TABLE's scheduling rules for AT_AddConstraint subcommands.Tom Lane2020-08-22
| | | | | | | | | | | | | | | | | | | | | | Commit 1281a5c90 rearranged the logic in this area rather drastically, and it broke the case of adding a foreign key constraint in the same ALTER that adds the pkey or unique constraint it depends on. While self-referential fkeys are surely a pretty niche case, this used to work so we shouldn't break it. To fix, reorganize the scheduling rules in ATParseTransformCmd so that a transformed AT_AddConstraint subcommand will be delayed into a later pass in all cases, not only when it's been spit out as a side-effect of parsing some other command type. Also tweak the logic so that we won't run ATParseTransformCmd twice while doing this. It seems to work even without that, but it's surely wasting cycles to do so. Per bug #16589 from Jeremy Evans. Back-patch to v13 where the new code was introduced. Discussion: https://postgr.es/m/16589-31c8d981ca503896@postgresql.org
* Fix handling of CREATE TABLE LIKE with inheritance.Tom Lane2020-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a CREATE TABLE command uses both LIKE and traditional inheritance, Vars in CHECK constraints and expression indexes that are absorbed from a LIKE parent table tended to get mis-numbered, resulting in wrong answers and/or bizarre error messages (though probably not any actual crashes, thanks to validation occurring in the executor). In v12 and up, the same could happen to Vars in GENERATED expressions, even in cases with no LIKE clause but multiple traditional-inheritance parents. The cause of the problem for LIKE is that parse_utilcmd.c supposed it could renumber such Vars correctly during transformCreateStmt(), which it cannot since we have not yet accounted for columns added via inheritance. Fix that by postponing processing of LIKE INCLUDING CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed DefineRelation(). The error with GENERATED and multiple inheritance is a simple oversight in MergeAttributes(); it knows it has to renumber Vars in inherited CHECK constraints, but forgot to apply the same processing to inherited GENERATED expressions (a/k/a defaults). Per bug #16272 from Tom Gottfried. The non-GENERATED variants of the issue are ancient, presumably dating right back to the addition of CREATE TABLE LIKE; hence back-patch to all supported branches. Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org
* Rework EXPLAIN for planner's buffer usage.Fujii Masao2020-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ce77abe63c allowed EXPLAIN (BUFFERS) to report the information on buffer usage during planning phase. However three issues were reported regarding this feature. (1) Previously, EXPLAIN option BUFFERS required ANALYZE. So the query had to be actually executed by specifying ANALYZE even when we want to see only the planner's buffer usage. This was inconvenient especially when the query was write one like DELETE. (2) EXPLAIN included the planner's buffer usage in summary information. So SUMMARY option had to be enabled to report that. Also this format was confusing. (3) The output structure for planning information was not consistent between TEXT format and the others. For example, "Planning" tag was output in JSON format, but not in TEXT format. For (1), this commit allows us to perform EXPLAIN (BUFFERS) without ANALYZE to report the planner's buffer usage. For (2), this commit changed EXPLAIN output so that the planner's buffer usage is reported before summary information. For (3), this commit made the output structure for planning information more consistent between the formats. Back-patch to v13 where the planner's buffer usage was allowed to be reported in EXPLAIN. Reported-by: Pierre Giraud, David Rowley Author: Fujii Masao Reviewed-by: David Rowley, Julien Rouhaud, Pierre Giraud Discussion: https://postgr.es/m/07b226e6-fa49-687f-b110-b7c37572f69e@dalibo.com
* Fix typos in comments.Fujii Masao2020-08-21
| | | | | | Author: Masahiko Sawada Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CA+fd4k4m9hFSrRLB3etPWO5_v5=MujVZWRtz63q+55hM0Dz25Q@mail.gmail.com
* Fix a few typos in JIT comments and READMEDavid Rowley2020-08-21
| | | | | | | Reviewed-by: Abhijit Menon-Sen Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvobgmCs6CohqhKTUf7D8vffoZXQTCBTERo9gbOeZmvLTw%40mail.gmail.com Backpatch-through: 11, where JIT was added
* Revert "Make vacuum a bit more verbose to debug BF failure."Andres Freund2020-08-20
| | | | | | | | | | | This reverts commit 49967da65aec970fcda123acc681f1df5d70bfc6. Enough time has passed that we can be confident that 07f32fcd23a resolved the issue. Therefore we can remove the temporary debugging aids. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/E1k7tGP-0005V0-5k@gemulon.postgresql.org
* Acquire ProcArrayLock exclusively in ProcArrayClearTransaction.Andres Freund2020-08-19
| | | | | | | | | | | | | | | This corrects an oversight by me in 20729324078, which made ProcArrayClearTransaction() increment xactCompletionCount. That requires an exclusive lock, obviously. There's other approaches that avoid the exclusive acquisition, but given that a 2PC commit is fairly heavyweight, it doesn't seem worth doing so. I've not been able to measure a performance difference, unsurprisingly. I did add a comment documenting that we could do so, should it ever become a bottleneck. Reported-By: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/1355915.1597794204@sss.pgh.pa.us
* Suppress unnecessary RelabelType nodes in yet more cases.Tom Lane2020-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a477bfc1d fixed eval_const_expressions() to ensure that it didn't generate unnecessary RelabelType nodes, but I failed to notice that some other places in the planner had the same issue. Really noplace in the planner should be using plain makeRelabelType(), for fear of generating expressions that should be equal() to semantically equivalent trees, but aren't. An example is that because canonicalize_ec_expression() failed to be careful about this, we could end up with an equivalence class containing both a plain Const, and a Const-with-RelabelType representing exactly the same value. So far as I can tell this led to no visible misbehavior, but we did waste a bunch of cycles generating and evaluating "Const = Const-with-RelabelType" to prove such entries are redundant. Hence, move the support function added by a477bfc1d to where it can be more generally useful, and use it in the places where planner code previously used makeRelabelType. Back-patch to v12, like the previous patch. While I have no concrete evidence of any real misbehavior here, it's certainly possible that I overlooked a case where equivalent expressions that aren't equal() could cause a user-visible problem. In any case carrying extra RelabelType nodes through planning to execution isn't very desirable. Discussion: https://postgr.es/m/1311836.1597781384@sss.pgh.pa.us
* Add pg_backend_memory_contexts system view.Fujii Masao2020-08-19
| | | | | | | | | | | | | | | | | | | This view displays the usages of all the memory contexts of the server process attached to the current session. This information is useful to investigate the cause of backend-local memory bloat. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be uesd in some environments because no debugger is available there. And it outputs lots of text messages and it's not easy to analyze them. So, pg_backend_memory_contexts view allows us to access to backend-local memory contexts information more easily. Bump catalog version. Author: Atsushi Torikoshi, Fujii Masao Reviewed-by: Tatsuhito Kasahara, Andres Freund, Daniel Gustafsson, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/72a656e0f71d0860161e0b3f67e4d771@oss.nttdata.com
* Fix race condition in snapshot caching when 2PC is used.Andres Freund2020-08-18
| | | | | | | | | | | | | | | | | | | | When preparing a transaction xactCompletionCount needs to be incremented, even though the transaction has not committed yet. Otherwise the snapshot used within the transaction otherwise can get reused outside of the prepared transaction. As GetSnapshotData() does not include the current xid when building a snapshot, reuse would not be correct. Somewhat surprisingly the regression tests only rarely show incorrect results without the fix. The reason for that is that often the snapshot's xmax will be >= the backend xid, yielding a snapshot that is correct, despite the bug. I'm working on a reliable test for the bug, but it seems worth seeing whether this fixes all the BF failures while I do. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/E1k7tGP-0005V0-5k@gemulon.postgresql.org
* snapshot scalability: cache snapshots using a xact completion counter.Andres Freund2020-08-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous commits made it faster/more scalable to compute snapshots. But not building a snapshot is still faster. Now that GetSnapshotData() does not maintain RecentGlobal* anymore, that is actually not too hard: This commit introduces xactCompletionCount, which tracks the number of top-level transactions with xids (i.e. which may have modified the database) that completed in some form since the start of the server. We can avoid rebuilding the snapshot's contents whenever the current xactCompletionCount is the same as it was when the snapshot was originally built. Currently this check happens while holding ProcArrayLock. While it's likely possible to perform the check without acquiring ProcArrayLock, it seems better to do that separately / later, some careful analysis is required. Even with the lock this is a significant win on its own. On a smaller two socket machine this gains another ~1.03x, on a larger machine the effect is roughly double (earlier patch version tested though). If we were able to safely avoid the lock there'd be another significant gain on top of that. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
* Mark commit and abort WAL records with XLR_SPECIAL_REL_UPDATE.Heikki Linnakangas2020-08-17
| | | | | | | | | | | | | | | | If a commit or abort record includes "dropped relfilenodes", then replaying the record will remove data files. That is surely a "special rel update", but the records were not marked as such. Fix that, teach pg_rewind to expect and ignore them, and add a test case to cover it. It's always been like this, but no backporting for fear of breaking existing applications. If an application parsed the WAL but was not handling commit/abort records, it would stop working. That might be a good thing if it really needed to handle the dropped rels, but it will be caught when the application is updated to work with PostgreSQL v14 anyway. Discussion: https://www.postgresql.org/message-id/07b33e2c-46a6-86a1-5f9e-a7da73fddb95%40iki.fi Reviewed-by: Amit Kapila, Michael Paquier
* Make xact.h usable in frontend.Heikki Linnakangas2020-08-17
| | | | | | | | xact.h included utils/datetime.h, which cannot be used in the frontend (it includes fmgr.h, which needs Datum). But xact.h only needs the definition of TimestampTz from it, which is available directly in datatypes/timestamp.h. Change xact.h to include that instead of utils/datetime.h, so that it can be used in client programs.
* Fix use of wrong index in ComputeXidHorizons().Andres Freund2020-08-16
| | | | | | | | | | | | | | | | This bug, recently introduced in 941697c3c1a, at least lead to vacuum failing because it found tuples inserted by a running transaction, but below the freeze limit. The freeze limit in turn is directly affected by the aforementioned bug. Thanks to Tom Lane figuring how to make the bug reproducible. We should add a few more assertions to make sure this type of bug isn't as hard to notice, but it's not yet clear how to best do so. Co-Diagnosed-By: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/1013484.1597609043@sss.pgh.pa.us
* Make vacuum a bit more verbose to debug BF failure.Andres Freund2020-08-16
| | | | | | | This is temporary. While possibly some more error checking / debugging in this path would be a good thing, it'll not look exactly like this. Discussion: https://postgr.es/m/20200816181604.l54m6kss5ntd6xow@alap3.anarazel.de
* Correct several behavior descriptions in comments.Noah Misch2020-08-15
| | | | | | | | | Reuse cautionary language from src/test/ssl/README in src/test/kerberos/README. SLRUs have had access to six-character segments names since commit 73c986adde5d73a5e2555da9b5c8facedb146dcd, and recovery stopped calling HeapTupleHeaderAdvanceLatestRemovedXid() in commit 558a9165e081d1936573e5a7d576f5febd7fb55a. The other corrections are more self-evident.
* Prevent concurrent SimpleLruTruncate() for any given SLRU.Noah Misch2020-08-15
| | | | | | | | | | | | | | | | | The SimpleLruTruncate() header comment states the new coding rule. To achieve this, add locktype "frozenid" and two LWLocks. This closes a rare opportunity for data loss, which manifested as "apparent wraparound" or "could not access status of transaction" errors. Data loss is more likely in pg_multixact, due to released branches' thin margin between multiStopLimit and multiWrapLimit. If a user's physical replication primary logged ": apparent wraparound" messages, the user should rebuild standbys of that primary regardless of symptoms. At less risk is a cluster having emitted "not accepting commands" errors or "must be vacuumed" warnings at some point. One can test a cluster for this data loss by running VACUUM FREEZE in every database. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
* Remove obsolete HAVE_BUGGY_SOLARIS_STRTODPeter Eisentraut2020-08-15
| | | | | | | Fixed more than 10 years ago. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/flat/aa266ede-baaa-f4e6-06cf-5b1737610e9a%402ndquadrant.com
* Be more careful about the shape of hashable subplan clauses.Tom Lane2020-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan has the form of one or more OpExprs whose LHS is an expression of the outer query's, while the RHS is an expression over Params representing output columns of the subquery. However, the planner only went as far as verifying that the clauses were all binary OpExprs. This works 99.99% of the time, because the clauses have the right shape when emitted by the parser --- but it's possible for function inlining to break that, as reported by PegoraroF10. To fix, teach the planner to check that the LHS and RHS contain the right things, or more accurately don't contain the wrong things. Given that this has been broken for years without anyone noticing, it seems sufficient to just give up hashing when it happens, rather than go to the trouble of commuting the clauses back again (which wouldn't necessarily work anyway). While poking at that, I also noticed that nodeSubplan.c had a baked-in assumption that the number of hash clauses is identical to the number of subquery output columns. Again, that's fine as far as parser output goes, but it's not hard to break it via function inlining. There seems little reason for that assumption though --- AFAICS, the only thing it's buying us is not having to store the number of hash clauses explicitly. Adding code to the planner to reject such cases would take more code than getting nodeSubplan.c to cope, so I fixed it that way. This has been broken for as long as we've had hashable SubPlans, so back-patch to all supported branches. Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
* snapshot scalability: Move subxact info to ProcGlobal, remove PGXACT.Andres Freund2020-08-14
| | | | | | | | | | | | | | | | | | | | Similar to the previous changes this increases the chance that data frequently needed by GetSnapshotData() stays in l2 cache. In many workloads subtransactions are very rare, and this makes the check for that considerably cheaper. As this removes the last member of PGXACT, there is no need to keep it around anymore. On a larger 2 socket machine this and the two preceding commits result in a ~1.07x performance increase in read-only pgbench. For read-heavy mixed r/w workloads without row level contention, I see about 1.1x. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
* snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.Andres Freund2020-08-14
| | | | | | | | | | | | | Similar to the previous commit this increases the chance that data frequently needed by GetSnapshotData() stays in l2 cache. As we now take care to not unnecessarily write to ProcGlobal->vacuumFlags, there should be very few modifications to the ProcGlobal->vacuumFlags array. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de