aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Add LSN location in some error messages related to WAL pagesMichael Paquier2022-12-05
| | | | | | | | | | | | | | The error messages reported during any failures while reading or validating the header of a WAL currently includes only the offset of the page but not the compiled LSN referring to the page, requiring an extra step to compile it if looking at the surroundings with pg_waldump or similar. Adding this information costs a bit in translation, but also eases debugging. Author: Bharath Rupireddy Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, Maxim Orlov, Michael Paquier Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
* Generalize ri_RootToPartitionMap to use for non-partition childrenAlvaro Herrera2022-12-02
| | | | | | | | | | | | | | | | | | | | | | | ri_RootToPartitionMap is currently only initialized for tuple routing target partitions, though a future commit will need the ability to use it even for the non-partition child tables, so make adjustments to the decouple it from the partitioning code. Also, make it lazily initialized via ExecGetRootToChildMap(), making that function its preferred access path. Existing third-party code accessing it directly should no longer do so; consequently, it's been renamed to ri_RootToChildMap, which also makes it consistent with ri_ChildToRootMap. ExecGetRootToChildMap() houses the logic of setting the map appropriately depending on whether a given child relation is partition or not. To support this, also add a separate entry point for TupleConversionMap creation that receives an AttrMap. No new code here, just split an existing function in two. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqEYUhDXSK5BTvG_xk=eaAEJCD4GS3C6uH7ybBvv+Z_Tmg@mail.gmail.com
* Fix memory leak for hashing with nondeterministic collations.Jeff Davis2022-12-01
| | | | | | | Backpatch through 12, where nondeterministic collations were introduced (5e1963fb76). Backpatch-through: 12
* Improve heuristics for compressing the KnownAssignedXids array.Tom Lane2022-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we'd compress only when the active range of array entries reached Max(4 * PROCARRAY_MAXPROCS, 2 * pArray->numKnownAssignedXids). If max_connections is large, the first term could result in not compressing for a long time, resulting in much wastage of cycles in hot-standby backends scanning the array to take snapshots. Get rid of that term, and just bound it to 2 * pArray->numKnownAssignedXids. That however creates the opposite risk, that we might spend too much effort compressing. Hence, consider compressing only once every 128 commit records. (This frequency was chosen by benchmarking. While we only tried one benchmark scenario, the results seem stable over a fairly wide range of frequencies.) Also, force compression when processing RecoveryInfo WAL records (which should be infrequent); the old code could perform compression then, but would do so only after the same array-range check as for the transaction-commit path. Also, opportunistically run compression if the startup process is about to wait for WAL, though not oftener than once a second. This should prevent cases where we waste lots of time by leaving the array not-compressed for long intervals due to low WAL traffic. Lastly, add a simple check to keep us from uselessly compressing when the array storage is already compact. Back-patch, as the performance problem is worse in pre-v14 branches than in HEAD. Simon Riggs and Michail Nikolaev, with help from Tom Lane and Andres Freund. Discussion: https://postgr.es/m/CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com
* Add 'missing_ok' argument to build_attrmap_by_nameAlvaro Herrera2022-11-29
| | | | | | | | | | | | When it's given as true, return a 0 in the position of the missing column rather than raising an error. This is currently unused, but it allows us to reimplement column permission checking in a subsequent commit. It seems worth breaking into a separate commit because it affects unrelated code. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFfiai=qBxPDTjaio_ZcaqUKh+FC=prESrB8ogZgFNNNQ@mail.gmail.com
* Remove promote_trigger_file.Thomas Munro2022-11-29
| | | | | | | | | | | | | | | | | | | Previously, an idle startup (recovery) process would wake up every 5 seconds to have a chance to poll for promote_trigger_file, even if that GUC was not configured. That promotion triggering mechanism was effectively superseded by pg_ctl promote and pg_promote() a long time ago. There probably aren't many users left and it's very easy to change to the modern mechanisms, so we agreed to remove the feature. This is part of a campaign to reduce wakeups on idle systems. Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com> Discussion: https://postgr.es/m/CANbhV-FsjnzVOQGBpQ589%3DnWuL1Ex0Ykn74Nh1hEjp2usZSR5g%40mail.gmail.com
* Improve indenting in _hash_pgaddtupDavid Rowley2022-11-25
| | | | | | | | | The Assert added in d09dbeb9b came out rather ugly after having run pgindent on that code. Here we adjust things to use some local variables so that the Assert remains within the 80-character margin. Author: Ted Yu Discussion: https://postgr.es/m/CALte62wLSir1=x93Jf0xZvHaO009FEJfhVMFwnaR8q=csPP8kQ@mail.gmail.com
* Make multixact error message more explicitAlvaro Herrera2022-11-24
| | | | | | | | | | | | | | There are recent reports involving a very old error message that we have no history of hitting -- perhaps a recently introduced bug. Improve the error message in an attempt to improve our chances of investigating the bug. Per reports from Dimos Stamatakis and Bob Krier. Backpatch to 11. Discussion: https://postgr.es/m/CO2PR0801MB2310579F65529380A4E5EDC0E20A9@CO2PR0801MB2310.namprd08.prod.outlook.com Discussion: https://postgr.es/m/17518-04e368df5ad7f2ee@postgresql.org
* Speedup hash index builds by skipping needless binary searchesDavid Rowley2022-11-24
| | | | | | | | | | | | | | | | | When building hash indexes using the spool method, tuples are added to the index page in hashkey order. Because of this, we can safely skip performing the binary search on the existing tuples on the page to find the location to insert the tuple based on its hashkey value. For this case, we can just always put the tuple at the end of the item array as the tuples will always arrive in hashkey order. Testing has shown that this can improve hash index build speeds by 5-15% with a unique set of integer values. Author: Simon Riggs Reviewed-by: David Rowley Tested-by: David Zhang, Tomas Vondra Discussion: https://postgr.es/m/CANbhV-GBc5JoG0AneUGPZZW3o4OK5LjBGeKe_icpC3R1McrZWQ@mail.gmail.com
* Simplify vacuum_set_xid_limits() signature.Peter Geoghegan2022-11-23
| | | | | | | | | | | | Pass VACUUM parameters (VacuumParams state) to vacuum_set_xid_limits() directly, rather than passing most individual VacuumParams fields as separate arguments. Also make vacuum_set_xid_limits() output parameter symbol names match those used by its vacuumlazy.c caller. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wz=TE7gW5DgSahDkf0UEZigFGAoHNNN6EvSrdzC=Kn+hrA@mail.gmail.com
* Don't test HEAP_XMAX_INVALID when freezing xmax.Peter Geoghegan2022-11-23
| | | | | | | | | | | | | | | | | | | | We shouldn't ever need to rely on whether HEAP_XMAX_INVALID is set in t_infomask when considering whether or not an xmax should be deemed already frozen, since that status flag is just a hint. The only acceptable representation for an "xmax_already_frozen" raw xmax field is the transaction ID value zero (also known as InvalidTransactionId). Adjust code that superficially appeared to rely on HEAP_XMAX_INVALID to make the rule about xmax_already_frozen clear. Also avoid needlessly rereading the tuple's raw xmax. Oversight in bugfix commit d2599ecf. There is no evidence that this ever led to incorrect behavior, so no backpatch. The worst consequence of this bug was that VACUUM could hypothetically fail to notice and report on certain kinds of corruption, which seems fairly benign. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wzkh3DMCDRPfhZxj9xCq9v3WmzvmbiCpf1dNKUBPadhCbQ@mail.gmail.com
* lwlock: Fix quadratic behavior with very long wait listsAndres Freund2022-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now LWLockDequeueSelf() sequentially searched the list of waiters to see if the current proc is still is on the list of waiters, or has already been removed. In extreme workloads, where the wait lists are very long, this leads to a quadratic behavior. #backends iterating over a list #backends long. Additionally, the likelihood of needing to call LWLockDequeueSelf() in the first place also increases with the increased length of the wait queue, as it becomes more likely that a lock is released while waiting for the wait list lock, which is held for longer during lock release. Due to the exponential back-off in perform_spin_delay() this is surprisingly hard to detect. We should make that easier, e.g. by adding a wait event around the pg_usleep() - but that's a separate patch. The fix is simple - track whether a proc is currently waiting in the wait list or already removed but waiting to be woken up in PGPROC->lwWaiting. In some workloads with a lot of clients contending for a small number of lwlocks (e.g. WALWriteLock), the fix can substantially increase throughput. As the quadratic behavior arguably is a bug, we might want to decide to backpatch this fix in the future. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/20221027165914.2hofzp4cvutj6gin@awork3.anarazel.de Discussion: https://postgr.es/m/CALj2ACXktNbG=K8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag@mail.gmail.com
* Standardize rmgrdesc recovery conflict XID output.Peter Geoghegan2022-11-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Standardize on the name snapshotConflictHorizon for all XID fields from WAL records that generate recovery conflicts when in hot standby mode. This supersedes the previous latestRemovedXid naming convention. The new naming convention places emphasis on how the values are actually used by REDO routines. How the values are generated during original execution (details of which vary by record type) is deemphasized. Users of tools like pg_waldump can now grep for snapshotConflictHorizon to see all potential sources of recovery conflicts in a standardized way, without necessarily having to consider which specific record types might be involved. Also bring a couple of WAL record types that didn't follow any kind of naming convention into line. These are heapam's VISIBLE record type and SP-GiST's VACUUM_REDIRECT record type. Now every WAL record whose REDO routine calls ResolveRecoveryConflictWithSnapshot() passes through the snapshotConflictHorizon field from its WAL record. This is follow-up work to the refactoring from commit 9e540599 that made FREEZE_PAGE WAL records use a standard snapshotConflictHorizon style XID cutoff. No bump in XLOG_PAGE_MAGIC, since the underlying format of affected WAL records doesn't change. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wzm2CQUmViUq7Opgk=McVREHSOorYaAjR1ZpLYkRN7_dPw@mail.gmail.com
* Use correct type name in comments about freezing.Peter Geoghegan2022-11-17
| | | | Oversight in commit 9e540599, which added freeze plan deduplication.
* Variable renaming in preparation for refactoringPeter Eisentraut2022-11-16
| | | | | | | | Rename page -> block and dp -> page where appropriate. The old naming mixed up block and page in confusing ways. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
* Remove useless castsPeter Eisentraut2022-11-16
| | | | | Maybe these are left from when PageGetItem() was a macro, but now they are clearly useless.
* Turn HeapKeyTest macro into inline functionPeter Eisentraut2022-11-16
| | | | | | | | It is easier to read as a function. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
* Mark argument of RegisterCustomRmgr() as const.Jeff Davis2022-11-15
|
* Deduplicate freeze plans in freeze WAL records.Peter Geoghegan2022-11-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make heapam WAL records that describe freezing performed by VACUUM more space efficient by storing each distinct "freeze plan" once, alongside an array of associated page offset numbers (one per freeze plan). The freeze plans required for most heap pages tend to naturally have a great deal of redundancy, so this technique is very effective in practice. It often leads to freeze WAL records that are less than 20% of the size of equivalent WAL records generated using the previous approach. The freeze plan concept was introduced by commit 3b97e6823b, which fixed bugs in VACUUM's handling of MultiXacts. We retain the concept of freeze plans, but go back to using page offset number arrays. There is no loss of generality here because deduplication is an additive process that gets applied mechanically when FREEZE_PAGE WAL records are built. More than anything else, freeze plan deduplication is an optimization that reduces the marginal cost of freezing additional tuples on pages that will need to have at least one or two tuples frozen in any case. Ongoing work that adds page-level freezing to VACUUM will take full advantage of the improved cost profile through batching. Also refactor some of the details surrounding recovery conflicts needed to REDO freeze records in passing: make original execution responsible for generating a standard latestRemovedXid cutoff, rather than working backwards to get the same cutoff in the REDO routine. Bugfix commit 66fbcb0d2e did it the other way around, which is equivalent but obscures what's going on. Also rename the cutoff field from the WAL record/struct (rename the field cutoff_xid to latestRemovedXid to match similar WAL records). Processing of conflicts by REDO routines is already completely uniform, so tools like pg_waldump should present the information driving the process uniformly. There are two remaining WAL record types that still don't quite follow this convention (heapam's VISIBLE record type and SP-GiST's VACUUM_REDIRECT record type). They can be brought into line by later work that totally standardizes how the cutoffs are presented. Bump XLOG_PAGE_MAGIC. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-By: Nathan Bossart <nathandbossart@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CAH2-Wz=XytErMnb8FAyFd+OQEbiipB0Q2FmFdXrggPL4VBnRYQ@mail.gmail.com
* Fix cleanup lock acquisition in SPLIT_ALLOCATE_PAGE replay.Amit Kapila2022-11-14
| | | | | | | | | | | | | | | | During XLOG_HASH_SPLIT_ALLOCATE_PAGE replay, we were checking for a cleanup lock on the new bucket page after acquiring an exclusive lock on it and raising a PANIC error on failure. However, it is quite possible that checkpointer can acquire the pin on the same page before acquiring a lock on it, and then the replay will lead to an error. So instead, directly acquire the cleanup lock on the new bucket page during XLOG_HASH_SPLIT_ALLOCATE_PAGE replay operation. Reported-by: Andres Freund Author: Robert Haas Reviewed-By: Amit Kapila, Andres Freund, Vignesh C Backpatch-through: 11 Discussion: https://postgr.es/m/20220810022617.fvjkjiauaykwrbse@awork3.anarazel.de
* Refactor aclcheck functionsPeter Eisentraut2022-11-13
| | | | | | | | | | | | | | | | | | Instead of dozens of mostly-duplicate pg_foo_aclcheck() functions, write one common function object_aclcheck() that can handle almost all of them. We already have all the information we need, such as which system catalog corresponds to which catalog table and which column is the ACL column. There are a few pg_foo_aclcheck() that don't work via the generic function and have special APIs, so those stay as is. I also changed most pg_foo_aclmask() functions to static functions, since they are not used outside of aclchk.c. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://www.postgresql.org/message-id/flat/95c30f96-4060-2f48-98b5-a4392d3b6066@enterprisedb.com
* Refactor ownercheck functionsPeter Eisentraut2022-11-13
| | | | | | | | | | | | Instead of dozens of mostly-duplicate pg_foo_ownercheck() functions, write one common function object_ownercheck() that can handle almost all of them. We already have all the information we need, such as which system catalog corresponds to which catalog table and which column is the owner column. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://www.postgresql.org/message-id/flat/95c30f96-4060-2f48-98b5-a4392d3b6066@enterprisedb.com
* Document WAL rules related to PD_ALL_VISIBLE in README.Jeff Davis2022-11-12
| | | | | | | Also improve comments. Discussion: https://postgr.es/m/a50005c1c537f89bb359057fd70e66bb83bce969.camel@j-davis.com Reviewed-by: Peter Geoghegan
* Fix theoretical torn page hazard.Jeff Davis2022-11-11
| | | | | | | | | | | | | | | | | | | | | | | | The original report was concerned with a possible inconsistency between the heap and the visibility map, which I was unable to confirm. The concern has been retracted. However, there did seem to be a torn page hazard when using checksums. By not setting the heap page LSN during redo, the protections of minRecoveryPoint were bypassed. Fixed, along with a misleading comment. It may have been impossible to hit this problem in practice, because it would require a page tear between the checksum and the flags, so I am marking this as a theoretical risk. But, as discussed, it did violate expectations about the page LSN, so it may have other consequences. Backpatch to all supported versions. Reported-by: Konstantin Knizhnik Reviewed-by: Konstantin Knizhnik Discussion: https://postgr.es/m/fed17dac-8cb8-4f5b-d462-1bb4908c029e@garret.ru Backpatch-through: 11
* Remove obsolete comments and code from prior to f8f4227976.Jeff Davis2022-11-11
| | | | | | | | XLogReadBufferForRedo() and XLogReadBufferForRedoExtended() only return BLK_NEEDS_REDO if the record LSN is greater than the page LSN, so the redo routine doesn't need to do the LSN check again. Discussion: https://postgr.es/m/0c37b80e62b1f3007d5a6d1292bd8fa0c275627a.camel@j-davis.com
* Fix comment of SimpleLruInit() in slru.cMichael Paquier2022-11-10
| | | | | | | | | | sync_handler was not mentioned in the comment block of the function. Oversight in dee663f. Author: Aleksander Alekseev Discussion: https://postgr.es/m/CAJ7c6TPUd9BwNY47TtMxaijLHSbyHNdhu=kvbGnvO_bi+oC6_Q@mail.gmail.com Backpatch-through: 14
* Report a more useful error for reloptions on a partitioned table.Tom Lane2022-11-09
| | | | | | | | | | | | Previously, trying to set storage parameters on a partitioned table always led to "unrecognized parameter foo", because the code expected there might be some valid parameters; but there aren't any. The docs make clear that it's intended that there never will be any, so let's replace this useless search with a more to-the-point message. Simon Riggs and Karina Litskevich Discussion: https://postgr.es/m/CANbhV-H=eZ9kTR9mUgKGK0Qv9uXP=U+dQg3rinQHfTdFMhBA2A@mail.gmail.com
* Doc: add comments about PreventInTransactionBlock/IsInTransactionBlock.Tom Lane2022-11-09
| | | | | | | | | | | | | Add a little to the header comments for these functions to make it clearer what guarantees about commit behavior are provided to callers. (See commit f92944137 for context.) Although this is only a comment change, it's really documentation aimed at authors of extensions, so it seems appropriate to back-patch. Yugo Nagata and Tom Lane, per further discussion of bug #17434. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
* Introduce pg_pwrite_zeros() in fileutils.cMichael Paquier2022-11-08
| | | | | | | | | | | | | | | | | | | | | This routine is designed to write zeros to a file using vectored I/O, for a size given by its caller, being useful when it comes to initializing a file with a final size already known. XLogFileInitInternal() in xlog.c is changed to use this new routine when initializing WAL segments with zeros (wal_init_zero enabled). Note that the aligned buffers used for the vectored I/O writes have a size of XLOG_BLCKSZ, and not BLCKSZ anymore, as pg_pwrite_zeros() relies on PGAlignedBlock while xlog.c originally used PGAlignedXLogBlock. This routine will be used in a follow-up patch to do the pre-padding of WAL segments for pg_receivewal and pg_basebackup when these are not compressed. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael Paquier Discussion: https://www.postgresql.org/message-id/CALj2ACUq7nAb7%3DbJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA%40mail.gmail.com
* Remove redundant breaks in HeapTupleSatisfiesVisibilityAndres Freund2022-11-05
| | | | | Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_ZJg_N7zHtWP+JoSY_hrce4+GKioL137Y2c2En-kuXQ7g@mail.gmail.com
* Remove unneeded includes of <sys/stat.h>Michael Paquier2022-11-05
| | | | | | | | | | Since bfb9dfd, none of the files updated in this commit have any stat() calls, so these inclusions are not necessary, for the same reasons as 233cf6e. Per discussion with John Naylor. Discussion: https://postgr.es/m/CAFBsxsGGGX7KD6RxbNoSJzuSc8Gz3hOxcfhTOMLB_hJcm68dKQ@mail.gmail.com
* Improve the description of XLOG_RUNNING_XACTS.Amit Kapila2022-11-02
| | | | | | | | | | | | | | Previously, the description of XLOG_RUNNING_XACTS showed only top-transaction XIDs and whether subtransactions overflowed. This commit improves it to show individual subtransaction XIDs. This also improves the description of overflowed subtransactions. This additional information can be helpful for testing and debugging purposes. Author: Masahiko Sawada Reviewd by: Fujii Masao, Kyotaro Horiguchi, Ashutosh Bapat, Bharath Rupireddy Discussion: https://postgr.es/m/CAD21AoAqvaE+XEeXHHPdAGQPcCoGXxuoeutq_nWhUSQvTt5+tA@mail.gmail.com
* Add doubly linked count list implementationDavid Rowley2022-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | We have various requirements when using a dlist_head to keep track of the number of items in the list. This, traditionally, has been done by maintaining a counter variable in the calling code. Here we tidy this up by adding "dclist", which is very similar to dlist but also keeps track of the number of items stored in the list. Callers may use the new dclist_count() function when they need to know how many items are stored. Obtaining the count is an O(1) operation. For simplicity reasons, dclist and dlist both use dlist_node as their node type and dlist_iter/dlist_mutable_iter as their iterator type. dclists have all of the same functionality as dlists except there is no function named dclist_delete(). To remove an item from a list dclist_delete_from() must be used. This requires knowing which dclist the given item is stored in. Additionally, here we also convert some dlists where additional code exists to keep track of the number of items stored and to make these use dclists instead. Author: David Rowley Reviewed-by: Bharath Rupireddy, Aleksander Alekseev Discussion: https://postgr.es/m/CAApHDvrtVxr+FXEX0VbViCFKDGxA3tWDgw9oFewNXCJMmwLjLg@mail.gmail.com
* Clean up some inconsistencies with GUC declarationsMichael Paquier2022-10-31
| | | | | | | | | | | | | | | | | | | | This is similar to 7d25958, and this commit takes care of all the remaining inconsistencies between the initial value used in the C variable associated to a GUC and its default value stored in the GUC tables (as of pg_settings.boot_val). Some of the initial values of the GUCs updated rely on a compile-time default. These are refactored so as the GUC table and its C declaration use the same values. This makes everything consistent with other places, backend_flush_after, bgwriter_flush_after, port, checkpoint_flush_after doing so already, for example. Extracted from a larger patch by Peter Smith. The spots updated in the modules are from me. Author: Peter Smith, Michael Paquier Reviewed-by: Nathan Bossart, Tom Lane, Justin Pryzby Discussion: https://postgr.es/m/CAHut+PtHE0XSfjjRQ6D4v7+dqzCw=d+1a64ujra4EX8aoc_Z+w@mail.gmail.com
* Remove AssertArg and AssertStatePeter Eisentraut2022-10-28
| | | | | | | | | These don't offer anything over plain Assert, and their usage had already been declared obsolescent. Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20221009210148.GA900071@nathanxps13
* Allow nodeSort to perform Datum sorts for byref typesDavid Rowley2022-10-28
| | | | | | | | | | | | | | | | | | | | | | | | Here we add a new 'copy' parameter to tuplesort_getdatum so that we can instruct the function not to datumCopy() byref Datums before returning. Similar to 91e9e89dc, this can provide significant performance improvements in nodeSort when sorting by a single byref column and the sort's targetlist contains only that column. This allows us to re-enable Datum sorts for byref types which was disabled in 3a5817695 due to a reported memory leak. Additionally, here we slightly optimize DISTINCT aggregates so that we no longer perform any datumCopy() when we find the current value not to be distinct from the previous value. Previously the code would always take a copy of the most recent Datum and pfree the previous value, even when the values were the same. Testing shows a small but noticeable performance increase when aggregate transitions are skipped due to the current transition value being the same as the prior one. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvqS6wC5U==k9Hd26E4EQXH3QR67-T4=Q1rQ36NGvjfVSg@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvqHonfe9G1cVaKeHbDx70R_zCrM3qP2AGXpGrieSKGnhA@mail.gmail.com
* Update outdated comment for TransactionIdSetTreeStatusHeikki Linnakangas2022-10-25
| | | | | | | | | Commit 06da3c570f changed the way subtransactions are marked as SUBCOMMITTED, but the example it included actually documented the old way. Update it. Author: Japin Li Discussion: https://www.postgresql.org/message-id/MEYP282MB16690BC96DFBE08CC857E1E3B6319%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
* Fix recently added incorrect assertionAlvaro Herrera2022-10-24
| | | | | | | | | | Commit df3737a651f4 added an incorrect assertion about the preconditions for invoking the backup cleanup callback: it misfires at session end in case a backup completes successfully. Fix it, using coding from Michaël Paquier. Also add some tests for the various cases. Reported by Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20221021.161038.1277961198945653224.horikyota.ntt@gmail.com
* Improve memory handling across SQL-callable backup functionsMichael Paquier2022-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since pg_backup_start() and pg_backup_stop() exist, the tablespace map data and the backup state data (backup_label string until 7d70809) have been allocated in the TopMemoryContext. This approach would cause memory leaks in the session calling these functions if failures happen before pg_backup_stop() ends, leaking more memory on repeated failures. Both things need little memory so that would not be really noticeable for most users, except perhaps connection poolers with long-lived connections able to trigger backup failures with these functions. This commit improves the logic in this area by not allocating anymore the backup-related data that needs to travel across the SQL-callable backup functions in TopMemoryContext, by using instead a dedicated memory context child of TopMemoryContext. The memory context is created in pg_backup_start() and deleted when finishing pg_backup_stop(). In the event of an in-flight failure, this memory context gets reset in the follow-up pg_backup_start() call, so as we are sure that only one run worth of data is leaked at any time. Some cleanup was already done for the backup data on a follow-up call of pg_backup_start(), but using a memory context makes the whole simpler. BASE_BACKUP commands are executed in isolation, relying on the memory context created for replication commands, hence these do not need such an extra logic. Author: Bharath Rupireddy Reviewed-by: Robert Haas, Alvaro Herrera, Cary Huang, Michael Paquier Discussion: https://postgr.es/m/CALj2ACXqvfKF2B0beQ=aJMdWnpNohmBPsRg=EDQj_6y1t2O8mQ@mail.gmail.com
* Get rid of XLogCtlInsert->forcePageWritesAlvaro Herrera2022-10-19
| | | | | | | | | After commit 39969e2a1e4d, ->forcePageWrites is no longer very interesting: we can just test whether runningBackups is different from 0. This simplifies some code, so do away with it. Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a
* Remove pg_backup_start_callback and reuse similar codeAlvaro Herrera2022-10-19
| | | | | | | | | | | | | | | | | | | | | We had two copies of almost identical logic to revert shared memory state when a running backup aborts; we can remove pg_backup_start_callback if we adapt do_pg_abort_backup so that it can be used for this purpose too. However, in order for this to work, we have to repurpose the flag passed to do_pg_abort_backup. It used to indicate whether to throw a warning (and the only caller always passed true). It now indicates whether the callback is being called at start time (in which case the session backup state is known not to have been set to RUNNING yet, so action is always taken) or shmem time (in which case action is only taken if the session backup state is RUNNING). Thus the meaning of the flag is no longer superfluous, but it's actually quite critical to get right. I (Álvaro) chose to change the polarity and the code flow re. the flag from what Bharath submitted, for coding clarity. Co-authored-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/20221013111330.564fk5tkwe3ha77l%40alvherre.pgsql
* Rename SetSingleFuncCall() to InitMaterializedSRF()Michael Paquier2022-10-18
| | | | | | | | | | | | | | | | | | Per discussion, the existing routine name able to initialize a SRF function with materialize mode is unpopular, so rename it. Equally, the flags of this function are renamed, as of: - SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC - SRF_SINGLE_BLESS -> MAT_SRF_BLESS The previous function and flags introduced in 9e98583 are kept around for compatibility purposes, so as any extension code already compiled with v15 continues to work as-is. The declarations introduced here for compatibility will be removed from HEAD in a follow-up commit. The new names have been suggested by Andres Freund and Melanie Plageman. Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de Backpatch-through: 15
* Have GetCurrentTransactionStopTimestamp() set xactStopTimestamp if unsetAndres Freund2022-10-14
| | | | | | | | | | | | | | | | Previously GetCurrentTransactionStopTimestamp() computed a new timestamp whenever xactStopTimestamp was unset and xactStopTimestamp was only set when a commit or abort record was written. An upcoming patch will add additional calls to GetCurrentTransactionStopTimestamp() from pgstats. To avoid computing timestamps multiple times, set xactStopTimestamp in GetCurrentTransactionStopTimestamp() if not already set. Author: Dave Page <dpage@pgadmin.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Discussion: https://postgr.es/m/20220906155325.an3xesq5o3fq36gt%40awork3.anarazel.de
* Fix ordering issue with WAL operations in GIN fast insert pathMichael Paquier2022-10-13
| | | | | | | | | | | | | | | | | | Contrary to what is documented in src/backend/access/transam/README, ginHeapTupleFastInsert() had a few ordering issues with the way it does its WAL operations when inserting items in its fast path. First, when using a separate list, XLogBeginInsert() was being always called before START_CRIT_SECTION(), and in this case a second thing was wrong when merging lists, as an exclusive lock was taken on the tail page *before* calling XLogBeginInsert(). Finally, when inserting items into a tail page, the order of XLogBeginInsert() and START_CRIT_SECTION() was reversed. This commit addresses all these issues by moving the calls of XLogBeginInsert() after all the pages logged are locked and pinned, within a critical section. Author: Matthias van de Meent, Zhang Mingli Discussion: https://postgr.es/m/CAEze2WhL8uLMqynnnCu1LAPwxD5RKEo0nHV+eXGg_N6ELU88HQ@mail.gmail.com
* Simplify some maths in xlogreader.cMichael Paquier2022-10-12
| | | | | | | | | | | | An LSN was calculated from a segment number, a segment size and a position offset, matching exactly the LSN given by the caller of XLogReaderValidatePageHeader(). This change removes the extra LSN calculation, relying only on the LSN given by the function caller instead. Author: Bharath Rupireddy Reviewed-by: Richard Guo, Álvaro Herrera, Kyotaro Horiguchi Discussion: https://postgr.es/m/CALj2ACXuh4Ms9j9sxMYdtHEe=5sFcyrs-GAHyADu_A_G71kZTg@mail.gmail.com
* Remove unnecessary semicolons after goto labelsJohn Naylor2022-10-10
| | | | | | | | | | | According to the C standard, a label must followed by a statement. If there was ever a time we needed an empty statement here, it was a long time ago. Japin Li Reviewed by Julien Rouhaud Discussion: https://www.postgresql.org/message-id/MEYP282MB16690F40189A4F060B41D56DB65E9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
* Remove unnecessary uses of Abs()Peter Eisentraut2022-10-07
| | | | | | | | Use C standard abs() or fabs() instead. Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/4beb42b5-216b-bce8-d452-d924d5794c63%40enterprisedb.com
* Fix comment in xlogprefetcher.cMichael Paquier2022-10-06
| | | | | Author: Sho Kato Discussion: https://postgr.es/m/TYCPR01MB684954052EC534A3261B29249F5C9@TYCPR01MB6849.jpnprd01.prod.outlook.com
* Rename shadowed local variablesDavid Rowley2022-10-05
| | | | | | | | | | | | In a similar effort to f01592f91, here we mostly rename shadowed local variables to remove the warnings produced when compiling with -Wshadow=compatible-local. This fixes 63 warnings and leaves just 5. Author: Justin Pryzby, David Rowley Reviewed-by: Justin Pryzby Discussion https://postgr.es/m/20220817145434.GC26426%40telsasoft.com
* Cleanup useless assignments and checksMichael Paquier2022-10-04
| | | | | | | | | | | | | | This cleans up a couple of areas: - Remove XLogSegNo calculation for the last WAL segment in backup in xlog.c (7d70809 has moved this logic entirely to xlogbackup.c when building the contents of the backup history file). - Remove check on log_min_duration in analyze.c, as it is already true where this code path is reached. - Simplify call to find_option() in guc.c. Author: Ranier Vilela Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAEudQArCDQQiPiFR16=yu9k5s2tp4tgEe1U1ZbkW4ofx81AWWQ@mail.gmail.com