postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	VACUUM VERBOSE: Don't report "pages removed".	Peter Geoghegan	2021-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It doesn't make any sense to report this information, since VACUUM VERBOSE reports on heap relation truncation directly. This was an oversight in commit 7ab96cf6, which made VACUUM VERBOSE output a little more consistent with nearby autovacuum-specific log output. Adjust comments that describe how this is supposed to work in passing. Also bring truncation-related VACUUM VERBOSE output in line with the convention established for VACUUM VERBOSE output by commit f4f4a649. Author: Peter Geoghegan <pg@bowt.ie> Backpatch: 14-, where VACUUM VERBOSE's output changed.
*	Report tuple address in data-corruption error message	Alvaro Herrera	2021-08-30
\| \| \| \| \| \| \| \| \| \| \|	Most data-corruption reports mention the location of the problem, but this one failed to. Add it. Backpatch all the way back. In 12 and older, also assign the ERRCODE_DATA_CORRUPTED error code as was done in commit fd6ec93bf890 for 13 and later. Discussion: https://postgr.es/m/202108191637.oqyzrdtnheir@alvherre.pgsql
*	Count SP-GiST index scans in pg_stat statistics.	Tom Lane	2021-08-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Somehow, spgist overlooked the need to call pgstat_count_index_scan(). Hence, pg_stat_all_indexes.idx_scan and equivalent columns never became nonzero for an SP-GiST index, although the related per-tuple counters worked fine. This fix works a bit differently from other index AMs, in that the counter increment occurs in spgrescan not spggettuple/spggetbitmap. It looks like this won't make the user-visible semantics noticeably different, so I won't go to the trouble of introducing an is-this- the-first-call flag just to make the counter bumps happen in the same places. Per bug #17163 from Christian Quest. Back-patch to all supported versions. Discussion: https://postgr.es/m/17163-b8c5cc88322a5e92@postgresql.org
*	track_io_timing logging: Don't special case 0 ms.	Peter Geoghegan	2021-08-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adjust track_io_timing related logging code added by commit 94d13d474d. Make it consistent with other nearby autovacuum and autoanalyze logging code by removing logic that suppressed zero millisecond outputs. log_autovacuum_min_duration log output now reliably shows "read:" and "write:" millisecond-based values in its report (when track_io_timing is enabled). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Stephen Frost <sfrost@snowman.net> Discussion: https://postgr.es/m/CAH2-WznW0FNxSVQMSRazAMYNfZ6DR_gr5WE78hc6E1CBkkJpzw@mail.gmail.com Backpatch: 14-, where the track_io_timing logging was introduced.
*	Reorder log_autovacuum_min_duration log output.	Peter Geoghegan	2021-08-27
\| \| \| \| \| \| \| \| \| \|	This order seems more natural. It starts with details that are particular to heap and index data structures, and ends with system-level costs incurred during the autovacuum worker's VACUUM/ANALYZE operation. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzkzxK6ahA9xxsOftRtBX_R0swuHZsvo4QUbak1Bz7hb7Q@mail.gmail.com Backpatch: 14-, which enhanced the log output in various ways.
*	Fix broken snapshot handling in parallel workers.	Robert Haas	2021-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pengchengliu reported an assertion failure in a parallel woker while performing a parallel scan using an overflowed snapshot. The proximate cause is that TransactionXmin was set to an incorrect value. The underlying cause is incorrect snapshot handling in parallel.c. In particular, InitializeParallelDSM() was unconditionally calling GetTransactionSnapshot(), because I (rhaas) mistakenly thought that was always retrieving an existing snapshot whereas, at isolation levels less than REPEATABLE READ, it's actually taking a new one. So instead do this only at higher isolation levels where there actually is a single snapshot for the whole transaction. By itself, this is not a sufficient fix, because we still need to guarantee that TransactionXmin gets set properly in the workers. The easiest way to do that seems to be to install the leader's active snapshot as the transaction snapshot if the leader did not serialize a transaction snapshot. This doesn't affect the results of future GetTrasnactionSnapshot() calls since those have to take a new snapshot anyway; what we care about is the side effect of setting TransactionXmin. Report by Pengchengliu. Patch by Greg Nancarrow, except for some comment text which I supplied. Discussion: https://postgr.es/m/002f01d748ac$eaa781a0$bff684e0$@tju.edu.cn
*	Avoid creating archive status ".ready" files too early	Alvaro Herrera	2021-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WAL records may span multiple segments, but XLogWrite() does not wait for the entire record to be written out to disk before creating archive status files. Instead, as soon as the last WAL page of the segment is written, the archive status file is created, and the archiver may process it. If PostgreSQL crashes before it is able to write and flush the rest of the record (in the next WAL segment), the wrong version of the first segment file lingers in the archive, which causes operations such as point-in-time restores to fail. To fix this, keep track of records that span across segments and ensure that segments are only marked ready-for-archival once such records have been completely written to disk. This has always been wrong, so backpatch all the way back. Author: Nathan Bossart <bossartn@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CBDDFA01-6E40-46BB-9F98-9340F4379505@amazon.com
*	Revert analyze support for partitioned tables	Alvaro Herrera	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts the following commits: 1b5617eb844cd2470a334c1d2eec66cf9b39c41a Describe (auto-)analyze behavior for partitioned tables 0e69f705cc1a3df273b38c9883fb5765991e04fe Set pg_class.reltuples for partitioned tables 41badeaba8beee7648ebe7923a41c04f1f3cb302 Document ANALYZE storage parameters for partitioned tables 0827e8af70f4653ba17ed773f123a60eadd9f9c9 autovacuum: handle analyze for partitioned tables There are efficiency issues in this code when handling databases with large numbers of partitions, and it doesn't look like there isn't any trivial way to handle those. There are some other issues as well. It's now too late in the cycle for nontrivial fixes, so we'll have to let Postgres 14 users continue to manually deal with ANALYZE their partitioned tables, and hopefully we can fix the issues for Postgres 15. I kept [most of] be280cdad298 ("Don't reset relhasindex for partitioned tables on ANALYZE") because while we added it due to 0827e8af70f4, it is a good bugfix in its own right, since it affects manual analyze as well as autovacuum-induced analyze, and there's no reason to revert it. I retained the addition of relkind 'p' to tables included by pg_stat_user_tables, because reverting that would require a catversion bump. Also, in pg14 only, I keep a struct member that was added to PgStat_TabStatEntry to avoid breaking compatibility with existing stat files. Backpatch to 14. Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de
*	Refresh apply delay on reload of recovery_min_apply_delay at recovery	Michael Paquier	2021-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit ensures that the wait interval in the replay delay loop waiting for an amount of time defined by recovery_min_apply_delay is correctly handled on reload, recalculating the delay if this GUC value is updated, based on the timestamp of the commit record being replayed. The previous behavior would be problematic for example with replay still waiting even if the delay got reduced or just cancelled. If the apply delay was increased to a larger value, the wait would have just respected the old value set, finishing earlier. Author: Soumyadeep Chakraborty, Ashwin Agrawal Reviewed-by: Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/CAE-ML+93zfr-HLN8OuxF0BjpWJ17O5dv1eMvSE5jsj9jpnAXZA@mail.gmail.com Backpatch-through: 9.6
*	Message style improvements	Peter Eisentraut	2021-08-07
\|
*	Make vacuum_index_cleanup reloption RELOPT_TYPE_ENUM.	Peter Geoghegan	2021-08-03
\| \| \| \| \| \| \| \| \| \|	Oversight in commit 3499df0d, which generalized the reloption as a way of giving users a way to consistently avoid VACUUM's index bypass optimization. Per off-list report from Nikolay Shaplov. Backpatch: 14-, where index cleanup reloption was extended.
*	Update obsolete comment that still referred to CheckpointLock	Heikki Linnakangas	2021-07-30
\| \| \| \| \| \|	CheckpointLock was removed in commit d18e75664a, and commit ce197e91d0 updated a leftover comment in CreateCheckPoint, but there was another copy of it in CreateRestartPoint still.
*	Close yet another race condition in replication slot test code	Alvaro Herrera	2021-07-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buildfarm shows that this test has a further failure mode when a checkpoint starts earlier than expected, so we detect a "checkpoint completed" line that's not the one we want. Change the config to try and prevent this. Per buildfarm While at it, update one comment that was forgotten in commit d18e75664a2f. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20210729.162038.534808353849568395.horikyota.ntt@gmail.com
*	Update minimum recovery point on truncation during WAL replay of abort record.	Fujii Masao	2021-07-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a file is truncated, we must update minRecoveryPoint. Once a file is truncated, there's no going back; it would not be safe to stop recovery at a point earlier than that anymore. Commit 7bffc9b7bf changed xact_redo_commit() so that it updates minRecoveryPoint on truncation, but forgot to change xact_redo_abort(). Back-patch to all supported versions. Reported-by: mengjuan.cmj@alibaba-inc.com Author: Fujii Masao Reviewed-by: Heikki Linnakangas Discussion: https://postgr.es/m/b029fce3-4fac-4265-968e-16f36ff4d075.mengjuan.cmj@alibaba-inc.com
*	Advance old-segment horizon properly after slot invalidation	Alvaro Herrera	2021-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When some slots are invalidated due to the max_slot_wal_keep_size limit, the old segment horizon should move forward to stay within the limit. However, in commit c6550776394e we forgot to call KeepLogSeg again to recompute the horizon after invalidating replication slots. In cases where other slots remained, the limits would be recomputed eventually for other reasons, but if all slots were invalidated, the limits would not move at all afterwards. Repair. Backpatch to 13 where the feature was introduced. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Marcin Krupowicz <mk@071.ovh> Discussion: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org
*	Remove dead assignment to local variable.	Heikki Linnakangas	2021-07-12
\| \| \| \| \| \| \| \| \| \| \| \|	This should have been removed in commit 7e30c186da, which split the loop into two. Only the first loop uses the 'from' variable; updating it in the second loop is bogus. It was never read after the first loop, so this was harmless and surely optimized away by the compiler, but let's be tidy. Backpatch to all supported versions. Author: Ranier Vilela Discussion: https://www.postgresql.org/message-id/CAEudQAoWq%2BAL3BnELHu7gms2GN07k-np6yLbukGaxJ1vY-zeiQ%40mail.gmail.com
*	Fix incorrect PITR message for transaction ROLLBACK PREPARED	Michael Paquier	2021-06-30
\| \| \| \| \| \| \| \| \| \| \|	Reaching PITR on such a transaction would cause the generation of a LOG message mentioning a transaction committed, not aborted. Oversight in 4f1b890. Author: Simon Riggs Discussion: https://postgr.es/m/CANbhV-GJ6KijeCgdOrxqMCQ+C8QiK657EMhCy4csjrPcEUFv_Q@mail.gmail.com Backpatch-through: 9.6
*	Pre branch pgindent / pgperltidy run	Andrew Dunstan	2021-06-28
\| \| \| \| \|	Along the way make a slight adjustment to src/include/utils/queryjumble.h to avoid an unused typedef.
*	Message style improvements	Peter Eisentraut	2021-06-28
\|
*	Fix race condition in TransactionGroupUpdateXidStatus().	Amit Kapila	2021-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we cannot immediately acquire XactSLRULock in exclusive mode at commit time, we add ourselves to a list of processes that need their XIDs status update. We do this if the clog page where we need to update the current transaction status is the same as the group leader's clog page, otherwise, we allow the caller to clear it by itself. Now, when we can't add ourselves to any group, we were not clearing the current proc if it has already become a member of some group which was leading to an assertion failure when the same proc was assigned to another backend after the current backend exits. Reported-by: Alexander Lakhin Bug: 17072 Author: Amit Kapila Tested-By: Alexander Lakhin Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/17072-2f8764857ef2c92a@postgresql.org
*	Remove redundant variable pageSize in gistinitpage	Peter Eisentraut	2021-06-25
\| \| \| \| \| \| \| \| \|	In gistinitpage, pageSize variable looks redundant, instead just pass BLCKSZ. This will be consistent with its peers BloomInitPage, brin_page_init and SpGistInitPage. Author: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CALj2ACWj=V1k5591eeZK2sOg2FYuBUp6azFO8tMkBtGfXf8PMQ@mail.gmail.com
*	Remove overzealous VACUUM failsafe assertions.	Peter Geoghegan	2021-06-20
\| \| \| \| \| \| \| \| \| \| \|	The failsafe can trigger when index processing is already disabled. This can happen when VACUUM's INDEX_CLEANUP parameter is "off" and the failsafe happens to trigger. Remove assertions that assume that index processing is directly tied to the failsafe. Oversight in commit c242baa4, which made it possible for the failsafe to trigger in a two-pass strategy VACUUM that has yet to make its first call to lazy_vacuum_all_indexes().
*	Support disabling index bypassing by VACUUM.	Peter Geoghegan	2021-06-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generalize the INDEX_CLEANUP VACUUM parameter (and the corresponding reloption): make it into a ternary style boolean parameter. It now exposes a third option, "auto". The "auto" option (which is now the default) enables the "bypass index vacuuming" optimization added by commit 1e55e7d1. "VACUUM (INDEX_CLEANUP TRUE)" is redefined to once again make VACUUM simply do any required index vacuuming, regardless of how few dead tuples are encountered during the first scan of the target heap relation (unless there are exactly zero). This gives users a way of opting out of the "bypass index vacuuming" optimization, if for whatever reason that proves necessary. It is also expected to be used by PostgreSQL developers as a testing option from time to time. "VACUUM (INDEX_CLEANUP FALSE)" does the same thing as it always has: it forcibly disables both index vacuuming and index cleanup. It's not expected to be used much in PostgreSQL 14. The failsafe mechanism added by commit 1e55e7d1 addresses the same problem in a simpler way. INDEX_CLEANUP can now be thought of as a testing and compatibility option. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CAH2-WznrBoCST4_Gxh_G9hA8NzGUbeBGnOUC8FcXcrhqsv6OHQ@mail.gmail.com
*	Tidy up GetMultiXactIdMembers()'s behavior on error	Heikki Linnakangas	2021-06-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One of the error paths left members uninitialized. That's not a live bug, because most callers don't look at members when the function returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does "if (members != NULL) pfree(members)", but AFAICS it never passes an invalid 'multi' value so it should not reach that error case. The callers are also a bit inconsistent in their expectations. heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's not a live bug either, because the function should never return 0, but add an Assert for that to make it more clear. I left the callers alone for now. I also moved the line where we set *nmembers. It wasn't wrong before, but I like to do that right next to the 'return' statement, to make it clear that it's always set on return. Also remove one unreachable return statement after ereport(ERROR), for brevity and for consistency with the similar if-block right after it. Author: Greg Nancarrow with the additional changes by me Backpatch-through: 9.6, all supported versions
*	Fix outdated comment that talked about seek position of WAL file.	Heikki Linnakangas	2021-06-16
\| \| \| \| \| \| \| \|	Since commit c24dcd0cfd, we have been using pg_pread() to read the WAL file, which doesn't change the seek position (unless we fall back to the implementation in src/port/pread.c). Update comment accordingly. Backpatch-through: 12, where we started to use pg_pread()
*	Remove unneeded field from VACUUM state.	Peter Geoghegan	2021-06-15
\| \| \| \| \| \| \|	Bugfix commit 5fc89376 effectively made the lock_waiter_detected field from vacuumlazy.c's global state struct into private state owned by lazy_truncate_heap(). Finish this off by replacing the struct field with a local variable.
*	Remove forced toast recompression in VACUUM FULL/CLUSTER	Michael Paquier	2021-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The extra checks added by the recompression of toast data introduced in bbe0a81 is proving to have a performance impact on VACUUM or CLUSTER even if no recompression is done. This is more noticeable with more toastable columns that contain non-NULL values. Improvements could be done to make those extra checks less expensive, but that's not material for 14 at this stage, and we are not sure either if the code path of VACUUM FULL/CLUSTER is adapted for this job. Per discussion with several people, including Andres Freund, Robert Haas, Álvaro Herrera, Tom Lane and myself. Discussion: https://postgr.es/m/20210527003144.xxqppojoiwurc2iz@alap3.anarazel.de
*	Report sort phase progress in parallel btree build	Alvaro Herrera	2021-06-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were already reporting it, but only after the parallel workers were finished, which is visibly much later than what happens in a serial build. With this change we report it when the leader starts its own sort phase when participating in the build (the normal case). Now this might happen a little later than when the workers start their sorting phases, but a) communicating the actual phase start from workers is likely to be a hassle, and b) the sort phase start is pretty fuzzy anyway, since sorting per se is actually initiated by tuplesort.c internally earlier than tuplesort_performsort() is called. Backpatch to pg12, where the progress reporting code for CREATE INDEX went in. Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Greg Nancarrow <gregn4422@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/1128176d-1eee-55d4-37ca-e63644422adb
*	Fix an asssortment of typos in brin_minmax_multi.c and mcv.c	David Rowley	2021-06-10
\| \| \| \|	Discussion: https://postgr.es/m/CAApHDvrbyJNOPBws4RUhXghZ7+TBjtdO-rznTsqZECuowNorXg@mail.gmail.com
*	Fix corner case failure of new standby to follow new primary.	Robert Haas	2021-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This only happens if (1) the new standby has no WAL available locally, (2) the new standby is starting from the old timeline, (3) the promotion happened in the WAL segment from which the new standby is starting, (4) the timeline history file for the new timeline is available from the archive but the WAL files for are not (i.e. this is a race), (5) the WAL files for the new timeline are available via streaming, and (6) recovery_target_timeline='latest'. Commit ee994272ca50f70b53074f0febaec97e28f83c4e introduced this logic and was an improvement over the previous code, but it mishandled this case. If recovery_target_timeline='latest' and restore_command is set, validateRecoveryParameters() can change recoveryTargetTLI to be different from receiveTLI. If streaming is then tried afterward, expectedTLEs gets initialized with the history of the wrong timeline. It's supposed to be a list of entries explaining how to get to the target timeline, but in this case it ends up with a list of entries explaining how to get to the new standby's original timeline, which isn't right. Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi. Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com
*	Fix pg_visibility regression failure with CLOBBER_CACHE_ALWAYS	Tomas Vondra	2021-06-08
\| \| \| \| \| \| \| \| \| \| \|	Commit 8e03eb92e9 reverted a bit too much code, reintroducing one of the issues fixed by 39b66a91bd - a page might have been left partially empty after relcache invalidation. Reported-By: Tom Lane Author: Masahiko Sawada Discussion: https://postgr.es/m/822752.1623032114@sss.pgh.pa.us Discussion: https://postgr.es/m/CAD21AoA%3D%3Df2VSw3c-Cp_y%3DWLKHMKc1D6s7g3YWsCOvgaYPpJcg%40mail.gmail.com
*	Clean up some questionable usages of DatumGet* macros	David Rowley	2021-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This tidies up some questionable coding which made use of DatumGetPointer() for Datums being passed into functions where the parameter is expected to be a cstring. We saw no compiler warnings with the old code as the Pointer type used in DatumGetPointer() happens to be a char * rather than a void *. However, that's no excuse and we should be using the correct macro for the job. Here we also make use of OutputFunctionCall() rather than using FunctionCall1() directly to call the type's output function. OutputFunctionCall() is the standard way to do this. It casts the returned value to a cstring for us. In passing get rid of a duplicate call to strlen(). Most compilers will likely optimize away the 2nd call, but there may be some that won't. In any case, this just aligns the code to some other nearby code that already does this. Discussion: https://postgr.es/m/CAApHDvq1D=ehZ8hey8Hz67N+_Zth0GHO5wiVCfv1YcGPMXJq0A@mail.gmail.com
*	Adjust locations which have an incorrect copyright year	David Rowley	2021-06-04
\| \| \| \| \| \| \|	A few patches committed after ca3b37487 mistakenly forgot to make the copyright year 2021. Fix these. Discussion: https://postgr.es/m/CAApHDvqyLmd9P2oBQYJ=DbrV8QwyPRdmXtCTFYPE08h+ip0UJw@mail.gmail.com
*	Standardize usages of appendStringInfo and appendPQExpBuffer	David Rowley	2021-06-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a few places that were using appendStringInfo() when they should have been using appendStringInfoString(). Also some cases of appendPQExpBuffer() that would have been better suited to use appendPQExpBufferChar(), and finally, some places that used appendPQExpBuffer() when appendPQExpBufferStr() would have suited better. There are no bugs are being fixed here. The aim is just to make the code use the most optimal function for the job. All the code being changed here is new to PG14. It makes sense to fix these before we branch for PG15. There are a few other places that we could fix, but those cases are older code so fixing those seems less worthwhile as it may cause unnecessary back-patching pain in the future. Author: Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB5716732158B1C4142C6FE375943D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
*	Revert most of 39b66a91bd	Tomas Vondra	2021-06-03
\| \| \| \| \| \| \| \| \| \|	Reverts most of commit 39b66a91bd, which was found to cause significant regression for REFRESH MATERIALIZED VIEW. This means only rows inserted by heap_multi_insert will benefit from the optimization, implemented in commit 7db0cd2145. Reported-by: Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoA%3D%3Df2VSw3c-Cp_y%3DWLKHMKc1D6s7g3YWsCOvgaYPpJcg%40mail.gmail.com
*	Fix VACUUM VERBOSE's LP_DEAD item pages output.	Peter Geoghegan	2021-05-27
\| \| \| \|	Oversight in commit 5100010e.
*	Rethink definition of pg_attribute.attcompression.	Tom Lane	2021-05-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to compress, use the current setting of default_toast_compression". This allows '\0' to be a suitable default choice regardless of datatype, greatly simplifying code paths that initialize tupledescs and the like. It seems like a more user-friendly approach as well, because now the default compression choice doesn't migrate into table definitions, meaning that changing default_toast_compression is usually sufficient to flip an installation's behavior; one needn't tediously issue per-column ALTER SET COMPRESSION commands. Along the way, fix a few minor bugs and documentation issues with the per-column-compression feature. Adopt more robust APIs for SetIndexStorageProperties and GetAttributeCompression. Bump catversion because typical contents of attcompression will now be different. We could get away without doing that, but it seems better to ensure v14 installations all agree on this. (We already forced initdb for beta2, anyway.) Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us
*	Fix typo in heapam.c	Michael Paquier	2021-05-26
\| \| \| \| \|	Author: Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB571612191738540B27A8DE5894249@OS0PR01MB5716.jpnprd01.prod.outlook.com
*	Fix memory leak when de-toasting compressed values in VACUUM FULL/CLUSTER	Michael Paquier	2021-05-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VACUUM FULL and CLUSTER can be used to enforce the use of the existing compression method of a toastable column if a value currently stored is compressed with a method that does not match the column's defined method. The code in charge of decompressing and recompressing toast values at rewrite left around the detoasted values, causing an accumulation of memory allocated in TopTransactionContext. When processing large relations, this could cause the system to run out of memory. The detoasted values are not needed once their tuple is rewritten, and this commit ensures that the necessary cleanup happens. Issue introduced by bbe0a81d. The comments of the area are reordered a bit while on it. Reported-by: Andres Freund Analyzed-by: Andres Freund Author: Michael Paquier Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/20210521211929.pcehg6f23icwstdb@alap3.anarazel.de
*	Consider triggering VACUUM failsafe during scan.	Peter Geoghegan	2021-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The wraparound failsafe mechanism added by commit 1e55e7d1 handled the one-pass strategy case (i.e. the "table has no indexes" case) by adding a dedicated failsafe check. This made up for the fact that the usual one-pass checks inside lazy_vacuum_all_indexes() cannot ever be reached during a one-pass strategy VACUUM. This approach failed to account for two-pass VACUUMs that opt out of index vacuuming up-front. The INDEX_CLEANUP off case in the only case that works like that. Fix this by performing a failsafe check every 4GB during the first scan of the heap, regardless of the details of the VACUUM. This eliminates the special case, and will make the failsafe trigger more reliably. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Andres Freund <andres@anarazel.de> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20210424002921.pb3t7h6frupdqnkp@alap3.anarazel.de
*	Re-order pg_attribute columns to eliminate some padding space.	Tom Lane	2021-05-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that attcompression is just a char, there's a lot of wasted padding space after it. Move it into the group of char-wide columns to save a net of 4 bytes per pg_attribute entry. While we're at it, swap the order of attstorage and attalign to make for a more logical grouping of these columns. Also re-order actions in related code to match the new field ordering. This patch also fixes one outright bug: equalTupleDescs() failed to compare attcompression. That could, for example, cause relcache reload to fail to adopt a new value following a change. Michael Paquier and Tom Lane, per a gripe from Andres Freund. Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
*	Avoid detoasting failure after COMMIT inside a plpgsql FOR loop.	Tom Lane	2021-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exec_for_query() normally tries to prefetch a few rows at a time from the query being iterated over, so as to reduce executor entry/exit overhead. Unfortunately this is unsafe if we have COMMIT or ROLLBACK within the loop, because there might be TOAST references in the data that we prefetched but haven't yet examined. Immediately after the COMMIT/ROLLBACK, we have no snapshots in the session, meaning that VACUUM is at liberty to remove recently-deleted TOAST rows. This was originally reported as a case triggering the "no known snapshots" error in init_toast_snapshot(), but even if you miss hitting that, you can get "missing toast chunk", as illustrated by the added isolation test case. To fix, just disable prefetching in non-atomic contexts. Maybe there will be performance complaints prompting us to work harder later, but it's not clear at the moment that this really costs much, and I doubt we'd want to back-patch any complicated fix. In passing, adjust that error message in init_toast_snapshot() to be a little clearer about the likely cause of the problem. Patch by me, based on earlier investigation by Konstantin Knizhnik. Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
*	Make standby promotion reset the recovery pause state to 'not paused'.	Fujii Masao	2021-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a promotion is triggered while recovery is paused, the paused state ends and promotion continues. But previously in that case pg_get_wal_replay_pause_state() returned 'paused' wrongly while a promotion was ongoing. This commit changes a standby promotion so that it marks the recovery pause state as 'not paused' when it's triggered, to fix the issue. Author: Fujii Masao Reviewed-by: Dilip Kumar, Kyotaro Horiguchi Discussion: https://postgr.es/m/f706876c-4894-0ba5-6f4d-79803eeea21b@oss.nttdata.com
*	Harden nbtree deduplication posting split code.	Peter Geoghegan	2021-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a defensive "can't happen" error to code that handles nbtree posting list splits (promote an existing assertion). This avoids a segfault in the event of an insertion of a newitem that is somehow identical to an existing non-pivot tuple in the index. An nbtree index should never have two index tuples with identical TIDs. This scenario is not particular unlikely in the event of any kind of corruption that leaves the index in an inconsistent state relative to the heap relation that is indexed. There are two known reports of preventable hard crashes. Doing nothing seems unacceptable given the general expectation that nbtree will cope reasonably well with corrupt data. Discussion: https://postgr.es/m/CAH2-Wz=Jr_d-dOYEEmwz0-ifojVNWho01eAqewfQXgKfoe114w@mail.gmail.com Backpatch: 13-, where nbtree deduplication was introduced.
*	Prevent infinite insertion loops in spgdoinsert().	Tom Lane	2021-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Formerly we just relied on operator classes that assert longValuesOK to eventually shorten the leaf value enough to fit on an index page. That fails since the introduction of INCLUDE-column support (commit 09c1c6ab4), because the INCLUDE columns might alone take up more than a page, meaning no amount of leaf-datum compaction will get the job done. At least with spgtextproc.c, that leads to an infinite loop, since spgtextproc.c won't throw an error for not being able to shorten the leaf datum anymore. To fix without breaking cases that would otherwise work, add logic to spgdoinsert() to verify that the leaf tuple size is decreasing after each "choose" step. Some opclasses might not decrease the size on every single cycle, and in any case, alignment roundoff of the tuple size could obscure small gains. Therefore, allow up to 10 cycles without additional savings before throwing an error. (Perhaps this number will need adjustment, but it seems quite generous right now.) As long as we've developed this logic, let's back-patch it. The back branches don't have INCLUDE columns to worry about, but this seems like a good defense against possible bugs in operator classes. We already know that an infinite loop here is pretty unpleasant, so having a defense seems to outweigh the risk of breaking things. (Note that spgtextproc.c is actually the only known opclass with longValuesOK support, so that this is all moot for known non-core opclasses anyway.) Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
*	Fix query-cancel handling in spgdoinsert().	Tom Lane	2021-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Knowing that a buggy opclass could cause an infinite insertion loop, spgdoinsert() intended to allow its loop to be interrupted by query cancel. However, that never actually worked, because in iterations after the first, we'd be holding buffer lock(s) which would cause InterruptHoldoffCount to be positive, preventing servicing of the interrupt. To fix, check if an interrupt is pending, and if so fall out of the insertion loop and service the interrupt after we've released the buffers. If it was indeed a query cancel, that's the end of the matter. If it was a non-canceling interrupt reason, make use of the existing provision to retry the whole insertion. (This isn't as wasteful as it might seem, since any upper-level index tuples we already created should be usable in the next attempt.) While there's no known instance of such a bug in existing release branches, it still seems like a good idea to back-patch this to all supported branches, since the behavior is fairly nasty if a loop does happen --- not only is it uncancelable, but it will quickly consume memory to the point of an OOM failure. In any case, this code is certainly not working as intended. Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
*	Fix autovacuum log output heap truncation issue.	Peter Geoghegan	2021-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The percentage of blocks from the table value reported by autovacuum log output (following commit 5100010ee4d) should never exceed 100% because it describes the state of the table back when lazy_vacuum() was called. The value could nevertheless exceed 100% in the event of heap relation truncation. We failed to compensate for how truncation affects rel_pages. Fix the faulty accounting by using the original rel_pages value instead of the current/final rel_pages value. Reported-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210423204306.5osfpkt2ggaedyvy@alap3.anarazel.de
*	Initial pgindent and pgperltidy run for v14.	Tom Lane	2021-05-12
\| \| \| \| \| \| \| \|	Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
*	Refactor some error messages for easier translation	Peter Eisentraut	2021-05-12
\|
*	Change data type of counters in BufferUsage and WalUsage from long to int64.	Fujii Masao	2021-05-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously long was used as the data type for some counters in BufferUsage and WalUsage. But long is only four byte, e.g., on Windows, and it's entirely possible to wrap a four byte counter. For example, emitting more than four billion WAL records in one transaction isn't actually particularly rare. To avoid the overflows of those counters, this commit changes the data type of them from long to int64. Suggested-by: Andres Freund Author: Masahiro Ikeda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20201221211650.k7b53tcnadrciqjo@alap3.anarazel.de Discussion: https://postgr.es/m/af0964ac-7080-1984-dc23-513754987716@oss.nttdata.com