postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	plpython: Reject Python 2 during build configuration.	Andres Freund	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Python 2.7 went EOL 2020-01-01 and the support for Python 2 requires a fair bit of infrastructure. Therefore we are removing Python 2 support in plpython. This patch just rejects Python 2 during configure / mkvcbuild.pl. Future commits will remove the code and infrastructure for Python 2 support and adjust more of the documentation. This way we can see the buildfarm state after the removal sooner and we can be sure that failures are due to desupporting Python 2, rather than caused by infrastructure cleanup. Reviewed-By: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/20211031184548.g4sxfe47n2kyi55r@alap3.anarazel.de
*	Increase hash_mem_multiplier default to 2.0.	Peter Geoghegan	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \|	Double the default setting for hash_mem_multiplier, from 1.0 to 2.0. This setting makes hash-based executor nodes use twice the usual work_mem limit. The PostgreSQL 15 release notes should have a compatibility note about this change. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wzndc_ROk6CY-bC6p9O53q974Y0Ey4WX8jcPbuTZYM4Q3A@mail.gmail.com
*	Avoid VACUUM reltuples distortion.	Peter Geoghegan	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a heuristic that avoids distortion in the pg_class.reltuples estimates used by VACUUM. Without the heuristic, successive manually run VACUUM commands (run against a table that is never modified after initial bulk loading) will scan the same page in each VACUUM operation. Eventually pg_class.reltuples may reach the point where one single heap page is accidentally considered highly representative of the entire table. This is likely to be completely wrong, since the last heap page typically has fewer tuples than average for the table. It's not obvious that this was a problem prior to commit 44fa8488, which made vacuumlazy.c consistently scan the last heap page (even when it is all-visible in the visibility map). It seems possible that there were more subtle variants of the same problem that went unnoticed for quite some time, though. Commit 44fa8488 simplified certain aspects of when and how relation truncation was considered, but it did not introduce the "scan the last page" behavior. Essentially the same behavior was introduced much earlier, in commit e8429082. It was conditioned on whether or not truncation looked promising towards the end of the initial heap pass by VACUUM until recently, which was at least somewhat protective. That doesn't seem like something that we should be relying on, though. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzkNKORurux459M64mR63Aw4Jq7MBRVcX=CvALqN3A88WA@mail.gmail.com
*	Remove all traces of tuplestore_donestoring() in the C code	Michael Paquier	2022-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This routine is a no-op since dd04e95 from 2003, with a macro kept around for compatibility purposes. This has led to the same code patterns being copy-pasted around for no effect, sometimes in confusing ways like in pg_logical_slot_get_changes_guts() from logical.c where the code was actually incorrect. This issue has been discussed on two different threads recently, so rather than living with this legacy, remove any uses of this routine in the C code to simplify things. The compatibility macro is kept to avoid breaking any out-of-core modules that depend on it. Reported-by: Tatsuhito Kasahara, Justin Pryzby Author: Tatsuhito Kasahara Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
*	Fix bogus log message when starting from a cleanly shut down state.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \| \|	In commit 70e81861fa to split xlog.c, I moved the startup code that updates the state in the control file and prints out the "database system was not properly shut down" message to the log, but I accidentally removed the "if (InRecovery)" check around it. As a result, that message was printed even if the system was cleanly shut down, also during 'initdb'. Discussion: https://www.postgresql.org/message-id/3357075.1645031062@sss.pgh.pa.us
*	Add missing TYPEALIGN macros	John Naylor	2022-02-16
\| \| \| \| \| \| \| \|	A couple call sites still had hard-coded characters. Amul Sul Discussion: https://www.postgresql.org/message-id/CAAJ_b94Y35MWB3PJoCbc_O-_Q4%2B-9DHKhWtAwboEyx8wm4mqcA%40mail.gmail.com
*	Fix read beyond buffer bug introduced by the split xlog.c patch.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \|	FinishWalRecovery() copied the valid part of the last WAL block into a palloc'd buffer, and the code in StartupXLOG() copied it to the WAL buffer. But the memcpy in StartupXLOG() copied a full 8kB block, not just the valid part, i.e. it copied from beyond the end of the buffer. The invalid part was cleared immediately afterwards, so as long as the memory was allocated and didn't segfault, it didn't do any harm, but it can definitely segfault. Discussion: https://www.postgresql.org/message-id/efc12e32-5af2-3485-5b1d-5af9f707491a@iki.fi
*	Reject trailing junk after numeric literals	Peter Eisentraut	2022-02-16
\| \| \| \| \| \| \| \| \| \| \|	After this, the PostgreSQL lexers no longer accept numeric literals with trailing non-digits, such as 123abc, which would be scanned as two tokens: 123 and abc. This is undocumented and surprising, and it might also interfere with some extended numeric literal syntax being contemplated for the future. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Split xlog.c into xlog.c and xlogrecovery.c.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \| \|	This moves the functions related to performing WAL recovery into the new xlogrecovery.c source file, leaving xlog.c responsible for maintaining the WAL buffers, coordinating the startup and switch from recovery to normal operations, and other miscellaneous stuff that have always been in xlog.c. Reviewed-by: Andres Freund, Kyotaro Horiguchi, Robert Haas Discussion: https://www.postgresql.org/message-id/a31f27b4-a31d-f976-6217-2b03be646ffa%40iki.fi
*	Move code around in StartupXLOG().	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in preparation for the next commit, which will split off recovery-related code from xlog.c into a new source file. This is the order that things will happen with the next commit, and the point of this commit is to make these ordering changes more explicit, while the next commit mechanically moves the source code to the new file. To aid review, I added "BEGIN/END function" comments to mark which blocks of code are moved to which functions in the next commit. They will be gone in the next commit. Reviewed-by: Andres Freund, Kyotaro Horiguchi, Robert Haas Discussion: https://www.postgresql.org/message-id/a31f27b4-a31d-f976-6217-2b03be646ffa%40iki.fi
*	Refactor setting XLP_FIRST_IS_OVERWRITE_CONTRECORD.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \|	Set it directly in CreateOverwriteContrecordRecord(). That way, AdvanceXLInsertBuffer() doesn't need the missingContrecPtr global variable. This is in preparation for splitting xlog.c into multiple files. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/a462d79c-cb5a-47cc-e9ac-616b5003965f%40iki.fi
*	Run pgindent on xlog.c.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \|	To tidy up after some recent refactorings in xlog.c. These would be fixed by the pgindent run we do at the end of the development cycle, but I want to clean these up now as I'm about to do some more big refactorings on xlog.c.
*	Add TAP test to automate the equivalent of check_guc, take two	Michael Paquier	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	src/backend/utils/misc/check_guc is a script that cross-checks the consistency of the GUCs with postgresql.conf.sample, making sure that its format is in line with what guc.c has. It has never been run automatically, and has rotten over the years, creating a lot of false positives as per a report from Justin Pryzby. d10e41d has introduced a SQL function to publish the most relevant flags associated to a GUC, with tests added in the main regression test suite to make sure that we avoid most of the inconsistencies in the GUC settings, based on recent reports, but there was nothing able to cross-check postgresql.conf.sample with the contents of guc.c. This commit adds a TAP test that covers the remaining gap. It emulates the most relevant checks that check_guc did, so as any format mistakes are detected in postgresql.conf.sample at development stage, with the following checks: - Check that parameters marked as NOT_IN_SAMPLE are not in the sample file. - Check that there are no dead entries in postgresql.conf.sample for parameters not marked as NOT_IN_SAMPLE. - Check that no parameters are missing from the sample file if listed in guc.c without NOT_IN_SAMPLE. The idea of building a list of the GUCs by parsing the sample file comes from Justin, and he wrote the regex used in the patch to find all the GUCs (this same formatting rule basically applies for the last 20~ years or so). In order to test this patch, I have played with manual modifications of postgresql.conf.sample and guc.c, making sure that we detect problems with the GUC rules and the sample file format. The test is located in src/test/modules/test_misc, which is the best location I could think about for such sanity checks, rather than the main regression test suite (src/test/regress) to avoid a new type of dependency with the source tree. The first attempt of this patch was b0a55f4, where the location of postgresql.conf.sample was retrieved using pg_config --sharedir. This has proven to be an issue for distributions that patch pg_config to enforce the installation paths at some wanted location (like Debian), that may not exist when the test is run, hence causing a failure. Instead of that, as per a suggestion from Andres Freund, rely on the fact that the test is always executed from its directory in the source tree and use a relative path to find the sample file. This works for the CI, VPATH builds and on Windows, and tests like the recovery one added in f47ed79 rely on that already. Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/Yf9YGSwPiMu0c7fP@paquier.xyz
*	Fix race condition in 028_pitr_timelines.pl test, add note to docs.	Heikki Linnakangas	2022-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 028_pitr_timelines.pl test would sometimes hang, waiting for a WAL segment that was just filled up to be archived. It was because the test used 'pg_stat_archiver.last_archived_wal' to check if a file was archived, but the order that WAL files are archived when a standby is promoted is not fully deterministic, and 'last_archived_wal' tracks the last segment that was archived, not the highest-numbered WAL segment. Because of that, if the archiver archived segment 3, and then 2, 'last_archived_wal' say 2, and the test query would think that 3 has not been archived yet. Normally, WAL files are marked ready for archival in order, and the archiver process will process them in order, so that issue doesn't arise. We have used the same query on 'last_archived_wal' in a few other tests with no problem. But when a standby is promoted, things are a bit chaotic. After promotion, the server will try to archive all the WAL segments from the old timeline that are in pg_wal, as well as the history file and any new WAL segments on the new timeline. The end-of-recovery checkpoint will create the .ready files for all the WAL files on the old timeline, but at the same time, the new timeline is opened up for business. A file from the new timeline can therefore be archived before the files from the old timeline have been marked as ready for archival. It turns out that we don't really need to wait for the archival in this particular test, because the standby server is about to be stopped, and stopping a server will wait for the end-of-recovery checkpoint and all WAL archivals to finish, anyway. So we can just remove it from the test. Add a note to the docs on 'pg_stat_archiver' view that files can be archived out of order. Reviewed-by: Tom Lane Discussion: https://www.postgresql.org/message-id/3186114.1644960507@sss.pgh.pa.us
*	Update "don't truncate with failsafe" rationale.	Peter Geoghegan	2022-02-15
\| \| \| \| \| \| \| \| \| \|	There is a very good (though non-obvious) reason to avoid relation truncation during a VACUUM that has triggered the failsafe mechanism, which was missed before now. Update related comments, so this isn't forgotten. Reported-By: John Naylor <john.naylor@enterprisedb.com> Discussion: https://postgr.es/m/CAFBsxsFiMPxQ-dHZ8tOgktn=+ffeJT3+GinZ4zdOGbmAnCYadA@mail.gmail.com
*	Ensure that length argument of memcmp() isn't seen as negative.	Tom Lane	2022-02-15
\| \| \| \| \| \| \| \| \| \|	I think this will shut up a weird warning from buildfarm member serinus. Perhaps it'd be better to change tsCompareString's length arguments to unsigned, but that seems more invasive than is justified. Part of a general push to remove off-the-beaten-track warnings where we can easily do so.
*	Ensure that the argument of shmdt(2) is declared "void *".	Tom Lane	2022-02-15
\| \| \| \| \| \| \| \| \|	Our gcc-on-Solaris buildfarm members emit "incompatible pointer type" warnings in places where it's not. This is a bit odd, since AFAICT Solaris follows the POSIX spec in declaring shmdt's argument as "const void *", and you'd think any pointer argument would satisfy that. But whatever. Part of a general push to remove off-the-beaten-track warnings where we can easily do so.
*	Reject change of output-column collation in CREATE OR REPLACE VIEW.	Tom Lane	2022-02-15
\| \| \| \| \| \| \| \| \| \| \| \|	checkViewTupleDesc() didn't get the memo that it should verify same attcollation along with same type/typmod. (A quick scan did not find other similar oversights.) Per bug #17404 from Pierre-Aurélien Georges. On another day I might've back-patched this, but today I'm feeling paranoid about unnecessary behavioral changes in back branches. Discussion: https://postgr.es/m/17404-8a4a270ef30a6709@postgresql.org
*	Ensure that STDERR is empty in connect_ok tests	Daniel Gustafsson	2022-02-15
\| \| \| \| \| \| \| \| \| \| \|	Connections performed via connect_ok() in TAP tests should not write anything to STDERR. Author: Jacob Champion <pchampion@vmware.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/9D4FFB61-392B-4A2C-B7E4-911797B4AC14@yesql.se Discussion: https://postgr.es/m/ec146256e31afa0542f9fa970ec258c5f1a5f98.camel@vmware.com
*	Add more logging to new 028_pitr_timelines.pl test.	Heikki Linnakangas	2022-02-15
\| \| \| \| \| \| \| \|	The test has failed a couple of times on buildfarm member 'hoverfly'. It gets stuck waiting for the standby to archive 000000020000000000000003 WAL segment. I don't understand why, but with DEBUG1, we will get messages in the log whenever a segment is archived, which hopefully will give a clue the next time it happens.
*	Remove IS_AF_UNIX macro	Peter Eisentraut	2022-02-15
\| \| \| \| \| \| \| \| \| \| \| \|	The AF_UNIX macro was being used unprotected by HAVE_UNIX_SOCKETS, apparently since 2008. So the redirection through IS_AF_UNIX() is apparently no longer necessary. (More generally, all supported platforms are now HAVE_UNIX_SOCKETS, but even if there were a new platform in the future, it seems plausible that it would define the AF_UNIX symbol even without kernel support.) So remove the IS_AF_UNIX() macro and make the code a bit more consistent. Discussion: https://www.postgresql.org/message-id/flat/f2d26815-9832-e333-d52d-72fbc0ade896%40enterprisedb.com
*	Add test case for trailing junk after numeric literals	Peter Eisentraut	2022-02-15
\| \| \| \| \| \| \| \| \| \|	PostgreSQL currently accepts numeric literals with trailing non-digits, such as 123abc where the abc is treated as the next token. This may be a bit surprising. This commit adds test cases for this; subsequent commits intend to change this behavior. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Remove pg_atoi()	Peter Eisentraut	2022-02-15
\| \| \| \| \| \| \| \| \|	The last caller was int2vectorin(), and having such a general function for one user didn't seem useful, so just put the required parts inline and remove the function. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Remove command checks in tests of pg_basebackup and pg_receivewal	Michael Paquier	2022-02-15
\| \| \| \| \| \| \| \| \| \| \| \|	The TAP tests of those commands have been checking if commands of "gzip" and "lz4" existed by launching them with an extra --version. Based on the buildfarm, this is not required for "gzip" as the command always exists. Since 1d084fb, "lz4" has a ./configure check doing the same thing. Reported-by: Andres Freund Discussion: https://postgr.es/m/20220212220643.ozuvq2k4cjkcnr2v@alap3.anarazel.de Discussion: https://postgr.es/m/Ygm2ADakjlqGc2Ro@paquier.xyz
*	Fix thinko with subdirectories generated by pg_upgrade for internal files	Michael Paquier	2022-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	38bfae3 has mixed the "dump/" and "log/" subdirectories generated in "pg_upgrade_output.d/", causing the internal dump files to be generated in "log/" and the log files to be in "dump/", but the opposite should be done. This was not directly an issue for pg_upgrade runs, as the internal dump files were still picked up at the location of their creation, but the newest version of the buildfarm client would have reported the dump files instead of the log files on failures of pg_upgrade. Issue spotted while testing the TAP tests of pg_upgrade.
*	Move replication slot release to before_shmem_exit().	Andres Freund	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, replication slots were released in ProcKill() on error, resulting in reporting replication slot drop of ephemeral slots after the stats subsystem was already shut down. To fix this problem, move replication slot release to a before_shmem_exit() hook that is called before the stats collector shuts down. There wasn't really a good reason for the slot handling to be in ProcKill() anyway. Patch by Masahiko Sawada, with very minor polishing by me. I, Andres, wrote a test for dropping slots during process exit, but there may be some OS dependent issues around the number of times FATAL error messages are displayed due to a still debated libpq issue. So that test will be committed separately / later. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Author: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoDAeEpAbZEyYJsPZJUmSPaRicVSBObaL7sPaofnKz+9zg@mail.gmail.com
*	Remove one use of pg_atoi()	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \|	There was no real need to use this here instead of a simpler API. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Move scanint8() to numutils.c	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \| \| \| \| \|	Move scanint8() to numutils.c and rename to pg_strtoint64(). We already have a "16" and "32" version of that, and the code inside the functions was aligned, so this move makes all three versions consistent. The API is also changed to no longer provide the errorOK case. Users that need the error checking can use strtoi64(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Suppress integer-overflow compiler warning for inconsistent sun_len.	Tom Lane	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \|	On AIX 7.1, struct sockaddr_un is declared to be 1025 bytes long, but the sun_len field that should hold the length is only a byte. Clamp the value we try to store to ensure it will fit in the field. (This coding might need adjustment if there are any machines out there where sun_len is as wide as size_t; but a preliminary survey suggests there's not, so let's keep it simple.) Discussion: https://postgr.es/m/2781112.1644819528@sss.pgh.pa.us
*	Add test case for an archive recovery corner case.	Heikki Linnakangas	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While I was working on a patch to refactor things around xlog.c, I mixed up EndOfLogTLI and replayTLI at the end of recovery. As a result, if you recovered to a point with a lower-numbered timeline in a WAL segment that has a higher TLI in the filename, the end-of-recovery WAL record was created with invalid PrevTimeLineId. I noticed that while self-reviewing, but no tests failed. So add a test to cover that corner case. Thanks to Amul Sul who also submitted a test case for the same corner case, although this patch is different from that. Reviewed-by: Amul Sul, Michael Paquier Discussion: https://www.postgresql.org/message-id/52bc9ccd-8591-431b-0086-15d9acf25a3f@iki.fi Discussion: https://www.postgresql.org/message-id/CAAJ_b94Vjt5cXGza_1MkjLQWciNdEemsmiWuQj0d%3DM7JfjAa1g%40mail.gmail.com
*	Add missing node support functions	Peter Eisentraut	2022-02-14
\| \| \| \|	forgotten in 37851a8b83d3d57ca48736093b10aa5f3bc0c177
*	Database-level collation version tracking	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds to database objects the same version tracking that collation objects have. There is a new pg_database column datcollversion that stores the version, a new function pg_database_collation_actual_version() to get the version from the operating system, and a new subcommand ALTER DATABASE ... REFRESH COLLATION VERSION. This was not originally added together with pg_collation.collversion, since originally version tracking was only supported for ICU, and ICU on a database-level is not currently supported. But we now have version tracking for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so this is useful to have now. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f0ff3190-29a3-5b39-a179-fa32eee57db6%40enterprisedb.com
*	Improve correlation names in sanity tests	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the queries in the "sanity" tests in the regression test suite (opr_sanity, type_sanity) are very confusing. One main stumbling block is that for some probably ancient reason many of the older queries are written with correlation names p1, p2, etc. independent of the name of the catalog. This one is a good example: SELECT p1.oid, p1.oprname, p2.oid, p2.proname FROM pg_operator AS p1, pg_proc AS p2 <-- HERE WHERE p1.oprcode = p2.oid AND p1.oprkind = 'l' AND (p2.pronargs != 1 OR NOT binary_coercible(p2.prorettype, p1.oprresult) OR NOT binary_coercible(p1.oprright, p2.proargtypes[0]) OR p1.oprleft != 0); This is better written as SELECT o1.oid, o1.oprname, p1.oid, p1.proname FROM pg_operator AS o1, pg_proc AS p1 WHERE o1.oprcode = p1.oid AND o1.oprkind = 'l' AND (p1.pronargs != 1 OR NOT binary_coercible(p1.prorettype, o1.oprresult) OR NOT binary_coercible(o1.oprright, p1.proargtypes[0]) OR o1.oprleft != 0); This patch cleans up all the queries in this manner. (As in the above case, I kept the digits like o1 and p1 even in cases where only one of each letter is used in a query. This is mainly to keep the style consistent.) Discussion: https://www.postgresql.org/message-id/flat/c538308b-319c-8784-e250-1284d12d5411%40enterprisedb.com
*	Use WL_SOCKET_CLOSED for client_connection_check_interval.	Thomas Munro	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \|	Previously we used poll() directly to check for a POLLRDHUP event. Instead, use the WaitEventSet API to poll the socket for WL_SOCKET_CLOSED, which knows how to detect this condition on many more operating systems. Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Maksim Milyutin <milyutinma@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
*	Add WL_SOCKET_CLOSED for socket shutdown events.	Thomas Munro	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide a way for WaitEventSet to report that the remote peer has shut down its socket, independently of whether there is any buffered data remaining to be read. This works only on systems where the kernel exposes that information, namely: * WAIT_USE_POLL builds using POLLRDHUP, if available * WAIT_USE_EPOLL builds using EPOLLRDHUP * WAIT_USE_KQUEUE builds using EV_EOF Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Maksim Milyutin <milyutinma@gmail.com> Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
*	WAL log unchanged toasted replica identity key attributes.	Amit Kapila	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, during UPDATE, the unchanged replica identity key attributes are not logged separately because they are getting logged as part of the new tuple. But if they are stored externally then the untoasted values are not getting logged as part of the new tuple and logical replication won't be able to replicate such UPDATEs. So we need to log such attributes as part of the old_key_tuple during UPDATE. Reported-by: Haiying Tang Author: Dilip Kumar and Amit Kapila Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
*	Track LLVM 15 changes.	Thomas Munro	2022-02-14
\| \| \| \| \|	This isn't an API change, it's just a missing #include that we got away with before. Per buildfarm animal seawasp.
*	Correct Makefile dependencies for catalog scripts	John Naylor	2022-02-14
\| \| \| \| \| \|	At some point, Gen_fmgrtab.pl stopped needing the value of defined symbols from access/transam.h, while genbki.pl starting doing so. The Makefiles didn't get the memo, so update the relevant dependencies.
*	Add ./configure check for "lz4" command	Michael Paquier	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some environments may compile with --with-lz4 while the command "lz4" goes missing, causing two failures in the TAP tests of pg_verifybackup (008_untar.pl and 010_client_untar.pl) as the code assumed that the command always existed with a hardcoded value in src/Makefile.global. Rather than this method, this adds a ./configure check based on PGAC_PATH_PROGS() to find automatically the command and get an absolute path to it. Both tests need to be adjusted for the case where the command does not exist, actually, as Makefile.global would set now LZ4 to an empty value in this case. The TAP tests of pg_receivewal already do that. Per report from buildfarm member copperhead, as an effect of dab2984. The origin of the failure is actually babbbb5 that did not centralize the check for the existence of a "lz4" command at ./configure to shave a few cycles. Note that one just needs to tweak an environment to move "lz4" out of the way to reproduce the problem, which is what I did to test this change. Per discussion with Robert Haas, Tom Lane, Andres Freund and myself. Discussion: https://postgr.es/m/Ygc51WVAFGocSu4h@paquier.xyz
*	Fix memory leak in IndexScan node with reordering	Alexander Korotkov	2022-02-14
\| \| \| \| \| \| \| \| \| \|	Fix ExecReScanIndexScan() to free the referenced tuples while emptying the priority queue. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com Author: Aliaksandr Kalenik Reviewed-by: Tom Lane, Alexander Korotkov Backpatch-through: 10
*	Make origin data initialization consistent other fields in 2PC header	Michael Paquier	2022-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of 1eb6d65, the origin data is optionally stored in a 2PC file header, with the data filled in EndPrepare() even in the default case where there is no origin data to add. This was inconsistent with all the other fields of TwoPhaseFileHeader which are initialized in StartPrepare(), so move the initialization of origin_lsn and origin_timestamp there instead. The effect of missing the initialization at this early stage is only cosmetic based on the current logic of the code, but could have led to issues in the long-term, and it is more consistent done this way. Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAooECJ+gU_RZB-yhioPOV94R4ucoHAf68PiJhLpgpVpBw@mail.gmail.com
*	Fix misuse of "const" qualifier.	Tom Lane	2022-02-13
\| \| \| \| \| \| \| \| \| \| \|	"const foo " is quite different from "foo const". This code was evidently trying to avoid casting away const from the arguments, but entirely failed to do so. Per study of some buildfarm warnings from anole (which unfortunately are mostly ignorable, since it seems not to understand "restrict" very well). I'm surprised though that nothing else has complained.
*	Remove REGRESS_OUTPUTDIR environment variable.	Thomas Munro	2022-02-14
\| \| \| \| \| \| \| \|	Andres Freund points out that the tmp_check path is already available as perl variable PostgreSQL::Test::Utils::tmp_check, so we can drop the new environment variable introduced by commit f47ed79cc. Discussion: https://postgr.es/m/20220213052955.dh7lheehit7bsemf%40alap3.anarazel.de
*	Silence minor compiler warnings.	Tom Lane	2022-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Depending on compiler version and optimization level, we might get a complaint that lazy_scan_heap's "freespace" is used uninitialized. Compilers not aware that ereport(ERROR) doesn't return complained about bbsink_lz4_new(). Assigning "-1" to a uint64 value has unportable results; fortunately, the value of xlogreadsegno is unimportant when xlogreadfd is -1. (It looks to me like there is no need for xlogreadsegno to be static in the first place, but I didn't venture to change that.)
*	Move libpq's write_failed mechanism down to pqsecure_raw_write().	Tom Lane	2022-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 1f39a1c06 implemented write-failure postponement in pqSendSome, which is above SSL/GSS processing. However, we've now seen failures indicating that (some versions of?) OpenSSL have a tendency to report write failures prematurely too. Hence, move the primary responsibility for postponing write failures down to pqsecure_raw_write(), below SSL/GSS processing. pqSendSome now sets write_failed only in corner cases where we'd lost the connection already. A side-effect of this change is that errors detected in the SSL/GSS layer itself will be reported immediately (as if they were read errors) rather than being postponed like write errors. That's reverting an effect of 1f39a1c06, and I think it's fine: if there's not a socket-level error, it's hard to be sure whether an OpenSSL error ought to be considered a read or write failure anyway. Another important point is that write-failure postponement is now effective during connection setup. OpenSSL's misbehavior of this sort occurs during SSL_connect(), so that's a change we want. Per bug #17391 from Nazir Bilal Yavuz. Possibly this should be back-patched, but I think it prudent to let it age awhile in HEAD first. Discussion: https://postgr.es/m/17391-304f81bcf724b58b@postgresql.org
*	Fix thinko in PQisBusy().	Tom Lane	2022-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 1f39a1c06 I made PQisBusy consider conn->write_failed, but that is now looking like complete brain fade. In the first place, the logic is quite wrong: it ought to be like "and not" rather than "or". This meant that once we'd gotten into a write_failed state, PQisBusy would always return true, probably causing the calling application to iterate its loop until PQconsumeInput returns a hard failure thanks to connection loss. That's not what we want: the intended behavior is to return an error PGresult, which the application probably has much cleaner support for. But in the second place, checking write_failed here seems like the wrong thing anyway. The idea of the write_failed mechanism is to postpone handling of a write failure until we've read all we can from the server; so that flag should not interfere with input-processing behavior. (Compare 7247e243a.) What we should check for is status = CONNECTION_BAD, ie, socket already closed. (Most places that close the socket don't touch asyncStatus, but they do reset status.) This primarily ensures that if PQisBusy() returns true then there is an open socket, which is assumed by several call sites in our own code, and probably other applications too. While at it, fix a nearby thinko in libpq's my_sock_write: we should only consult errno for res < 0, not res == 0. This is harmless since pqsecure_raw_write would force errno to zero in such a case, but it still could confuse readers. Noted by Andres Freund. Backpatch to v12 where 1f39a1c06 came in. Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de
*	Revert "Add TAP test to automate the equivalent of check_guc"	Michael Paquier	2022-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit b0a55f4, to remove for now the TAP test that did the equivalent of check_guc. The test has been using pg_config --sharedir to find the location of postgresql.conf.sample. While the buildfarm and normal build environments rather liked that, this proves to be an issue for Debian where pg_config is patched to not be relocatable, causing the test to fail. Rather than relying on pg_config, we'd better find the sample file based on its location from the source directory. However, this is also an issue as a TAP test only offers the build directory as of TESTDIR in the environment context, so this would fail with VPATH builds. Perhaps the source path could be provided additionally when running the TAP tests. Or perhaps we may be able to get away by just switching to a SQL approach, by using PG_ABS_SRCDIR but this is going to require some extra loops to get the sample file from the correct path in src/backend/. In any case, this needs more thoughts, so just revert the test case until something better is done about this relocation problem. Reported-by: Christopher Berg Discussion: https://postgr.es/m/YgYw25OXV5men8Fj@msg.df7cb.de
*	Consolidate VACUUM xid cutoff logic.	Peter Geoghegan	2022-02-11
\| \| \| \| \| \| \| \| \|	Push the logic for determining whether or not a VACUUM operation will be aggressive down into vacuum_set_xid_limits(). This makes the function's signature significantly simpler, and seems clearer overall. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzkymFbz6D_vL+jmqSn_5q1wsFvFrE+37yLgL_Rkfd6Gzg@mail.gmail.com
*	Add VACUUM instrumentation for scanned pages, relfrozenxid.	Peter Geoghegan	2022-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Report on scanned pages within VACUUM VERBOSE and autovacuum logging. These are pages that were physically examined during the VACUUM operation. Note that this can include a small number of pages that were marked all-visible in the visibility map by some earlier VACUUM operation. VACUUM won't skip all-visible pages that aren't part of a range of all-visible pages that's at least 32 blocks in length (partly to avoid missing out on opportunities to advance relfrozenxid during non-aggressive VACUUMs). Commit 44fa8488 simplified the definition of scanned pages. It became the complement of the pages (of those pages from rel_pages) that were skipped using the visibility map. And so scanned pages precisely indicates how effective the visibility map was at saving work. (Before now we displayed the number of pages skipped via the visibility map when happened to be frozen pages, but not when they were merely all-visible, which was less useful to users.) Rename the user-visible OldestXmin output field to "removal cutoff", and show some supplementary information: how far behind the cutoff is (number of XIDs behind) by the time the VACUUM operation finished. This will help users to figure out what's _not_ working in extreme cases where VACUUM is fundamentally unable to remove dead tuples or freeze older tuples (e.g., due to a leaked replication slot). Also report when relfrozenxid is advanced by VACUUM in output that immediately follows "removal cutoff". This structure is intended to highlight the relationship between the new relfrozenxid value for the table, and the VACUUM operation's removal cutoff. Finally, add instrumentation of "missed dead tuples", and the number of pages that had at least one such tuple. These are fully DEAD (not just RECENTLY_DEAD) tuples with storage that could not be pruned due to failure to acquire a cleanup lock on a heap page. This is a replacement for the "skipped due to pin" instrumentation removed by commit 44fa8488. It shows more details than before for pages where failing to get a cleanup lock actually resulted in VACUUM missing out on useful work, but usually shows nothing at all instead (the mere fact that we couldn't get a cleanup lock is usually of no consequence whatsoever now). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com
*	Simplify lazy_scan_heap's handling of scanned pages.	Peter Geoghegan	2022-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Redefine a scanned page as any heap page that actually gets pinned by VACUUM's first pass over the heap, regardless of whether or not the page was cleanup locked. Although it's fundamentally impossible to prune a heap page without a cleanup lock (since we cannot safely defragment the page), we can do just about everything else. The only notable further exception is freezing tuples, though even that is arguably a consequence of not being able to prune (not a separate issue). VACUUM now does as much of the same processing as possible for pages that could not be cleanup locked. Any failure to do specific required processing is treated as a special case exception, which will be rare in practice. We now collect any preexisting LP_DEAD items (left behind by earlier opportunistic pruning) in the dead_items array for these heap pages, and count their tuples in the usual way. Steps used to decide if we'll attempt relation truncation are performed in the usual way for no-cleanup-lock scanned pages, too. Although eliminating these special cases is intrinsically useful, it's even more useful as an enabler of further simplifications. The only essential difference between aggressive and non-aggressive is that only aggressive is _guaranteed_ to be able to advance relfrozenxid up to FreezeLimit. Advancing relfrozenxid is always useful, but before now non-aggressive VACUUMs threw away the opportunity to do so whenever a cleanup lock could not be acquired on any page, no matter what the details were. This was very pessimistic. It isn't actually necessary to "behave aggressively" to maintain the ability to advance relfrozenxid when a cleanup lock isn't immediately available (most of the time). The non-aggressive case will now make sure that it isn't safe to advance relfrozenxid (without waiting) using only a share lock. It will usually notice that there are no tuples that need to be frozen anyway, just like in the aggressive case -- and so it no longer wastes an opportunity to advance relfrozenxid over nothing. (The non-aggressive case still won't wait for a cleanup lock when there really are tuples on the page that need to be frozen, since that really would amount to "behaving aggressively".) VACUUM currently has a tendency to set heap pages to all-visible in the visibility map before it freezes all of the tuples on the page. Only a subsequent aggressive VACUUM will visit these pages to freeze their tuples, usually only when the tuple XIDs are much older than the vacuum_freeze_min_age GUC (FreezeLimit cutoff) is supposed to allow. And so non-aggressive VACUUMs are still far less likely to be able to advance relfrozenxid in practice, even with the enhancements from this commit. This remaining issue will be addressed by future work that overhauls the criteria for freezing tuples. Once that's in place, almost every VACUUM operation will be able to advance relfrozenxid in practice. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com