aboutsummaryrefslogtreecommitdiff
path: root/src/test/perl/PostgresNode.pm
Commit message (Collapse)AuthorAge
* Fix perl warning from commit 9b4eafcaf4Andrew Dunstan2022-11-23
| | | | | | per gripe from Andres Freund and Tom Lane Backpatch to all live branches.
* Prevent port collisions between concurrent TAP testsAndrew Dunstan2022-11-22
| | | | | | | | | | | | | | | | | | | | | | | | Currently there is a race condition where if concurrent TAP tests both test that they can open a port they will assume that it is free and use it, causing one of them to fail. To prevent this we record a reservation using an exclusive lock, and any TAP test that discovers a reservation checks to see if the reserving process is still alive, and looks for another free port if it is. Ports are reserved in a directory set by the environment setting PG_TEST_PORT_DIR, or if that doesn't exist a subdirectory of the top build directory as set by Makefile.global, or its own tmp_check directory. The prove_check recipe in Makefile.global.in is extended to export top_builddir to the TAP tests. This was already exported by the prove_installcheck recipes. Per complaint from Andres Freund Backpatched from 9b4eafcaf4 to all live branches Discussion: https://postgr.es/m/20221002164931.d57hlutrcz4d2zi7@awork3.anarazel.de
* Back-Patch "Add wait_for_subscription_sync for TAP tests."Amit Kapila2022-08-12
| | | | | | | | | | | | | | | This was originally done in commit 0c20dd33db for 16 only, to eliminate duplicate code and as an infrastructure that makes it easier to write future tests. However, it has been suggested that it would be good to back-patch this testing infrastructure to aid future tests in back-branches. Backpatch to all supported versions. Author: Masahiko Sawada Reviewed by: Amit Kapila, Shi yu Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com Discussion: https://postgr.es/m/E1oJBIf-0006sw-SA@gemulon.postgresql.org
* For PostgreSQL::Test compatibility, alias entire package symbol tables.Noah Misch2022-06-25
| | | | | | | | | | | | Remove the need to edit back-branch-specific code sites when back-patching the addition of a PostgreSQL::Test::Utils symbol. Replace per-symbol, incomplete alias lists. Give old and new package names the same EXPORT and EXPORT_OK semantics. Back-patch to v10 (all supported versions). Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20220622072144.GD4167527@rfd.leadboat.com
* Backpatch addition of wait_for_log(), pump_until().Andres Freund2022-05-02
| | | | | | | | These were originally introduced in a2ab9c06ea1 and a2ab9c06ea1, as they are needed by a about-to-be-backpatched test. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-14
* Support new perl module namespace in stable branchesAndrew Dunstan2022-04-21
| | | | | | | | | | | | | | | | | | | Commit b3b4d8e68a moved our perl test modules to a better namespace structure, but this has made life hard for people wishing to backpatch improvements in the TAP tests. Here we alleviate much of that difficulty by implementing the new module names on top of the old modules, mostly by using a little perl typeglob aliasing magic, so that we don't have a dual maintenance burden. This should work both for the case where a new test is backpatched and the case where a fix to an existing test that uses the new namespace is backpatched. Reviewed by Michael Paquier Per complaint from Andres Freund Discussion: https://postgr.es/m/20220418141530.nfxtkohefvwnzncl@alap3.anarazel.de Applied to branches 10 through 14
* Harden TAP tests that intentionally corrupt page checksums.Tom Lane2022-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous method for doing that was to write zeroes into a predetermined set of page locations. However, there's a roughly 1-in-64K chance that the existing checksum will match by chance, and yesterday several buildfarm animals started to reproducibly see that, resulting in test failures because no checksum mismatch was reported. Since the checksum includes the page LSN, test success depends on the length of the installation's WAL history, which is affected by (at least) the initial catalog contents, the set of locales installed on the system, and the length of the pathname of the test directory. Sooner or later we were going to hit a chance match, and today is that day. Harden these tests by specifically inverting the checksum field and leaving all else alone, thereby guaranteeing that the checksum is incorrect. In passing, fix places that were using seek() to set up for syswrite(), a combination that the Perl docs very explicitly warn against. We've probably escaped problems because no regular buffered I/O is done on these filehandles; but if it ever breaks, we wouldn't deserve or get much sympathy. Although we've only seen problems in HEAD, now that we recognize the environmental dependencies it seems like it might be just a matter of time until someone manages to hit this in back-branch testing. Hence, back-patch to v11 where we started doing this kind of test. Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
* Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts.Noah Misch2022-03-04
| | | | | | | | | | | | Slow hosts may avoid load-induced, spurious failures by setting environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds greater than 180. Developers may see faster failures by setting that environment variable to some lesser number of seconds. In tests, write $PostgreSQL::Test::Utils::timeout_default wherever the convention has been to write 180. This change raises the default for some briefer timeouts. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com
* Remove most msys special processing in TAP testsAndrew Dunstan2022-02-20
| | | | | | | | | | Following migration of Windows buildfarm members running TAP tests to use of ucrt64 perl for those tests, special processing for msys perl is no longer necessary and so is removed. Backpatch to release 10 Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net
* Remove PostgreSQL::Test::Utils::perl2host completelyAndrew Dunstan2022-02-20
| | | | | | | | | | | Commit f1ac4a74de disabled this processing, and as nothing has broken (as expected) here we proceed to remove the routine and adjust all the call sites. Backpatch to release 10 Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de
* Tighten TAP tests' tracking of postmaster state some more.Tom Lane2022-01-20
| | | | | | | | | | | | | | | | | | | Commits 6c4a8903b et al. had a couple of deficiencies: * The logic I added to Cluster::start to see if a PID file is present could be fooled by a stale PID file left over from a previous postmaster. To fix, if we're not sure whether we expect to find a running postmaster or not, validate the PID using "kill 0". * 017_shm.pl has a loop in which it just issues repeated Cluster::start calls; this will fail if some invocation fails but leaves self->_pid set. Per buildfarm results, the above fix is not enough to make this safe: we might have "validated" a PID for a postmaster that exits immediately after we look. Hence, match each failed start call with a stop call that will get us back to the self->_pid == undef state. Add a fail_ok option to Cluster::stop to make this work. Discussion: https://postgr.es/m/CA+hUKGKV6fOHvfiPt8=dOKzvswjAyLoFoJF1iQXMNpi7+hD1JQ@mail.gmail.com
* TAP tests: check for postmaster.pid anyway when "pg_ctl start" fails.Tom Lane2022-01-19
| | | | | | | | | | | | "pg_ctl start" might start a new postmaster and then return failure anyway, for example if PGCTLTIMEOUT is exceeded. If there is a postmaster there, it's still incumbent on us to shut it down at script end, so check for the PID file even though we are about to fail. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/647439.1642622744@sss.pgh.pa.us
* Improve contrib/amcheck's tests for CREATE INDEX CONCURRENTLY.Tom Lane2021-10-28
| | | | | | | | | | | | | | | | | | | | | | | | Commits fdd965d07 and 3cd9c3b92 tested CREATE INDEX CONCURRENTLY by launching two separate pgbench runs concurrently. This was needed so that only a single client thread would run CREATE INDEX CONCURRENTLY, avoiding deadlock between two CICs. However, there's a better way, which is to use an advisory lock to prevent concurrent CICs. That's better in part because the test code is shorter and more readable, but mostly because it automatically scales things to launch an appropriate number of CICs relative to the number of INSERT transactions. As committed, typically half to three-quarters of the CIC transactions were pointless because the INSERT transactions had already stopped. In passing, remove background_pgbench, which was added to support these tests and isn't needed anymore. We can always put it back if we find a use for it later. Back-patch to v12; older pgbench versions lack the conditional-execution features needed for this method. Tom Lane and Andrey Borodin Discussion: https://postgr.es/m/139687.1635277318@sss.pgh.pa.us
* Fix CREATE INDEX CONCURRENTLY for the newest prepared transactions.Noah Misch2021-10-23
| | | | | | | | | | | | | | | | | | | The purpose of commit 8a54e12a38d1545d249f1402f66c8cde2837d97c was to fix this, and it sufficed when the PREPARE TRANSACTION completed before the CIC looked for lock conflicts. Otherwise, things still broke. As before, in a cluster having used CIC while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index builds by making CIC wait for arbitrarily-recent prepared transactions and for ordinary transactions that may yet PREPARE TRANSACTION. As part of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru
* Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY.Noah Misch2021-10-23
| | | | | | | | | | | | | | | | | CIC and REINDEX CONCURRENTLY assume backends see their catalog changes no later than each backend's next transaction start. That failed to hold when a backend absorbed a relevant invalidation in the middle of running RelationBuildDesc() on the CIC index. Queries that use the resulting index can silently fail to find rows. Fix this for future index builds by making RelationBuildDesc() loop until it finishes without accepting a relevant invalidation. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Back-patch to 9.6 (all supported versions). Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com
* Restore robustness of TAP tests that wait for postmaster restart.Tom Lane2021-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several TAP tests use poll_query_until() to wait for the postmaster to restart. They were checking to see if a trivial query (e.g. "SELECT 1") succeeds. However, that's problematic in the wake of commit 11e9caff8, because now that we feed said query to psql via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql quits before reading the query. Hence, we can't use a nonempty query in cases where we need to wait for connection failures to stop happening. Per the precedent of commits c757a3da0 and 6d41dd045, we can pass "undef" as the query in such cases to ensure that IPC::Run has nothing to write. However, then we have to say that the expected output is empty, and this exposes a deficiency in poll_query_until: if psql fails altogether and returns empty stdout, poll_query_until will treat that as a success! That's because, contrary to its documentation, it makes no actual check for psql failure, looking neither at the exit status nor at stderr. To fix that, adjust poll_query_until to insist on empty stderr as well as a stdout match. (I experimented with checking exit status instead, but it seems that psql often does exit(1) in cases that we need to consider successes. That might be something to fix someday, but it would be a non-back-patchable behavior change.) Back-patch to v10. The test cases needing this exist only as far back as v11, but it seems wise to keep poll_query_until's behavior the same in v10, in case we back-patch another such test case in future. (9.6 does not currently need this change, because in that branch poll_query_until can't be told to accept empty stdout as a success case.) Per assorted buildfarm failures, mostly on hoverfly. Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
* Allow PostgresNode.pm's backup method to accept backup_options.Robert Haas2021-06-09
| | | | | | | | | | Partial back-port of commit 081876d75ea15c3bd2ee5ba64a794fd8ea46d794. A test case for a pending bug fix needs this capability, but the code on 9.6 is significantly different, so I'm only back-patching this change as far as v10. We'll have to work around the problem another way in v9.6. Discussion: http://postgr.es/m/CAFiTN-tcivNvL0Rg6rD7_CErNfE75H7+gh9WbMxjbgsattja1Q@mail.gmail.com
* In PostgresNode.pm, don't pass SQL to psql on the command lineAndrew Dunstan2021-06-03
| | | | | | | | | The Msys shell mangles certain patterns in its command line, so avoid handing arbitrary SQL to psql on the command line and instead use IPC::Run's redirection facility for stdin. This pattern is already mostly whats used, but query_poll_until() was not doing the right thing. Problem discovered on the buildfarm when a new TAP test failed on msys.
* Raise a timeout to 180s, in test 010_logical_decoding_timelines.pl.Noah Misch2021-05-31
| | | | | Per buildfarm member hornet. Also, update Pod documentation showing the lower value. Back-patch to v10, where the test first appeared.
* fix silly perl error in commit d064afc720Andrew Dunstan2021-04-21
|
* Only ever test for non-127.0.0.1 addresses on Windows in PostgresNodeAndrew Dunstan2021-04-21
| | | | | | | | | | This has been found to cause hangs where tcp usage is forced. Alexey Kodratov Discussion: https://postgr.es/m/82e271a9a11928337fcb5b5e57b423c0@postgrespro.ru Backpatch to all live branches
* Allow TestLib::slurp_file to skip contents, and use as neededAndrew Dunstan2021-04-16
| | | | | | | | | | | | | | | | | | | In order to avoid getting old logfile contents certain functions in PostgresNode were doing one of two things. On Windows it rotated the logfile and restarted the server, while elsewhere it truncated the log file. Both of these are unnecessary. We borrow from the buildfarm which does this instead: note the size of the logfile before we start, and then when fetching the logfile skip to that position before accumulating contents. This is spelled differently on Windows but the effect is the same. This is largely centralized in TestLib's slurp_file function, which has a new optional parameter, the offset to skip to before starting to reading the file. Code in the client becomes much neater. Backpatch to all live branches. Michael Paquier, slightly modified by me. Discussion: https://postgr.es/m/YHajnhcMAI3++pJL@paquier.xyz
* Use fast checkpoint in PostgresNode::backup()Alvaro Herrera2020-10-21
| | | | Should cause tests to be a bit faster
* Put back explicit setting of replication values within TAP tests.Tom Lane2020-10-01
| | | | | | | | | | | | | | Commit 151c0c5f7 neglected the possibility that a TEMP_CONFIG file would explicitly set max_wal_senders=0; as indeed buildfarm member thorntail does, so that it can test wal_level=minimal in other test suites. Hence, rather than assuming that max_wal_senders=10 will prevail if we say nothing, set it explicitly. Set max_replication_slots=10 explicitly too, just to be safe. Back-patch to v10, like the previous patch. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Remove obsolete replication settings within TAP tests.Tom Lane2020-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgresNode.pm set "max_wal_senders = 5" for replication testing, but this seems to be slightly too low for our current test suite. Slower buildfarm members frequently report "number of requested standby connections exceeds max_wal_senders" failures, due to old walsenders not exiting instantaneously. Usually, the test does not fail overall because of automatic walreceiver restart, but sometimes the failure becomes visible; and in any case such retries slow down the test. That value came in with commit 89ac7004d, but was soon obsoleted by f6d6d2920, which raised the built-in default from zero to 10; so that PostgresNode.pm is actually setting it to less than the conservative built-in default. That seems pretty pointless, so let's remove the special setting and let the default prevail, in hopes of making the TAP tests more robust. Likewise, the setting "max_replication_slots = 5" is obsolete and can be removed. While here, reverse-engineer a comment about why we're choosing less-than-default values for some other settings. (Note: before v12, max_wal_senders counted against max_connections so that the latter setting also needs some fiddling with.) Back-patch to v10 where the subscription tests were added. It's likely that the older branches aren't pushing the boundaries of max_wal_senders, but I'm disinclined to spend time trying to figure out exactly when it started to be a problem. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Tighten up Windows CRLF conversion in our TAP test scripts.Tom Lane2020-07-09
| | | | | | | | | | Back-patch commits 91bdf499b and ffb4cee43, so that all branches agree on when and how to do Windows CRLF conversion. This should close the referenced thread. Thanks to Andrew Dunstan for discussion/review. Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net
* Fix walsender error cleanup codeAlvaro Herrera2020-05-15
| | | | | | | | | | | | | | | In commit 850196b610d2 I (Álvaro) failed to handle the case of walsender shutting down on an error before setting up its 'xlogreader' pointer; the error handling code dereferences the pointer, causing a crash. Fix by testing the pointer before trying to dereference it. Kyotaro authored the code fix; I adopted Nathan's test case to be used by the TAP tests and added the necessary PostgresNode change. Reported-by: Nathan Bossart <bossartn@amazon.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/C04FC24E-903D-4423-B312-6910E4D846E5@amazon.com
* Initial pgindent and pgperltidy run for v13.Tom Lane2020-05-14
| | | | | | | | | | | Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
* Allow using Unix-domain sockets on Windows in testsPeter Eisentraut2020-03-30
| | | | | | | | | | | | | | | The test suites currently don't use Unix-domain sockets on Windows. This optionally allows enabling that by setting the environment variable PG_TEST_USE_UNIX_SOCKETS. This should currently be considered experimental. In particular, pg_regress.c contains some comments that the cleanup code for Unix-domain sockets doesn't work correctly under Windows, which hasn't been an problem until now. But it's good enough for locally supervised testing of the functionality. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/54bde68c-d134-4eb8-5bd3-8af33b72a010@2ndquadrant.com
* When a TAP file has non-zero exit status, retain temporary directories.Noah Misch2020-02-05
| | | | | | | | | | | | PostgresNode already retained base directories in such cases. Stop using $SIG{__DIE__}, which is redundant with the exit status check, in lieu of proliferating it to TestLib. Back-patch to 9.6, where commit 88802e068017bee8cea7a5502a712794e761c7b5 introduced retention on failure. Reviewed by Daniel Gustafsson. Discussion: https://postgr.es/m/20200202170155.GA3264196@rfd.leadboat.com
* Fail if recovery target is not reachedPeter Eisentraut2020-01-29
| | | | | | | | | | | | Before, if a recovery target is configured, but the archive ended before the target was reached, recovery would end and the server would promote without further notice. That was deemed to be pretty wrong. With this change, if the recovery target is not reached, it is a fatal error. Based-on-patch-by: Leif Gunnar Erlandsen <leif@lako.no> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/993736dd3f1713ec1f63fc3b653839f5@lako.no
* Add basic TAP tests for psql's tab-completion logic.Tom Lane2020-01-02
| | | | | | | | | | | | | | | | | Up to now, psql's tab-complete.c has had exactly no regression test coverage. This patch is an experimental attempt to add some. This needs Perl's IO::Pty module, which isn't installed everywhere, so the test script just skips all tests if that's not present. There may be other portability gotchas too, so I await buildfarm results with interest. So far this just covers a few very basic keyword-completion and query-driven-completion scenarios, which should be enough to let us get a feel for whether this is practical at all from a portability standpoint. If it is, there's lots more that can be done. Discussion: https://postgr.es/m/10967.1577562752@sss.pgh.pa.us
* Avoid picking already-bound TCP ports in kerberos and ldap test suites.Tom Lane2019-08-04
| | | | | | | | | | | | | | | | | | | | | | | src/test/kerberos and src/test/ldap need to run a private authentication server of the relevant type, for which they need a free TCP port. They were just picking a random port number in 48K-64K, which works except when something's already using the particular port. Notably, the probability of failure rises dramatically if one simply runs those tests in a tight loop, because each test cycle leaves behind a bunch of high ports that are transiently in TIME_WAIT state. To fix, split out the code that PostgresNode.pm already had for identifying a free TCP port number, so that it can be invoked to choose a port for the KDC or LDAP server. This isn't 100% bulletproof, since conceivably something else on the machine could grab the port between the time we check and the time we actually start the server. But that's a pretty short window, so in practice this should be good enough. Back-patch to v11 where these test suites were added. Patch by me, reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/3397.1564872168@sss.pgh.pa.us
* Consolidate methods for translating a Perl path to a Windows path.Noah Misch2019-06-21
| | | | | | | | | | | | This fixes some TAP suites when using msys Perl and a builddir located in an msys mount point other than "/". For example, builddir=/c/pg exhibited the problem, since /c/pg falls in mount point "/c". Back-patch to 9.6, where tests first started to perform such translations. In back branches, offer both new and old APIs. Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20190610045838.GA238501@rfd.leadboat.com
* Honor TEMP_CONFIG in TAP suites.Noah Misch2019-05-11
| | | | | | | | | | | | The buildfarm client uses TEMP_CONFIG to implement its extra_config setting. Except for stats_temp_directory, extra_config now applies to TAP suites; extra_config values seen in the past month are compatible with this. Back-patch to 9.6, where PostgresNode was introduced, so the buildfarm can rely on it sooner. Reviewed by Andrew Dunstan and Tom Lane. Discussion: https://postgr.es/m/20181229021950.GA3302966@rfd.leadboat.com
* Probe only 127.0.0.1 when looking for ports on Unix.Thomas Munro2019-05-08
| | | | | | | | | | | | Commit c0985099, later adjusted by commit 4ab02e81, probed 0.0.0.0 in addition to 127.0.0.1, for the benefit of Windows build farm animals. It isn't really useful on Unix systems, and turned out to be a bit inconvenient to users of some corporate firewall software. Switch back to probing just 127.0.0.1 on non-Windows systems. Back-patch to 9.6, like the earlier changes. Discussion: https://postgr.es/m/CA%2BhUKG%2B21EPwfgs4m%2BtqyRtbVqkOUvP8QQ8sWk9%2Bh55Aub1H3A%40mail.gmail.com
* Test both 0.0.0.0 and 127.0.0.x addresses to find a usable port.Noah Misch2019-04-14
| | | | | | | | | | | | | Commit c098509927f9a49ebceb301a2cb6a477ecd4ac3c changed PostgresNode::get_new_node() to probe 0.0.0.0 instead of 127.0.0.1, but the new test was less effective for Windows native Perl. This increased the failure rate of buildfarm members bowerbird and jacana. Instead, test 0.0.0.0 and concrete addresses. This restores the old level of defense, but the algorithm is still subject to its longstanding time of check to time of use race condition. Back-patch to 9.6, like the previous change. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se
* Switch TAP tests of pg_rewind to use non-superuser role, take twoMichael Paquier2019-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | Up to now the tests of pg_rewind have been using a superuser for all its tests (which is the default of many tests actually, and something that ought to be reviewed) when involving an online source server, still it is possible to use a non-superuser role to do that as long as this role is granted permissions to execute all the source-side functions used for the rewind. This is possible since v11, and was already documented as of bfc8068. PostgresNode::init is extended so as callers of this routine can add extra options to configure the authentication of a new node, which gets used by this commit, and allows the tests to work properly on Windows where SSPI is used. This will allow to catch up easily any change in pg_rewind if the tool begins to use more backend-side functions, so as the properties introduced by v11 are kept. Per suggestion from Peter Eisentraut. Author: Michael Paquier Reviewed-by: Magnus Hagander Discussion: https://postgr.es/m/20190411041336.GM2728@paquier.xyz
* When Perl "kill(9, ...)" fails, try "pg_ctl kill".Noah Misch2019-04-13
| | | | | | | Per buildfarm member jacana, the former fails under msys Perl 5.8.8. Back-patch to 9.6, like the code in question. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se
* Consistently test for in-use shared memory.Noah Misch2019-04-12
| | | | | | | | | | | | | | | | | | | | | | postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com
* Revert "Consistently test for in-use shared memory."Noah Misch2019-04-05
| | | | | | | | | This reverts commits 2f932f71d9f2963bbd201129d7b971c8f5f077fd, 16ee6eaf80a40007a138b60bb5661660058d0422 and 6f0e190056fe441f7cf788ff19b62b13c94f68f3. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com
* Consistently test for in-use shared memory.Noah Misch2019-04-03
| | | | | | | | | | | | | | | | | | | | | postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com
* Don't propagate PGAPPNAME through pg_ctl in testsPeter Eisentraut2019-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When libpq is loaded in the server (for instance, by libpqwalreceiver), it may use libpq environment variables set in the postmaster environment for connection parameter defaults. This has some confusing effects in our test suites. For example, the TAP test infrastructure sets PGAPPNAME to allow identifying clients in the server log. But this environment variable is also inherited by temporary servers started with pg_ctl and is then in turn used by libpqwalreceiver as the application_name for connecting to remote servers where it then shows up in pg_stat_replication and is relevant for things like synchronous_standby_names. Replication already has a suitable default for application_name, and overriding that accidentally then requires the individual test cases to re-override that, which is all very confusing and unnecessary. To fix, unset PGAPPNAME temporarily before running pg_ctl start or restart in the tests. More comprehensive approaches like unsetting all environment variables in pg_ctl were considered but might be too complicated to achieve portably. The now unnecessary re-overriding of application_name by test cases is also removed. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/flat/33383613-690e-6f1b-d5ba-4957ff40f6ce@2ndquadrant.com
* Set cluster_name for PostgresNode.pm instancesPeter Eisentraut2019-02-27
| | | | | | | | | This can help identifying test instances more easily at run time, and it also provides some minimal test coverage for the cluster_name feature. Reviewed-by: Euler Taveira <euler@timbira.com.br> Discussion: https://www.postgresql.org/message-id/flat/1257eaee-4874-e791-e83a-46720c72cac7@2ndquadrant.com
* Integrate recovery.conf into postgresql.confPeter Eisentraut2018-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | recovery.conf settings are now set in postgresql.conf (or other GUC sources). Currently, all the affected settings are PGC_POSTMASTER; this could be refined in the future case by case. Recovery is now initiated by a file recovery.signal. Standby mode is initiated by a file standby.signal. The standby_mode setting is gone. If a recovery.conf file is found, an error is issued. The trigger_file setting has been renamed to promote_trigger_file as part of the move. The documentation chapter "Recovery Configuration" has been integrated into "Server Configuration". pg_basebackup -R now appends settings to postgresql.auto.conf and creates a standby.signal file. Author: Fujii Masao <masao.fujii@gmail.com> Author: Simon Riggs <simon@2ndquadrant.com> Author: Abhijit Menon-Sen <ams@2ndquadrant.com> Author: Sergei Kornilov <sk@zsrv.org> Discussion: https://www.postgresql.org/message-id/flat/607741529606767@web3g.yandex.ru/
* Make PostgresNode.pm's poll_query_until() more chatty about failures.Tom Lane2018-10-16
| | | | | | | | | | | Reporting only the stderr is unhelpful when the problem is that the server output we're getting doesn't match what was expected. So we should report the query output too; and just for good measure, let's print the query we used and the output we expected. Back-patch to 9.5 where poll_query_until was introduced. Discussion: https://postgr.es/m/17913.1539634756@sss.pgh.pa.us
* Implement "pg_ctl logrotate" commandAlexander Korotkov2018-09-01
| | | | | | | | | | | | Currently there are two ways to trigger log rotation in logging collector process: call pg_rotate_logfile() SQL-function or send SIGUSR1 signal directly to logging collector process. However, it's nice to have more suitable way for external tools to do that, which wouldn't require SQL connection or knowledge of logging collector pid. This commit implements triggering log rotation by "pg_ctl logrotate" command. Discussion: https://postgr.es/m/20180416.115435.28153375.horiguchi.kyotaro%40lab.ntt.co.jp Author: Kyotaro Horiguchi, Alexander Kuzmenkov, Alexander Korotkov
* Make logical WAL sender report streaming state appropriatelyMichael Paquier2018-07-12
| | | | | | | | | | | | | | | | | | | | WAL senders sending logically-decoded data fail to properly report in "streaming" state when starting up, hence as long as one extra record is not replayed, such WAL senders would remain in a "catchup" state, which is inconsistent with the physical cousin. This can be easily reproduced by for example using pg_recvlogical and restarting the upstream server. The TAP tests have been slightly modified to detect the failure and strengthened so as future tests also make sure that a node is in streaming state when waiting for its catchup. Backpatch down to 9.4 where this code has been introduced. Reported-by: Sawada Masahiko Author: Simon Riggs, Sawada Masahiko Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com
* Use $Test::Builder::Level in TAP test functionsPeter Eisentraut2018-07-01
| | | | | | | | | In TAP test functions, that is, those that produce test results, locally increment $Test::Builder::Level. This has the effect that test failures are reported at the callers location rather than somewhere in the test support libraries. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
* Replace search.cpan.org with metacpan.orgMichael Paquier2018-06-29
| | | | | | | | | | search.cpan.org has been EOL'd, with metacpan.org being the official replacement to which URLs now redirect. Update links to match the new URL. Also update links to CPAN to use https as it will redirect from http. Author: Daniel Gustafsson Discussion: https://postgr.es/m/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3@yesql.se