aboutsummaryrefslogtreecommitdiff
path: root/src/bin/pg_rewind/t
Commit message (Collapse)AuthorAge
* Run pgperltidyJoe Conway7 days
| | | | | | | This is required before the creation of a new branch. pgindent is clean, as well as is reformat-dat-files. perltidy version is v20230309, as documented in pgindent's README.
* Apply more consistent style for command options in TAP testsMichael Paquier2025-03-17
| | | | | | | | | | | | | | This commit reshapes the grammar of some commands to apply a more consistent style across the board, following rules similar to ce1b0f9da03e: - Elimination of some pointless used-once variables. - Use of long options, to self-document better the options used. - Use of fat commas to link option names and their assigned values, including redirections, so as perltidy can be tricked to put them together. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87jz8rzf3h.fsf@wibble.ilmari.org
* pg_rewind: Add dbname to primary_conninfo when using --write-recovery-conf.Masahiko Sawada2025-03-12
| | | | | | | | | | | | | This commit enhances pg_rewind's --write-recovery-conf option to include the dbname in the generated primary_conninfo value when specified in the --source-server option. With this modification, the rewound server can connect to the primary server without manual configuration file modifications when sync_replication_slots is enabled. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAD21AoAkW=Ht0k9dVoBTCcqLiiZ2MXhVr+d=j2T_EZMerGrLWQ@mail.gmail.com
* Improve grammar of options for command arrays in TAP testsMichael Paquier2025-01-22
| | | | | | | | | | | | | | | | | | | | | | | This commit rewrites a good chunk of the command arrays in TAP tests with a grammar based on the following rules: - Fat commas are used between option names and their values, making it clear to both humans and perltidy that values and names are bound together. This is particularly useful for the readability of multi-line command arrays, and there are plenty of them in the TAP tests. Most of the test code is updated to use this style. Some commands used parenthesis to show the link, or attached values and options in a single string. These are updated to use fat commas instead. - Option names are switched to use their long names, making them more self-documented. Based on a suggestion by Andrew Dunstan. - Add some trailing commas after the last item in multi-line arrays, which is a common perl style. Not all the places are taken care of, but this covers a very good chunk of them. Author: Dagfinn Ilmari Mannsåker Reviewed-by: Michael Paquier, Peter Smith, Euler Taveira Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Fix newly introduced 010_keep_recycled_wals.plÁlvaro Herrera2024-11-21
| | | | | | | | | | | | | | | It failed to set the archive_command as it desired because of a syntax problem. Oversight in commit 90bcc7c2db1d. This bug doesn't cause the test to fail, because the test only checks pg_rewind's output messages, not the actual outcome (and the outcome in both cases is that the file is kept, not deleted). But in either case the message about the file being kept is there, so it's hard to get excited about doing much more. Reported-by: Antonin Houska <ah@cybertec.at> Author: Alexander Kukushkin <cyberdemn@gmail.com> Discussion: https://postgr.es/m/7822.1732167825@antos
* Avoid deleting critical WAL segments during pg_rewindÁlvaro Herrera2024-11-15
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, in unlucky cases, it was possible for pg_rewind to remove certain WAL segments from the rewound demoted primary. In particular this happens if those files have been marked for archival (i.e., their .ready files were created) but not yet archived; the newly promoted node no longer has such files because of them having been recycled, but they are likely critical for recovery in the demoted node. If pg_rewind removes them, recovery is not possible anymore. Fix this by maintaining a hash table of files in this situation in the scan that looks for a checkpoint, which the decide_file_actions phase can consult so that it knows to preserve them. Backpatch to 14. The problem also exists in 13, but that branch was not blessed with commit eb00f1d4bf96, so this patch is difficult to apply there. Users of older releases will just have to continue to be extra careful when rewinding. Co-authored-by: Полина Бунгина (Polina Bungina) <bungina@gmail.com> Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://postgr.es/m/CAAtGL4AhzmBRsEsaDdz7065T+k+BscNadfTqP1NcPmsqwA5HBw@mail.gmail.com
* Run pgperltidyMichael Paquier2024-07-01
| | | | | | | This is required before the creation of a new branch. pgindent is clean, as well as is reformat-dat-files. perltidy version is v20230309, as documented in pgindent's README.
* Skip some permissions checks on CygwinAndrew Dunstan2024-06-13
| | | | | | These are checks that are already skipped on other Windows systems. Backpatch to all live branches, as appropriate.
* Pre-beta mechanical code beautification.Tom Lane2024-05-14
| | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. The pgindent part of this is pretty small, consisting mainly of fixing up self-inflicted formatting damage from patches that hadn't bothered to add their new typedefs to typedefs.list. In order to keep it from making anything worse, I manually added a dozen or so typedefs that appeared in the existing typedefs.list but not in the buildfarm's list. Perhaps we should formalize that, or better find a way to get those typedefs into the automatic list. pgperltidy is as opinionated as always, and reformat-dat-files too.
* Activate perlcritic InputOutput::RequireCheckedSyscalls and fix resulting ↵Peter Eisentraut2024-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | warnings This checks that certain I/O-related Perl functions properly check their return value. Some parts of the PostgreSQL code had been a bit sloppy about that. The new perlcritic warnings are fixed here. I didn't design any beautiful error messages, mostly just used "or die $!", which mostly matches existing code, and also this is developer-level code, so having the system error plus source code reference should be ok. Initially, we only activate this check for a subset of what the perlcritic check would warn about. The effective list is chmod flock open read rename seek symlink system The initial set of functions is picked because most existing code already checked the return value of those, so any omissions are probably unintended, or because it seems important for test correctness. The actual perlcritic configuration is written as an exclude list. That seems better so that we are clear on what we are currently not checking. Maybe future patches want to investigate checking some of the other functions. (In principle, we might eventually want to check all of them, but since this is test and build support code, not production code, there are probably some reasonable compromises to be made.) Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/flat/88b7d4f2-46d9-4cc7-b1f7-613c90f9a76a%40eisentraut.org
* Skip .DS_Store files in server side utilsDaniel Gustafsson2024-02-13
| | | | | | | | | | | | | | | | | | The macOS Finder application creates .DS_Store files in directories when opened, which creates problems for serverside utilities which expect all files to be PostgreSQL specific files. Skip these files when encountered in pg_checksums, pg_rewind and pg_basebackup. This was extracted from a larger patchset for skipping hidden files and system files, where the concencus was to just skip these. Since this is equally likely to happen in every version, backpatch to all supported versions. Reported-by: Mark Guertin <markguertin@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tobias Bussmann <t.bussmann@gmx.net> Discussion: https://postgr.es/m/E258CE50-AB0E-455D-8AAD-BB4FE8F882FB@gmail.com Backpatch-through: v12
* Avoid package qualification of $windows_osAndrew Dunstan2024-02-01
| | | | | | Further fallout from commit 6ee26c6a4b. To keep code in sync and avoid issues on older releases with different package names, simply use the unqualified name like many other places in our code.
* Fix 003_extrafiles.pl test for the WindowsAndrew Dunstan2024-01-30
| | | | | | | | | | | | | File::Find converts backslashes to slashes in the newer Perl versions. See: https://github.com/Perl/perl5/commit/414f14df98cb1c9a20f92c5c54948b67c09f072d So, do the same conversion for Windows before comparing paths. To support all Perl versions, always convert them on Windows regardless of the Perl's version. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Backpatch to all live branches
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Fix typos in comments and in one isolation test.Robert Haas2024-01-02
| | | | | | | Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions by me. Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
* Make all Perl warnings fatalPeter Eisentraut2023-12-29
| | | | | | | | | | | | | | | | | | | | | | | | There are a lot of Perl scripts in the tree, mostly code generation and TAP tests. Occasionally, these scripts produce warnings. These are probably always mistakes on the developer side (true positives). Typical examples are warnings from genbki.pl or related when you make a mess in the catalog files during development, or warnings from tests when they massage a config file that looks different on different hosts, or mistakes during merges (e.g., duplicate subroutine definitions), or just mistakes that weren't noticed because there is a lot of output in a verbose build. This changes all warnings into fatal errors, by replacing use warnings; by use warnings FATAL => 'all'; in all Perl files. Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
* Fix pg_rewind with in-place tablespaces when source is remoteMichael Paquier2023-07-30
| | | | | | | | | | | | | | | | | | | | libpq_source.c would consider any result returned by pg_tablespace_location() as a symlink, resulting in run-time errors like that: pg_rewind: error: file "pg_tblspc/NN" is of different type in source and target In-place tablespaces are directories located in pg_tblspc/, returned as relative paths instead of absolute paths, so rely on that to make the difference with a normal tablespace and an in-place one. If the path is relative, the tablespace is handled as a directory. If the path is absolute, consider it as a symlink. In-place tablespaces are only intended for development purposes, so like 363e8f9 no backpatch is done. A test is added in pg_rewind with an in-place tablespace and some data in it. Author: Rui Zhao, Michael Paquier Discussion: https://postgr.es/m/2b79d2a8-b2d5-4bd7-a15b-31e485100980.xiyuan.zr@alibaba-inc.com
* Pre-beta mechanical code beautification.Tom Lane2023-05-19
| | | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
* pg_rewind: Fix determining TLI when server was just promoted.Heikki Linnakangas2023-02-23
| | | | | | | | | | | | | | | | | | | | | | | If the source server was just promoted, and it hasn't written the checkpoint record yet, pg_rewind considered the server to be still on the old timeline. Because of that, it would claim incorrectly that no rewind is required. Fix that by looking at minRecoveryPointTLI in the control file in addition to the ThisTimeLineID on the checkpoint. This has been a known issue since forever, and we had worked around it in the regression tests by issuing a checkpoint after each promotion, before running pg_rewind. But that was always quite hacky, so better to fix this properly. This doesn't add any new tests for this, but removes the previously-added workarounds from the existing tests, so that they should occasionally hit this codepath again. This is arguably a bug fix, but don't backpatch because we haven't really treated it as a bug so far. Also, the patch didn't apply cleanly to v13 and below. I'm sure sure it could be made to work on v13, but doesn't seem worth the risk and effort. Reviewed-by: Kyotaro Horiguchi, Ibrar Ahmed, Aleksander Alekseev Discussion: https://www.postgresql.org/message-id/9f568c97-87fe-a716-bd39-65299b8a60f4%40iki.fi
* Add wait_for_replay_catchup wrapper to Cluster.pmAlvaro Herrera2023-02-13
| | | | | | | This simplifies a few lines of Perl test code a bit. Author: Bertrand Drouvot Discussion: https://postgr.es/m/846724b5-0723-f4c2-8b13-75301ec7509e@gmail.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* Fix typos and grammar in code and test commentsMichael Paquier2022-05-11
| | | | | | | | This fixes the grammar of some comments in a couple of tests (SQL and TAP), and in some C files. Author: Justin Pryzby Discussion: https://postgr.es/m/20220511020334.GH19626@telsasoft.com
* Improve frontend error logging style.Tom Lane2022-04-08
| | | | | | | | | | | | | | | | | | | | | | | | Get rid of the separate "FATAL" log level, as it was applied so inconsistently as to be meaningless. This mostly involves s/pg_log_fatal/pg_log_error/g. Create a macro pg_fatal() to handle the common use-case of pg_log_error() immediately followed by exit(1). Various modules had already invented either this or equivalent macros; standardize on pg_fatal() and apply it where possible. Invent the ability to add "detail" and "hint" messages to a frontend message, much as we have long had in the backend. Except where rewording was needed to convert existing coding to detail/hint style, I have (mostly) resisted the temptation to change existing message wording. Patch by me. Design and patch reviewed at various stages by Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and Daniel Gustafsson. Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
* Add option --config-file to pg_rewindMichael Paquier2022-04-07
| | | | | | | | | | | | | | | | | | | | | This option is useful to do a rewind with the server configuration file (aka postgresql.conf) located outside the data directory, which is something that some Linux distributions and some HA tools like to rely on. As a result, this can simplify the logic around a rewind by avoiding the copy of such files before running pg_rewind. This option affects pg_rewind when it internally starts the target cluster with some "postgres" commands, adding -c config_file=FILE to the command strings generated, when: - retrieving a restore_command using a "postgres -C" command for -c/--restore-target-wal. - forcing crash recovery once to get the cluster into a clean shutdown state. Author: Gunnar "Nick" Bluth Reviewed-by: Michael Banck, Alexander Kukushkin, Michael Paquier, Alexander Alekseev Discussion: https://postgr.es/m/7c59265d-ac50-b0aa-ca1e-65e8bd27642a@pro-open.de
* pg_rewind: Fetch small files according to new size.Daniel Gustafsson2022-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race condition if a file changes in the source system after we have collected the file list. If the file becomes larger, we only fetched up to its original size. That can easily result in a truncated file. That's not a problem for relation files, files in pg_xact, etc. because any actions on them will be replayed from the WAL. However, configuration files are affected. This commit mitigates the race condition by fetching small files in whole, even if they have grown. A test is added in which an extra file copied is concurrently grown with the output of pg_rewind thus guaranteeing it to have changed in size during the operation. This is not a full fix: we still believe the original file size for files larger than 1 MB. That should be enough for configuration files, and doing more than that would require big changes to the chunking logic in libpq_source.c. This mitigates the race condition if the file is modified between the original scan of files and copying the file, but there's still a race condition if a file is changed while it's being copied. That's a much smaller window, though, and pg_basebackup has the same issue. This race can be seen with pg_auto_failover, which frequently uses ALTER SYSTEM, which updates postgresql.auto.conf. Often, pg_rewind will fail, because the postgresql.auto.conf file changed concurrently and a partial version of it was copied to the target. The partial file would fail to parse, preventing the server from starting up. Author: Heikki Linnakangas Reviewed-by: Cary Huang Discussion: https://postgr.es/m/f67feb24-5833-88cb-1020-19a4a2b83ac7%40iki.fi
* Remove more unused module imports from TAP testsDaniel Gustafsson2022-03-27
| | | | | | | | | | | | | | | | | This is a follow-up to commit 7dac61402 which removed a set of unused modules from the TAP test. The Config references in the pg_ctl and pg_rewind tests were removed in commit 1c6d46293. Fcntl ':mode' and File::stat in the pg_ctl test were added in c37b3d08c which was probably a leftover from an earlier version of the patch, as the function using these was added to another module in that commit. The Config reference in the ldap test was added in ee56c3b21 which in turn use $^O instead of interrogating Config. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87lewyqk45.fsf@wibble.ilmari.org
* Use test functions in pg_rewind test moduleDaniel Gustafsson2022-02-23
| | | | | | | | | | Commit 61081e75c introduced pg_rewind along with the test suite, which ensured that subroutines didn't incur more than one test to plan. Now that we no longer explicitly plan tests (since 549ec201d), we can use the usual Test::More functions. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/AA527525-F0CC-4AA2-AF98-543CABFDAF59@yesql.se
* Remove most msys special processing in TAP testsAndrew Dunstan2022-02-20
| | | | | | | | | | Following migration of Windows buildfarm members running TAP tests to use of ucrt64 perl for those tests, special processing for msys perl is no longer necessary and so is removed. Backpatch to release 10 Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net
* Replace Test::More plans with done_testingDaniel Gustafsson2022-02-11
| | | | | | | | | | | | | | | | | | | Rather than doing manual book keeping to plan the number of tests to run in each TAP suite, conclude each run with done_testing() summing up the the number of tests that ran. This removes the need for maintaning and updating the plan count at the expense of an accurate count of remaining during the test suite runtime. This patch has been discussed a number of times, often in the context of other patches which updates tests, so a larger number of discussions can be found in the archives. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/DD399313-3D56-4666-8079-88949DAC870F@yesql.se
* Clean up TAP tests' usage of wait_for_catchup().Tom Lane2022-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, wait_for_catchup() waits for the replication connection to reach the primary's write LSN. That's fine, but in an apparent attempt to save one query round-trip, it was coded so that we executed pg_current_wal_lsn() again during each probe query. Thus, we presented the standby with a moving target to be reached. (While the test script itself couldn't be causing the write LSN to advance while it's blocked in wait_for_catchup(), it's plenty plausible that background activity such as autovacuum is emitting more WAL.) That could make the test take longer than necessary, and potentially it could mask bugs by allowing the standby to process more WAL than a strict interpretation of the test scenario allows. So, change wait_for_catchup() to do it "by the book", explicitly collecting the write LSN to wait for at the outset. Also, various call sites were instructing wait_for_catchup() to wait for the standby to reach the primary's insert LSN rather than its write LSN. This also seems like a bad idea. While in most test scenarios those are the same, if they are different then the inserted-but-not-yet-written WAL is not presently available to the standby. The test isn't doing anything to make it become so, so again we have the potential for unwanted test delay, perhaps even a test timeout. (Again, background activity would be needed to make this more than a hypothetical problem.) Hence, change the callers where necessary so that the wait target is always the primary's write LSN. While at it, simplify callers by making use of wait_for_catchup's default arguments wherever possible (the preceding change makes this possible in more places than it was before). And rewrite wait_for_catchup's documentation a bit. Patch by me; thanks to Julien Rouhaud for review. Discussion: https://postgr.es/m/2368336.1641843098@sss.pgh.pa.us
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Move Perl test modules to a better namespaceAndrew Dunstan2021-10-24
| | | | | | | | | | | | | | | The five modules in our TAP test framework all had names in the top level namespace. This is unwise because, even though we're not exporting them to CPAN, the names can leak, for example if they are exported by the RPM build process. We therefore move the modules to the PostgreSQL::Test namespace. In the process PostgresNode is renamed to Cluster, and TestLib is renamed to Utils. PostgresVersion becomes simply PostgreSQL::Version, to avoid possible confusion about what it's the version of. Discussion: https://postgr.es/m/aede93a4-7d92-ef26-398f-5094944c2504@dunslane.net Reviewed by Erik Rijkers and Michael Paquier
* Unify PostgresNode's new() and get_new_node() methodsAndrew Dunstan2021-07-29
| | | | | | | | | There is only one constructor now for PostgresNode, with the idiomatic name 'new'. The method is not exported by the class, and must be called as "PostgresNode->new('name',[args])". All the TAP tests that use PostgresNode are modified accordingly. Third party scripts will need adjusting, which is a fairly mechanical process (I just used a sed script).
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Add a copyright notice to perl files lacking one.Andrew Dunstan2021-05-07
|
* Fix more race conditions in the newly-added pg_rewind test.Heikki Linnakangas2020-12-07
| | | | | | | | | | | | | | | | | | | | | | | | pg_rewind looks at the control file to check what timeline a server is on. But promotion doesn't immediately write a checkpoint, it merely writes an end-of-recovery WAL record. If pg_rewind runs immediately after promotion, before the checkpoint has completed, it will think think that the server is still on the earlier timeline. We ran into this issue a long time ago already, see commit 484a848a73f. It's a bit bogus that pg_rewind doesn't determine the timeline correctly until the end-of-recovery checkpoint has completed. We probably should fix that. But for now work around it by waiting for the checkpoint to complete before running pg_rewind, like we did in commit 484a848a73f. In the passing, tidy up the new test a little bit. Rerder the INSERTs so that the comments make more sense, remove a spurious CHECKPOINT call after pg_rewind has already run, and add --debug option, so that if this fails again, we'll have more data. Per buildfarm failure at https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual&dt=2020-12-06%2018%3A32%3A19&stg=pg_rewind-check. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/1713707e-e318-761c-d287-5b6a4aa807e8@iki.fi
* Fix race conditions in newly-added test.Heikki Linnakangas2020-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Buildfarm has been failing sporadically on the new test. I was able to reproduce this by adding a random 0-10 s delay in the walreceiver, just before it connects to the primary. There's a race condition where node_3 is promoted before it has fully caught up with node_1, leading to diverged timelines. When node_1 is later reconfigured as standby following node_3, it fails to catch up: LOG: primary server contains no more WAL on requested timeline 1 LOG: new timeline 2 forked off current database system timeline 1 before current recovery point 0/30000A0 That's the situation where you'd need to use pg_rewind, but in this case it happens already when we are just setting up the actual pg_rewind scenario we want to test, so change the test so that it waits until node_3 is connected and fully caught up before promoting it, so that you get a clean, controlled failover. Also rewrite some of the comments, for clarity. The existing comments detailed what each step in the test did, but didn't give a good overview of the situation the steps were trying to create. For reasons I don't understand, the test setup had to be written slightly differently in 9.6 and 9.5 than in later versions. The 9.5/9.6 version needed node 1 to be reinitialized from backup, whereas in later versions it could be shut down and reconfigured to be a standby. But even 9.5 should support "clean switchover", where primary makes sure that pending WAL is replicated to standby on shutdown. It would be nice to figure out what's going on there, but that's independent of pg_rewind and the scenario that this test tests. Discussion: https://www.postgresql.org/message-id/b0a3b95b-82d2-6089-6892-40570f8c5e60%40iki.fi
* Fix pg_rewind bugs when rewinding a standby server.Heikki Linnakangas2020-12-03
| | | | | | | | | | | | | If the target is a standby server, its WAL doesn't end at the last checkpoint record, but at minRecoveryPoint. We must scan all the WAL from the last common checkpoint all the way up to minRecoveryPoint for modified pages, and also consider that portion when determining whether the server needs rewinding. Backpatch to all supported versions. Author: Ian Barwick and me Discussion: https://www.postgresql.org/message-id/CABvVfJU-LDWvoz4-Yow3Ay5LZYTuPD7eSjjE4kGyNZpXC6FrVQ%40mail.gmail.com
* Make pg_rewind test case more stable.Heikki Linnakangas2020-11-20
| | | | | | | | If replication is exceptionally slow for some reason, pg_rewind might run before the test row has been replicated. Add an explicit wait for it. Reported-by: Andres Freund Discussion: https://www.postgresql.org/message-id/20201120003811.iknhqwatitw2vvxf%40alap3.anarazel.de
* Fix timing issue in pg_rewind test.Heikki Linnakangas2020-11-15
| | | | | | | | | | | The test inserts a row in primary server, waits until the insertion has been replicated to a cascaded standby, and checks that it's visible there by querying the cascaded standby. In order for that to work reliably, the test needs to wait until the insertion WAL record has been fully replayed. This should fix the occasional buildfarm failures. Diagnosis by Tom Lane. Discussion: https://www.postgresql.org/message-id/606796.1605424022@sss.pgh.pa.us
* Remove another test that doesn't work on Windows.Heikki Linnakangas2020-11-13
| | | | | | Apparently double-quotes are not allowed in filenames on Windows, either. Per buildfarm.
* Remove tests that don't work on Windows.Heikki Linnakangas2020-11-12
| | | | | | | | | | On Windows, a filename cannot contain backslashes, because a backslash is used directory separator. Remove tests I added in commit 9c4f5192f that tried to do that. We could perhaps use a SKIP block to only skip them on Windows, but I'm not sure how exactly to formulate that, so just remove the tests to make the buildfarm green again. Per buildfarm.
* Allow pg_rewind to use a standby server as the source system.Heikki Linnakangas2020-11-12
| | | | | | | | | | | | | | Using a hot standby server as the source has not been possible, because pg_rewind creates a temporary table in the source system, to hold the list of file ranges that need to be fetched. Refactor it to queue up the file fetch requests in pg_rewind's memory, so that the temporary table is no longer needed. Also update the logic to compute 'minRecoveryPoint' correctly, when the source is a standby server. Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
* pg_rewind: Fix thinko in parsing target WAL.Heikki Linnakangas2020-11-10
| | | | | | | | | | | It's entirely possible to see WAL for a relation that doesn't exist in the target anymore. That happens when the relation was dropped later. The refactoring in commit eb00f1d4b broke that case, by sanity-checking the file type in the target before checking the flag forwhether it exists there at all. I noticed this during manual testing. Modify the 001_basic.pl test so that it covers this case.
* Mark commit and abort WAL records with XLR_SPECIAL_REL_UPDATE.Heikki Linnakangas2020-08-17
| | | | | | | | | | | | | | | | If a commit or abort record includes "dropped relfilenodes", then replaying the record will remove data files. That is surely a "special rel update", but the records were not marked as such. Fix that, teach pg_rewind to expect and ignore them, and add a test case to cover it. It's always been like this, but no backporting for fear of breaking existing applications. If an application parsed the WAL but was not handling commit/abort records, it would stop working. That might be a good thing if it really needed to handle the dropped rels, but it will be caught when the application is updated to work with PostgreSQL v14 anyway. Discussion: https://www.postgresql.org/message-id/07b33e2c-46a6-86a1-5f9e-a7da73fddb95%40iki.fi Reviewed-by: Amit Kapila, Michael Paquier
* Rename wal_keep_segments to wal_keep_size.Fujii Masao2020-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | max_slot_wal_keep_size that was added in v13 and wal_keep_segments are the GUC parameters to specify how much WAL files to retain for the standby servers. While max_slot_wal_keep_size accepts the number of bytes of WAL files, wal_keep_segments accepts the number of WAL files. This difference of setting units between those similar parameters could be confusing to users. To alleviate this situation, this commit renames wal_keep_segments to wal_keep_size, and make users specify the WAL size in it instead of the number of WAL files. There was also the idea to rename max_slot_wal_keep_size to max_slot_wal_keep_segments, in the discussion. But we have been moving away from measuring in segments, for example, checkpoint_segments was replaced by max_wal_size. So we concluded to rename wal_keep_segments to wal_keep_size. Back-patch to v13 where max_slot_wal_keep_size was added. Author: Fujii Masao Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
* Enable almost all TAP tests involving symlinks on WindowsAndrew Dunstan2020-07-16
| | | | | | | | | | | | | | | | | | | | Windows has junction points which function as symbolic links for directories. This patch introduces a new function TestLib::dir_symlink() which creates a junction point on Windows and a standard Unix type symbolic link elsewhere. The function TestLib::perl2host is also modified, first to use cygpath where it's available (e.g. msys2) and second to allow it to succeed if the gandparent directory exists but the parent does not. Given these changes the only symlink tests that need to be skipped on Windows are those related to permissions or to use of readlink. The relevant tests for pg_basebackup and pg_rewind are therefore adjusted accordingly. Andrew Dunstan, reviewed by Peter Eisentraut and Michael Paquier. Discussion: https://postgr.es/m/c50a646c-d9bb-7c62-a4bf-8256ff6ff338@2ndquadrant.com
* Tighten up Windows CRLF conversion in our TAP test scripts.Tom Lane2020-07-08
| | | | | | | | | | | | | | | | | | The previous approach was to search-and-destroy all \r occurrences no matter what. That seems more likely to hide bugs than anything else; indeed it seems to be hiding one now. Fix things so that we only transform \r\n to \n. Side effects: must do this before, not after, chomp'ing if we're going to chomp, else we'd fail to clean up a trailing \r\n. Also, remove safe_psql's redundant repetition of what psql already did; else it might reduce \r\r\n to \n, which is exactly the scenario I'm hoping to expose. Perhaps this should be back-patched, but for now I'm content to see what happens in HEAD. Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net