aboutsummaryrefslogtreecommitdiff
path: root/src/bin/pg_rewind/file_ops.c
Commit message (Collapse)AuthorAge
* initdb: Add --no-sync-data-files.Nathan Bossart2025-03-25
| | | | | | | | | | | | | | | | | | | | | | This new option instructs initdb to skip synchronizing any files in database directories, the database directories themselves, and the tablespace directories, i.e., everything in the base/ subdirectory and any other tablespace directories. Other files, such as those in pg_wal/ and pg_xact/, will still be synchronized unless --no-sync is also specified. --no-sync-data-files is primarily intended for internal use by tools that separately ensure the skipped files are synchronized to disk. A follow-up commit will use this to help optimize pg_upgrade's file transfer step. The --sync-method=fsync implementation of this option makes use of a new exclude_dir parameter for walkdir(). When not NULL, exclude_dir specifies a directory to skip processing. The --sync-method=syncfs implementation of this option just skips synchronizing the non-default tablespace directories. This means that initdb will still synchronize some or all of the database files, but there's not much we can do about that. Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Define PG_TBLSPC_DIR for path pg_tblspc/ in data folderMichael Paquier2024-09-03
| | | | | | | | | | | | | | Similarly to 2065ddf5e34c, this introduces a define for "pg_tblspc". This makes the style more consistent with the existing PG_STAT_TMP_DIR, for example. There is a difference with the other cases with the introduction of PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add support for syncfs() in frontend support functions.Nathan Bossart2023-09-06
| | | | | | | | | | | | | | | This commit adds support for using syncfs() in fsync_pgdata() and fsync_dir_recurse() (which have been renamed to sync_pgdata() and sync_dir_recurse()). Like recovery_init_sync_method, sync_pgdata() calls syncfs() for the data directory, each tablespace, and pg_wal (if it is a symlink). For now, all of the frontend utilities that use these support functions are hard-coded to use fsync(), but a follow-up commit will allow specifying syncfs(). Co-authored-by: Justin Pryzby Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/20210930004340.GM831%40telsasoft.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Replace pgwin32_is_junction() with lstat().Thomas Munro2022-08-06
| | | | | | | | | | | | | | | | | | | Now that lstat() reports junction points with S_IFLNK/S_ISLINK(), and unlink() can unlink them, there is no need for conditional code for Windows in a few places. That was expressed by testing for WIN32 or S_ISLNK, which we can now constant-fold. The coding around pgwin32_is_junction() was a bit suspect anyway, as we never checked for errors, and we also know that errors can be spuriously reported because of transient sharing violations on this OS. The lstat()-based code has handling for that. This also reverts 4fc6b6ee on master only. That was done because lstat() didn't previously work for symlinks (junction points), but now it does. Tested-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CA%2BhUKGLfOOeyZpm5ByVcAt7x5Pn-%3DxGRNCvgiUPVVzjFLtnY0w%40mail.gmail.com
* Remove configure probes for symlink/readlink, and dead code.Thomas Munro2022-08-05
| | | | | | | | | | | | | | | | | | | symlink() and readlink() are in SUSv2 and all targeted Unix systems have them. We have partial emulation on Windows. Code that raised runtime errors on systems without it has been dead for years, so we can remove that and also references to such systems in the documentation. Define HAVE_READLINK and HAVE_SYMLINK macros on Unix. Our Windows replacement functions based on junction points can't be used for relative paths or for non-directories, so the macros can be used to check for full symlink support. The places that deal with tablespaces can just use symlink functions without checking the macros. (If they did check the macros, they'd need to provide an #else branch with a runtime or compile time error, and it'd be dead code.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* pg_rewind: Refactor the abstraction to fetch from local/libpq source.Heikki Linnakangas2020-11-04
| | | | | | | | | | | | | | | | | This makes the abstraction of a "source" server more clear, by introducing a common abstract class, borrowing the object-oriented programming term, that represents all the operations that can be done on the source server. There are two implementations of it, one for fetching via libpq, and another to fetch from a local directory. This adds some code, but makes it easier to understand what's going on. The copy_executeFileMap() and libpq_executeFileMap() functions contained basically the same logic, just calling different functions to fetch the source files. Refactor so that the common logic is in one place, in a new function called perform_rewind(). Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
* Refactor pg_rewind for more clear decision making.Heikki Linnakangas2020-11-04
| | | | | | | | | | | Deciding what to do with each file is now a separate step after all the necessary information has been gathered. It is more clear that way. Previously, the decision-making was divided between process_source_file() and process_target_file(), and it was a bit hard to piece together what the overall rules were. Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
* pg_rewind: Move syncTargetDirectory() to file_ops.cHeikki Linnakangas2020-11-04
| | | | | | | | For consistency. All the other low-level functions that operate on the target directory are in file_ops.c. Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Remove pg_rewind's private logging.h/logging.c files.Tom Lane2019-05-14
| | | | | | | | | | | | The existence of these files became rather confusing with the introduction of a widely-known logging.h header in commit cc8d41511. (Indeed, there's already some duplicative #includes here, perhaps betraying such confusion.) The only thing left in them, after that commit, is a progress-reporting function that's neither general-purpose nor tied in any way to other logging infrastructure. Hence, let's just move that function to pg_rewind.c, and get rid of the separate files. Discussion: https://postgr.es/m/3971.1557787914@sss.pgh.pa.us
* Unified logging system for command-line programsPeter Eisentraut2019-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Rework error messages around file handlingMichael Paquier2018-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some error messages related to file handling are using the code path context to define their state. For example, 2PC-related errors are referring to "two-phase status files", or "relation mapping file" is used for catalog-to-filenode mapping, however those prove to be difficult to translate, and are not more helpful than just referring to the path of the file being worked on. So simplify all those error messages by just referring to files with their path used. In some cases, like the manipulation of WAL segments, the context is actually helpful so those are kept. Calls to the system function read() have also been rather inconsistent with their error handling sometimes not reporting the number of bytes read, and some other code paths trying to use an errno which has not been set. The in-core functions are using a more consistent pattern with this patch, which checks for both errno if set or if an inconsistent read is happening. So as to care about pluralization when reading an unexpected number of byte(s), "could not read: read %d of %zu" is used as error message, with %d field being the output result of read() and %zu the expected size. This simplifies the work of translators with less variations of the same message. Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20180520000522.GB1603@paquier.xyz
* Refactor dir/file permissionsStephen Frost2018-04-07
| | | | | | | | | | | | | | | | | | | Consolidate directory and file create permissions for tools which work with the PG data directory by adding a new module (common/file_perm.c) that contains variables (pg_file_create_mode, pg_dir_create_mode) and constants to initialize them (0600 for files and 0700 for directories). Convert mkdir() calls in the backend to MakePGDirectory() if the original call used default permissions (always the case for regular PG directories). Add tests to make sure permissions in PGDATA are set correctly by the tools which modify the PG data directory. Authors: David Steele <david@pgmasters.net>, Adam Brightwell <adam.brightwell@crunchydata.com> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
* Fix handling of files that source server removes during pg_rewind is running.Fujii Masao2018-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | After processing the filemap to build the list of chunks that will be fetched from the source to rewing the target server, it is possible that a file which was previously processed is removed from the source. A simple example of such an occurence is a WAL segment which gets recycled on the target in-between. When the filemap is processed, files not categorized as relation files are first truncated to prepare for its full copy of which is going to be taken from the source, divided into a set of junks. However, for a recycled WAL segment, this would result in a segment which has a zero-byte size. With such an empty file, post-rewind recovery thinks that records are saved but they are actually not because of the truncation which happened when processing the filemap, resulting in data loss. In order to fix the problem, make sure that files which are found as removed on the source when receiving chunks of them are as well deleted on the target server for consistency. Back-patch to 9.5 where pg_rewind was added. Author: Tsunakawa Takayuki Reviewed-by: Michael Paquier Reported-by: Tsunakawa Takayuki Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8DAAA2%40G01JPEXMBYT05
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Remove useless duplicate inclusions of system header files.Tom Lane2017-02-25
| | | | | | | | | | | | | | | | c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* pg_rewind: fsync target data directory.Andres Freund2016-03-27
| | | | | | | | | | | | | | | | | Previously pg_rewind did not fsync any files. That's problematic, given that the target directory is modified. If the database was started afterwards, 2ce439f33 luckily already caused the data directory to be synced to disk at postmaster startup; reducing the scope of the problem. To fix, use initdb -S, at the end of the pg_rewind run. It doesn't seem worthwhile to duplicate the code into pg_rewind, and initdb -S is already used that way by pg_upgrade. Reported-By: Andres Freund Author: Michael Paquier, somewhat edited by me Discussion: 20160310034352.iuqgvpmg5qmnxtkz@alap3.anarazel.de CAB7nPqSytVG1o4S3S2pA1O=692ekurJ+fckW2PywEG3sNw54Ow@mail.gmail.com Backpatch: 9.5, where pg_rewind was introduced
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* pg_rewind: Improve some messagesPeter Eisentraut2015-10-01
| | | | | | | The output of a typical pg_rewind run contained a mix of capitalized and not-capitalized and punctuated and not-punctuated phrases for no apparent reason. Make that consistent. Also fix some problems in other messages.
* pg_rewind: Improve message wordingPeter Eisentraut2015-06-22
|
* Fix some issues in pg_rewind.Fujii Masao2015-06-11
| | | | | | | | | | | | | | * Remove invalid option character "N" from the third argument (valid option string) of getopt_long(). * Use pg_free() or pfree() to free the memory allocated by pg_malloc() or palloc() instead of always using free(). * Assume problem is no disk space if write() fails but doesn't set errno. * Fix several typos. Patch by me. Review by Michael Paquier.
* Add pg_rewind, for re-synchronizing a master server after failback.Heikki Linnakangas2015-03-23
Earlier versions of this tool were available (and still are) on github. Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila, and Satoshi Nagayasu for review.