aboutsummaryrefslogtreecommitdiff
path: root/src/backend/storage/file/fd.c
Commit message (Collapse)AuthorAge
...
* Remove durable_rename_excl()Michael Paquier2022-07-05
| | | | | | | | | | | | A previous commit replaced all the calls to this function with durable_rename() as of dac1ff3, making it used nowhere in the tree. Using it in extension code is also risky based on the issues described in this previous commit, so let's remove it. This makes possible the removal of HAVE_WORKING_LINK. Author: Nathan Bossart Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
* Revert recent changes with durable_rename_excl()Michael Paquier2022-04-28
| | | | | | | | | | | | | | | This reverts commits 2c902bb and ccfbd92. Per buildfarm members kestrel, rorqual and calliphoridae, the assertions checking that a TLI history file should not exist when created by a WAL receiver have been failing, and switching to durable_rename() over durable_rename_excl() would cause the newest TLI history file to overwrite the existing one. We need to think harder about such cases, so revert the new logic for now. Note that all the failures have been reported in the test 025_stuck_on_old_timeline. Discussion: https://postgr.es/m/511362.1651116498@sss.pgh.pa.us
* Remove durable_rename_excl()Michael Paquier2022-04-28
| | | | | | | | | | ccfbd92 has replaced all existing in-core callers of this function in favor of durable_rename(). durable_rename_excl() is by nature unsafe on crashes happening at the wrong time, so just remove it. Author: Nathan Bossart Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
* Add missing spaces after single-line commentsDavid Rowley2022-04-14
| | | | | | | | | | | | | Only 1 of 3 of these changes appear to be handled by pgindent. That change is new to v15. The remaining two appear to be left alone by pgindent. The exact reason for that is not 100% clear to me. It seems related to the fact that it's a line that contains *only* a single line comment and no actual code. It does not seem worth investigating this in too much detail. In any case, these do not conform to our usual practices, so fix them. Author: Justin Pryzby Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.Tom Lane2021-11-28
| | | | | | | | | | | | | | | | | | | Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128** is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
* Report progress of startup operations that take a long time.Robert Haas2021-10-25
| | | | | | | | | | | | | | | | | | | | | | | | Users sometimes get concerned whe they start the server and it emits a few messages and then doesn't emit any more messages for a long time. Generally, what's happening is either that the system is taking a long time to apply WAL, or it's taking a long time to reset unlogged relations, or it's taking a long time to fsync the data directory, but it's not easy to tell which is the case. To fix that, add a new 'log_startup_progress_interval' setting, by default 10s. When an operation that is known to be potentially long-running takes more than this amount of time, we'll log a status update each time this interval elapses. To avoid undesirable log chatter, don't log anything about WAL replay when in standby mode. Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera. Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
* In count_usable_fds(), duplicate stderr not stdin.Tom Lane2021-09-02
| | | | | | | | | | | | | | | | | | | | We had a complaint that the postmaster fails to start if the invoking program closes stdin. That happens because count_usable_fds expects to be able to dup(0), and if it can't, we conclude there are no free FDs and go belly-up. So far as I can find, though, there is no other place in the server that touches stdin, and it's not unreasonable to expect that a daemon wouldn't use that file. As a simple improvement, let's dup FD 2 (stderr) instead. Unlike stdin, it *is* reasonable for us to expect that stderr be open; even if we are configured not to touch it, common libraries such as libc might try to write error messages there. Per gripe from Mario Emmenlauer. Given the lack of previous complaints, I'm not excited about pushing this into stable branches, but it seems OK to squeeze it into v14. Discussion: https://postgr.es/m/48bafc63-c30f-3962-2ded-f2e985d93e86@emmenlauer.de
* Refactor sharedfileset.c to separate out fileset implementation.Amit Kapila2021-08-30
| | | | | | | | | | | | Move fileset related implementation out of sharedfileset.c to allow its usage by backends that don't want to share filesets among different processes. After this split, fileset infrastructure is used by both sharedfileset.c and worker.c for the named temporary files that survive across transactions. Author: Dilip Kumar, based on suggestion by Andres Freund Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila Discussion: https://postgr.es/m/E1mCC6U-0004Ik-Fs@gemulon.postgresql.org
* Move temporary file cleanup to before_shmem_exit().Andres Freund2021-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported by a few OSX buildfarm animals there exist at least one path where temporary files exist during AtProcExit_Files() processing. As temporary file cleanup causes pgstat reporting, the assertions added in ee3f8d3d3ae caused failures. This is not an OSX specific issue, we were just lucky that timing on OSX reliably triggered the problem. The known way to cause this is a FATAL error during perform_base_backup() with a MANIFEST used - adding an elog(FATAL) after InitializeBackupManifest() reliably reproduces the problem in isolation. The problem is that the temporary file created in InitializeBackupManifest() is not cleaned up via resource owner cleanup as WalSndResourceCleanup() currently is only used for non-FATAL errors. That then allows to reach AtProcExit_Files() with existing temporary files, causing the assertion failure. To fix this problem, move temporary file cleanup to a before_shmem_exit() hook and add assertions ensuring that no temporary files are created before / after temporary file management has been initialized / shut down. The cleanest way to do so seems to be to split fd.c initialization into two, one for plain file access and one for temporary file access. Right now there's no need to perform further fd.c cleanup during process exit, so I just renamed AtProcExit_Files() to BeforeShmemExit_Files(). Alternatively we could perform another pass through the files to check that no temporary files exist, but the added assertions seem to provide enough protection against that. It might turn out that the assertions added in ee3f8d3d3ae will cause too much noise - in that case we'll have to downgrade them to a WARNING, at least temporarily. This commit is not necessarily the best approach to address this issue, but it should resolve the buildfarm failures. We can revise later. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210807190131.2bm24acbebl4wl6i@alap3.anarazel.de
* Don't use #if inside function-like macro arguments.Thomas Munro2021-07-20
| | | | | | | | No concrete problem reported, but in the past it's been known to cause problems on some compilers so let's avoid doing that. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/234364.1626704007%40sss.pgh.pa.us
* Adjust commit 2dbe8905 for ancient macOS.Thomas Munro2021-07-19
| | | | | | | A couple of open flags used in an assertion didn't exist in macOS 10.4. Per build farm animal prairiedog. Also add O_EXCL while here (there are a few more standard flags but they're not relevant and likely to be missing).
* Support direct I/O on macOS.Thomas Munro2021-07-19
| | | | | | | | | | | | | | | | | | Macs don't understand O_DIRECT, but they can disable caching with a separate fcntl() call. Extend the file opening functions in fd.c to handle this for us if the caller passes in PG_O_DIRECT. For now, this affects only WAL data and even then only if you set: max_wal_senders=0 wal_level=minimal This is not expected to be very useful on its own, but later proposed patches will make greater use of direct I/O, and it'll be useful for testing if developers on Macs can see the effects. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2BADiyyHe0cun2wfT%2BSVnFVqNYPxoO6J9zcZkVO7%2BNGig%40mail.gmail.com
* Message style improvementsPeter Eisentraut2021-06-28
|
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Fix concurrency issues with WAL segment recycling on WindowsMichael Paquier2021-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is mostly a revert of aaa3aed, that switched the routine doing the internal renaming of recycled WAL segments to use on Windows a combination of CreateHardLinkA() plus unlink() instead of rename(). As reported by several users of Postgres 13, this is causing concurrency issues when manipulating WAL segments, mostly in the shape of the following error: LOG: could not rename file "pg_wal/000000XX000000YY000000ZZ": Permission denied This moves back to a logic where a single rename() (well, pgrename() for Windows) is used. This issue has proved to be hard to hit when I tested it, facing it only once with an archive_command that was not able to do its work, so it is environment-sensitive. The reporters of this issue have been able to confirm that the situation improved once we switched back to a single rename(). In order to check things, I have provided to the reporters a patched build based on 13.2 with aaa3aed reverted, to test if the error goes away, and an unpatched build of 13.2 to test if the error still showed up (just to make sure that I did not mess up my build process). Extra thanks to Fujii Masao for pointing out what looked like the culprit commit, and to all the reporters for taking the time to test what I have sent them. Reported-by: Andrus, Guy Burgess, Yaroslav Pashinsky, Thomas Trenz Reviewed-by: Tom Lane, Andres Freund Discussion: https://postgr.es/m/3861ff1e-0923-7838-e826-094cc9bef737@hot.ee Discussion: https://postgr.es/m/16874-c3eecd319e36a2bf@postgresql.org Discussion: https://postgr.es/m/095ccf8d-7f58-d928-427c-b17ace23cae6@burgess.co.nz Discussion: https://postgr.es/m/16927-67c570d968c99567%40postgresql.org Discussion: https://postgr.es/m/YFBcRbnBiPdGZvfW@paquier.xyz Backpatch-through: 13
* Provide recovery_init_sync_method=syncfs.Thomas Munro2021-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | Since commit 2ce439f3 we have opened every file in the data directory and called fsync() at the start of crash recovery. This can be very slow if there are many files, leading to field complaints of systems taking minutes or even hours to begin crash recovery. Provide an alternative method, for Linux only, where we call syncfs() on every possibly different filesystem under the data directory. This is equivalent, but avoids faulting in potentially many inodes from potentially slow storage. The new mode comes with some caveats, described in the documentation, so the default value for the new setting is "fsync", preserving the older behavior. Reported-by: Michael Brown <michael.brown@discourse.org> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Paul Guo <guopa@vmware.com> Reviewed-by: Bruce Momjian <bruce@momjian.us> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: David Steele <david@pgmasters.net> Discussion: https://postgr.es/m/11bc2bb7-ecb5-3ad0-b39f-df632734cd81%40discourse.org Discussion: https://postgr.es/m/CAEET0ZHGnbXmi8yF3ywsDZvb3m9CbdsGZgfTXscQ6agcbzcZAw%40mail.gmail.com
* Remove temporary files after backend crashTomas Vondra2021-03-18
| | | | | | | | | | | | | | | | | | After a crash of a backend using temporary files, the files used to be left behind, on the basis that it might be useful for debugging. But we don't have any reports of anyone actually doing that, and it means the disk usage may grow over time due to repeated backend failures (possibly even hitting ENOSPC). So this behavior is a bit unfortunate, and fixing it required either manual cleanup (deleting files, which is error-prone) or restart of the instance (i.e. service disruption). This implements automatic cleanup of temporary files, controled by a new GUC remove_temp_files_after_crash. By default the files are removed, but it can be disabled to restore the old behavior if needed. Author: Euler Taveira Reviewed-by: Tomas Vondra, Michael Paquier, Anastasia Lubennikova, Thomas Munro Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com
* Move our p{read,write}v replacements into their own files.Thomas Munro2021-01-14
| | | | | | | | | | | | | macOS's ranlib issued a warning about an empty pread.o file with the previous arrangement, on systems new enough to require no replacement functions. Let's go back to using configure's AC_REPLACE_FUNCS system to build and include each .o in the library only if it's needed, which requires moving the *v() functions to their own files. Also move the _with_retry() wrapper to a more permanent home. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1283127.1610554395%40sss.pgh.pa.us
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Convert elog(LOG) calls to ereport() where appropriatePeter Eisentraut2020-12-04
| | | | | | | | | | | User-visible log messages should go through ereport(), so they are subject to translation. Many remaining elog(LOG) calls are really debugging calls. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com
* Use truncate(2) where appropriate.Thomas Munro2020-12-01
| | | | | | | When truncating files by name, use truncate(2). Windows hasn't got it, so keep our previous coding based on ftruncate(2) as a fallback. Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
* Don't Insert() a VFD entry until it's fully built.Tom Lane2020-11-16
| | | | | | | | | | | | | | | | | | | Otherwise, if FDDEBUG is enabled, the debugging output fails because it tries to read the fileName, which isn't set up yet (and should in fact always be NULL). AFAICT, this has been wrong since Berkeley. Before 96bf88d52, it would accidentally fail to crash on platforms where snprintf() is forgiving about being passed a NULL pointer for %s; but the file name intended to be included in the debug output wouldn't ever have shown up. Report and fix by Greg Nancarrow. Although this is only visibly broken in custom-made builds, it still seems worth back-patching to all supported branches, as the FDDEBUG code is pretty useless as it stands. Discussion: https://postgr.es/m/CAJcOf-cUDgm9qYtC_B6XrC6MktMPNRby2p61EtSGZKnfotMArw@mail.gmail.com
* Skip unnecessary stat() calls in walkdir().Thomas Munro2020-09-07
| | | | | | | | | | | Some kernels can tell us the type of a "dirent", so we can avoid a call to stat() or lstat() in many cases. Define a new function get_dirent_type() to contain that logic, for use by the backend and frontend versions of walkdir(), and perhaps other callers in future. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
* Extend the BufFile interface.Amit Kapila2020-08-26
| | | | | | | | | | | | | | | | | | | | | | Allow BufFile to support temporary files that can be used by the single backend when the corresponding files need to be survived across the transaction and need to be opened and closed multiple times. Such files need to be created as a member of a SharedFileSet. Additionally, this commit implements the interface for BufFileTruncate to allow files to be truncated up to a particular offset and extends the BufFileSeek API to support the SEEK_END case. This also adds an option to provide a mode while opening the shared BufFiles instead of always opening in read-only mode. These enhancements in BufFile interface are required for the upcoming patch to allow the replication apply worker, to handle streamed in-progress transactions. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
* Fix temporary tablespaces for shared filesets some more.Tom Lane2020-07-03
| | | | | | | | | | | | | | | | | | | | | | | | Commit ecd9e9f0b fixed the problem in the wrong place, causing unwanted side-effects on the behavior of GetNextTempTableSpace(). Instead, let's make SharedFileSetInit() responsible for subbing in the value of MyDatabaseTableSpace when the default tablespace is called for. The convention about what is in the tempTableSpaces[] array is evidently insufficiently documented, so try to improve that. It also looks like SharedFileSetInit() is doing the wrong thing in the case where temp_tablespaces is empty. It was hard-wiring use of the pg_default tablespace, but it seems like using MyDatabaseTableSpace is more consistent with what happens for other temp files. Back-patch the reversion of PrepareTempTablespaces()'s behavior to 9.5, as ecd9e9f0b was. The changes in SharedFileSetInit() go back to v11 where that was introduced. (Note there is net zero code change before v11 from these two patch sets, so nothing to release-note.) Magnus Hagander and Tom Lane Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
* Remove HAVE_WORKING_LINKPeter Eisentraut2020-03-11
| | | | | | | | | | | | | | Previously, hard links were not used on Windows and Cygwin, but they support them just fine in currently supported OS versions, so we can use them there as well. Since all supported platforms now support hard links, we can remove the alternative code paths. Rename durable_link_or_rename() to durable_rename_excl() to make the purpose more clear without referencing the implementation details. Discussion: https://www.postgresql.org/message-id/flat/72fff73f-dc9c-4ef4-83e8-d2e60c98df48%402ndquadrant.com
* Account explicitly for long-lived FDs that are allocated outside fd.c.Tom Lane2020-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comments in fd.c have long claimed that all file allocations should go through that module, but in reality that's not always practical. fd.c doesn't supply APIs for invoking some FD-producing syscalls like pipe() or epoll_create(); and the APIs it does supply for non-virtual FDs are mostly insistent on releasing those FDs at transaction end; and in some cases the actual open() call is in code that can't be made to use fd.c, such as libpq. This has led to a situation where, in a modern server, there are likely to be seven or so long-lived FDs per backend process that are not known to fd.c. Since NUM_RESERVED_FDS is only 10, that meant we had *very* few spare FDs if max_files_per_process is >= the system ulimit and fd.c had opened all the files it thought it safely could. The contrib/postgres_fdw regression test, in particular, could easily be made to fall over by running it under a restrictive ulimit. To improve matters, invent functions Acquire/Reserve/ReleaseExternalFD that allow outside callers to tell fd.c that they have or want to allocate a FD that's not directly managed by fd.c. Add calls to track all the fixed FDs in a standard backend session, so that we are honestly guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE limit in a backend's idle state. The coding rules for these functions say that there's no need to call them in code that just allocates one FD over a fairly short interval; we can dip into NUM_RESERVED_FDS for such cases. That means that there aren't all that many places where we need to worry. But postgres_fdw and dblink must use this facility to account for long-lived FDs consumed by libpq connections. There may be other places where it's worth doing such accounting, too, but this seems like enough to solve the immediate problem. Internally to fd.c, "external" FDs are limited to max_safe_fds/3 FDs. (Callers can choose to ignore this limit, but of course it's unwise to do so except for fixed file allocations.) I also reduced the limit on "allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2). Conceivably a smarter rule could be used here --- but in practice, on reasonable systems, max_safe_fds should be large enough that this isn't much of an issue, so KISS for now. To avoid possible regression in the number of external or allocated files that can be opened, increase FD_MINFREE and the lower limit on max_files_per_process a little bit; we now insist that the effective "ulimit -n" be at least 64. This seems like pretty clearly a bug fix, but in view of the lack of field complaints, I'll refrain from risking a back-patch. Discussion: https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
* Add pg_file_sync() to adminpack extension.Fujii Masao2020-01-24
| | | | | | | | | | | This function allows us to fsync the specified file or directory. It's useful, for example, when we want to sync the file that pg_file_write() writes out or that COPY TO exports the data into, for durability. Author: Fujii Masao Reviewed-By: Julien Rouhaud, Arthur Zakirov, Michael Paquier, Atsushi Torikoshi Discussion: https://www.postgresql.org/message-id/CAHGQGwGY8uzZ_k8dHRoW1zDcy1Z7=5GQ+So4ZkVy2u=nLsk=hA@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Add safeguards for pg_fsync() called with incorrectly-opened fdsMichael Paquier2019-11-26
| | | | | | | | | | | | | | | | | | | | | | | | On some platforms, fsync() returns EBADFD when opening a file descriptor with O_RDONLY (read-only), leading ultimately now to a PANIC to prevent data corruption. This commit adds a new sanity check in pg_fsync() based on fcntl() to make sure that we don't repeat again mistakes with incorrectly-set file descriptors so as problems are detected at an early stage. Without that, such errors could only be detected after running Postgres on a specific supported platform for the culprit code path, which could take some time before being found. b8e19b93 was a fix for such a problem, which got undetected for more than 5 years, and a586cc4b fixed another similar issue. Note that the new check added works as well when fsync=off is configured, so as all regression tests would detect problems as long as assertions are enabled. fcntl() being not available on Windows, the new checks do not happen there. Author: Michael Paquier Reviewed-by: Mark Dilger Discussion: https://postgr.es/m/20191009062640.GB21379@paquier.xyz
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-12
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Fix most -Wundef warningsPeter Eisentraut2019-10-19
| | | | | | | | | | | | In some cases #if was used instead of #ifdef in an inconsistent style. Cleaning this up also helps when analyzing cases like 38d8dce61fff09daae0edb6bcdd42b0c7f10ebcd where this makes a difference. There are no behavior changes here, but the change in pg_bswap.h would prevent possible accidental misuse by third-party code. Discussion: https://www.postgresql.org/message-id/flat/3b615ca5-c595-3f1d-fdf7-a429e564f614%402ndquadrant.com
* Message style fixesPeter Eisentraut2019-09-23
|
* Rearrange postmaster's startup sequence for better syslogger results.Tom Lane2019-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a second try at what commit 57431a911 tried to do, namely, launch the syslogger before we open postmaster sockets so that our messages about the sockets end up in the syslogger files. That commit fell foul of a bunch of subtle issues caused by trying to launch a postmaster child process before creating shared memory. Rather than messing with that interaction, let's postpone opening the sockets till after we launch the syslogger. This would not have been terribly safe before commit 7de19fbc0, because we relied on socket opening to detect whether any competing postmasters were using the same port number. But now that we choose IPC keys without regard to the port number, there's no interaction to worry about. Also delay creation of the external PID file (if requested) till after the sockets are open, since external code could plausibly be relying on that ordering of events. And postpone most of the work of RemovePgTempFiles() so that that potentially-slow processing still happens after we make the external PID file. We have to be a bit careful about that last though: as noted in the discussion subsequent to bug #15804, EXEC_BACKEND builds still have to clear the parameter-file temp dir before launching the syslogger. Patch by me; thanks to Michael Paquier for review/testing. Discussion: https://postgr.es/m/15804-3721117bf40fb654@postgresql.org
* Fix inconsistencies and typos in the tree, take 11Michael Paquier2019-08-19
| | | | | | | | This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com
* Fix inconsistencies and typos in the tree, take 10Michael Paquier2019-08-13
| | | | | | | | | This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
* Fix typo in fd.cMichael Paquier2019-07-28
| | | | | | | | The frontend version of walkdir() is defined in file_utils.c, and not initdb.c. Author: Sehrope Sarkuni Discussion: https://postgr.es/m/CAH7T-artawnBt4=KODNCD8Mt2ZX4CCjJT8c=_=950xjutcRZ4Q@mail.gmail.com
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-16
| | | | | | | | | | | This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com
* Use consistent style for checking return from system callsPeter Eisentraut2019-07-07
| | | | | | | | | | | | | | | | | Use if (something() != 0) error ... instead of just if (something) error ... The latter is not incorrect, but it's a bit confusing and not the common style. Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com
* Fix more typos and inconsistencies in the treeMichael Paquier2019-06-17
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Tighten use of OpenTransientFile and CloseTransientFileMichael Paquier2019-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two sets of issues related to the use of transient files in the backend: 1) OpenTransientFile() has been used in some code paths with read-write flags while read-only is sufficient, so switch those calls to be read-only where necessary. These have been reported by Joe Conway. 2) When opening transient files, it is up to the caller to close the file descriptors opened. In error code paths, CloseTransientFile() gets called to clean up things before issuing an error. However in normal exit paths, a lot of callers of CloseTransientFile() never actually reported errors, which could leave a file descriptor open without knowing about it. This is an issue I complained about a couple of times, but never had the courage to write and submit a patch, so here we go. Note that one frontend code path is impacted by this commit so as an error is issued when fetching control file data, making backend and frontend to be treated consistently. Reported-by: Joe Conway, Michael Paquier Author: Michael Paquier Reviewed-by: Álvaro Herrera, Georgios Kokolatos, Joe Conway Discussion: https://postgr.es/m/20190301023338.GD1348@paquier.xyz Discussion: https://postgr.es/m/c49b69ec-e2f7-ff33-4f17-0eaa4f2cef27@joeconway.com
* Tolerate EINVAL when calling fsync() on a directory.Thomas Munro2019-02-24
| | | | | | | | | | | Previously, we tolerated EBADF as a way for the operating system to indicate that it doesn't support fsync() on a directory. Tolerate EINVAL too, for older versions of Linux CIFS. Bug #15636. Back-patch all the way. Reported-by: John Klann Discussion: https://postgr.es/m/15636-d380890dafd78fc6@postgresql.org
* Tolerate ENOSYS failure from sync_file_range().Thomas Munro2019-02-24
| | | | | | | | | | | | | | | | | | | | | One unintended consequence of commit 9ccdd7f6 was that Windows WSL users started getting a panic whenever we tried to initiate data flushing with sync_file_range(), because WSL does not implement that system call. Previously, they got a stream of periodic warnings, which was also undesirable but at least ignorable. Prevent the panic by handling ENOSYS specially and skipping the panic promotion with data_sync_elevel(). Also suppress future attempts after the first such failure so that the pre-existing problem of noisy warnings is improved. Back-patch to 9.6 (older branches were not affected in this way by 9ccdd7f6). Author: Thomas Munro and James Sewell Tested-by: James Sewell Reported-by: Bruce Klein Discussion: https://postgr.es/m/CA+mCpegfOUph2U4ZADtQT16dfbkjjYNJL1bSTWErsazaFjQW9A@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Handle EPIPE more sanely when we close a pipe reading from a program.Tom Lane2018-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, any program launched by COPY TO/FROM PROGRAM inherited the server's setting of SIGPIPE handling, i.e. SIG_IGN. Hence, if we were doing COPY FROM PROGRAM and closed the pipe early, the child process would see EPIPE on its output file and typically would treat that as a fatal error, in turn causing the COPY to report error. Similarly, one could get a failure report from a query that didn't read all of the output from a contrib/file_fdw foreign table that uses file_fdw's PROGRAM option. To fix, ensure that child programs inherit SIG_DFL not SIG_IGN processing of SIGPIPE. This seems like an all-around better situation since if the called program wants some non-default treatment of SIGPIPE, it would expect to have to set that up for itself. Then in COPY, if it's COPY FROM PROGRAM and we stop reading short of detecting EOF, treat a SIGPIPE exit from the called program as a non-error condition. This still allows us to report an error for any case where the called program gets SIGPIPE on some other file descriptor. As coded, we won't report a SIGPIPE if we stop reading as a result of seeing an in-band EOF marker (e.g. COPY BINARY EOF marker). It's somewhat debatable whether we should complain if the called program continues to transmit data after an EOF marker. However, it seems like we should avoid throwing error in any questionable cases, especially in a back-patched fix, and anyway it would take additional code to make such an error get reported consistently. Back-patch to v10. We could go further back, since COPY FROM PROGRAM has been around awhile, but AFAICS the only way to reach this situation using core or contrib is via file_fdw, which has only supported PROGRAM sources since v10. The COPY statement per se has no feature whereby it'd stop reading without having hit EOF or an error already. Therefore, I don't see any upside to back-patching further that'd outweigh the risk of complaints about behavioral change. Per bug #15449 from Eric Cyr. Patch by me, review by Etsuro Fujita and Kyotaro Horiguchi Discussion: https://postgr.es/m/15449-1cf737dd5929450e@postgresql.org
* PANIC on fsync() failure.Thomas Munro2018-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some operating systems, it doesn't make sense to retry fsync(), because dirty data cached by the kernel may have been dropped on write-back failure. In that case the only remaining copy of the data is in the WAL. A subsequent fsync() could appear to succeed, but not have flushed the data. That means that a future checkpoint could apparently complete successfully but have lost data. Therefore, violently prevent any future checkpoint attempts by panicking on the first fsync() failure. Note that we already did the same for WAL data; this change extends that behavior to non-temporary data files. Provide a GUC data_sync_retry to control this new behavior, for users of operating systems that don't eject dirty data, and possibly forensic/testing uses. If it is set to on and the write-back error was transient, a later checkpoint might genuinely succeed (on a system that does not throw away buffers on failure); if the error is permanent, later checkpoints will continue to fail. The GUC defaults to off, meaning that we panic. Back-patch to all supported releases. There is still a narrow window for error-loss on some operating systems: if the file is closed and later reopened and a write-back error occurs in the intervening time, but the inode has the bad luck to be evicted due to memory pressure before we reopen, we could miss the error. A later patch will address that with a scheme for keeping files with dirty data open at all times, but we judge that to be too complicated to back-patch. Author: Craig Ringer, with some adjustments by Thomas Munro Reported-by: Craig Ringer Reviewed-by: Robert Haas, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
* Remove set-but-unused variable.Thomas Munro2018-11-07
| | | | | | | Clean-up for commit c24dcd0c. Reported-by: Andrew Dunstan Discussion: https://postgr.es/m/2d52ff4a-5440-f6f1-7806-423b0e6370cb%402ndQuadrant.com