aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/transam
Commit message (Collapse)AuthorAge
* Prevent concurrent SimpleLruTruncate() for any given SLRU.Noah Misch2020-08-15
| | | | | | | | | | | | | | | | | The SimpleLruTruncate() header comment states the new coding rule. To achieve this, add locktype "frozenid" and two LWLocks. This closes a rare opportunity for data loss, which manifested as "apparent wraparound" or "could not access status of transaction" errors. Data loss is more likely in pg_multixact, due to released branches' thin margin between multiStopLimit and multiWrapLimit. If a user's physical replication primary logged ": apparent wraparound" messages, the user should rebuild standbys of that primary regardless of symptoms. At less risk is a cluster having emitted "not accepting commands" errors or "must be vacuumed" warnings at some point. One can test a cluster for this data loss by running VACUUM FREEZE in every database. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
* Fix typo.Robert Haas2020-08-06
| | | | | Per report from Tom Lane. Previously fixed in master by commit f057980149ddccd4b862d2c6b3920ed498b0d7ec.
* Fix minor problems with non-exclusive backup cleanup.Robert Haas2020-08-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous coding imagined that it could call before_shmem_exit() when a non-exclusive backup began and then remove the previously-added handler by calling cancel_before_shmem_exit() when that backup ended. However, this only works provided that nothing else in the system has registered a before_shmem_exit() hook in the interim, because cancel_before_shmem_exit() is documented to remove a callback only if it is the latest callback registered. It also only works if nothing can ERROR out between the time that sessionBackupState is reset and the time that cancel_before_shmem_exit(), which doesn't seem to be strictly true. To fix, leave the handler installed for the lifetime of the session, arrange to install it just once, and teach it to quietly do nothing if there isn't a non-exclusive backup in process. This was originally committed to master as 303640199d0436c5e7acdf50b837a027b5726594, but I did not back-patch at the time because the consequences were minor. However, now there's been a second report of this causing trouble with a slightly different test case than the one I reported originally, so now I'm back-patching as far as v11 where JIT was introduced. Patch by me, reviewed by Kyotaro Horiguchi, Michael Paquier (who preferred a different approach, but got outvoted), Fujii Masao, and Tom Lane, and with comments by various others. New problem report from Bharath Rupireddy. Discussion: http://postgr.es/m/CA+TgmobMjnyBfNhGTKQEDbqXYE3_rXWpc4CM63fhyerNCes3mA@mail.gmail.com Discussion: http://postgr.es/m/CALj2ACWk7j4F2v2fxxYfrroOF=AdFNPr1WsV+AGtHAFQOqm_pw@mail.gmail.com
* Add parens to ConvertToXSegs macroAlvaro Herrera2020-06-24
| | | | | | | The current definition is dangerous. No bugs exist in our code at present, but backpatch to 11 nonetheless where it was introduced. Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
* Fix locking bugs that could corrupt pg_control.Thomas Munro2020-06-08
| | | | | | | | | | | | | | | | | | The redo routines for XLOG_CHECKPOINT_{ONLINE,SHUTDOWN} must acquire ControlFileLock before modifying ControlFile->checkPointCopy, or the checkpointer could write out a control file with a bad checksum. Likewise, XLogReportParameters() must acquire ControlFileLock before modifying ControlFile and calling UpdateControlFile(). Back-patch to all supported releases. Author: Nathan Bossart <bossartn@amazon.com> Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/70BF24D6-DC51-443F-B55A-95735803842A%40amazon.com
* Prevent archive recovery from scanning non-existent WAL files.Fujii Masao2020-05-09
| | | | | | | | | | | | | | | | | | | | | Previously when there were multiple timelines listed in the history file of the recovery target timeline, archive recovery searched all of them, starting from the newest timeline to the oldest one, to find the segment to read. That is, archive recovery had to continuously fail scanning the segment until it reached the timeline that the segment belonged to. These scans for non-existent segment could be harmful on the recovery performance especially when archival area was located on the remote storage and each scan could take a long time. To address the issue, this commit changes archive recovery so that it skips scanning the timeline that the segment to read doesn't belong to. Per discussion, back-patch to all supported versions. Author: Kyotaro Horiguchi, tweaked a bit by Fujii Masao Reviewed-by: David Steele, Pavel Suderevsky, Grigory Smolkin Discussion: https://postgr.es/m/16159-f5a34a3a04dc67e0@postgresql.org Discussion: https://postgr.es/m/20200129.120222.1476610231001551715.horikyota.ntt@gmail.com
* Report missing wait event for timeline history file.Fujii Masao2020-05-08
| | | | | | | | | | | | | | | | TimelineHistoryRead and TimelineHistoryWrite wait events are reported during waiting for a read and write of a timeline history file, respectively. However, previously, TimelineHistoryRead wait event was not reported while readTimeLineHistory() was reading a timeline history file. Also TimelineHistoryWrite was not reported while writeTimeLineHistory() was writing one line with the details of the timeline split, at the end. This commit fixes these issues. Back-patch to v10 where wait events for a timeline history file was added. Author: Masahiro Ikeda Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/d11b0c910b63684424e06772eb844ab5@oss.nttdata.com
* Fix handling of WAL segments ready to be archived during crash recoveryMichael Paquier2020-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 78ea8b5 has fixed an issue related to the recycling of WAL segments on standbys depending on archive_mode. However, it has introduced a regression with the handling of WAL segments ready to be archived during crash recovery, causing those files to be recycled without getting archived. This commit fixes the regression by tracking in shared memory if a live cluster is either in crash recovery or archive recovery as the handling of WAL segments ready to be archived is different in both cases (those WAL segments should not be removed during crash recovery), and by using this new shared memory state to decide if a segment can be recycled or not. Previously, it was not possible to know if a cluster was in crash recovery or archive recovery as the shared state was able to track only if recovery was happening or not, leading to the problem. A set of TAP tests is added to close the gap here, making sure that WAL segments ready to be archived are correctly handled when a cluster is in archive or crash recovery with archive_mode set to "on" or "always", for both standby and primary. Reported-by: Benoît Lobréau Author: Jehan-Guillaume de Rorthais Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/20200331172229.40ee00dc@firost Backpatch-through: 9.5
* Fix possible crash during FATAL exit from reindexing.Tom Lane2020-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | index.c supposed that it could just use a PG_TRY block to clean up the state associated with an active REINDEX operation. However, that code doesn't run if we do a FATAL exit --- for example, due to a SIGTERM shutdown signal --- while the REINDEX is happening. And that state does get consulted during catalog accesses, which makes it problematic if we do any catalog accesses during shutdown --- for example, to clean up any temp tables created in the session. If this combination of circumstances occurred, we could find ourselves trying to access already-freed memory. In debug builds that'd fairly reliably cause an assertion failure. In production we might often get away with it, but with some bad luck it could cause a core dump. Another possible bad outcome is an erroneous conclusion that an index-to-be-accessed is being reindexed; but it looks like that would be unlikely to have any consequences worse than failing to drop temp tables right away. (They'd still get dropped by the next session that uses that temp schema.) To fix, get rid of the use of PG_TRY here, and instead hook into the transaction abort mechanisms to clean up reindex state. Per bug #16378 from Alexander Lakhin. This has been wrong for a very long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/16378-7a70ca41b3ec2009@postgresql.org
* Revert "Skip WAL for new relfilenodes, under wal_level=minimal."Noah Misch2020-03-22
| | | | | | | | This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per numerous buildfarm members, it was incompatible with parallel query, and a test case assumed LP64. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
* Skip WAL for new relfilenodes, under wal_level=minimal.Noah Misch2020-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Back-patch to 9.5 (all supported versions). This introduces a new WAL record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As always, update standby systems before master systems. This changes sizeof(RelationData) and sizeof(IndexStmt), breaking binary compatibility for affected extensions. (The most recent commit to affect the same class of extensions was 089e4d405d0f3b94c74a2c6a54357a84a681754b.) Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
* In log_newpage_range(), heed forkNum and page_std arguments.Noah Misch2020-03-21
| | | | | | | | | The function assumed forkNum=MAIN_FORKNUM and page_std=true, ignoring the actual arguments. Existing callers passed exactly those values, so there's no live bug. Back-patch to v12, where the function first appeared, because another fix needs this. Discussion: https://postgr.es/m/20191118045434.GA1173436@rfd.leadboat.com
* Avoid assertion failure with targeted recovery in standby mode.Fujii Masao2020-03-09
| | | | | | | | | | | | | | | | | | | | | | | | At the end of recovery, standby mode is turned off to re-fetch the last valid record from archive or pg_wal. Previously, if recovery target was reached and standby mode was turned off while the current WAL source was stream, recovery could try to retrieve WAL file containing the last valid record unexpectedly from stream even though not in standby mode. This caused an assertion failure. That is, the assertion test confirms that WAL file should not be retrieved from stream if standby mode is not true. This commit moves back the current WAL source to archive if it's stream even though not in standby mode, to avoid that assertion failure. This issue doesn't cause the server to crash when built with assertion disabled. In this case, the attempt to retrieve WAL file from stream not in standby mode just fails. And then recovery tries to retrieve WAL file from archive or pg_wal. Back-patch to all supported branches. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
* Reimplement nullification of walsender timestampAlvaro Herrera2020-01-08
| | | | | | | | | | | | | | | Make the value null only at pg_stat_activity-output time, as suggested by Tom Lane, instead of messing with the internal state. This should appease buildfarm members with force_parallel_mode=regress, which are running parallel queries on logical replication walsenders. The fact that walsenders can run parallel queries should perhaps be studied more carefully, but for the moment let's get rid of the red blots in buildfarm. Backpatch to pg10, like the previous commit. Discussion: https://postgr.es/m/30804.1578438763@sss.pgh.pa.us
* pg_stat_activity: show NULL stmt start time for walsendersAlvaro Herrera2020-01-07
| | | | | | | | | | | | | Returning a non-NULL time is pointless, sinc a walsender is not a process that would be running normal transactions anyway, but the code was unintentionally exposing the process start time intermittently, which was not only bogus but it also confused monitoring systems looking for idle transactions. Fix by avoiding all updates in walsenders. Backpatch to 11, where walsenders started appearing in pg_stat_activity. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/20191209234409.exe7osmyalwkt5j4@development
* Remove shadow variables linked to RedoRecPtr in xlog.cMichael Paquier2019-12-18
| | | | | | | | | | | | | | | This changes the routines in charge of recycling WAL segments past the last redo LSN to not use anymore "RedoRecPtr" as a local variable, which is also available in the context of the session as a static declaration, replacing it with "lastredoptr". This confusion has been introduced by d9fadbf, so backpatch down to v11 like the other commit. Thanks to Tom Lane, Robert Haas, Alvaro Herrera, Mark Dilger and Kyotaro Horiguchi for the input provided. Author: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927F7B5F690065E1194B258E35D0@MN2PR18MB2927.namprd18.prod.outlook.com Backpatch-through: 11
* Change overly strict Assert in TransactionGroupUpdateXidStatus.Amit Kapila2019-12-17
| | | | | | | | | | | | | | | | | | This Assert thought that an overflowed transaction can never get registered for the group update.  But that is not true, because even when the number of children for a transaction got reduced, the overflow flag is not changed. And, for group update, we only care about the current number of children for a transaction that is being committed. Based on comments by Andres Freund, remove a redundant Assert in TransactionIdSetPageStatus as we already had a static Assert for the same condition a few lines earlier. Reported-by: Vignesh C Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAFiTN-s5=uJw-Z6JC9gcqtBSjXsrHnU63PXBrA=pnBjqnkm5UA@mail.gmail.com
* Avoid assertion failure with LISTEN in a serializable transaction.Tom Lane2019-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If LISTEN is the only action in a serializable-mode transaction, and the session was not previously listening, and the notify queue is not empty, predicate.c reported an assertion failure. That happened because we'd acquire the transaction's initial snapshot during PreCommit_Notify, which was called *after* predicate.c expects any such snapshot to have been established. To fix, just swap the order of the PreCommit_Notify and PreCommit_CheckForSerializationFailure calls during CommitTransaction. This will imply holding the notify-insertion lock slightly longer, but the difference could only be meaningful in serializable mode, which is an expensive option anyway. It appears that this is just an assertion failure, with no consequences in non-assert builds. A snapshot used only to scan the notify queue could not have been involved in any serialization conflicts, so there would be nothing for PreCommit_CheckForSerializationFailure to do except assign it a prepareSeqNo and set the SXACT_FLAG_PREPARED flag. And given no conflicts, neither of those omissions affect the behavior of ReleasePredicateLocks. This admittedly once-over-lightly analysis is backed up by the lack of field reports of trouble. Per report from Mark Dilger. The bug is old, so back-patch to all supported branches; but the new test case only goes back to 9.6, for lack of adequate isolationtester infrastructure before that. Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
* Fix assertion failure when running pgbench -s.Fujii Masao2019-11-07
| | | | | | | | | | | | | | | | | | If there is the WAL page that the continuation WAL record just fits within (i.e., the continuation record ends just at the end of the page) and the LSN in such page is specified with -s option, previously pg_waldump caused an assertion failure. The cause of this assertion failure was that XLogFindNextRecord() that pg_waldump -s calls mistakenly handled such special WAL page. This commit changes XLogFindNextRecord() so that it can handle such WAL page correctly. Back-patch to all supported versions. Author: Andrey Lepikhov Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
* Fix initialization of fake LSN for unlogged relationsMichael Paquier2019-10-27
| | | | | | | | | | | | | | 9155580 has changed the value of the first fake LSN for unlogged relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a non-zero LSN on some pages to allow an interlocking logic to work, but its value was still initialized to 1 at the beginning of recovery or after running pg_resetwal. This fixes the initialization for both code paths. Author: Takayuki Tsunakawa Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com Backpatch-through: 12
* Fix failure of archive recovery with recovery_min_apply_delay enabled.Fujii Masao2019-10-18
| | | | | | | | | | | | | | | | | | | | | | | | recovery_min_apply_delay parameter is intended for use with streaming replication deployments. However, the document clearly explains that the parameter will be honored in all cases if it's specified. So it should take effect even if in archive recovery. But, previously, archive recovery with recovery_min_apply_delay enabled always failed, and caused assertion failure if --enable-caasert is enabled. The cause of this problem is that; the ownership of recoveryWakeupLatch that recovery_min_apply_delay uses was taken only when standby mode is requested. So unowned latch could be used in archive recovery, and which caused the failure. This commit changes recovery code so that the ownership of recoveryWakeupLatch is taken even in archive recovery. Which prevents archive recovery with recovery_min_apply_delay from failing. Back-patch to v9.4 where recovery_min_apply_delay was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
* Make crash recovery ignore recovery_min_apply_delay setting.Fujii Masao2019-10-18
| | | | | | | | | | | | | | | | | | | In v11 or before, this setting could not take effect in crash recovery because it's specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed this setting to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery_min_apply_delay setting. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
* Make crash recovery ignore restore_command and recovery_end_command settings.Fujii Masao2019-10-11
| | | | | | | | | | | | | | | | | | In v11 or before, those settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed those settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore restore_command and recovery_end_command settings. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
* Remove temporary WAL and history files at the end of archive recoveryMichael Paquier2019-10-02
| | | | | | | | | | | | | | | cbc55da has reworked the order of some actions at the end of archive recovery. Unfortunately this overlooked the fact that the startup process needs to remove RECOVERYXLOG (for temporary WAL segment newly recovered from archives) and RECOVERYHISTORY (for temporary history file) at this step, leaving the files around even after recovery ended. Backpatch to 9.5, like the previous commit. Author: Sawada Masahiko Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com Backpatch-through: 9.5
* Make crash recovery ignore recovery target settings.Fujii Masao2019-09-30
| | | | | | | | | | | | | | | | | | In v11 or before, recovery target settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed recovery target settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery target settings. Back-patch to v12. Author: Peter Eisentraut Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
* Message style fixesPeter Eisentraut2019-09-23
|
* Fix behavior of AND CHAIN outside of explicit transaction blocksPeter Eisentraut2019-09-08
| | | | | | | | | | | | | When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit transaction block, the previous implementation would leave a transaction block active in the ROLLBACK case but not the COMMIT case. To fix for now, error out when using these commands not in an explicit transaction block. This restriction could be lifted if a sensible definition and implementation is found. Bug: #15977 Author: fn ln <emuser20140816@gmail.com> Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
* Make pg_promote() detect postmaster death while waiting for promotion to end.Fujii Masao2019-09-06
| | | | | | | | | | | | | | | | Previously even if postmaster died and WaitLatch() woke up with that event while pg_promote() was waiting for the standby promotion to finish, pg_promote() did nothing special and kept waiting until timeout occurred. This could cause a busy loop. This patch make pg_promote() return false immediately when postmaster dies, to avoid such a busy loop. Back-patch to v12 where pg_promote() was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com
* Better error messages for short reads/writes in SLRUPeter Eisentraut2019-09-03
| | | | | | | | | | | | This avoids getting a Could not read from file ...: Success. for a short read or write (since errno is not set in that case). Instead, report a more specific error messages. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com
* Don't call data type input functions in GUC check hooksPeter Eisentraut2019-06-30
| | | | | | | | | | | | | | | | | | | | Instead of calling pg_lsn_in() in check_recovery_target_lsn and timestamptz_in() in check_recovery_target_time, reorganize the respective code so that we don't raise any errors in the check hooks. The previous code tried to use PG_TRY/PG_CATCH to handle errors in a way that is not safe, so now the code contains no ereport() calls and can operate safely within the GUC error handling system. Moreover, since the interpretation of the recovery_target_time string may depend on the time zone, we cannot do the final processing of that string until all the GUC processing is done. Instead, check_recovery_target_time() now does some parsing for syntax checking, but the actual conversion to a timestamptz value is done later in the recovery code that uses it. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/20190611061115.njjwkagvxp4qujhp%40alap3.anarazel.de
* Fix more typos and inconsistencies in the treeMichael Paquier2019-06-17
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com
* Update stale comments, and fix comment typos.Noah Misch2019-06-08
|
* Fix assorted inconsistencies.Amit Kapila2019-06-08
| | | | | | | | | | There were a number of issues in the recent commits which include typos, code and comments mismatch, leftover function declarations. Fix them. Reported-by: Alexander Lakhin Author: Alexander Lakhin, Amit Kapila and Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/ef0c0232-0c1d-3a35-63d4-0ebd06e31387@gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Fix comment for issue_xlog_fsync().Fujii Masao2019-05-21
| | | | | | | "segno" is the argument for the function, not "log" and "seg". Author: Antonin Houska Discussion: https://postgr.es/m/11863.1558361020@spoje.net
* Clean up the behavior and API of catalog.c's is-catalog-relation tests.Tom Lane2019-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The right way for IsCatalogRelation/Class to behave is to return true for OIDs less than FirstBootstrapObjectId (not FirstNormalObjectId), without any of the ad-hoc fooling around with schema membership. The previous code was wrong because (1) it claimed that information_schema tables were not catalog relations but their toast tables were, which is silly; and (2) if you dropped and recreated information_schema, which is a supported operation, the behavior changed. That's even sillier. With this definition, "catalog relations" are exactly the ones traceable to the postgres.bki data, which seems like what we want. With this simplification, we don't actually need access to the pg_class tuple to identify a catalog relation; we only need its OID. Hence, replace IsCatalogClass with "IsCatalogRelationOid(oid)". But keep IsCatalogRelation as a convenience function. This allows fixing some arguably-wrong semantics in contrib/sepgsql and ReindexRelationConcurrently, which were using an IsSystemNamespace test where what they really should be using is IsCatalogRelationOid. The previous coding failed to protect toast tables of system catalogs, and also was not on board with the general principle that user-created tables do not become catalogs just by virtue of being renamed into pg_catalog. We can also get rid of a messy hack in ReindexMultipleTables. While we're at it, also rename IsSystemNamespace to IsCatalogNamespace, because the previous name invited confusion with the more expansive semantics used by IsSystemRelation/Class. Also improve the comments in catalog.c. There are a few remaining places in replication-related code that are special-casing OIDs below FirstNormalObjectId. I'm inclined to think those are wrong too, and if there should be any special case it should just extend to FirstBootstrapObjectId. But first we need to debate whether a FOR ALL TABLES publication should include information_schema. Discussion: https://postgr.es/m/21697.1557092753@sss.pgh.pa.us Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us
* Remove leftover reference to old "flat file" mechanism in a comment.Heikki Linnakangas2019-05-08
| | | | The flat file mechanism was removed in PostgreSQL 9.0.
* Revert "Avoid the creation of the free space map for small heap relations".Amit Kapila2019-05-07
| | | | | | | | | | | | | | | | | | | | | This feature was using a process local map to track the first few blocks in the relation. The map was reset each time we get the block with enough freespace. It was discussed that it would be better to track this map on a per-relation basis in relcache and then invalidate the same whenever vacuum frees up some space in the page or when FSM is created. The new design would be better both in terms of API design and performance. List of commits reverted, in reverse chronological order: 06c8a5090e Improve code comments in b0eaa4c51b. 13e8643bfc During pg_upgrade, conditionally skip transfer of FSMs. 6f918159a9 Add more tests for FSM. 9c32e4c350 Clear the local map when not used. 29d108cdec Update the documentation for FSM behavior.. 08ecdfe7e5 Make FSM test portable. b0eaa4c51b Avoid creation of the free space map for small heap relations. Discussion: https://postgr.es/m/20190416180452.3pm6uegx54iitbt5@alap3.anarazel.de
* Message style fixesAlvaro Herrera2019-04-30
|
* Simplify some ERROR paths clearing wait events and transient filesMichael Paquier2019-04-17
| | | | | | | | | | | | | | Transient files and wait events get normally cleaned up when seeing an exception (be it in the context of a transaction for a backend or another process like the checkpointer), hence there is little point in complicating error code paths to do this work. This shaves a bit of code, and removes some extra handling with errno which needed to be preserved during the cleanup steps done. Reported-by: Masahiko Sawada Author: Michael Paquier Reviewed-by: Tom Lane, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoDhHYVq5KkXfkaHhmjA-zJYj-e4teiRAJefvXuKJz1tKQ@mail.gmail.com
* Fix GetNewTransactionId()'s interaction with xidVacLimit.Thomas Munro2019-04-12
| | | | | | | | | | | | | | | Commit ad308058 switched to returning a FullTransactionId, but failed to load the potentially updated value in the case where xidVacLimit is reached and we release and reacquire the lock. Repair, closing bug #15727. While reviewing that commit, also fix the size computation used by EstimateTransactionStateSize() and switch to the mul_size() macro traditionally used in such expressions. Author: Thomas Munro Reported-by: Roman Zharkov Discussion: https://postgr.es/m/15727-0be246e7d852d229%40postgresql.org
* Avoid counting transaction stats for parallel worker cooperatingAmit Kapila2019-04-10
| | | | | | | | | | | | | | | | | | | | | | transaction. The transaction that is initiated by the parallel worker to cooperate with the actual transaction started by the main backend to complete the query execution should not be counted as a separate transaction. The other internal transactions started and committed by the parallel worker are still counted as separate transactions as we that is what we do in other places like autovacuum. This will partially fix the bloat in transaction stats due to additional transactions performed by parallel workers. For a complete fix, we need to decide how we want to show all the transactions that are started internally for various operations and that is a matter of separate patch. Reported-by: Haribabu Kommi Author: Haribabu Kommi Reviewed-by: Amit Kapila, Jamison Kirk and Rahila Syed Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAJrrPGc9=jKXuScvNyQ+VNhO0FZk7LLAShAJRyZjnedd2D61EQ@mail.gmail.com
* Refactor the fsync queue for wider use.Thomas Munro2019-04-04
| | | | | | | | | | | | | | | | | | | | | | | Previously, md.c and checkpointer.c were tightly integrated so that fsync calls could be handed off and processed in the background. Introduce a system of callbacks and file tags, so that other modules can hand off fsync work in the same way. For now only md.c uses the new interface, but other users are being proposed. Since there may be use cases that are not strictly SMGR implementations, use a new function table for sync handlers rather than extending the traditional SMGR one. Instead of using a bitmapset of segment numbers for each RelFileNode in the checkpointer's hash table, make the segment number part of the key. This requires sending explicit "forget" requests for every segment individually when relations are dropped, but suits the file layout schemes of proposed future users better (ie sparse or high segment numbers). Author: Shawn Debnath and Thomas Munro Reviewed-by: Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com
* Log all statements from a sample of transactionsAlvaro Herrera2019-04-03
| | | | | | | | This is useful to obtain a view of the different transaction types in an application, regardless of the durations of the statements each runs. Author: Adrien Nayrat Reviewed-by: Masahiko Sawada, Hayato Kuroda, Andres Freund
* Generate less WAL during GiST, GIN and SP-GiST index build.Heikki Linnakangas2019-04-03
| | | | | | | | | | | | | | | | | | | | Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov
* Add support for partial TOAST decompressionStephen Frost2019-04-02
| | | | | | | | | | | | | | | | When asked for a slice of a TOAST entry, decompress enough to return the slice instead of decompressing the entire object. For use cases where the slice is at, or near, the beginning of the entry, this avoids a lot of unnecessary decompression work. This changes the signature of pglz_decompress() by adding a boolean to indicate if it's ok for the call to finish before consuming all of the source or destination buffers. Author: Paul Ramsey Reviewed-By: Rafia Sabih, Darafei Praliaskouski, Regina Obe Discussion: https://postgr.es/m/CACowWR07EDm7Y4m2kbhN_jnys%3DBBf9A6768RyQdKm_%3DNpkcaWg%40mail.gmail.com
* Add wal_recycle and wal_init_zero GUCs.Thomas Munro2019-04-02
| | | | | | | | | | | | On at least ZFS, it can be beneficial to create new WAL files every time and not to bother zero-filling them. Since it's not clear which other filesystems might benefit from one or both of those things, add individual GUCs to control those two behaviors independently and make only very general statements in the docs. Author: Jerry Jelinek, with some adjustments by Thomas Munro Reviewed-by: Alvaro Herrera, Andres Freund, Tomas Vondra, Robert Haas and others Discussion: https://postgr.es/m/CACPQ5Fo00QR7LNAcd1ZjgoBi4y97%2BK760YABs0vQHH5dLdkkMA%40mail.gmail.com
* Unified logging system for command-line programsPeter Eisentraut2019-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
* Warn more strongly about the dangers of exclusive backup mode.Robert Haas2019-03-29
| | | | | | | | | | | Especially, warn about the hazards of mishandling the backup_label file. Adjust a couple of server messages to be more clear about the hazards associated with removing backup_label files, too. David Steele and Robert Haas, reviewed by Laurenz Albe, Martín Marqués, Peter Eisentraut, and Magnus Hagander. Discussion: http://postgr.es/m/7d85c387-000e-16f0-e00b-50bf83c22127@pgmasters.net