aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/transam/clog.c
Commit message (Collapse)AuthorAge
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Fix access-off-end-of-array in clog.c.Tom Lane2017-10-06
| | | | | | | | | | | | | Sloppy loop coding in set_status_by_pages() resulted in fetching one array element more than it should from the subxids[] array. The odds of this resulting in SIGSEGV are pretty small, but we've certainly seen that happen with similar mistakes elsewhere. While at it, we can get rid of an extra TransactionIdToPage() calculation per loop. Per report from David Binderman. Back-patch to all supported branches, since this code is quite old. Discussion: https://postgr.es/m/HE1PR0802MB2331CBA919CBFFF0C465EB429C710@HE1PR0802MB2331.eurprd08.prod.outlook.com
* Use group updates when setting transaction status in clog.Robert Haas2017-09-01
| | | | | | | | | | | | | | | Commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a introduced a mechanism to reduce contention on ProcArrayLock by having a single process clear XIDs in the procArray on behalf of multiple processes, reducing the need to hand the lock around. A previous attempt to introduce a similar mechanism for CLogControlLock in ccce90b398673d55b0387b3de66639b1b30d451b crashed and burned, but the design problem which resulted in those failures is believed to have been corrected in this version. Amit Kapila, with some cosmetic changes by me. See the previous commit message for additional credits. Discussion: http://postgr.es/m/CAA4eK1KudxzgWhuywY_X=yeSAhJMT4DwCjroV5Ay60xaeB2Eew@mail.gmail.com
* Remove incorrect assertion in clog.cRobert Haas2017-08-10
| | | | | | | | | | | We must advance the oldest XID that can be safely looked up in clog *before* truncating CLOG, and the oldest XID that can't be reused *after* truncating CLOG. This assertion, and the accompanying comment, are confused; remove them. Reported by Neha Sharma. Discussion: http://postgr.es/m/CANiYTQumC3T=UMBMd1Hor=5XWZYuCEQBioL3ug0YtNQCMMT5wQ@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Fsync directory after creating or unlinking file.Teodor Sigaev2017-03-27
| | | | | | | | | If file was created/deleted just before powerloss it's possible that file system will miss that. To prevent it, call fsync() where creating/ unlinkg file is critical. Author: Michael Paquier Reviewed-by: Ashutosh Bapat, Takayuki Tsunakawa, me
* Track the oldest XID that can be safely looked up in CLOG.Robert Haas2017-03-23
| | | | | | | | | | | | | | This provides infrastructure for looking up arbitrary, user-supplied XIDs without a risk of scary-looking failures from within the clog module. Normally, the oldest XID that can be safely looked up in CLOG is the same as the oldest XID that can reused without causing wraparound, and the latter is already tracked. However, while truncation is in progress, the values are different, so we must keep track of them separately. Craig Ringer, reviewed by Simon Riggs and by me. Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
* Rename "pg_clog" directory to "pg_xact".Robert Haas2017-03-17
| | | | | | | | | | | Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com
* Revert "Use group updates when setting transaction status in clog."Robert Haas2017-03-10
| | | | | | | | This reverts commit ccce90b398673d55b0387b3de66639b1b30d451b. This optimization is unsafe, at least, of rollbacks and rollbacks to savepoints, but I'm concerned there may be other problematic cases as well. Therefore, I've decided to revert this pending further investigation.
* Use group updates when setting transaction status in clog.Robert Haas2017-03-09
| | | | | | | | | | | | | | | | | | | | | | Commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a introduced a mechanism to reduce contention on ProcArrayLock by having a single process clear XIDs in the procArray on behalf of multiple processes, reducing the need to hand the lock around. Use a similar mechanism to reduce contention on CLogControlLock. Testing shows that this very significantly reduces the amount of time waiting for CLogControlLock on high-concurrency pgbench tests run on a large multi-socket machines; whether that translates into a TPS improvement depends on how much of that contention is simply shifted to some other lock, particularly WALWriteLock. Amit Kapila, with some cosmetic changes by me. Extensively reviewed, tested, and benchmarked over a period of about 15 months by Simon Riggs, Robert Haas, Andres Freund, Jesper Pedersen, and especially by Tomas Vondra and Dilip Kumar. Discussion: http://postgr.es/m/CAA4eK1L_snxM_JcrzEstNq9P66++F4kKFce=1r5+D1vzPofdtg@mail.gmail.com Discussion: http://postgr.es/m/CAA4eK1LyR2A+m=RBSZ6rcPEwJ=rVi1ADPSndXHZdjn56yqO6Vg@mail.gmail.com Discussion: http://postgr.es/m/91d57161-d3ea-0cc2-6066-80713e4f90d7@2ndquadrant.com
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Increase maximum number of clog buffers.Andres Freund2016-04-08
| | | | | | | | | | | | | | | | | | | | | | Benchmarking has shown that the current number of clog buffers limits scalability. We've previously increased the number in 33aaa139, but that's not sufficient with a large number of clients. We've benchmarked the cost of increasing the limit by benchmarking worst case scenarios; testing showed that 128 buffers don't cause a regression, even in contrived scenarios, whereas 256 does There are a number of more complex patches flying around to address various clog scalability problems, but this is simple enough that we can get it into 9.6; and is beneficial even after those patches have been applied. It is a bit unsatisfactory to increase this in small steps every few releases, but a better solution seems to require a rewrite of slru.c; not something done quickly. Author: Amit Kapila and Andres Freund Discussion: CAA4eK1+-=18HOrdqtLXqOMwZDbC_15WTyHiFruz7BvVArZPaAw@mail.gmail.com
* Make all built-in lwlock tranche IDs fixed.Robert Haas2016-02-02
| | | | | | | This makes the values more stable, which seems like a good thing for anybody who needs to look at at them. Alexander Korotkov and Amit Kapila
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Move each SLRU's lwlocks to a separate tranche.Robert Haas2015-11-12
| | | | | | | | | | This makes it significantly easier to identify these lwlocks in LWLOCK_STATS or Trace_lwlocks output. It's also arguably better from a modularity standpoint, since lwlock.c no longer needs to know anything about the LWLock needs of the higher-level SLRU facility. Ildus Kurbangaliev, reviewd by Álvaro Herrera and by me.
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* Fix typosAlvaro Herrera2014-12-03
|
* Revamp the WAL record format.Heikki Linnakangas2014-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.
* Move the backup-block logic from XLogInsert to a new file, xloginsert.c.Heikki Linnakangas2014-11-06
| | | | | | | | | | | | xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Update copyright for 2014Bruce Momjian2014-01-07
| | | | | Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
* Fix Hot-Standby initialization of clog and subtrans.Heikki Linnakangas2013-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. 5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Remove obsolete XLogRecPtr macrosAlvaro Herrera2012-12-28
| | | | | | | | | | | | | | | | | This gets rid of XLByteLT, XLByteLE, XLByteEQ and XLByteAdvance. These were useful for brevity when XLogRecPtrs were split in xlogid/xrecoff; but now that they are simple uint64's, they are just clutter. The only downside to making this change would be ease of backporting patches, but that has been negated by other substantive changes to the involved code anyway. The clarity of simpler expressions makes the change worthwhile. Most of the changes are mechanical, but in a couple of places, the patch author chose to invert the operator sense, making the code flow more logical (and more in line with preceding comments). Author: Andres Freund Eyeballed by Dimitri Fontaine and Alvaro Herrera
* Split out rmgr rm_desc functions into their own filesAlvaro Herrera2012-11-28
| | | | | This is necessary (but not sufficient) to have them compilable outside of a backend environment.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Assorted comment fixes, mostly just typos, but some obsolete statements.Tom Lane2012-01-29
| | | | YAMAMOTO Takashi
* Make the number of CLOG buffers adaptive, based on shared_buffers.Robert Haas2012-01-06
| | | | | | | | | | | | Previously, this was hardcoded: we always had 8. Performance testing shows that isn't enough, especially on big SMP systems, so we allow it to scale up as high as 32 when there's adequate memory. On the flip side, when shared_buffers is very small, drop the number of CLOG buffers down to as little as 4, so that we can start the postmaster even when very little shared memory is available. Per extensive discussion with Simon Riggs, Tom Lane, and others on pgsql-hackers.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Fix timing of Startup CLOG and MultiXact during Hot StandbySimon Riggs2011-11-02
| | | | Patch by me, bug report by Chris Redekop, analysis by Florian Pflug
* Use callbacks in SlruScanDirectory for the actual actionAlvaro Herrera2011-10-04
| | | | | | | | | | | | Previously, the code assumed that the only possible action to take was to delete files behind a certain cutoff point. The async notify code was already a crock: it used a different "pagePrecedes" function for truncation than for regular operation. By allowing it to pass a callback to SlruScanDirectory it can do cleanly exactly what it needs to do. The clog.c code also had its own use for SlruScanDirectory, which is made a bit simpler with this.
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* Fix assorted typosAlvaro Herrera2011-05-12
|
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Avoid unnecessary public struct declaration in slru.hAlvaro Herrera2010-12-30
| | | | | | | | Instead, declare a public wrapper of the sole function using it for external callers, so that they don't have to always pass a NULL argument. Author: Kevin Grittner
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Allow read only connections during recovery, known as Hot Standby.Simon Riggs2009-12-19
| | | | | | | | | | | | Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record. New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far. This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required. Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit. Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock shouldHeikki Linnakangas2009-01-20
| | | | | | | | | be used instead of the normal exclusive lock, and make WAL redo functions responsible for calling RestoreBkpBlocks(). They know better what kind of a lock they need. At the moment, this just moves things around with no functional change, but makes the hot standby patch that's under review cleaner.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Fix silly typo in previous commit.Alvaro Herrera2008-11-03
|
* Fix TransactionIdSetStatusBit so that it doesn't try to change a transactionAlvaro Herrera2008-11-03
| | | | | | | | from COMMITTED to SUBCOMMITTED during recovery. This wasn't previously possible, but it is now due to the recent changes on clog commit protocol for subtransactions. Simon Riggs
* Rework subtransaction commit protocol for hot standby.Alvaro Herrera2008-10-20
| | | | | | | | | | | | This patch eliminates the marking of subtransactions as SUBCOMMITTED in pg_clog during their commit; instead they remain in-progress until main transaction commit. At main transaction commit, the commit protocol is atomic-by-page instead of one transaction at a time. To avoid a race condition with some subtransactions appearing committed before others in the case where they span more than one pg_clog page, we conserve the logic that marks them subcommitted before marking the parent committed. Simon Riggs with minor help from me
* Add a few more DTrace probes to the backend.Alvaro Herrera2008-08-01
| | | | Robert Lor
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* pgindent run for 8.3.Bruce Momjian2007-11-15
|