aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/time
Commit message (Collapse)AuthorAge
...
* Use full 64-bit XID for checking if a deleted GiST page is old enough.Heikki Linnakangas2019-07-24
| | | | | | | | | | | Otherwise, after a deleted page gets even older, it becomes unrecyclable again. B-tree has the same problem, and has had since time immemorial, but let's at least fix this in GiST, where this is new. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-22
| | | | | | | | This is numbered take 7, and addresses a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Mark correctly initial slot snapshots with MVCC type when builtMichael Paquier2019-02-20
| | | | | | | | | | | | | | | | When building an initial slot snapshot, snapshots are marked with historic MVCC snapshots as type with the marker field being set in SnapBuildBuildSnapshot() but not overriden in SnapBuildInitialSnapshot(). Existing callers of SnapBuildBuildSnapshot() do not care about the type of snapshot used, but extensions calling it actually may, as reported. While on it, mark correctly the snapshot type when importing one. This is cosmetic as the field is enforced to 0. Author: Antonin Houska Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/23215.1527665193@localhost Backpatch-through: 9.4
* Move remaining code from tqual.[ch] to heapam.h / heapam_visibility.c.Andres Freund2019-01-21
| | | | | | | | | | | | | | Given these routines are heap specific, and that there will be more generic visibility support in via table AM, it makes sense to move the prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will not be exposed in a generic fashion, because they are too storage specific). Similarly, the code in tqual.c is specific to heap, so moving it into access/heap/ makes sense. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Move generic snapshot related code from tqual.h to snapmgr.h.Andres Freund2019-01-21
| | | | | | | | | | | | | | | | The code in tqual.c is largely heap specific. Due to the upcoming pluggable storage work, it therefore makes sense to move it into access/heap/ (as the file's header notes, the tqual name isn't very good). But the various statically allocated snapshot and snapshot initialization functions are now (see previous commit) generic and do not depend on functions declared in tqual.h anymore. Therefore move. Also move XidInMVCCSnapshot as that's useful for future AMs, and already used outside of tqual.c. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Change snapshot type to be determined by enum rather than callback.Andres Freund2019-01-21
| | | | | | | | | | | | | | | | | | | | | | This is in preparation for allowing the same snapshot be used for different table AMs. With the current callback based approach we would need one callback for each supported AM, which clearly would not be extensible. Thus add a new Snapshot->snapshot_type field, and move the dispatch into HeapTupleSatisfiesVisibility() (which is now a function). Later work will then dispatch calls to HeapTupleSatisfiesVisibility() and other AMs visibility functions depending on the type of the table. The central SnapshotType enum also seems like a good location to centralize documentation about the intended behaviour of various types of snapshots. As tqual.h isn't included by bufmgr.h any more (as HeapTupleSatisfies* isn't referenced by TestForOldSnapshot() anymore) a few files now need to include it directly. Author: Andres Freund, loosely based on earlier work by Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Fix misc typos, mostly in comments.Heikki Linnakangas2018-07-18
| | | | | | | | A collection of typos I happened to spot while reading code, as well as grepping for common mistakes. Backpatch to all supported versions, as applicable, to avoid conflicts when backporting other commits in the future.
* Rename TransactionChain functionsPeter Eisentraut2018-03-16
| | | | | | | | | We call this thing a "transaction block" everywhere except in a few functions, where it is mysteriously called a "transaction chain". In the SQL standard, a transaction chain is something different. So rename these functions to match the common terminology. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Fix pruning of locked and updated tuples.Andres Freund2017-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
* Clean up assorted messiness around AllocateDir() usage.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Ignore CatalogSnapshot when checking COPY FREEZE prerequisites.Noah Misch2017-11-05
| | | | | | | | | | This restores the ability, essentially lost in commit ffaa44cb559db332baeee7d25dedd74a61974203, to use COPY FREEZE under REPEATABLE READ isolation. Back-patch to 9.4, like that commit. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CA+TgmoahWDm-7fperBxzU9uZ99LPMUmEpSXLTw9TmrOgzwnORw@mail.gmail.com
* Fix low-probability loss of NOTIFY messages due to XID wraparound.Tom Lane2017-10-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now async.c has used TransactionIdIsInProgress() to detect whether a notify message's source transaction is still running. However, that function has a quick-exit path that reports that XIDs before RecentXmin are no longer running. If a listening backend is doing nothing but listening, and not running any queries, there is nothing that will advance its value of RecentXmin. Once 2 billion transactions elapse, the RecentXmin check causes active transactions to be reported as not running. If they aren't committed yet according to CLOG, async.c decides they aborted and discards their messages. The timing for that is a bit tight but it can happen when multiple backends are sending notifies concurrently. The net symptom therefore is that a sufficiently-long-surviving listen-only backend starts to miss some fraction of NOTIFY traffic, but only under heavy load. The only function that updates RecentXmin is GetSnapshotData(). A brute-force fix would therefore be to take a snapshot before processing incoming notify messages. But that would add cycles, as well as contention for the ProcArrayLock. We can be smarter: having taken the snapshot, let's use that to check for running XIDs, and not call TransactionIdIsInProgress() at all. In this way we reduce the number of ProcArrayLock acquisitions from one per message to one per notify interrupt; that's the same under light load but should be a benefit under heavy load. Light testing says that this change is a wash performance-wise for normal loads. I looked around for other callers of TransactionIdIsInProgress() that might be at similar risk, and didn't find any; all of them are inside transactions that presumably have already taken a snapshot. Problem report and diagnosis by Marko Tiikkaja, patch by me. Back-patch to all supported branches, since it's been like this since 9.0. Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org
* Message style fixesPeter Eisentraut2017-09-11
|
* Improve performance of get_actual_variable_range with recently-dead tuples.Tom Lane2017-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit fccebe421, we hacked get_actual_variable_range() to scan the index with SnapshotDirty, so that if there are many uncommitted tuples at the end of the index range, it wouldn't laboriously scan through all of them looking for a live value to return. However, that didn't fix it for the case of many recently-dead tuples at the end of the index; SnapshotDirty recognizes those as committed dead and so we're back to the same problem. To improve the situation, invent a "SnapshotNonVacuumable" snapshot type and use that instead. The reason this helps is that, if the snapshot rejects a given index entry, we know that the indexscan will mark that index entry as killed. This means the next get_actual_variable_range() scan will proceed past that entry without visiting the heap, making the scan a lot faster. We may end up accepting a recently-dead tuple as being the estimated extremal value, but that doesn't seem much worse than the compromise we made before to accept not-yet-committed extremal values. The cost of the scan is still proportional to the number of dead index entries at the end of the range, so in the interval after a mass delete but before VACUUM's cleaned up the mess, it's still possible for get_actual_variable_range() to take a noticeable amount of time, if you've got enough such dead entries. But the constant factor is much much better than before, since all we need to do with each index entry is test its "killed" bit. We chose to back-patch commit fccebe421 at the time, but I'm hesitant to do so here, because this form of the problem seems to affect many fewer people. Also, even when it happens, it's less bad than the case fixed by commit fccebe421 because we don't get the contention effects from expensive TransactionIdIsInProgress tests. Dmitriy Sarafannikov, reviewed by Andrey Borodin Discussion: https://postgr.es/m/05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru
* Fix typo in comment in SerializeSnapshotSimon Riggs2017-06-24
| | | | Author: Masahiko Sawada
* Revert 1f30295eab65eddaa88528876ab66e7095f4bb65Simon Riggs2017-06-24
| | | | Reported-by: Tom Lane
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Final pgindent run with old pg_bsd_indent (version 1.3).Tom Lane2017-06-21
| | | | | | | | | This is just to have a clean basis for comparison with the results of the new version (which will indeed end up reverting some of these changes...) Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Don't force-assign transaction id when exporting a snapshot.Andres Freund2017-06-14
| | | | | | | | | | | | | | | | | | | | | | Previously we required every exported transaction to have an xid assigned. That was used to check that the exporting transaction is still running, which in turn is needed to guarantee that that necessary rows haven't been removed in between exporting and importing the snapshot. The exported xid caused unnecessary problems with logical decoding, because slot creation has to wait for all concurrent xid to finish, which in turn serializes concurrent slot creation. It also prohibited snapshots to be exported on hot-standby replicas. Instead export the virtual transactionid, which avoids the unnecessary serialization and the inability to export snapshots on standbys. This changes the file name of the exported snapshot, but since we never documented what that one means, that seems ok. Author: Petr Jelinek, slightly editorialized by me Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f598b4b8-8cd7-0d54-0939-adda763d8c34@2ndquadrant.com
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Remove poorly worded and duplicated commentSimon Riggs2017-05-08
| | | | | | Move line of code to avoid need for duplicated comment Brought to attention by Masahiko Sawada
* Fix new warnings from GCC 7Peter Eisentraut2017-04-17
| | | | | This addresses the new warning types -Wformat-truncation -Wformat-overflow that are part of -Wall, via -Wformat, in GCC 7.
* Always SnapshotResetXmin() during ClearTransaction()Simon Riggs2017-04-06
| | | | Avoid corner cases during 2PC with 6bad580d9e678a0b604883e14d8401d469b06566
* Avoid SnapshotResetXmin() during AtEOXact_Snapshot()Simon Riggs2017-04-06
| | | | | | | | | | | | | | | | | | | For normal commits and aborts we already reset PgXact->xmin, so we can simply avoid running SnapshotResetXmin() twice. During performance tests by Alexander Korotkov, diagnosis by Andres Freund showed PgXact array as a bottleneck. After manual analysis by me of the code paths that touch those memory locations, I was able to identify extraneous code in the main transaction commit path. Avoiding touching highly contented shmem improves concurrent performance slightly on all workloads, confirmed by tests run by Ashutosh Sharma and Alexander Korotkov. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com
* Reverting 42b4b0b2413b9b472aaf2112a3bbfd80a6ab4dc5Simon Riggs2017-03-24
| | | | Buildfarm issues and other reported issues
* Avoid SnapshotResetXmin() during AtEOXact_Snapshot()Simon Riggs2017-03-24
| | | | | | | | | | For normal commits and aborts we already reset PgXact->xmin Avoiding touching highly contented shmem improves concurrent performance. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com
* Rename "pg_clog" directory to "pg_xact".Robert Haas2017-03-17
| | | | | | | | | | | Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com
* Handle unaligned SerializeSnapshot() buffer.Noah Misch2017-03-02
| | | | | | | | | | Likewise in RestoreSnapshot(). Do so by copying between the user buffer and a stack buffer of known alignment. Back-patch to 9.6, where this last applies cleanly. In master, the select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13 s10s_u11wos_24a SPARC", building 32-bit with gcc 4.9.2. In 9.6 and 9.5, the buffers in question happen to be sufficiently-aligned, and this change is mere insurance against future 9.6 changes or extension code compromising that.
* Consistently declare timestamp variables as TimestampTz.Tom Lane2017-02-23
| | | | | | | | | | | | | | | | | | | Twiddle the replication-related code so that its timestamp variables are declared TimestampTz, rather than the uninformative "int64" that was previously used for meant-to-be-always-integer timestamps. This resolves the int64-vs-TimestampTz declaration inconsistencies introduced by commit 7c030783a, though in the opposite direction to what was originally suggested. This required including datatype/timestamp.h in a couple more places than before. I decided it would be a good idea to slim down that header by not having it pull in <float.h> etc, as those headers are no longer at all relevant to its purpose. Unsurprisingly, a small number of .c files turn out to have been depending on those inclusions, so add them back in the .c files as needed. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us
* Fix typos in comments.Heikki Linnakangas2017-02-06
| | | | | | | | | Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Account for catalog snapshot in PGXACT->xmin updates.Tom Lane2016-11-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CatalogSnapshot was not plugged into SnapshotResetXmin()'s accounting for whether MyPgXact->xmin could be cleared or advanced. In normal transactions this was masked by the fact that the transaction snapshot would be older, but during backend startup and certain utility commands it was possible to re-use the CatalogSnapshot after MyPgXact->xmin had been cleared, meaning that recently-deleted rows could be pruned even though this snapshot could still see them, causing unexpected catalog lookup failures. This effect appears to be the explanation for a recent failure on buildfarm member piculet. To fix, add the CatalogSnapshot to the RegisteredSnapshots heap whenever it is valid. In the previous logic, it was possible for the CatalogSnapshot to remain valid across waits for client input, but with this change that would mean it delays advance of global xmin in cases where it did not before. To avoid possibly causing new table-bloat problems with clients that sit idle for long intervals, add code to invalidate the CatalogSnapshot before waiting for client input. (When the backend is busy, it's unlikely that the CatalogSnapshot would be the oldest snap for very long, so we don't worry about forcing early invalidation of it otherwise.) In passing, remove the CatalogSnapshotStale flag in favor of using "CatalogSnapshot != NULL" to represent validity, as we do for the other special snapshots in snapmgr.c. And improve some obsolete comments. No regression test because I don't know a deterministic way to cause this failure. But the stress test shown in the original discussion provokes "cache lookup failed for relation 1255" within a few dozen seconds for me. Back-patch to 9.4 where MVCC catalog scans were introduced. (Note: it's quite easy to produce similar failures with the same test case in branches before 9.4. But MVCC catalog scans were supposed to fix that.) Discussion: <16447.1478818294@sss.pgh.pa.us>
* Fix deletion of speculatively inserted TOAST on conflictAndres Freund2016-08-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | INSERT .. ON CONFLICT runs a pre-check of the possible conflicting constraints before performing the actual speculative insertion. In case the inserted tuple included TOASTed columns the ON CONFLICT condition would be handled correctly in case the conflict was caught by the pre-check, but if two transactions entered the speculative insertion phase at the same time, one would have to re-try, and the code for aborting a speculative insertion did not handle deleting the speculatively inserted TOAST datums correctly. TOAST deletion would fail with "ERROR: attempted to delete invisible tuple" as we attempted to remove the TOAST tuples using simple_heap_delete which reasoned that the given tuples should not be visible to the command that wrote them. This commit updates the heap_abort_speculative() function which aborts the conflicting tuple to use itself, via toast_delete, for deleting associated TOAST datums. Like before, the inserted toast rows are not marked as being speculative. This commit also adds a isolationtester spec test, exercising the relevant code path. Unfortunately 9.5 cannot handle two waiting sessions, and thus cannot execute this test. Reported-By: Viren Negi, Oskari Saarenmaa Author: Oskari Saarenmaa, edited a bit by me Bug: #14150 Discussion: <20160519123338.12513.20271@wrigleys.postgresql.org> Backpatch: 9.5, where ON CONFLICT was introduced
* Avoid crashing in GetOldestSnapshot() if there are no known snapshots.Tom Lane2016-08-07
| | | | | | | | | | | | | The sole caller expects NULL to be returned in such a case, so make it so and document it. Per reports from Andreas Seltenreich and Regina Obe. This doesn't really fix their problem, as now their RETURNING queries will say "ERROR: no known snapshots", but in any case this function should not dump core in a reasonably-foreseeable situation. Report: <87vazemeda.fsf@credativ.de> Report: <20160807051854.1427.32414@wrigleys.postgresql.org>
* Prevent "snapshot too old" from trying to return pruned TOAST tuples.Robert Haas2016-08-03
| | | | | | | | | | | | | Previously, we tested for MVCC snapshots to see whether they were too old, but not TOAST snapshots, which can lead to complaints about missing TOAST chunks if those chunks are subject to early pruning. Ideally, the threshold lsn and timestamp for a TOAST snapshot would be that of the corresponding MVCC snapshot, but since we have no way of deciding which MVCC snapshot was used to fetch the TOAST pointer, use the oldest active or registered snapshot instead. Reported by Andres Freund, who also sketched out what the fix should look like. Patch by me, reviewed by Amit Kapila.
* Adjust spellings of forms of "cancel"Peter Eisentraut2016-07-14
|
* Fix crash bug in RestoreSnapshot.Robert Haas2016-07-01
| | | | | | | | | If serialized_snapshot->subxcnt > 0 and serialized_snapshot->xcnt == 0, the old coding would do the wrong thing and crash. This can happen on standby servers. Report by Andreas Seltenreich. Patch by Thomas Munro, reviewed by Amit Kapila and tested by Andreas Seltenreich.
* Fix handling of multixacts predating pg_upgradeAlvaro Herrera2016-06-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After pg_upgrade, it is possible that some tuples' Xmax have multixacts corresponding to the old installation; such multixacts cannot have running members anymore. In many code sites we already know not to read them and clobber them silently, but at least when VACUUM tries to freeze a multixact or determine whether one needs freezing, there's an attempt to resolve it to its member transactions by calling GetMultiXactIdMembers, and if the multixact value is "in the future" with regards to the current valid multixact range, an error like this is raised: ERROR: MultiXactId 123 has not been created yet -- apparent wraparound and vacuuming fails. Per discussion with Andrew Gierth, it is completely bogus to try to resolve multixacts coming from before a pg_upgrade, regardless of where they stand with regards to the current valid multixact range. It's possible to get from under this problem by doing SELECT FOR UPDATE of the problem tuples, but if tables are large, this is slow and tedious, so a more thorough solution is desirable. To fix, we realize that multixacts in xmax created in 9.2 and previous have a specific bit pattern that is never used in 9.3 and later (we already knew this, per comments and infomask tests sprinkled in various places, but we weren't leveraging this knowledge appropriately). Whenever the infomask of the tuple matches that bit pattern, we just ignore the multixact completely as if Xmax wasn't set; or, in the case of tuple freezing, we act as if an unwanted value is set and clobber it without decoding. This guarantees that no errors will be raised, and that the values will be progressively removed until all tables are clean. Most callers of GetMultiXactIdMembers are patched to recognize directly that the value is a removable "empty" multixact and avoid calling GetMultiXactIdMembers altogether. To avoid changing the signature of GetMultiXactIdMembers() in back branches, we keep the "allow_old" boolean flag but rename it to "from_pgupgrade"; if the flag is true, we always return an empty set instead of looking up the multixact. (I suppose we could remove the argument in the master branch, but I chose not to do so in this commit). This was broken all along, but the error-facing message appeared first because of commit 8e9a16ab8f7f and was partially fixed in a25c2b7c4db3. This fix, backpatched all the way back to 9.3, goes approximately in the same direction as a25c2b7c4db3 but should cover all cases. Bug analysis by Andrew Gierth and Álvaro Herrera. A number of public reports match this bug: https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org
* Fix interaction between CREATE INDEX and "snapshot too old".Kevin Grittner2016-06-10
| | | | | | | | | | | | | | | Since indexes are created without valid LSNs, an index created while a snapshot older than old_snapshot_threshold existed could cause queries to return incorrect results when those old snapshots were used, if any relevant rows had been subject to early pruning before the index was built. Prevent usage of a newly created index until all such snapshots are released, for relations where this can happen. Questions about the interaction of "snapshot too old" with index creation were initially raised by Andres Freund. Reviewed by Robert Haas.
* pgindent run for 9.6Robert Haas2016-06-09
|
* Minor typos / copy-editing for snapmgr.cStephen Frost2016-06-07
| | | | Noticed while reviewing snapshot management.
* Add new snapshot fields to serialize/deserialize functions.Kevin Grittner2016-06-03
| | | | | | | The "snapshot too old" condition was not being recognized when using a copied snapshot, since the original timestamp and lsn were not being passed along. Noticed when testing the combination of "snapshot too old" with parallel query execution.
* Mitigate "snapshot too old" performance regression on NUMAKevin Grittner2016-05-06
| | | | | | | | | | | | Limit maintenance of time to xid mapping to once per minute. At least in the tested case this brings performance within 5% of when the feature is off, compared to several times slower without this patch. While there, fix comments and whitespace. Ants Aasma, with cosmetic adjustments suggested by Andres Freund Reviewed by Kevin Grittner and Andres Freund
* Fix hash index vs "snapshot too old" problemmsKevin Grittner2016-05-06
| | | | | | | | | | | | | | | | Hash indexes are not WAL-logged, and so do not maintain the LSN of index pages. Since the "snapshot too old" feature counts on detecting error conditions using the LSN of a table and all indexes on it, this makes it impossible to safely do early vacuuming on any table with a hash index, so add this to the tests for whether the xid used to vacuum a table can be adjusted based on old_snapshot_threshold. While at it, add a paragraph to the docs for old_snapshot_threshold which specifically mentions this and other aspects of the feature which may otherwise surprise users. Problem reported and patch reviewed by Amit Kapila
* Add a few entries to the tail of time mapping, to see old values.Kevin Grittner2016-04-29
| | | | | | | | | | | | | | | | | | Without a few entries beyond old_snapshot_threshold, the lookup would often fail, resulting in the more aggressive pruning or vacuum being skipped often enough to matter. This was very clearly shown by a python test script posted by Ants Aasma, and was likely a factor in an earlier but somewhat less clear-cut test case posted by Jeff Janes. This patch makes no change to the logic, per se -- it just makes the array of mapping entries big enough to make lookup misses based on timing much less likely. An occasional miss is still possible if a thread stalls for more than 10 minutes, but that does not create any problem with correctness of behavior. Besides, if things are so busy that a thread is stalling for more than 10 minutes, it is probably OK to skip the more aggressive cleanup at that particular point in time.