aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Fix bogus initialization of replication origin shared memory state.Tom Lane2020-05-15
| | | | | | | | | | | | | | | | | The previous coding zeroed out offsetof(ReplicationStateCtl, states) more bytes than it was entitled to, as a consequence of starting the zeroing from the wrong pointer (or, if you prefer, using the wrong calculation of how much to zero). It's unsurprising that this has not caused any reported problems, since it can be expected that the newly-allocated block is at the end of what we've used in shared memory, and we always make the shmem block substantially bigger than minimally necessary. Nonetheless, this is wrong and it could bite us someday; plus it's a dangerous model for somebody to copy. This dates back to the introduction of this code (commit 5aa235042), so back-patch to all supported branches.
* Rename assorted LWLock tranches.Tom Lane2020-05-15
| | | | | | | | | | | | | | | | Choose names that fit into the conventions for wait event names (particularly, that multi-word names are in the style MultiWordName) and hopefully convey more information to non-hacker users than the previous names did. Also rename SerializablePredicateLockListLock to SerializablePredicateListLock; the old name was long enough to cause table formatting problems, plus the double occurrence of "Lock" seems confusing/error-prone. Also change a couple of particularly opaque LWLock field names. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
* Add comments linking pg_strftime to timestamptz_to_strAlvaro Herrera2020-05-15
|
* Avoid killing btree items that are already deadAlvaro Herrera2020-05-15
| | | | | | | | | | | | | | | | | | | | | | _bt_killitems marks btree items dead when a scan leaves the page where they live, but it does so with only share lock (to improve concurrency). This was historicall okay, since killing a dead item has no consequences. However, with the advent of data checksums and wal_log_hints, this action incurs a WAL full-page-image record of the page. Multiple concurrent processes would write the same page several times, leading to WAL bloat. The probability of this happening can be reduced by only killing items if they're not already dead, so change the code to do that. The problem could eliminated completely by having _bt_killitems upgrade to exclusive lock upon seeing a killable item, but that would reduce concurrency so it's considered a cure worse than the disease. Backpatch all the way back to 9.5, since wal_log_hints was introduced in 9.4. Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://postgr.es/m/CA+fd4k6PeRj2CkzapWNrERkja5G0-6D-YQiKfbukJV+qZGFZ_Q@mail.gmail.com
* Rename SLRU structures and associated LWLocks.Tom Lane2020-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | Originally, the names assigned to SLRUs had no purpose other than being shmem lookup keys, so not a lot of thought went into them. As of v13, though, we're exposing them in the pg_stat_slru view and the pg_stat_reset_slru function, so it seems advisable to take a bit more care. Rename them to names based on the associated on-disk storage directories (which fortunately we *did* think about, to some extent; since those are also visible to DBAs, consistency seems like a good thing). Also rename the associated LWLocks, since those names are likewise user-exposed now as wait event names. For the most part I only touched symbols used in the respective modules' SimpleLruInit() calls, not the names of other related objects. This renaming could have been taken further, and maybe someday we will do so. But for now it seems undesirable to change the names of any globally visible functions or structs, so some inconsistency is unavoidable. (But I *did* terminate "oldserxid" with prejudice, as I found that name both unreadable and not descriptive of the SLRU's contents.) Table 27.12 needs re-alphabetization now, but I'll leave that till after the other LWLock renamings I have in mind. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
* Make COPY TO keep locks until the transaction end.Amit Kapila2020-05-15
| | | | | | | | | | | | | | | | | | COPY TO released the ACCESS SHARE lock immediately when it was done rather than holding on to it until the end of the transaction. This breaks the case where a REPEATABLE READ transaction could see an empty table if it repeats a COPY statement and somebody truncated the table in the meantime. Before 4dded12faad the lock was also released after COPY FROM, but the commit failed to notice the irregularity in COPY TO. This is old behavior but doesn't seem important enough to backpatch. Author: Laurenz Albe, based on suggestion by Robert Haas and Tom Lane Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/7bcfc39d4176faf85ab317d0c26786953646a411.camel@cybertec.at
* Remove duplicated comment block in event_trigger.cMichael Paquier2020-05-15
| | | | | | | | | | | The reasons why event triggers are disabled in standalone mode are documented in the code path of ddl_command_start, and other places checking if standalone mode is enabled or not mention to refer to the comment for ddl_command_start, except for table_rewrite that duplicated the same explanation. Reported-by: David G. Johnston Discussion: https://postgr.es/m/CAKFQuwYqHtXpvr2mBJRwH9f+Y5y1GXw3rhbaAu0Dk2MoNevsmA@mail.gmail.com
* Initial pgindent and pgperltidy run for v13.Tom Lane2020-05-14
| | | | | | | | | | | Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
* Collect built-in LWLock tranche names statically, not dynamically.Tom Lane2020-05-14
| | | | | | | | | | | | | | | | | | There is little point in using the LWLockRegisterTranche mechanism for built-in tranche names. It wastes cycles, it creates opportunities for bugs (since failing to register a tranche name is a very hard-to-detect problem), and the lack of any centralized list of names encourages sloppy nonconformity in name choices. Moreover, since we have a centralized list of the tranches anyway in enum BuiltinTrancheIds, we're certainly not buying any flexibility in return for these disadvantages. Hence, nuke all the backend-internal LWLockRegisterTranche calls, and instead provide a const array of the builtin tranche names. (I have in mind to change a bunch of these names shortly, but this patch is just about getting them into one place.) Discussion: https://postgr.es/m/9056.1589419765@sss.pgh.pa.us
* Move check for fsync=off so that pendingOps still gets cleared.Heikki Linnakangas2020-05-14
| | | | | | | | | | | | Commit 3eb77eba5a moved the loop and refactored it, and inadvertently changed the effect of fsync=off so that it also skipped removing entries from the pendingOps table. That was not intentional, and leads to an assertion failure if you turn fsync on while the server is running and reload the config. Backpatch-through: 12- Reviewed-By: Thomas Munro Discussion: https://www.postgresql.org/message-id/3cbc7f4b-a5fa-56e9-9591-c886deb07513%40iki.fi
* Fix the MSVC build for versions 2015 and later.Amit Kapila2020-05-14
| | | | | | | | | | | | | | | | | | | Visual Studio 2015 and later versions should still be able to do the same as Visual Studio 2012, but the declaration of locale_name is missing in _locale_t, causing the code compilation to fail, hence this falls back instead on to enumerating all system locales by using EnumSystemLocalesEx to find the required locale name.  If the input argument is in Unix-style then we can get ISO Locale name directly by using GetLocaleInfoEx() with LCType as LOCALE_SNAME. In passing, change the documentation references of the now obsolete links. Note that this problem occurs only with NLS enabled builds. Author: Juan José Santamaría Flecha, Davinder Singh and Amit Kapila Reviewed-by: Ranier Vilela and Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CAHzhFSFoJEWezR96um4-rg5W6m2Rj9Ud2CNZvV4NWc9tXV7aXQ@mail.gmail.com
* Fix async.c to not register any SLRU stats counts in the postmaster.Tom Lane2020-05-13
| | | | | | | | | | | | | | | | | | | | | Previously, AsyncShmemInit forcibly initialized the first page of the async SLRU, to save dealing with that case in asyncQueueAddEntries. But this is a poor tradeoff, since many installations do not ever use NOTIFY; for them, expending those cycles in AsyncShmemInit is a complete waste. Besides, this only saves a couple of instructions in asyncQueueAddEntries, which hardly seems likely to be measurable. The real reason to change this now, though, is that now that we track SLRU access stats, the existing code is causing the postmaster to accumulate some access counts, which then get inherited into child processes by fork(), messing up the statistics. Delaying the initialization into the first child that does a NOTIFY fixes that. Hence, we can revert f3d23d83e, which was an incorrect attempt at fixing that issue. Also, add an Assert to pgstat.c that should catch any future errors of the same sort. Discussion: https://postgr.es/m/8367.1589391884@sss.pgh.pa.us
* Dial back -Wimplicit-fallthrough to level 3Alvaro Herrera2020-05-13
| | | | | | | | | The additional pain from level 4 is excessive for the gain. Also revert all the source annotation changes to their original wordings, to avoid back-patching pain. Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us
* Improve management of SLRU statistics collection.Tom Lane2020-05-13
| | | | | | | | | | | | | | | | | | | | | | | Instead of re-identifying which statistics bucket to use for a given SLRU on every counter increment, do it once during shmem initialization. This saves a fair number of cycles, and there's no real cost because we could not have a bucket assignment that varies over time or across backends anyway. Also, get rid of the ill-considered decision to let pgstat.c pry directly into SLRU's shared state; it's cleaner just to have slru.c pass the stats bucket number. In consequence of these changes, there's no longer any need to store an SLRU's LWLock tranche info in shared memory, so get rid of that, making this a net reduction in shmem consumption. (That partly reverts fe702a7b3.) This is basically code review for 28cac71bd, so I also cleaned up some comments, removed a dangling extern declaration, fixed some things that should be static and/or const, etc. Discussion: https://postgr.es/m/3618.1589313035@sss.pgh.pa.us
* Adjust walsender usage of xlogreader, simplify APIsAlvaro Herrera2020-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Have both physical and logical walsender share a 'xlogreader' state struct for tracking state. This replaces the existing globals sendSeg and sendCxt. * Change WALRead not to receive XLogReaderState->seg and ->segcxt as separate arguments anymore; just use the ones from 'state'. This is made possible by the above change. * have the XLogReader segment_open contract require the callbacks to install the file descriptor in the state struct themselves instead of returning it. xlogreader was already ignoring any possible failed return from the callbacks, relying solely on them never returning. (This point is not altogether excellent, as it means the callbacks have to know more of XLogReaderState; but to really improve on that we would have to pass back error info from the callbacks to xlogreader. And the complexity would not be saved but instead just transferred to the callers of WALRead, which would have to learn how to throw errors from the open_segment callback in addition of, as currently, from pg_pread.) * segment_open no longer receives the 'segcxt' as a separate argument, since it's part of the XLogReaderState argument. Per comments from Kyotaro Horiguchi. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20200511203336.GA9913@alvherre.pgsql
* Use proper GetDatum function in pg_stat_get_slru().Fujii Masao2020-05-13
| | | | | | | | | | | This commit changes pg_stat_get_slru() so that it uses TimestampTzGetDatum() for stats_reset field because that field stores the timestamp with time zone value. Previously Int64GetDatum() was used. Author: Fujii Masao Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/b8784fe6-1401-ab35-aa14-d57b5bb8e312@oss.nttdata.com
* Initialize SLRU stats entries to zero.Fujii Masao2020-05-13
| | | | | | | | | | Previously since SLRUStats was not initialized, SLRU stats counters could begin with non-zero value. Which could lead to incorrect results in pg_stat_slru view. Author: Fujii Masao Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/976bbb73-a112-de3c-c488-b34b64609793@oss.nttdata.com
* Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGSAlvaro Herrera2020-05-12
| | | | | | | | | | | | | | | Use it at level 4, a bit more restrictive than the default level, and tweak our commanding comments to FALLTHROUGH. (However, leave zic.c alone, since it's external code; to avoid the warnings that would appear there, change CFLAGS for that file in the Makefile.) Author: Julien Rouhaud <rjuju123@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org
* Rework EXPLAIN format for incremental sortTomas Vondra2020-05-12
| | | | | | | | | | | | | | | | | | | | | The explain format used by incremental sort was somewhat inconsistent with other nodes, making it harder to parse and understand. This commit addresses that by - adding an extra space to better separate groups of values - using colons instead of equal signs to separate key/value - properly capitalizing first letter of a key - using separate lines for full and pre-sorted groups These changes were proposed by Justin Pryzby and mostly copy the final explain format used to report WAL usage. Author: Justin Pryzby Reviewed-by: James Coleman Discussion: https://postgr.es/m/20200419023625.GP26953@telsasoft.com
* Fix typos and improve incremental sort commentsTomas Vondra2020-05-12
| | | | | Author: Justin Pryzby, James Coleman Discussion: https://postgr.es/m/20200419023625.GP26953@telsasoft.com
* Remove unnecessary #include.Etsuro Fujita2020-05-12
| | | | My oversight in commit c8434d64c.
* Fix comment in xlogutils.cMichael Paquier2020-05-12
| | | | | | | | | | | The existing callers of XLogReadDetermineTimeline() performing recovery need to check a replay LSN position when determining on which timeline to read a WAL page. A portion of the comment describing this function said exactly that, while referring to a routine for fetching a write LSN, something not available in recovery. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200511.101619.2043820539323292957.horikyota.ntt@gmail.com
* Adjust "root of to-be-deleted subtree" function.Peter Geoghegan2020-05-11
| | | | | | | | | | | | | | | | | | | | Restructure the function that locates the root of the to-be-deleted subtree during nbtree page deletion. Handle the conditions that make page deletion unsafe in a slightly more uniform way, and acknowledge the fact that the behavior with incomplete splits on internal pages is different (as pointed out in the nbtree README as of commit 35bc0ec7). Also invent new terminology that avoids ambiguity around which pages are about to be deleted. Consistently use the term "to-be-deleted subtree", not the ambiguous term "branch". We were calling the subtree parent page the "top parent page", but that was quite misleading. The top parent page usually refers to a page unlinked from its siblings and marked deleted (during the second stage of page deletion). There was one kind of top parent page that we merely removed a downlink from, and another kind of top parent page that we actually marked deleted. Eliminate the ambiguity by inventing a new term ("subtree parent page") that refers to the former kind of page only.
* Fix obsolete references to "XLogRead"Alvaro Herrera2020-05-11
| | | | | | | | The one in xlogreader.h was pointed out by Antonin Houska; I (Álvaro) noticed the others by grepping. Author: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/28250.1589186654@antos
* Translation updatesPeter Eisentraut2020-05-11
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 80d8f54b3c5533ec036404bd3c3b24ff4825d037
* Remove smgrdounlink() in smgr.c from the code treeMichael Paquier2020-05-10
| | | | | | | | | | The last caller of this routine was removed in b416691, and as a wise man said one day, dead code tends to silently break. Per discussion between Fujii Masao, Peter Geoghegan, Vignesh C and me. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=sg5H8-vG4d5UmAofdcRMpeTDt2K-NUWp4GSfhenRGAQ@mail.gmail.com
* Simplify show_incremental_sort_info a bitTomas Vondra2020-05-09
| | | | | | | | | | Incremental sort always processes at least one full group group before switching to prefix groups, so it's enough to check just the number of full groups. There was no risk of division by zero due to the extra condition, but it made the code harder to understand. Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAp+7qoS92-4V1vLChpdY3vEkLCbf+gye6P-4cirE-0z0A@mail.gmail.com
* Do no reset bounded before incremental sort rescanTomas Vondra2020-05-09
| | | | | | | | | | ExecReScanIncrementalSort was resetting bounded=false, which means the optimization would be disabled on all rescans. This happens because ExecSetTupleBound is called before the rescan, not after it. Author: James Coleman Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200414065336.GI1492@paquier.xyz
* Fix handling of REWIND/MARK/BACKWARD in incremental sortTomas Vondra2020-05-09
| | | | | | | | | | | | | | | | | | | | | The executor flags were not handled entirely correctly, although the bugs were mostly harmless and it was mostly comment inaccuracy. We don't need to strip any of the flags for child nodes. Incremental sort does not support backward scans of mark/restore, so MARK/BACKWARDS flags should not be possible. So we simply ensure this using an assert, and we don't bother removing them when initializing the child node. With REWIND it's a bit less clear - incremental sort does not support REWIND, but there is no way to signal this - it's legal to just ignore the flag. We however continue passing the flag to child nodes, because they might be useful to leverage that. Reported-by: Michael Paquier Author: James Coleman Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200414065336.GI1492@paquier.xyz
* Rework XLogReader callback systemAlvaro Herrera2020-05-08
| | | | | | | | | | | | | | | | | | | Code review for 0dc8ead46363, prompted by a bug closed by 91c40548d5f7. XLogReader's system for opening and closing segments had gotten too complicated, with callbacks being passed at both the XLogReaderAllocate level (read_page) as well as at the WALRead level (segment_open). This was confusing and hard to follow, so restructure things so that these callbacks are passed together at XLogReaderAllocate time, and add another callback to the set (segment_close) to make it a coherent whole. Also, ensure XLogReaderState is an argument to all the callbacks, so that they can grab at the ->private data if necessary. Document the whole arrangement more clearly. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20200422175754.GA19858@alvherre.pgsql
* Fix several DDL issues of generated columns versus inheritancePeter Eisentraut2020-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | Several combinations of generated columns and inheritance in CREATE TABLE were not handled correctly. Specifically: - Disallow a child column specifying a generation expression if the parent column is a generated column. The child column definition must be unadorned and the parent column's generation expression will be copied. - Prohibit a child column of a generated parent column specifying default values or identity. - Allow a child column of a not-generated parent column specifying itself as a generated column. This previously did not work, but it was possible to arrive at the state via other means (involving ALTER TABLE), so it seems sensible to support it. Add tests for each case. Also add documentation about the rules involving generated columns and inheritance. Discussion: https://www.postgresql.org/message-id/flat/15830.1575468847%40sss.pgh.pa.us https://www.postgresql.org/message-id/flat/2678bad1-048f-519a-ef24-b12962f41807%40enterprisedb.com https://www.postgresql.org/message-id/flat/CAJvUf_u4h0DxkCMCeEKAWCuzGUTnDP-G5iVmSwxLQSXn0_FWNQ%40mail.gmail.com
* Propagate ALTER TABLE ... SET STORAGE to indexesPeter Eisentraut2020-05-08
| | | | | | | | | When creating a new index, the attstorage setting of the table column is copied to regular (non-expression) index columns. But a later ALTER TABLE ... SET STORAGE is not propagated to indexes, thus creating an inconsistent and undumpable state. Discussion: https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com
* Report missing wait event for timeline history file.Fujii Masao2020-05-08
| | | | | | | | | | | | | | | | TimelineHistoryRead and TimelineHistoryWrite wait events are reported during waiting for a read and write of a timeline history file, respectively. However, previously, TimelineHistoryRead wait event was not reported while readTimeLineHistory() was reading a timeline history file. Also TimelineHistoryWrite was not reported while writeTimeLineHistory() was writing one line with the details of the timeline split, at the end. This commit fixes these issues. Back-patch to v10 where wait events for a timeline history file was added. Author: Masahiro Ikeda Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/d11b0c910b63684424e06772eb844ab5@oss.nttdata.com
* Refactor nbtree deletion INCOMPLETE_SPLIT check.Peter Geoghegan2020-05-07
| | | | | | | | | | | | | | | | | | | | Factor out code common to _bt_lock_branch_parent() and _bt_pagedel() into a new utility function. This new function is used to check that the left sibling of a deletion target page does not have the INCOMPLETE_SPLIT page flag set. If it is set then deletion is unsafe; there won't be a usable pivot tuple (with a downlink) in the parent page that points to the deletion target page. The page deletion algorithm is not prepared to deal with that. Also restructure an existing, related utility function that checks if the right sibling of the target page has the ISHALFDEAD page flag set. This organization highlights the symmetry between the two cases. The goal is to make the design of page deletion clearer. Both functions involve a sibling page with a flag that indicates that there was an interrupted operation (a page split or a page deletion) that resulted in a page pointed to by sibling pages, but not pointed to in the parent. And, both functions indicate if page deletion is unsafe due to the absence of a particular downlink in the parent page.
* Fix YA text phrase search bug.Tom Lane2020-05-07
| | | | | | | | | | | | | | | | | | | checkcondition_str() failed to report multiple matches for a prefix pattern correctly: it would dutifully merge the match positions, but then after exiting that loop, if the last prefix-matching word had had no suitable positions, it would report there were no matches. The upshot would be failing to recognize a match that the query should match. It looks like you need all of these conditions to see the bug: * a phrase search (else we don't ask for match position details) * a prefix search item (else we don't get to this code) * a weight restriction (else checkclass_str won't fail) Noted while investigating a problem report from Pavel Borisov, though this is distinct from the issue he was on about. Back-patch to 9.6 where phrase search was added.
* Heed lock protocol in DROP OWNED BYAlvaro Herrera2020-05-06
| | | | | | | | | | | | | | | We were acquiring object locks then deleting objects one by one, instead of acquiring all object locks first, ignoring those that did not exist, and then deleting all objects together. The latter is the correct protocol to use, and what this commits changes to code to do. Failing to follow that leads to "cache lookup failed for relation XYZ" error reports when DROP OWNED runs concurrently with other DDL -- for example, a session termination that removes some temp tables. Author: Álvaro Herrera Reported-by: Mithun Chicklore Yogendra (Mithun CY) Reviewed-by: Ahsan Hadi, Tom Lane Discussion: https://postgr.es/m/CADq3xVZTbzK4ZLKq+dn_vB4QafXXbmMgDP3trY-GuLnib2Ai1w@mail.gmail.com
* Fix severe memory leaks in GSSAPI encryption support.Tom Lane2020-05-05
| | | | | | | | | | | | | | | | | Both the backend and libpq leaked buffers containing encrypted data to be transmitted, so that the process size would grow roughly as the total amount of data sent. There were also far-less-critical leaks of the same sort in GSSAPI session establishment. Oversight in commit b0b39f72b, which I failed to notice while reviewing the code in 2c0cdc818. Per complaint from pmc@citylink. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/20200504115649.GA77072@gate.oper.dinoex.org
* Change the display of WAL usage statistics in Explain.Amit Kapila2020-05-05
| | | | | | | | | | | | | | | | In commit 33e05f89c5, we have added the option to display WAL usage statistics in Explain and auto_explain. The display format used two spaces between each field which is inconsistent with Buffer usage statistics which is using one space between each field. Change the format to make WAL usage statistics consistent with Buffer usage statistics. This commit also changed the usage of "full page writes" to "full page images" for WAL usage statistics to make it consistent with other parts of code and docs. Author: Julien Rouhaud, Amit Kapila Reviewed-by: Justin Pryzby, Kyotaro Horiguchi and Amit Kapila Discussion: https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4ZLxWS_Cza3kQQ@mail.gmail.com
* Fix typo in commentAlexander Korotkov2020-05-03
| | | | Reported-by: Oleg Bartunov
* Refactor btvacuumpage().Peter Geoghegan2020-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove one of the arguments to btvacuumpage(), and give up on the idea that it's a recursive function. We now use the term "backtracking" to refer to the case where an earlier block must be visited to make sure no tuples that need to be removed were missed. Advertising btvacuumpage() as a recursive function was unhelpful. In reality the function always simulates recursion with a loop (it doesn't actually call itself). This wasn't just necessary as a precaution (per the comments mentioning tail recursion), though. There is no reliable natural limit on the number of times we can backtrack. There are important behavioral difference when "recursing"/backtracking, mostly related to page deletion. We don't perform page deletion when backtracking due to the extra complexity. And when we recurse, we're not performing a physical order scan anymore, so we expect fairly different conditions to hold for the page. Structuring the code like this makes it clearer how _bt_pagedel() cooperates with btvacuumpage() and btvacuumscan() (as established in commit b0229f26 and commit 73a076b0). Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzmRGMDWiLMcb+zagG9652PboNN4Gfcq1Gc_wJL6A716MA@mail.gmail.com
* Fix GSS client to non-GSS server connectionStephen Frost2020-05-02
| | | | | | | | | | | | | | | | | | If the client is compiled with GSSAPI support and tries to start up GSS with the server, but the server is not compiled with GSSAPI support, we would mistakenly end up falling through to call ProcessStartupPacket with secure_done = true, but the client might then try to perform SSL, which the backend wouldn't understand and we'd end up failing the connection with: FATAL: unsupported frontend protocol 1234.5679: server supports 2.0 to 3.0 Fix by arranging to track ssl_done independently from gss_done, instead of trying to use the same boolean for both. Author: Andrew Gierth Discussion: https://postgr.es/m/87h82kzwqn.fsf@news-spur.riddles.org.uk Backpatch: 12-, where GSSAPI encryption was added.
* Remove superfluous memset from pgstat_recv_resetslrucounterTomas Vondra2020-05-02
| | | | | | | | | The extra memset meant pg_stat_reset_slru() always reset all the entries even when reset of a single entry was requested, but the timestamp was left uninitialized. Reported-by: Atsushi Torikoshi Discussion: https://postgr.es/m/CACZ0uYFe16pjZxQYaTn53mspyM7dgMPYL3DJLjjPw69GMCC2Ow%40mail.gmail.com
* Simplify cost_incremental_sort a bitTomas Vondra2020-05-02
| | | | | | | | | | Commit de0dc1a847 added code to cost_incremental_sort to handle varno 0. Explicitly removing the RelabelType is not really necessary, because the pull_varnos handles that just fine, which simplifies the code a bit. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_3_D2J5XxOuw68hvn0-gJsw9FXNSGcZka9aTymn9UJ8A%40mail.gmail.com Discussion: https://postgr.es/m/20200411214639.GK2228%40telsasoft.com
* Remove pg_xact entry from SLRU statsTomas Vondra2020-05-02
| | | | | | | The "pg_xact" entry was duplicate with "clog" and was added by mistake. Reported-by: Fujii Masao Discussion: https://postgr.es/m/20200119143707.gyinppnigokesjok@development
* Get rid of trailing semicolons in C macro definitions.Tom Lane2020-05-01
| | | | | | | | | | | | | | | | | | | | | | | Writing a trailing semicolon in a macro is almost never the right thing, because you almost always want to write a semicolon after each macro call instead. (Even if there was some reason to prefer not to, pgindent would probably make a hash of code formatted that way; so within PG the rule should basically be "don't do it".) Thus, if we have a semi inside the macro, the compiler sees "something;;". Much of the time the extra empty statement is harmless, but it could lead to mysterious syntax errors at call sites. In perhaps an overabundance of neatnik-ism, let's run around and get rid of the excess semicolons whereever possible. The only thing worse than a mysterious syntax error is a mysterious syntax error that only happens in the back branches; therefore, backpatch these changes where relevant, which is most of them because most of these mistakes are old. (The lack of reported problems shows that this is largely a hypothetical issue, but still, it could bite us in some future patch.) John Naylor and Tom Lane Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com
* Clear up issue with FSM and oldest bpto.xact.Peter Geoghegan2020-05-01
| | | | | | | | | | | | | | | | | On further reflection, code comments added by commit b0229f26 slightly misrepresented how we determine the oldest bpto.xact for the index. btvacuumpage() does not treat the bpto.xact of a page that it put in the FSM as a candidate to be the oldest deleted page (the delete-marked page that has the oldest bpto.xact XID among all pages encountered). The definition of a deleted page for the purposes of the bpto.xact calculation is different from the definition used by the bulk delete statistics. The bulk delete statistics don't distinguish between pages that were deleted by the current VACUUM, pages deleted by a previous VACUUM operation but not yet recyclable/reusable, and pages that are reusable (though reusable pages are counted separately). Backpatch: 11-, just like commit b0229f26.
* Reorder function prototypes for consistency.Peter Geoghegan2020-05-01
|
* Fix undercounting in VACUUM VERBOSE output.Peter Geoghegan2020-05-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic for determining how many nbtree pages in an index are deleted pages sometimes undercounted pages. Pages that were deleted by the current VACUUM operation (as opposed to some previous VACUUM operation whose deleted pages have yet to be reused) were sometimes overlooked. The final count is exposed to users through VACUUM VERBOSE's "%u index pages have been deleted" output. btvacuumpage() avoided double-counting when _bt_pagedel() deleted more than one page by assuming that only one page was deleted, and that the additional deleted pages would get picked up during a future call to btvacuumpage() by the same VACUUM operation. _bt_pagedel() can legitimately delete pages that the btvacuumscan() scan will not visit again, though, so that assumption was slightly faulty. Fix the accounting by teaching _bt_pagedel() about its caller's requirements. It now only reports on pages that it knows btvacuumscan() won't visit again (including the current btvacuumpage() page), so everything works out in the end. This bug has been around forever. Only backpatch to v11, though, to keep _bt_pagedel() is sync on the branches that have today's bugfix commit b0229f26da. Note that this commit changes the signature of _bt_pagedel(), just like commit b0229f26da. Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com Backpatch: 11-
* Fix bug in nbtree VACUUM "skip full scan" feature.Peter Geoghegan2020-05-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 857f9c36cda (which taught nbtree VACUUM to skip a scan of the index from btcleanup in situations where it doesn't seem worth it) made VACUUM maintain the oldest btpo.xact among all deleted pages for the index as a whole. It failed to handle all the details surrounding pages that are deleted by the current VACUUM operation correctly (though pages deleted by some previous VACUUM operation were processed correctly). The most immediate problem was that the special area of the page was examined without a buffer pin at one point. More fundamentally, the handling failed to account for the full range of _bt_pagedel() behaviors. For example, _bt_pagedel() sometimes deletes internal pages in passing, as part of deleting an entire subtree with btvacuumpage() caller's page as the leaf level page. The original leaf page passed to _bt_pagedel() might not be the page that it deletes first in cases where deletion can take place. It's unclear how disruptive this bug may have been, or what symptoms users might want to look out for. The issue was spotted during unrelated code review. To fix, push down the logic for maintaining the oldest btpo.xact to _bt_pagedel(). btvacuumpage() is now responsible for pages that were fully deleted by a previous VACUUM operation, while _bt_pagedel() is now responsible for pages that were deleted by the current VACUUM operation (this includes half-dead pages from a previous interrupted VACUUM operation that become fully deleted in _bt_pagedel()). Note that _bt_pagedel() should never encounter an existing deleted page. This commit theoretically breaks the ABI of a stable release by changing the signature of _bt_pagedel(). However, if any third party extension is actually affected by this, then it must already be completely broken (since there are numerous assumptions made in _bt_pagedel() that cannot be met outside of VACUUM). It seems highly unlikely that such an extension actually exists, in any case. Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com Backpatch: 11-, where the "skip full scan" feature was introduced.
* Fix AddressSanitizer use-after-scope complaint.Peter Geoghegan2020-04-30
| | | | | | | | | | | XLogRegisterBufData() does not copy data pointed to by caller's pointer argument. Oversight in commit 0d861bbb702. Author: Peter Eisentraut Reported-By: Peter Eisentraut Discussion: https://postgr.es/m/21800dbe-a13e-22f7-d423-b81db9d249f5@2ndquadrant.com