aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Fix locking bugs that could corrupt pg_control.Thomas Munro2020-06-08
| | | | | | | | | | | | | | | | | | The redo routines for XLOG_CHECKPOINT_{ONLINE,SHUTDOWN} must acquire ControlFileLock before modifying ControlFile->checkPointCopy, or the checkpointer could write out a control file with a bad checksum. Likewise, XLogReportParameters() must acquire ControlFileLock before modifying ControlFile and calling UpdateControlFile(). Back-patch to all supported releases. Author: Nathan Bossart <bossartn@amazon.com> Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/70BF24D6-DC51-443F-B55A-95735803842A%40amazon.com
* Fix instance of elog() called while holding a spinlockMichael Paquier2020-06-04
| | | | | | | | This broke the project rule to not call any complex code while a spinlock is held. Issue introduced by b89e151. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com Backpatch-through: 9.5
* Fix use-after-release mistake in currtid() and currtid2() for viewsMichael Paquier2020-06-01
| | | | | | | | | | This issue has been present since the introduction of this code as of a3519a2 from 2002, and has been found by buildfarm member prion that uses RELCACHE_FORCE_RELEASE via the tests introduced recently in e786be5. Discussion: https://postgr.es/m/20200601022055.GB4121@paquier.xyz Backpatch-through: 9.5
* Add CHECK_FOR_INTERRUPTS() to the repeat() functionJoe Conway2020-05-28
| | | | | | | | | | | The repeat() function loops for potentially a long time without ever checking for interrupts. This prevents, for example, a query cancel from interrupting until the work is all done. Fix by inserting a CHECK_FOR_INTERRUPTS() into the loop. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/flat/8692553c-7fe8-17d9-cbc1-7cddb758f4c6%40joeconway.com
* Fix comment in slot.c.Amit Kapila2020-05-18
| | | | | | | | Reported-by: Sawada Masahiko Author: Sawada Masahiko Reviewed-by: Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CA+fd4k4Ws7M7YQ8PqSym5WB1y75dZeBTd1sZJUQdfe0KJQ-iSA@mail.gmail.com
* Fix bogus initialization of replication origin shared memory state.Tom Lane2020-05-15
| | | | | | | | | | | | | | | | | The previous coding zeroed out offsetof(ReplicationStateCtl, states) more bytes than it was entitled to, as a consequence of starting the zeroing from the wrong pointer (or, if you prefer, using the wrong calculation of how much to zero). It's unsurprising that this has not caused any reported problems, since it can be expected that the newly-allocated block is at the end of what we've used in shared memory, and we always make the shmem block substantially bigger than minimally necessary. Nonetheless, this is wrong and it could bite us someday; plus it's a dangerous model for somebody to copy. This dates back to the introduction of this code (commit 5aa235042), so back-patch to all supported branches.
* Avoid killing btree items that are already deadAlvaro Herrera2020-05-15
| | | | | | | | | | | | | | | | | | | | | | _bt_killitems marks btree items dead when a scan leaves the page where they live, but it does so with only share lock (to improve concurrency). This was historicall okay, since killing a dead item has no consequences. However, with the advent of data checksums and wal_log_hints, this action incurs a WAL full-page-image record of the page. Multiple concurrent processes would write the same page several times, leading to WAL bloat. The probability of this happening can be reduced by only killing items if they're not already dead, so change the code to do that. The problem could eliminated completely by having _bt_killitems upgrade to exclusive lock upon seeing a killable item, but that would reduce concurrency so it's considered a cure worse than the disease. Backpatch all the way back to 9.5, since wal_log_hints was introduced in 9.4. Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://postgr.es/m/CA+fd4k6PeRj2CkzapWNrERkja5G0-6D-YQiKfbukJV+qZGFZ_Q@mail.gmail.com
* Fix the MSVC build for versions 2015 and later.Amit Kapila2020-05-14
| | | | | | | | | | | | | | | | | | | Visual Studio 2015 and later versions should still be able to do the same as Visual Studio 2012, but the declaration of locale_name is missing in _locale_t, causing the code compilation to fail, hence this falls back instead on to enumerating all system locales by using EnumSystemLocalesEx to find the required locale name.  If the input argument is in Unix-style then we can get ISO Locale name directly by using GetLocaleInfoEx() with LCType as LOCALE_SNAME. In passing, change the documentation references of the now obsolete links. Note that this problem occurs only with NLS enabled builds. Author: Juan José Santamaría Flecha, Davinder Singh and Amit Kapila Reviewed-by: Ranier Vilela and Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CAHzhFSFoJEWezR96um4-rg5W6m2Rj9Ud2CNZvV4NWc9tXV7aXQ@mail.gmail.com
* Translation updatesPeter Eisentraut2020-05-11
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 5c769f79a22d341c6ea364d879ce33a16b1154d7
* Prevent archive recovery from scanning non-existent WAL files.Fujii Masao2020-05-09
| | | | | | | | | | | | | | | | | | | | | Previously when there were multiple timelines listed in the history file of the recovery target timeline, archive recovery searched all of them, starting from the newest timeline to the oldest one, to find the segment to read. That is, archive recovery had to continuously fail scanning the segment until it reached the timeline that the segment belonged to. These scans for non-existent segment could be harmful on the recovery performance especially when archival area was located on the remote storage and each scan could take a long time. To address the issue, this commit changes archive recovery so that it skips scanning the timeline that the segment to read doesn't belong to. Per discussion, back-patch to all supported versions. Author: Kyotaro Horiguchi, tweaked a bit by Fujii Masao Reviewed-by: David Steele, Pavel Suderevsky, Grigory Smolkin Discussion: https://postgr.es/m/16159-f5a34a3a04dc67e0@postgresql.org Discussion: https://postgr.es/m/20200129.120222.1476610231001551715.horikyota.ntt@gmail.com
* Heed lock protocol in DROP OWNED BYAlvaro Herrera2020-05-06
| | | | | | | | | | | | | | | We were acquiring object locks then deleting objects one by one, instead of acquiring all object locks first, ignoring those that did not exist, and then deleting all objects together. The latter is the correct protocol to use, and what this commits changes to code to do. Failing to follow that leads to "cache lookup failed for relation XYZ" error reports when DROP OWNED runs concurrently with other DDL -- for example, a session termination that removes some temp tables. Author: Álvaro Herrera Reported-by: Mithun Chicklore Yogendra (Mithun CY) Reviewed-by: Ahsan Hadi, Tom Lane Discussion: https://postgr.es/m/CADq3xVZTbzK4ZLKq+dn_vB4QafXXbmMgDP3trY-GuLnib2Ai1w@mail.gmail.com
* Get rid of trailing semicolons in C macro definitions.Tom Lane2020-05-01
| | | | | | | | | | | | | | | | | | | | | | | Writing a trailing semicolon in a macro is almost never the right thing, because you almost always want to write a semicolon after each macro call instead. (Even if there was some reason to prefer not to, pgindent would probably make a hash of code formatted that way; so within PG the rule should basically be "don't do it".) Thus, if we have a semi inside the macro, the compiler sees "something;;". Much of the time the extra empty statement is harmless, but it could lead to mysterious syntax errors at call sites. In perhaps an overabundance of neatnik-ism, let's run around and get rid of the excess semicolons whereever possible. The only thing worse than a mysterious syntax error is a mysterious syntax error that only happens in the back branches; therefore, backpatch these changes where relevant, which is most of them because most of these mistakes are old. (The lack of reported problems shows that this is largely a hypothetical issue, but still, it could bite us in some future patch.) John Naylor and Tom Lane Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com
* Fix error case for CREATE ROLE ... IN ROLE.Andrew Gierth2020-04-25
| | | | | | | | | | | | | | | | | | | | | | | | CreateRole() was passing a Value node, not a RoleSpec node, for the newly-created role name when adding the role as a member of existing roles for the IN ROLE syntax. This mistake went unnoticed because the node in question is used only for error messages and is not accessed on non-error paths. In older pg versions (such as 9.5 where this was found), this results in an "unexpected node type" error in place of the real error. That node type check was removed at some point, after which the code would accidentally fail to fail on 64-bit platforms (on which accessing the Value node as if it were a RoleSpec would be mostly harmless) or give an "unexpected role type" error on 32-bit platforms. Fix the code to pass the correct node type, and add an lfirst_node assertion just in case. Per report on irc from user m1chelangelo. Backpatch all the way, because this error has been around for a long time.
* Fix handling of WAL segments ready to be archived during crash recoveryMichael Paquier2020-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 78ea8b5 has fixed an issue related to the recycling of WAL segments on standbys depending on archive_mode. However, it has introduced a regression with the handling of WAL segments ready to be archived during crash recovery, causing those files to be recycled without getting archived. This commit fixes the regression by tracking in shared memory if a live cluster is either in crash recovery or archive recovery as the handling of WAL segments ready to be archived is different in both cases (those WAL segments should not be removed during crash recovery), and by using this new shared memory state to decide if a segment can be recycled or not. Previously, it was not possible to know if a cluster was in crash recovery or archive recovery as the shared state was able to track only if recovery was happening or not, leading to the problem. A set of TAP tests is added to close the gap here, making sure that WAL segments ready to be archived are correctly handled when a cluster is in archive or crash recovery with archive_mode set to "on" or "always", for both standby and primary. Reported-by: Benoît Lobréau Author: Jehan-Guillaume de Rorthais Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/20200331172229.40ee00dc@firost Backpatch-through: 9.5
* Fix possible crash during FATAL exit from reindexing.Tom Lane2020-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | index.c supposed that it could just use a PG_TRY block to clean up the state associated with an active REINDEX operation. However, that code doesn't run if we do a FATAL exit --- for example, due to a SIGTERM shutdown signal --- while the REINDEX is happening. And that state does get consulted during catalog accesses, which makes it problematic if we do any catalog accesses during shutdown --- for example, to clean up any temp tables created in the session. If this combination of circumstances occurred, we could find ourselves trying to access already-freed memory. In debug builds that'd fairly reliably cause an assertion failure. In production we might often get away with it, but with some bad luck it could cause a core dump. Another possible bad outcome is an erroneous conclusion that an index-to-be-accessed is being reindexed; but it looks like that would be unlikely to have any consequences worse than failing to drop temp tables right away. (They'd still get dropped by the next session that uses that temp schema.) To fix, get rid of the use of PG_TRY here, and instead hook into the transaction abort mechanisms to clean up reindex state. Per bug #16378 from Alexander Lakhin. This has been wrong for a very long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/16378-7a70ca41b3ec2009@postgresql.org
* Clear dangling pointer to avoid bogus EXPLAIN printout in a corner case.Tom Lane2020-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ExecReScanHashJoin will destroy the join's hash table if it expects that the inner relation will produce different rows on rescan. Up to now it's not bothered to clear the additional pointer to that hash table that exists in the child HashState node. However, it's possible for the query to terminate without building a fresh hash table (this happens if the outer relation is found to be empty during the final rescan). So we can end with a dangling pointer to a deleted hash table. That was harmless originally, but since 9.0 EXPLAIN ANALYZE has used that pointer to print hash table statistics. In debug builds this reproducibly results in garbage statistics. In non-debug builds there's frequently no ill effects, but in principle one could get wrong EXPLAIN ANALYZE output, or perhaps even a crash if free() has released the hashtable memory back to the OS. To fix, just make sure we clear the additional pointer when destroying the hash table. In problematic cases, EXPLAIN ANALYZE will then print no hashtable statistics (reverting to its pre-9.0 behavior). This isn't ideal, but since the problem manifests only in unusual corner cases, it's hard to justify taking any risks to do better in the back branches. A follow-on patch will improve matters in HEAD. Konstantin Knizhnik and Tom Lane, per diagnosis by Thomas Munro of a trouble report from Alvaro Herrera. Discussion: https://postgr.es/m/20200323165059.GA24950@alvherre.pgsql
* Fix circle_in to accept "(x,y),r" as it's advertised to do.Tom Lane2020-04-07
| | | | | | | | | | | Our documentation describes four allowed input syntaxes for circles, but the regression tests tried only three ... with predictable consequences. Remarkably, this has been wrong since the circle datatype was added in 1997, but nobody noticed till now. David Zhang, with some help from me Discussion: https://postgr.es/m/332c47fa-d951-7574-b5cc-a8f7f7201202@highgo.ca
* Adjust bytea get_bit/set_bit to cope with bytea strings > 256MB.Tom Lane2020-04-07
| | | | | | | | | | | | | | | | | | | Since the existing bit number argument can't exceed INT32_MAX, it's not possible for these functions to manipulate bits beyond the first 256MB of a bytea value. However, it'd be good if they could do at least that much, and not fall over entirely for longer bytea values. Adjust the comparisons to be done in int64 arithmetic so that works. Also tweak the error reports to show sane values in case of overflow. Also add some test cases to improve the miserable code coverage of these functions. Apply patch to back branches only; HEAD has a better solution as of commit 26a944cf2. Extracted from a much larger patch by Movead Li Discussion: https://postgr.es/m/20200312115135445367128@highgo.ca
* Preserve clustered index after rewrites with ALTER TABLEMichael Paquier2020-04-06
| | | | | | | | | | | | | A table rewritten by ALTER TABLE would lose tracking of an index usable for CLUSTER. This setting is tracked by pg_index.indisclustered and is controlled by ALTER TABLE, so some extra work was needed to restore it properly. Note that ALTER TABLE only marks the index that can be used for clustering, and does not do the actual operation. Author: Amit Langote, Justin Pryzby Reviewed-by: Ibrar Ahmed, Michael Paquier Discussion: https://postgr.es/m/20200202161718.GI13621@telsasoft.com Backpatch-through: 9.5
* Use TransactionXmin instead of RecentGlobalXmin in heap_abort_speculative().Andres Freund2020-04-05
| | | | | | | | | | | | | | | There's a very low risk that RecentGlobalXmin could be far enough in the past to be older than relfrozenxid, or even wrapped around. Luckily the consequences of that having happened wouldn't be too bad - the page wouldn't be pruned for a while. Avoid that risk by using TransactionXmin instead. As that's announced via MyPgXact->xmin, it is protected against wrapping around (see code comments for details around relfrozenxid). Author: Andres Freund Discussion: https://postgr.es/m/20200328213023.s4eyijhdosuc4vcj@alap3.anarazel.de Backpatch: 9.5-
* Save errno across LWLockRelease() callsPeter Eisentraut2020-04-05
| | | | | | Fixup for "Drop slot's LWLock before returning from SaveSlotToPath()" Reported-by: Michael Paquier <michael@paquier.xyz>
* Fix bugs in gin_fuzzy_search_limit processing.Tom Lane2020-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entryGetItem()'s three code paths each contained bugs associated with filtering the entries for gin_fuzzy_search_limit. The posting-tree path failed to advance "advancePast" after having decided to filter an item. If we ran out of items on the current page and needed to advance to the next, what would actually happen is that entryLoadMoreItems() would re-load the same page. Eventually, the random dropItem() test would accept one of the same items it'd previously rejected, and we'd move on --- but it could take awhile with small gin_fuzzy_search_limit. To add insult to injury, this case would inevitably cause entryLoadMoreItems() to decide it needed to re-descend from the root, making things even slower. The posting-list path failed to implement gin_fuzzy_search_limit filtering at all, so that all entries in the posting list would be returned. The bitmap-result path used a "gotitem" variable that it failed to update in the one place where it'd actually make a difference, ie at the one "continue" statement. I think this was unreachable in practice, because if we'd looped around then it shouldn't be the case that the entries on the new page are before advancePast. Still, the "gotitem" variable was contributing nothing to either clarity or correctness, so get rid of it. Refactor all three loops so that the termination conditions are more alike and less unreadable. The code coverage report showed that we had no coverage at all for the re-descend-from-root code path in entryLoadMoreItems(), which seems like a very bad thing, so add a test case that exercises it. We also had exactly no coverage for gin_fuzzy_search_limit, so add a simplistic test case that at least hits those code paths a little bit. Back-patch to all supported branches. Adé Heyward and Tom Lane Discussion: https://postgr.es/m/CAEknJCdS-dE1Heddptm7ay2xTbSeADbkaQ8bU2AXRCVC2LdtKQ@mail.gmail.com
* Ensure snapshot is registered within ScanPgRelation().Andres Freund2020-03-28
| | | | | | | | | | | | | | | | | | | | | In 9.4 I added support to use a historical snapshot in ScanPgRelation(), while adding logical decoding. Unfortunately a conflict with the concurrent removal of SnapshotNow was incorrectly resolved, leading to an unregistered snapshot being used. It is not correct to use an unregistered (or non-active) snapshot for anything non-trivial, because catalog invalidations can cause the snapshot to be invalidated. Luckily it seems unlikely to actively cause problems in practice, as ScanPgRelation() requires that we already have a lock on the relation, we only look for a single row, and we don't appear to rely on the result's tid to be correct. It however is clearly wrong and potential negative consequences would likely be hard to find. So it seems worth backpatching the fix, even without a concrete hazard. Discussion: https://postgr.es/m/20200229052459.wzhqnbhrriezg4v2@alap3.anarazel.de Backpatch: 9.5-
* Drop slot's LWLock before returning from SaveSlotToPath()Peter Eisentraut2020-03-26
| | | | | | | | | | | | | When SaveSlotToPath() is called with elevel=LOG, the early exits didn't release the slot's io_in_progress_lock. This could result in a walsender being stuck on the lock forever. A possible way to get into this situation is if the offending code paths are triggered in a low disk space situation. Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com> Reported-by: Craig Ringer <craig@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
* Revert "Skip WAL for new relfilenodes, under wal_level=minimal."Noah Misch2020-03-22
| | | | | | | | This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per numerous buildfarm members, it was incompatible with parallel query, and a test case assumed LP64. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
* Skip WAL for new relfilenodes, under wal_level=minimal.Noah Misch2020-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Back-patch to 9.5 (all supported versions). This introduces a new WAL record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As always, update standby systems before master systems. This changes sizeof(RelationData) and sizeof(IndexStmt), breaking binary compatibility for affected extensions. (The most recent commit to affect the same class of extensions was 089e4d405d0f3b94c74a2c6a54357a84a681754b.) Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
* Back-patch log_newpage_range().Noah Misch2020-03-21
| | | | | | | | | | | Back-patch a subset of commit 9155580fd5fc2a0cbb23376dfca7cd21f59c2c7b to v11, v10, 9.6, and 9.5. Include the latest repairs to this function. Use a new XLOG_FPI_MULTI value instead of reusing XLOG_FPI. That way, if an older server reads WAL from this function, that server will PANIC instead of applying just one page of the record. The next commit adds a call to this function. Discussion: https://postgr.es/m/20200304.162919.898938381201316571.horikyota.ntt@gmail.com
* During heap rebuild, lock any TOAST index until end of transaction.Noah Misch2020-03-21
| | | | | | | | | | | | swap_relation_files() calls toast_get_valid_index() to find and lock this index, just before swapping with the rebuilt TOAST index. The latter function releases the lock before returning. Potential for mischief is low; a concurrent session can issue ALTER INDEX ... SET (fillfactor = ...), which is not alarming. Nonetheless, changing pg_class.relfilenode without a lock is unconventional. Back-patch to 9.5 (all supported versions), because another fix needs this. Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
* Fix cosmetic blemishes involving rd_createSubid.Noah Misch2020-03-21
| | | | | | | Remove an obsolete comment from AtEOXact_cleanup(). Restore formatting of a comment in struct RelationData, mangled by the pgindent run in commit 9af4159fce6654aa0e081b00d02bca40b978745c. Back-patch to 9.5 (all supported versions), because another fix stacks on this.
* Add missing errcode() in a few ereport calls.Amit Kapila2020-03-18
| | | | | | | | | | This will allow to specifying SQLSTATE error code for the errors in the missing places. Reported-by: Sawada Masahiko Author: Sawada Masahiko Backpatch-through: 9.5 Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
* Avoid holding a directory FD open across assorted SRF calls.Tom Lane2020-03-16
| | | | | | | | | | | | | | | | | This extends the fixes made in commit 085b6b667 to other SRFs with the same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(), pg_ls_dir(), and pg_tablespace_databases(). Also adjust various comments and documentation to warn against expecting to clean up resources during a ValuePerCall SRF's final call. Back-patch to all supported branches, since these functions were all born broken. Justin Pryzby, with cosmetic tweaks by me Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
* C comment: correct commented bytes of max_cached_tuplebufsBruce Momjian2020-03-14
| | | | | | | | | | The comment said ~8MB, but it is actually ~64MB. Reported-by: Kuntal Ghosh Discussion: https://postgr.es/m/CAGz5QC+GGmHdnxp04B6wcLz2Zcd_HU+wCBrsPyOZP62-BJghig@mail.gmail.com Backpatch-through: 9.5-10
* Restructure polymorphic-type resolution in funcapi.c.Tom Lane2020-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to cover the case of having to resolve anyarray given only an anyrange input. The bug was masked if anyelement was also used (as either input or output), which probably helps account for our not having noticed. While looking at this I noticed that resolve_generic_type() would produce the wrong answer if asked to make that same resolution. ISTM that resolve_generic_type() is confusingly defined and overly complex, so rather than fix it, let's just make funcapi.c do the specific lookups it requires for itself. With this change, resolve_generic_type() is not used anywhere, so remove it in HEAD. In the back branches, leave it alone (complete with bug) just in case any external code is using it. While we're here, make some other refactoring adjustments in funcapi.c with an eye to upcoming future expansion of the set of polymorphic types: * Simplify quick-exit tests by adding an overall have_polymorphic_result flag. This is about a wash now but will be a win when there are more flags. * Reduce duplication of code between resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes(). * Don't bother to validate correct matching of anynonarray or anyenum; the parser should have done that, and even if it didn't, just doing "return false" here would lead to a very confusing, off-point error message. (Really, "return false" in these two functions should only occur if the call_expr isn't supplied or we can't obtain data type info from it.) * For the same reason, throw an elog rather than "return false" if we fail to resolve a polymorphic type. The bug's been there since we added anyrange, so back-patch to all supported branches. Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
* Preserve replica identity index across ALTER TABLE rewritePeter Eisentraut2020-03-13
| | | | | | | | | | | | If an index was explicitly set as replica identity index, this setting was lost when a table was rewritten by ALTER TABLE. Because this setting is part of pg_index but actually controlled by ALTER TABLE (not part of CREATE INDEX, say), we have to do some extra work to restore it. Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com> Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
* Fix nextXid tracking bug on standbys (9.5-11 only).Thomas Munro2020-03-12
| | | | | | | | | | | | | | | | | | | | | | RecordKnownAssignedTransactionIds() should never move nextXid backwards. Before this commit, that could happen if some other code path had advanced it without advancing latestObservedXid. One consequence is that a well timed XLOG_CHECKPOINT_ONLINE could cause hot standby feedback messages to get confused and report an xmin from a future epoch, potentially allowing vacuum to run too soon on the primary. Repair, by making sure RecordKnownAssignedTransactionIds() can only move nextXid forwards. In release 12 and master, this was already done by commit 2fc7af5e, which consolidated similar code and straightened out this bug. Back-patch to supported releases before that. Author: Eka Palamadai <ekanatha@amazon.com> Discussion: https://postgr.es/m/98BB4805-D0A2-48E1-96F4-15014313EADC@amazon.com
* Fix bug that causes to report waiting in PS display twice, in hot standby.Fujii Masao2020-03-10
| | | | | | | | | | | | | | | | | | | | | Previously "waiting" could appear twice via PS in case of lock conflict in hot standby mode. Specifically this issue happend when the delay in WAL application determined by max_standby_archive_delay and max_standby_streaming_delay had passed but it took more than 500 msec to cancel all the conflicting transactions. Especially we can observe this easily by setting those delay parameters to -1. The cause of this issue was that WaitOnLock() and ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to the process title in that case. This commit prevents ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting in case of lock conflict, to fix the bug. Back-patch to all back branches. Author: Masahiko Sawada Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
* Avoid assertion failure with targeted recovery in standby mode.Fujii Masao2020-03-09
| | | | | | | | | | | | | | | | | | | | | | | | At the end of recovery, standby mode is turned off to re-fetch the last valid record from archive or pg_wal. Previously, if recovery target was reached and standby mode was turned off while the current WAL source was stream, recovery could try to retrieve WAL file containing the last valid record unexpectedly from stream even though not in standby mode. This caused an assertion failure. That is, the assertion test confirms that WAL file should not be retrieved from stream if standby mode is not true. This commit moves back the current WAL source to archive if it's stream even though not in standby mode, to avoid that assertion failure. This issue doesn't cause the server to crash when built with assertion disabled. In this case, the attempt to retrieve WAL file from stream not in standby mode just fails. And then recovery tries to retrieve WAL file from archive or pg_wal. Back-patch to all supported branches. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
* Fix mesurement of elapsed time during truncating heap in VACUUM.Fujii Masao2020-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | VACUUM may truncate heap in several batches. The activity report is logged for each batch, and contains the number of pages in the table before and after the truncation, and also the elapsed time during the truncation. Previously the elapsed time reported in each batch was the total elapsed time since starting the truncation until finishing each batch. For example, if the truncation was processed dividing into three batches, the second batch reported the accumulated time elapsed during both first and second batches. This is strange and confusing because the number of pages in the table reported together is not total. Instead, each batch should report the time elapsed during only that batch. The cause of this issue was that the resource usage snapshot was initialized only at the beginning of the truncation and was never reset later. This commit fixes the issue by changing VACUUM so that the resource usage snapshot is reset at each batch. Back-patch to all supported branches. Reported-by: Tatsuhito Kasahara Author: Tatsuhito Kasahara Reviewed-by: Masahiko Sawada, Fujii Masao Discussion: https://postgr.es/m/CAP0=ZVJsf=NvQuy+QXQZ7B=ZVLoDV_JzsVC1FRsF1G18i3zMGg@mail.gmail.com
* Stop demanding that top xact must be seen before subxact in decoding.Amit Kapila2020-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Manifested as ERROR: subtransaction logged without previous top-level txn record this check forbids legit behaviours like - First xl_xact_assignment record is beyond reading, i.e. earlier restart_lsn. - After restart_lsn there is some change of a subxact. - After that, there is second xl_xact_assignment (for another subxact) revealing the relationship between top and first subxact. Such a transaction won't be streamed anyway because we hadn't seen it in full. Saying for sure whether xact of some record encountered after the snapshot was deserialized can be streamed or not requires to know whether it wrote something before deserialization point --if yes, it hasn't been seen in full and can't be decoded. Snapshot doesn't have such info, so there is no easy way to relax the check. Reported-by: Hsu, John Diagnosed-by: Arseny Sher Author: Arseny Sher, Amit Kapila Reviewed-by: Amit Kapila, Dilip Kumar Backpatch-through: 9.5 Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
* Translation updatesPeter Eisentraut2020-02-10
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: cde576a776a749a424b649f24259486752c7884d
* Revert commit 606f350de9.Fujii Masao2020-02-03
| | | | | | | | | | | | This commit reverts the fix "Make inherited TRUNCATE perform access permission checks on parent table only" only in the back branches. It's not hard to imagine that there are some applications expecting the old behavior and the fix breaks their security. To avoid this compatibility problem, we decided to apply the fix only in HEAD and revert it in all supported back branches. Discussion: https://postgr.es/m/21015.1580400165@sss.pgh.pa.us
* Fix memory leak on DSM slot exhaustion.Thomas Munro2020-02-01
| | | | | | | | | | | | | | If we attempt to create a DSM segment when no slots are available, we should return the memory to the operating system. Previously we did that if the DSM_CREATE_NULL_IF_MAXSEGMENTS flag was passed in, but we didn't do it if an error was raised. Repair. Back-patch to 9.4, where DSM segments arrived. Author: Thomas Munro Reviewed-by: Robert Haas Reported-by: Julian Backes Discussion: https://postgr.es/m/CA%2BhUKGKAAoEw-R4om0d2YM4eqT1eGEi6%3DQot-3ceDR-SLiWVDw%40mail.gmail.com
* Fix CheckAttributeType's handling of collations for ranges.Tom Lane2020-01-31
| | | | | | | | | | | | | | | | | | | | Commit fc7695891 changed CheckAttributeType to recurse into ranges, but made it pass down the wrong collation (always InvalidOid, since ranges as such have no collation). This would result in guaranteed failure when considering a range type whose subtype is collatable. Embarrassingly, we lack any regression tests that would expose such a problem (but fortunately, somebody noticed before we shipped this bug in any release). Fix it to pass down the range's subtype collation property instead, and add some regression test cases to exercise collatable-subtype ranges a bit more. Back-patch to all supported branches, as the previous patch was. Report and patch by Julien Rouhaud, test cases tweaked by me Discussion: https://postgr.es/m/CAOBaU_aBWqNweiGUFX0guzBKkcfJ8mnnyyGC_KBQmO12Mj5f_A@mail.gmail.com
* Make inherited TRUNCATE perform access permission checks on parent table only.Fujii Masao2020-01-31
| | | | | | | | | | | | | | Previously, TRUNCATE command through a parent table checked the permissions on not only the parent table but also the children tables inherited from it. This was a bug and inherited queries should perform access permission checks on the parent table only. This commit fixes that bug. Back-patch to all supported branches. Author: Amit Langote Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CAHGQGwFHdSvifhJE+-GSNqUHSfbiKxaeQQ7HGcYz6SC2n_oDcg@mail.gmail.com
* Fix an oversight in commit 4c70098ff.Tom Lane2020-01-23
| | | | | | | | | | | | | | | | | | | | I had supposed that the from_char_seq_search() call sites were all passing the constant arrays you'd expect them to pass ... but on looking closer, the one for DY format was passing the days[] array not days_short[]. This accidentally worked because the day abbreviations in English are all the same as the first three letters of the full day names. However, once we took out the "maximum comparison length" logic, it stopped working. As penance for that oversight, add regression test cases covering this, as well as every other switch case in DCH_from_char() that was not reached according to the code coverage report. Also, fold the DCH_RM and DCH_rm cases into one --- now that seq_search is case independent, there's no need to pass different comparison arrays for those cases. Back-patch, as the previous commit was.
* Clean up formatting.c's logic for matching constant strings.Tom Lane2020-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | seq_search(), which is used to match input substrings to constants such as month and day names, had a lot of bizarre and unnecessary behaviors. It was mostly possible to avert our eyes from that before, but we don't want to duplicate those behaviors in the upcoming patch to allow recognition of non-English month and day names. So it's time to clean this up. In particular: * seq_search scribbled on the input string, which is a pretty dangerous thing to do, especially in the badly underdocumented way it was done here. Fortunately the input string is a temporary copy, but that was being made three subroutine levels away, making it something easy to break accidentally. The behavior is externally visible nonetheless, in the form of odd case-folding in error reports about unrecognized month/day names. The scribbling is evidently being done to save a few calls to pg_tolower, but that's such a cheap function (at least for ASCII data) that it's pretty pointless to worry about. In HEAD I switched it to be pg_ascii_tolower to ensure it is cheap in all cases; but there are corner cases in Turkish where this'd change behavior, so leave it as pg_tolower in the back branches. * seq_search insisted on knowing the case form (all-upper, all-lower, or initcap) of the constant strings, so that it didn't have to case-fold them to perform case-insensitive comparisons. This likewise seems like excessive micro-optimization, given that pg_tolower is certainly very cheap for ASCII data. It seems unsafe to assume that we know the case form that will come out of pg_locale.c for localized month/day names, so it's better just to define the comparison rule as "downcase all strings before comparing". (The choice between downcasing and upcasing is arbitrary so far as English is concerned, but it might not be in other locales, so follow citext's lead here.) * seq_search also had a parameter that'd cause it to report a match after a maximum number of characters, even if the constant string were longer than that. This was not actually used because no caller passed a value small enough to cut off a comparison. Replicating that behavior for localized month/day names seems expensive as well as useless, so let's get rid of that too. * from_char_seq_search used the maximum-length parameter to truncate the input string in error reports about not finding a matching name. This leads to rather confusing reports in many cases. Worse, it is outright dangerous if the input string isn't all-ASCII, because we risk truncating the string in the middle of a multibyte character. That'd lead either to delivering an illegible error message to the client, or to encoding-conversion failures that obscure the actual data problem. Get rid of that in favor of truncating at whitespace if any (a suggestion due to Alvaro Herrera). In addition to fixing these things, I const-ified the input string pointers of DCH_from_char and its subroutines, to make sure there aren't any other scribbling-on-input problems. The risk of generating a badly-encoded error message seems like enough of a bug to justify back-patching, so patch all supported branches. Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us
* Fix concurrent indexing operations with temporary tablesMichael Paquier2020-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempting to use CREATE INDEX, DROP INDEX or REINDEX with CONCURRENTLY on a temporary relation with ON COMMIT actions triggered unexpected errors because those operations use multiple transactions internally to complete their work. Here is for example one confusing error when using ON COMMIT DELETE ROWS: ERROR: index "foo" already contains data Issues related to temporary relations and concurrent indexing are fixed in this commit by enforcing the non-concurrent path to be taken for temporary relations even if using CONCURRENTLY, transparently to the user. Using a non-concurrent path does not matter in practice as locks cannot be taken on a temporary relation by a session different than the one owning the relation, and the non-concurrent operation is more effective. The problem exists with REINDEX since v12 with the introduction of CONCURRENTLY, and with CREATE/DROP INDEX since CONCURRENTLY exists for those commands. In all supported versions, this caused only confusing error messages to be generated. Note that with REINDEX, it was also possible to issue a REINDEX CONCURRENTLY for a temporary relation owned by a different session, leading to a server crash. The idea to enforce transparently the non-concurrent code path for temporary relations comes originally from Andres Freund. Reported-by: Manuel Rigger Author: Michael Paquier, Heikki Linnakangas Reviewed-by: Andres Freund, Álvaro Herrera, Heikki Linnakangas Discussion: https://postgr.es/m/CA+u7OA6gP7YAeCguyseusYcc=uR8+ypjCcgDDCTzjQ+k6S9ksQ@mail.gmail.com Backpatch-through: 9.4
* Fix edge case leading to agg transitions skipping ExecAggTransReparent() calls.Andres Freund2020-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code checking whether an aggregate transition value needs to be reparented into the current context has always only compared the transition return value with the previous transition value by datum, i.e. without regard for NULLness. This normally works, because when the transition function returns NULL (via fcinfo->isnull), it'll return a value that won't be the same as its input value. But there's no hard requirement that that's the case. And it turns out, it's possible to hit this case (see discussion or reproducers), leading to a non-null transition value not being reparented, followed by a crash caused by that. Instead of adding another comparison of NULLness, instead have ExecAggTransReparent() ensure that pergroup->transValue ends up as 0 when the new transition value is NULL. That avoids having to add an additional branch to the much more common cases of the transition function returning the old transition value (which is a pointer in this case), and when the new value is different, but not NULL. In branches since 69c3936a149, also deduplicate the reparenting code between the expression evaluation based transitions, and the path for ordered aggregates. Reported-By: Teodor Sigaev, Nikita Glukhov Author: Andres Freund Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru Backpatch: 9.4-, this issue has existed since at least 7.4
* Fix crash in BRIN inclusion op functions, due to missing datum copy.Heikki Linnakangas2020-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | The BRIN add_value() and union() functions need to make a longer-lived copy of the argument, if they want to store it in the BrinValues struct also passed as argument. The functions for the "inclusion operator classes" used with box, range and inet types didn't take into account that the union helper function might return its argument as is, without making a copy. Check for that case, and make a copy if necessary. That case arises at least with the range_union() function, when one of the arguments is an 'empty' range: CREATE TABLE brintest (n numrange); CREATE INDEX brinidx ON brintest USING brin (n); INSERT INTO brintest VALUES ('empty'); INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric)); INSERT INTO brintest VALUES ('(-1, 0)'); SELECT brin_desummarize_range('brinidx', 0); SELECT brin_summarize_range('brinidx', 0); Backpatch down to 9.5, where BRIN was introduced. Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera
* Repair more failures with SubPlans in multi-row VALUES lists.Tom Lane2020-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9b63c13f0 turns out to have been fundamentally misguided: the parent node's subPlan list is by no means the only way in which a child SubPlan node can be hooked into the outer execution state. As shown in bug #16213 from Matt Jibson, we can also get short-lived tuple table slots added to the outer es_tupleTable list. At this point I have little faith that there aren't other possible connections as well; the long time it took to notice this problem shows that this isn't a heavily-exercised situation. Therefore, revert that fix, returning to the coding that passed a NULL parent plan pointer down to the transiently-built subexpressions. That gives us a pretty good guarantee that they won't hook into the outer executor state in any way. But then we need some other solution to make SubPlans work. Adopt the solution speculated about in the previous commit's log message: do expression initialization at plan startup for just those VALUES rows containing SubPlans, abandoning the goal of reclaiming memory intra-query for those rows. In practice it seems unlikely that queries containing a vast number of VALUES rows would be using SubPlans in them, so this should not give up much. (BTW, this test case also refutes my claim in connection with the prior commit that the issue only arises with use of LATERAL. That was just wrong: some variants of SubLink always produce SubPlans.) As with previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org