aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Add pg_stat_wal statistics view.Fujii Masao2020-10-02
| | | | | | | | | | | | | | | | | | | | This view shows the statistics about WAL activity. Currently it has only two columns: wal_buffers_full and stats_reset. wal_buffers_full column indicates the number of times WAL data was written to the disk because WAL buffers got full. This information is useful when tuning wal_buffers. stats_reset column indicates the time at which these statistics were last reset. pg_stat_wal view is also the basic infrastructure to expose other various statistics about WAL activity later. Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format. Bump catalog version. Author: Masahiro Ikeda Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
* Add block information in error context of WAL REDO apply loopMichael Paquier2020-10-02
| | | | | | | | | | | | | | | | Providing this information can be useful for example when diagnosing problems related to recovery conflicts or for recovery issues without having to go through the output generated by pg_waldump to get some information about the blocks a WAL record works on. The block information is printed in the same format as pg_waldump. This already existed in xlog.c for debugging purposes with -DWAL_DEBUG, so adding the block information in the callback has required just a small refactoring. Author: Bertrand Drouvot Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/c31e2cba-efda-762c-f4ad-5c25e5dac3d0@amazon.com
* Set right-links during sorted GiST index build.Heikki Linnakangas2020-10-01
| | | | | | | | | | | This is not strictly necessary, as the right-links are only needed by scans that are concurrent with page splits, and neither scans or page splits can happen during sorted index build. But it seems like a good idea to set them anyway, if we e.g. want to add a check to amcheck in the future to verify that the chain of right-links is complete. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/4D68C21F-9FB9-41DA-B663-FDFC8D143788%40yandex-team.ru
* Fix and test snapshot behavior on standby.Andres Freund2020-09-30
| | | | | | | | | | | | | | | | | | | | | | I (Andres) broke this in 623a9CA79bx, because I didn't think about the way snapshots are built on standbys sufficiently. Unfortunately our existing tests did not catch this, as they are all just querying with psql (therefore ending up with fresh snapshots). The fix is trivial, we just need to increment the transaction completion counter in ExpireTreeKnownAssignedTransactionIds(), which is the equivalent of ProcArrayEndTransaction() during recovery. This commit also adds a new test doing some basic testing of the correctness of snapshots built on standbys. To avoid the aforementioned issue of one-shot psql's not exercising the snapshot caching, the test uses a long lived psqls, similar to 013_crash_restart.pl. It'd be good to extend the test further. Reported-By: Ian Barwick <ian.barwick@2ndquadrant.com> Author: Andres Freund <andres@anarazel.de> Author: Ian Barwick <ian.barwick@2ndquadrant.com> Discussion: https://postgr.es/m/61291ffe-d611-f889-68b5-c298da9fb18f@2ndquadrant.com
* Reword partitioning error messageAlvaro Herrera2020-09-30
| | | | | | | | | | The error message about columns in the primary key not including all of the partition key was unclear; reword it. Backpatch all the way to pg11, where it appeared. Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com> Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
* Fix handling of BC years in to_date/to_timestamp.Tom Lane2020-09-30
| | | | | | | | | | | | | | | | | | | | | | | Previously, a conversion such as to_date('-44-02-01','YYYY-MM-DD') would result in '0045-02-01 BC', as the code attempted to interpret the negative year as BC, but failed to apply the correction needed for our internal handling of BC years. Fix the off-by-one problem. Also, arrange for the combination of a negative year and an explicit "BC" marker to cancel out and produce AD. This is how the negative-century case works, so it seems sane to do likewise. Continue to read "year 0000" as 1 BC. Oracle would throw an error, but we've accepted that case for a long time so I'm hesitant to change it in a back-patch. Per bug #16419 from Saeed Hubaishan. Back-patch to all supported branches. Dar Alathar-Yemen and Tom Lane Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
* Fix make_timestamp[tz] to accept negative years as meaning BC.Tom Lane2020-09-29
| | | | | | | | | | | Previously we threw an error. But make_date already allowed the case, so it is inconsistent as well as unhelpful for make_timestamp not to. Both functions continue to reject year zero. Code and test fixes by Peter Eisentraut, doc changes by me Discussion: https://postgr.es/m/13c0992c-f15a-a0ca-d839-91d3efd965d9@2ndquadrant.com
* Support for ISO 8601 in the jsonpath .datetime() methodAlexander Korotkov2020-09-29
| | | | | | | | | | | | | | | | The SQL standard doesn't require jsonpath .datetime() method to support the ISO 8601 format. But our to_json[b]() functions convert timestamps to text in the ISO 8601 format in the sake of compatibility with javascript. So, we add support of the ISO 8601 to the jsonpath .datetime() in the sake compatibility with to_json[b](). The standard mode of datetime parsing currently supports just template patterns and separators in the format string. In order to implement ISO 8601, we have to add support of the format string double quotes to the standard parsing mode. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov, revised by me Backpatch-through: 13
* Remove excess space from jsonpath .datetime() default format stringAlexander Korotkov2020-09-29
| | | | | | | | | | | bffe1bd684 has introduced jsonpath .datetime() method, but default formats for time and timestamp contain excess space between time and timezone. This commit removes this excess space making behavior of .datetime() method standard-compliant. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov Backpatch-through: 13
* Archive timeline history files in standby if archive_mode is set to "always".Fujii Masao2020-09-29
| | | | | | | | | | | | | | | | | | | | Previously the standby server didn't archive timeline history files streamed from the primary even when archive_mode is set to "always", while it archives the streamed WAL files. This could cause the PITR to fail because there was no required timeline history file in the archive. The cause of this issue was that walreceiver didn't mark those files as ready for archiving. This commit makes walreceiver mark those streamed timeline history files as ready for archiving if archive_mode=always. Then the archiver process archives the marked timeline history files. Back-patch to all supported versions. Reported-by: Grigory Smolkin Author: Grigory Smolkin, Fujii Masao Reviewed-by: David Zhang, Anastasia Lubennikova Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru
* Fix progress reporting of REINDEX CONCURRENTLYMichael Paquier2020-09-29
| | | | | | | | | | | | | | | | This addresses a couple of issues with the so-said subject: - Report the correct parent relation with the index actually being rebuilt or validated. Previously, the command status remained set to the last index created for the progress of the index build and validation, which would be incorrect when working on a table that has more than one index. - Use the correct phase when waiting before the drop of the old indexes. Previously, this was reported with the same status as when waiting before the old indexes are marked as dead. Author: Matthias van de Meent, Michael Paquier Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com Backpatch-through: 12
* Add for_each_from, to simplify loops starting from non-first list cells.Tom Lane2020-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
* Assign collations in partition bound expressions.Tom Lane2020-09-28
| | | | | | | | | | Failure to do this can result in errors during evaluation of the bound expression, as illustrated by the new regression test. Back-patch to v12 where the ability for partition bounds to be expressions was added. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Remove complaints about COLLATE clauses in partition bound values.Tom Lane2020-09-28
| | | | | | | | | | | | | | | | transformPartitionBoundValue went out of its way to do the wrong thing: there is no reason to complain about a non-matching COLLATE clause in a partition boundary expression. We're coercing the bound expression to the target column type as though by an implicit assignment, and the rules for implicit assignment say that collations can be implicitly converted. What we *do* need to do, and the code is not doing, is apply assign_expr_collations() to the bound expression. While this is merely a definition disagreement, that is a bug that needs to be back-patched, so I'll commit it separately. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Cache the result of converting now() to a struct pg_tm.Tom Lane2020-09-28
| | | | | | | | | | | | | | | SQL operations such as CURRENT_DATE, CURRENT_TIME, LOCALTIME, and conversion of "now" in a datetime input string have to obtain the transaction start timestamp ("now()") as a broken-down struct pg_tm. This is a remarkably expensive conversion, and since now() does not change intra-transaction, it doesn't really need to be done more than once per transaction. Introducing a simple cache provides visible speedups in queries that compute these values many times, for example insertion of many rows that use a default value of CURRENT_DATE. Peter Smith, with a bit of kibitzing by me Discussion: https://postgr.es/m/CAHut+Pu89TWjq530V2gY5O6SWi=OEJMQ_VHMt8bdZB_9JFna5A@mail.gmail.com
* Minor mop-up for List improvements.Tom Lane2020-09-27
| | | | | | | | | Fix a few places that were using written-out versions of the pg_list.h macros that commit cc99baa43 just improved, making them also use those macros so as to gain whatever performance improvement is to be had. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
* Move resolution of AlternativeSubPlan choices to the planner.Tom Lane2020-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When commit bd3daddaf introduced AlternativeSubPlans, I had some ambitions towards allowing the choice of subplan to change during execution. That has not happened, or even been thought about, in the ensuing twelve years; so it seems like a failed experiment. So let's rip that out and resolve the choice of subplan at the end of planning (in setrefs.c) rather than during executor startup. This has a number of positive benefits: * Removal of a few hundred lines of executor code, since AlternativeSubPlans need no longer be supported there. * Removal of executor-startup overhead (particularly, initialization of subplans that won't be used). * Removal of incidental costs of having a larger plan tree, such as tree-scanning and copying costs in the plancache; not to mention setrefs.c's own costs of processing the discarded subplans. * EXPLAIN no longer has to print a weird (and undocumented) representation of an AlternativeSubPlan choice; it sees only the subplan actually used. This should mean less confusion for users. * Since setrefs.c knows which subexpression of a plan node it's working on at any instant, it's possible to adjust the estimated number of executions of the subplan based on that. For example, we should usually estimate more executions of a qual expression than a targetlist expression. The implementation used here is pretty simplistic, because we don't want to expend a lot of cycles on the issue; but it's better than ignoring the point entirely, as the executor had to. That last point might possibly result in shifting the choice between hashed and non-hashed EXISTS subplans in a few cases, but in general this patch isn't meant to change planner choices. Since we're doing the resolution so late, it's really impossible to change any plan choices outside the AlternativeSubPlan itself. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us
* Revise RelationBuildRowSecurity() to avoid memory leaks.Tom Lane2020-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This function leaked some memory while loading qual clauses for an RLS policy. While ordinarily negligible, that could build up in some repeated-reload cases, as reported by Konstantin Knizhnik. We can improve matters by borrowing the coding long used in RelationBuildRuleLock: build stringToNode's result directly in the target context, and remember to explicitly pfree the input string. This patch by no means completely guarantees zero leaks within this function, since we have no real guarantee that the catalog- reading subroutines it calls don't leak anything. However, practical tests suggest that this is enough to resolve the issue. In any case, any remaining leaks are similar to those risked by RelationBuildRuleLock and other relcache-loading subroutines. If we need to fix them, we should adopt a more global approach such as that used by the RECOVER_RELATION_BUILD_MEMORY hack. While here, let's remove the need for an expensive PG_TRY block by using MemoryContextSetParent to reparent an initially-short-lived context for the RLS data. Back-patch to all supported branches. Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru
* Fix the logical replication from HEAD to lower versions.Amit Kapila2020-09-26
| | | | | | | | | | | | Commit 464824323e changed the logical replication protocol to allow the streaming of in-progress transactions and used the new version of protocol irrespective of the server version. Use the appropriate version of the protocol based on the server version. Reported-by: Ashutosh Sharma Author: Dilip Kumar Reviewed-by: Ashutosh Sharma and Amit Kapila Discussion: https://postgr.es/m/CAE9k0P=9OpXcNrcU5Gsvd5MZ8GFpiN833vNHzX6Uc=8+h1ft1Q@mail.gmail.com
* Defer flushing of SLRU files.Thomas Munro2020-09-25
| | | | | | | | | | | | | | | | | | | | | | | | Previously, we called fsync() after writing out individual pg_xact, pg_multixact and pg_commit_ts pages due to cache pressure, leading to regular I/O stalls in user backends and recovery. Collapse requests for the same file into a single system call as part of the next checkpoint, as we already did for relation files, using the infrastructure developed by commit 3eb77eba. This can cause a significant improvement to recovery performance, especially when it's otherwise CPU-bound. Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer that it applies to all the SLRU mini-buffer-pools as well as the main buffer pool. Rearrange things so that data collected in CheckpointStats includes SLRU activity. Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions, because they were redundant after the shutdown checkpoint that immediately precedes them. (I'm not sure if they were ever needed, but they aren't now.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
* Fix two bugs in MaintainOldSnapshotTimeMapping.Robert Haas2020-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous coding was confused about whether head_timestamp was intended to represent the timestamp for the newest bucket in the mapping or the oldest timestamp for the oldest bucket in the mapping. Decide that it's intended to be the oldest one, and repair accordingly. To do that, we need to do two things. First, when advancing to a new bucket, don't categorically set head_timestamp to the new timestamp. Do this only if we're blowing out the map completely because a lot of time has passed since we last maintained it. If we're replacing entries one by one, advance head_timestamp by 1 minute for each; if we're filling in unused entries, don't advance head_timestamp at all. Second, fix the computation of how many buckets we need to advance. The previous formula would be correct if head_timestamp were the timestamp for the new bucket, but we're now making all the code agree that it's the timestamp for the oldest bucket, so adjust the formula accordingly. This is certainly a bug fix, but I don't feel good about back-patching it without the introspection tools added by commit aecf5ee2bb36c597d3c6142e367e38d67816c777, and perhaps also some actual tests. Since back-patching the introspection tools might not attract sufficient support and since there are no automated tests of these fixes yet, I'm just committing this to master for now. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
* Standardize the printf format for st_sizePeter Eisentraut2020-09-24
| | | | | | Existing code used various inconsistent ways to printf struct stat's st_size member. The type of that is off_t, which is in most cases a signed 64-bit integer, so use the long long int format for it.
* Expose oldSnapshotControl definition via new header.Robert Haas2020-09-24
| | | | | | | | | | This makes it possible for code outside snapmgr.c to examine the contents of this data structure. This commit does not add any code which actually does so; a subsequent commit will make that change. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
* Improve behavior of tsearch_readline(), and remove t_readline().Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | | | Commit fbeb9da22, which added the tsearch_readline APIs, left t_readline() in place as a compatibility measure. But that function has been unused and deprecated for twelve years now, so that seems like enough time to remove it. Doing so, and merging t_readline's code into tsearch_readline, aids in making several useful improvements: * The hard-wired 4K limit on line length in tsearch data files is removed, by using a StringInfo buffer instead of a fixed-size buffer. * We can buy back the per-line palloc/pfree added by 3ea7e9550 in the common case where encoding conversion is not required. * We no longer need a separate pg_verify_mbstr call, as that functionality was folded into encoding conversion some time ago. (We could have done some of this stuff while keeping t_readline as a separate API, but there seems little point, since there's no reason for anyone to still be using t_readline directly.) Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
* Fix missing fsync of SLRU directories.Thomas Munro2020-09-24
| | | | | | | | | | | | | Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit 1b02be21), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
* Improve error cursor positions for problems with partition bounds.Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | We failed to pass down the query string to check_new_partition_bound, so that its attempts to provide error cursor positions were for naught; one must have the query string to get parser_errposition to do anything. Adjust its API to require a ParseState to be passed down. Also, improve the logic inside check_new_partition_bound so that the cursor points at the partition bound for the specific column causing the issue, when one can be identified. That part is also for naught if we can't determine the query position of the column with the problem. Improve transformPartitionBoundValue so that it makes sure that const-simplified partition expressions will be properly labeled with positions. In passing, skip calling evaluate_expr if the value is already a Const, which is surely the most common case. Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Avoid possible dangling-pointer access in tsearch_readline_callback.Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
* Allow WaitLatch() to be used without a latch.Thomas Munro2020-09-23
| | | | | | | | | | Due to flaws in commit 3347c982bab, using WaitLatch() without WL_LATCH_SET could cause an assertion failure or crash. Repair. While here, also add a check that the latch we're switching to belongs to this backend, when changing from one latch to another. Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
* Improve the error message for an inappropriate column definition list.Tom Lane2020-09-22
| | | | | | | | | | | The existing message about "a column definition list is only allowed for functions returning "record"" could be given in some cases where it was fairly confusing; in particular, a function with multiple OUT parameters *does* return record according to pg_proc. Break it down into a couple more cases to deliver a more on-point complaint. Per complaint from Bruce Momjian. Discussion: https://postgr.es/m/798909.1600562993@sss.pgh.pa.us
* Fix a few more generator scripts to produce pgindent-clean output.Tom Lane2020-09-21
| | | | | | | | | This completes the project of making all our derived files be pgindent-clean (or else explicitly excluded from indentation), so that no surprises result when running pgindent in a built-out development tree. Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
* Copy editing: fix a bunch of misspellings and poor wording.Tom Lane2020-09-21
| | | | | | | | 99% of this is docs, but also a couple of comments. No code changes. Justin Pryzby Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com
* Standardize order of use strict and use warnings in Perl codePeter Eisentraut2020-09-21
| | | | | The standard order in PostgreSQL and other code is use strict first, but some code was uselessly inconsistent about this.
* Fix checksum calculation in the new sorting GiST build.Heikki Linnakangas2020-09-21
| | | | | | | | | | | | | Since we're bypassing the buffer manager, we need to call PageSetChecksumInplace() directly. As reported by Justin Pryzby. In the passing, add RelationOpenSmgr() calls before all smgrwrite() and smgrextend() calls. Tom added one before the first smgrextend() call in commit c2bb287025, which seems to be enough, but let's play it safe and do it before each one. That's how it's done in the similar code in nbtsort.c, too. Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com
* Fix new GIST build code to work under CLOBBER_CACHE_ALWAYS.Tom Lane2020-09-20
| | | | | | | Can't say if this fixes *all* cases, but at least we get through the "point" regression test now, which hyrax's last run did not. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23
* Fix whitespacePeter Eisentraut2020-09-20
|
* Remove precedence hacks no longer needed without postfix operators.Tom Lane2020-09-19
| | | | | | | | | | | | | | | | It's no longer necessary to assign explicit precedences to GENERATED, NULL_P, PRESERVE, or STRIP_P. Actually, we don't need to assign precedence to IDENT either; that was really just there to govern the behavior of target_el's "a_expr IDENT" production, which no longer ends with that terminal. However, it seems like a good idea to continue to do so, because it provides a reference point for a precedence level that we can assign to other unreserved keywords that lack a natural precedence level. Research by Peter Eisentraut and John Naylor; comment rewrite by me. Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* Code review for dynahash change.Thomas Munro2020-09-19
| | | | | | | | | | | Commit be0a6666 left behind a comment about the order of some tests that didn't make sense without the expensive division, and in fact we might as well change the order to one that fails more cheaply most of the time as a micro-optimization. Also, remove the "+ 1" applied to max_bucket, to drop an instruction and match the original behavior. Per review from Tom Lane. Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
* Remove large fill factor support from dynahash.c.Thomas Munro2020-09-19
| | | | | | | | | | | | | | | | | | | | | | Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <Jakub.Wartak@tomtom.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
* Allow most keywords to be used as column labels without requiring AS.Tom Lane2020-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | Up to now, if you tried to omit "AS" before a column label in a SELECT list, it would only work if the column label was an IDENT, that is not any known keyword. This is rather unfriendly considering that we have so many keywords and are constantly growing more. In the wake of commit 1ed6b8956 it's possible to improve matters quite a bit. We'd originally tried to make this work by having some of the existing keyword categories be allowed without AS, but that didn't work too well, because each category contains a few special cases that don't work without AS. Instead, invent an entirely orthogonal keyword property "can be bare column label", and mark all keywords that way for which we don't get shift/reduce errors by doing so. It turns out that of our 450 current keywords, all but 39 can be made bare column labels, improving the situation by over 90%. This number might move around a little depending on future grammar work, but it's a pretty nice improvement. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* Update file header comments for logical/relation.c.Amit Kapila2020-09-18
| | | | | | Author: Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CA+HiwqE20oZoix13JyCeALpTf_SmjarZWtBFe5sND6zz+iupAw@mail.gmail.com
* Fix comments in heapam.c.Amit Kapila2020-09-18
| | | | | | | | | | | | After commits 85f6b49c2c and 3ba59ccc89, we can allow parallel inserts which was earlier not possible as parallel group members won't conflict for relation extension and page lock. In those commits, we forgot to update comments at few places. Author: Amit Kapila Reviewed-by: Robert Haas and Dilip Kumar Backpatch-through: 13 Discussion: https://postgr.es/m/CAFiTN-tMrQh5FFMPx5aWJ+1gi1H6JxktEhq5mDwCHgnEO5oBkA@mail.gmail.com
* Remove support for postfix (right-unary) operators.Tom Lane2020-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* Update parallel BTree scan state when the scan keys can't be satisfied.Amit Kapila2020-09-17
| | | | | | | | | | | | | | For parallel btree scan to work for array of scan keys, it should reach BTPARALLEL_DONE state once for every distinct combination of array keys. This is required to ensure that the parallel workers don't try to seize blocks at the same time for different scan keys. We missed to update this state when we discovered that the scan keys can't be satisfied. Author: James Hunter Reviewed-by: Amit Kapila Tested-by: Justin Pryzby Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
* Allow CURRENT_ROLE where CURRENT_USER is acceptedPeter Eisentraut2020-09-17
| | | | | | | | | | | | | In the particular case of GRANTED BY, this is specified in the SQL standard. Since in PostgreSQL, CURRENT_ROLE is equivalent to CURRENT_USER, and CURRENT_USER is already supported here, adding CURRENT_ROLE is trivial. The other cases are PostgreSQL extensions, but for the same reason it also makes sense there. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Asif Rehman <asifr.rehman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com
* Add support for building GiST index by sorting.Heikki Linnakangas2020-09-17
| | | | | | | | | | | | | | | | | | | | | | This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
* Teach walsender to update its process title for replication commands.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | Because the code path taken for SQL commands executed in a walsender will update the process title, we pretty much have to update the title for replication commands as well. Otherwise, the title shows "idle" for the rest of a logical walsender's lifetime once it's executed any SQL command. Playing with this, I confirm that a walsender now typically spends most of its life reporting walsender postgres [local] START_REPLICATION Considering this in isolation, it might be better to have it say walsender postgres [local] sending replication data However, consistency with the other cases seems to be a stronger argument. In passing, remove duplicative pgstat_report_activity call. Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
* Fix bogus completion tag usage in walsenderAlvaro Herrera2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit fd5942c18f97 (2012, 9.3-era), walsender has been sending completion tags for certain replication commands twice -- and they're not even consistent. Apparently neither libpq nor JDBC have a problem with it, but it's not kosher. Fix by remove the EndCommand() call in the common code path for them all, and inserting specific calls to EndReplicationCommand() specifically in those places where it's needed. EndReplicationCommand() is a new simple function to send the completion tag for replication commands. Do this instead of sending a generic SELECT completion tag for them all, which was also pretty bogus (if innocuous). While at it, change StartReplication() to use EndReplicationCommand() instead of pg_puttextmessage(). In commit 2f9661311b83, I failed to realize that replication commands are not close-enough kin of regular SQL commands, so the DROP_REPLICATION_SLOT tag I added is undeserved and a type pun. Take it out. Backpatch to 13, where the latter commit appeared. The duplicate tag has been sent since 9.3, but since nothing is broken, it doesn't seem worth fixing. Per complaints from Tom Lane. Discussion: https://postgr.es/m/1347966.1600195735@sss.pgh.pa.us
* Centralize setup of SIGQUIT handling for postmaster child processes.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | We decided that the policy established in commit 7634bd4f6 for the bgwriter, checkpointer, walwriter, and walreceiver processes, namely that they should accept SIGQUIT at all times, really ought to apply uniformly to all postmaster children. Therefore, get rid of the duplicative and inconsistent per-process code for establishing that signal handler and removing SIGQUIT from BlockSig. Instead, make InitPostmasterChild do it. The handler set up by InitPostmasterChild is SignalHandlerForCrashExit, which just summarily does _exit(2). In interactive backends, we almost immediately replace that with quickdie, since we would prefer to try to tell the client that we're dying. However, this patch is changing the behavior of autovacuum (both launcher and workers), as well as walsenders. Those processes formerly also used quickdie, but AFAICS that was just mindless copy-and-paste: they don't have any interactive client that's likely to benefit from being told this. The stats collector continues to be an outlier, in that it thinks SIGQUIT means normal exit. That should probably be changed for consistency, but there's another patch set where that's being dealt with, so I didn't do so here. Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us
* Don't fetch partition check expression during InitResultRelInfo.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
* Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a partitioned table's column is already marked NOT NULL, there is no need to examine its partitions, because we can rely on previous DDL to have enforced that the child columns are NOT NULL as well. (Unfortunately, the same cannot be said for traditional inheritance, so for now we have to restrict the optimization to partitioned tables.) Hence, we may skip recursing to child tables in this situation. The reason this case is worth worrying about is that when pg_dump dumps a partitioned table having a primary key, it will include the requisite NOT NULL markings in the CREATE TABLE commands, and then add the primary key as a separate step. The primary key addition generates a SET NOT NULL as a subcommand, just to be sure. So the situation where a SET NOT NULL is redundant does arise in the real world. Skipping the recursion does more than just save a few cycles: it means that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY KEY" will take locks only on the partition parent table, not on the partitions. It turns out that parallel pg_restore is effectively assuming that that's true, and has little choice but to do so because the dependencies listed for such a TOC entry don't include the partitions. pg_restore could thus issue this ALTER while data restores on the partitions are still in progress. Taking unnecessary locks on the partitions not only hurts concurrency, but can lead to actual deadlock failures, as reported by Domagoj Smoljanovic. (A contributing factor in the deadlock is that TRUNCATE on a child partition wants a non-exclusive lock on the parent. This seems likewise unnecessary, but the fix for it is more invasive so we won't consider back-patching it. Fortunately, getting rid of one of these two poor behaviors is enough to remove the deadlock.) Although support for partitioned primary keys came in with v11, this patch is dependent on the SET NOT NULL refactoring done by commit f4a3fdfbd, so we can only patch back to v12. Patch by me; thanks to Alvaro Herrera and Amit Langote for review. Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com