aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Fix behavior of ecpg's "EXEC SQL elif name".Tom Lane2020-08-03
| | | | | | | | | | | | | | | | | | This ought to work much like C's "#elif defined(name)"; but the code implemented it in a way equivalent to endif followed by ifdef, so that it didn't matter whether any previous branch of the IF construct had succeeded. Fix that; add some test cases covering elif and nested IFs; and improve the documentation, which also seemed a bit confused. AFAICS the code has been like this since the feature was added in 1999 (commit b57b0e044). So while it's surely wrong, there might be code out there relying on the current behavior. Hence, don't back-patch into stable branches. It seems all right to fix it in v13 though. Per report from Ashutosh Sharma. Reviewed by Ashutosh Sharma and Michael Meskes. Discussion: https://postgr.es/m/CAE9k0P=dQk9X0cU2tN49S7a9tv733-e1pVdpB1P-pWJ5PdTktg@mail.gmail.com
* Fix rare failure in LDAP tests.Thomas Munro2020-08-03
| | | | | | | | | | | Instead of writing a query to psql's stdin, use -c. This avoids a failure where psql exits before we write, seen a few times on the build farm. Thanks to Tom Lane for the suggestion. Back-patch to 11, where the LDAP tests arrived. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CA%2BhUKGLFmW%2BHQYPeKiwSp5sdFFHtFViCpw4Mh6yAgEx74r5-Cw%40mail.gmail.com
* Fix minor issues in psql's new \dAc and related commands.Tom Lane2020-08-02
| | | | | | | | | | | | | | | | | The type-name pattern in \dAc and \dAf was matched only to the actual pg_type.typname string, which is fairly user-unfriendly in cases where that is not what's shown to the user by format_type (compare "_int4" and "integer[]"). Make this code match what \dT does, i.e. match the pattern against either typname or format_type() output. Also fix its broken handling of schema-name restrictions. (IOW, make these processSQLNamePattern calls match \dT's.) While here, adjust whitespace to make the query a little prettier in -E output, too. Also improve some inaccuracies and shaky grammar in the related documentation. Noted while working on a patch for intarray's opclasses; I wondered why I couldn't get a match to "integer*" for the input type name.
* Use int64 instead of long in incremental sort codeDavid Rowley2020-08-02
| | | | | | | | | | Windows 64bit has 4-byte long values which is not suitable for tracking disk space usage in the incremental sort code. Let's just make all these fields int64s. Author: James Coleman Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com Backpatch-through: 13, where the incremental sort code was added
* Fix oversight in ALTER TYPE: typmodin/typmodout must propagate to arrays.Tom Lane2020-07-31
| | | | | | | | | | | If a base type supports typmods, its array type does too, with the same interpretation. Hence changes in pg_type.typmodin/typmodout must be propagated to the array type. While here, improve AlterTypeRecurse to not recurse to domains if there is nothing we'd need to change. Oversight in fe30e7ebf. Back-patch to v13 where that came in.
* Fix recently-introduced performance problem in ts_headline().Tom Lane2020-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new hlCover() algorithm that I introduced in commit c9b0c678d turns out to potentially take O(N^2) or worse time on long documents, if there are many occurrences of individual query words but few or no substrings that actually satisfy the query. (One way to hit this behavior is with a "common_word & rare_word" type of query.) This seems unavoidable given the original goal of checking every substring of the document, so we have to back off that idea. Fortunately, it seems unlikely that anyone would really want headlines spanning all of a long document, so we can avoid the worse-than-linear behavior by imposing a maximum length of substring that we'll consider. For now, just hard-wire that maximum length as a multiple of max_words times max_fragments. Perhaps at some point somebody will argue for exposing it as a ts_headline parameter, but I'm hesitant to make such a feature addition in a back-patched bug fix. I also noted that the hlFirstIndex() function I'd added in that commit was unnecessarily stupid: it really only needs to check whether a HeadlineWordEntry's item pointer is null or not. This wouldn't make all that much difference in typical cases with queries having just a few terms, but a cycle shaved is a cycle earned. In addition, add a CHECK_FOR_INTERRUPTS call in TS_execute_recurse. This ensures that hlCover's loop is cancellable if it manages to take a long time, and it may protect some other TS_execute callers as well. Back-patch to 9.6 as the previous commit was. I also chose to add the CHECK_FOR_INTERRUPTS call to 9.5. The old hlCover() algorithm seems to avoid the O(N^2) behavior, at least on the test case I tried, but nonetheless it's not very quick on a long document. Per report from Stephen Frost. Discussion: https://postgr.es/m/20200724160535.GW12375@tamriel.snowman.net
* Use pg_bitutils for HyperLogLog.Jeff Davis2020-07-30
| | | | | | | | | | | Using pg_leftmost_one_post32() yields substantial performance benefits. Backpatching to version 13 because HLL is used for HashAgg improvements in 9878b643, which was also backpatched to 13. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-WzkGvDKVDo+0YvfvZ+1CE=iCi88DCOGFF3i1hTGGaxcKPw@mail.gmail.com Backpatch-through: 13
* Add hash_mem_multiplier GUC.Peter Geoghegan2020-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a GUC that acts as a multiplier on work_mem. It gets applied when sizing executor node hash tables that were previously size constrained using work_mem alone. The new GUC can be used to preferentially give hash-based nodes more memory than the generic work_mem limit. It is intended to enable admin tuning of the executor's memory usage. Overall system throughput and system responsiveness can be improved by giving hash-based executor nodes more memory (especially over sort-based alternatives, which are often much less sensitive to being memory constrained). The default value for hash_mem_multiplier is 1.0, which is also the minimum valid value. This means that hash-based nodes continue to apply work_mem in the traditional way by default. hash_mem_multiplier is generally useful. However, it is being added now due to concerns about hash aggregate performance stability for users that upgrade to Postgres 13 (which added disk-based hash aggregation in commit 1f39bce0). While the old hash aggregate behavior risked out-of-memory errors, it is nevertheless likely that many users actually benefited. Hash agg's previous indifference to work_mem during query execution was not just faster; it also accidentally made aggregation resilient to grouping estimate problems (at least in cases where this didn't create destabilizing memory pressure). hash_mem_multiplier can provide a certain kind of continuity with the behavior of Postgres 12 hash aggregates in cases where the planner incorrectly estimates that all groups (plus related allocations) will fit in work_mem/hash_mem. This seems necessary because hash-based aggregation is usually much slower when only a small fraction of all groups can fit. Even when it isn't possible to totally avoid hash aggregates that spill, giving hash aggregation more memory will reliably improve performance (the same cannot be said for external sort operations, which appear to be almost unaffected by memory availability provided it's at least possible to get a single merge pass). The PostgreSQL 13 release notes should advise users that increasing hash_mem_multiplier can help with performance regressions associated with hash aggregation. That can be taken care of by a later commit. Author: Peter Geoghegan Reviewed-By: Álvaro Herrera, Jeff Davis Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com Backpatch: 13-, where disk-based hash aggregation was introduced.
* HashAgg: use better cardinality estimate for recursive spilling.Jeff Davis2020-07-28
| | | | | | | | | | | | | | | | Use HyperLogLog to estimate the group cardinality in a spilled partition. This estimate is used to choose the number of partitions if we recurse. The previous behavior was to use the number of tuples in a spilled partition as the estimate for the number of groups, which lead to overpartitioning. That could cause the number of batches to be much higher than expected (with each batch being very small), which made it harder to interpret EXPLAIN ANALYZE results. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com Backpatch-through: 13
* Rename another "hash_mem" local variable.Peter Geoghegan2020-07-28
| | | | | | Missed by my commit 564ce621. Backpatch: 13-, where disk-based hash aggregation was introduced.
* Correct obsolete UNION hash aggs comment.Peter Geoghegan2020-07-28
| | | | | | Oversight in commit 1f39bce0, which added disk-based hash aggregation. Backpatch: 13-, where disk-based hash aggregation was introduced.
* Make EXPLAIN ANALYZE of HashAgg more similar to Hash JoinDavid Rowley2020-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | There were various unnecessary differences between Hash Agg's EXPLAIN ANALYZE output and Hash Join's. Here we modify the Hash Agg output so that it's better aligned to Hash Join's. The following changes have been made: 1. Start batches counter at 1 instead of 0. 2. Always display the "Batches" property, even when we didn't spill to disk. 3. Use the text "Batches" instead of "HashAgg Batches" for text format. 4. Use the text "Memory Usage" instead of "Peak Memory Usage" for text format. 5. Include "Batches" before "Memory Usage" in both text and non-text formats. In passing also modify the "Planned Partitions" property so that we show it regardless of if the value is 0 or not for non-text EXPLAIN formats. This was pointed out by Justin Pryzby and probably should have been part of 40efbf870. Reviewed-by: Justin Pryzby, Jeff Davis Discussion: https://postgr.es/m/CAApHDvrshRnA6C0VFnu7Fb9TVvgGo80PUMm5+2DiaS1gEkPvtw@mail.gmail.com Backpatch-through: 13, where HashAgg batching was introduced
* Fix some issues with step generation in partition pruning.Etsuro Fujita2020-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of range partitioning, get_steps_using_prefix() assumes that the passed-in prefix list contains at least one clause for each of the partition keys earlier than one specified in the passed-in step_lastkeyno, but the caller (ie, gen_prune_steps_from_opexps()) didn't take it into account, which led to a server crash or incorrect results when the list contained no clauses for such partition keys, as reported in bug #16500 and #16501 from Kobayashi Hisanori. Update the caller to call that function only when the list created there contains at least one clause for each of the earlier partition keys in the case of range partitioning. While at it, fix some other issues: * The list to pass to get_steps_using_prefix() is allowed to contain multiple clauses for the same partition key, as described in the comment for that function, but that function actually assumed that the list contained just a single clause for each of middle partition keys, which led to an assertion failure when the list contained multiple clauses for such partition keys. Update that function to match the comment. * In the case of hash partitioning, partition keys are allowed to be NULL, in which case the list to pass to get_steps_using_prefix() contains no clauses for NULL partition keys, but that function treats that case as like the case of range partitioning, which led to the assertion failure. Update the assertion test to take into account NULL partition keys in the case of hash partitioning. * Fix a typo in a comment in get_steps_using_prefix_recurse(). * gen_partprune_steps() failed to detect self-contradiction from strict-qual clauses and an IS NULL clause for the same partition key in some cases, producing incorrect partition-pruning steps, which led to incorrect results of partition pruning, but didn't cause any user-visible problems fortunately, as the self-contradiction is detected later in the query planning. Update that function to detect the self-contradiction. Per bug #16500 and #16501 from Kobayashi Hisanori. Patch by me, initial diagnosis for the reported issue and review by Dmitry Dolgov. Back-patch to v11, where partition pruning was introduced. Discussion: https://postgr.es/m/16500-d1613f2a78e1e090%40postgresql.org Discussion: https://postgr.es/m/16501-5234a9a0394f6754%40postgresql.org
* Remove hashagg_avoid_disk_plan GUC.Peter Geoghegan2020-07-27
| | | | | | | | | | | Note: This GUC was originally named enable_hashagg_disk when it appeared in commit 1f39bce0, which added disk-based hash aggregation. It was subsequently renamed in commit 92c58fd9. Author: Peter Geoghegan Reviewed-By: Jeff Davis, Álvaro Herrera Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com Backpatch: 13-, where disk-based hash aggregation was introduced.
* Fix handling of structure for bytea data type in ECPGMichael Paquier2020-07-27
| | | | | | | | | | | | | | Some code paths dedicated to bytea used the structure for varchar. This did not lead to any actual bugs, as bytea and varchar have the same definition, but it could become a trap if one of these definitions changes for a new feature or a bug fix. Issue introduced by 050710b. Author: Shenhao Wang Reviewed-by: Vignesh C, Michael Paquier Discussion: https://postgr.es/m/07ac7dee1efc44f99d7f53a074420177@G08CNEXMBPEKD06.g08.fujitsu.local Backpatch-through: 12
* Fix LookupTupleHashEntryHash() pipeline-stall issue.Jeff Davis2020-07-26
| | | | | | | | Refactor hash lookups in nodeAgg.c to improve performance. Author: Andres Freund and Jeff Davis Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de Backpatch-through: 13
* Tweak behavior of pg_stat_activity.leader_pidMichael Paquier2020-07-26
| | | | | | | | | | | | | | | | | | | | | | | The initial implementation of leader_pid in pg_stat_activity added by b025f32 took the approach to strictly print what a PGPROC entry includes. In short, if a backend has been involved in parallel query at least once, leader_pid would remain set as long as the backend is alive. For a parallel group leader, this means that the field would always be set after it participated at least once in parallel query, and after more discussions this could be confusing if using for example a connection pooler. This commit changes the data printed so as leader_pid becomes always NULL for a parallel group leader, showing up a non-NULL value only for the parallel workers, and actually as long as a parallel query is running as workers are shut down once the query has completed. This does not change the definition of any catalog, so no catalog bump is needed. Per discussion with Justin Pryzby, Álvaro Herrera, Julien Rouhaud and me. Discussion: https://postgr.es/m/20200721035145.GB17300@paquier.xyz Backpatch-through: 13
* Fix buffer usage stats for nodes above Gather Merge.Amit Kapila2020-07-25
| | | | | | | | | | | | | | | Commit 85c9d347 addressed a similar problem for Gather and Gather Merge nodes but forgot to account for nodes above parallel nodes. This still works for nodes above Gather node because we shut down the workers for Gather node as soon as there are no more tuples. We can do a similar thing for Gather Merge as well but it seems better to account for stats during nodes shutdown after completing the execution. Reported-by: Stéphane Lorek, Jehan-Guillaume de Rorthais Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Reviewed-by: Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/20200718160206.584532a2@firost
* Replace TS_execute's TS_EXEC_CALC_NOT flag with TS_EXEC_SKIP_NOT.Tom Lane2020-07-24
| | | | | | | | | | | | | | | | | | | | | It's fairly silly that ignoring NOT subexpressions is TS_execute's default behavior. It's wrong on its face and it encourages errors of omission. Moreover, the only two remaining callers that aren't specifying CALC_NOT are in ts_headline calculations, and it's very arguable that those are bugs: if you've specified "!foo" in your query, why would you want to get a headline that includes "foo"? Hence, rip that out and change the default behavior to be to calculate NOT accurately. As a concession to the slim chance that there is still somebody somewhere who needs the incorrect behavior, provide a new SKIP_NOT flag to explicitly request that. Back-patch into v13, mainly because it seems better to change this at the same time as the previous commit's rejiggering of TS_execute related APIs. Any outside callers affected by this change are probably also affected by that one. Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
* Fix assorted bugs by changing TS_execute's callback API to ternary logic.Tom Lane2020-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | Text search sometimes failed to find valid matches, for instance '!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during an index search. The root of the issue is that TS_execute's callback functions were not changed to use ternary (yes/no/maybe) reporting when we made the search logic itself do so. It's somewhat annoying to break that API, but on the other hand we now see that any code using plain boolean logic is almost certainly broken since the addition of phrase search. There seem to be very few outside callers of this code anyway, so we'll just break them intentionally to get them to adapt. This allows removal of tsginidx.c's private re-implementation of TS_execute, since that's now entirely duplicative. It's also no longer necessary to avoid use of CALC_NOT in tsgistidx.c, since the underlying callbacks can now do something reasonable. Back-patch into v13. We can't change this in stable branches, but it seems not quite too late to fix it in v13. Tom Lane and Pavel Borisov Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
* Fix error message.Thomas Munro2020-07-23
| | | | | | | Remove extra space. Back-patch to all releases, like commit 7897e3bb. Author: Lu, Chenyang <lucy.fnst@cn.fujitsu.com> Discussion: https://postgr.es/m/795d03c6129844d3803e7eea48f5af0d%40G08CNEXMBPEKD04.g08.fujitsu.local
* neqjoinsel must now pass through collation to eqjoinsel.Tom Lane2020-07-21
| | | | | | | | | | | | | Since commit 044c99bc5, eqjoinsel passes the passed-in collation to any operators it invokes. However, neqjoinsel failed to pass on whatever collation it got, so that if we invoked a collation-dependent operator via that code path, we'd get "could not determine which collation to use for string comparison" or the like. Per report from Justin Pryzby. Back-patch to v12, like the previous commit. Discussion: https://postgr.es/m/20200721191606.GL5748@telsasoft.com
* Assert that we don't insert nulls into attnotnull catalog columns.Tom Lane2020-07-21
| | | | | | | | | | | | | | | | | | | | | The executor checks for this error, and so does the bootstrap catalog loader, but we never checked for it in retail catalog manipulations. The folly of that has now been exposed, so let's add assertions checking it. Checking in CatalogTupleInsert[WithInfo] and CatalogTupleUpdate[WithInfo] should be enough to cover this. Back-patch to v10; the aforesaid functions didn't exist before that, and it didn't seem worth adapting the patch to the oldest branches. But given the risk of JIT crashes, I think we certainly need this as far back as v11. Pre-v13, we have to explicitly exclude pg_subscription.subslotname and pg_subscription_rel.srsublsn from the checks, since they are mismarked. (Even if we change our mind about applying BKI_FORCE_NULL in the branch tips, it doesn't seem wise to have assertions that would fire in existing databases.) Discussion: https://postgr.es/m/298837.1595196283@sss.pgh.pa.us
* Correctly mark pg_subscription_rel.srsublsn as nullable.Tom Lane2020-07-20
| | | | | | | | | | | | | | The code has always set this column to NULL when it's not valid, but the catalog header's description failed to reflect that, as did the SGML docs, as did some of the code. To prevent future coding errors of the same ilk, let's hide the field from C code as though it were variable-length (which, in a sense, it is). As with commit 72eab84a5, we can only fix this cleanly in HEAD and v13; the problem extends further back but we'll need some klugery in the released branches. Discussion: https://postgr.es/m/367660.1595202498@sss.pgh.pa.us
* Fix construction of updated-columns bitmap in logical replication.Tom Lane2020-07-20
| | | | | | | | | | | | | | | | | | | Commit b9c130a1f failed to apply the publisher-to-subscriber column mapping while checking which columns were updated. Perhaps less significantly, it didn't exclude dropped columns either. This could result in an incorrect updated-columns bitmap and thus wrong decisions about whether to fire column-specific triggers on the subscriber while applying updates. In HEAD (since commit 9de77b545), it could also result in accesses off the end of the colstatus array, as detected by buildfarm member skink. Fix the logic, and adjust 003_constraints.pl so that the problem is exposed in unpatched code. In HEAD, also add some assertions to check that we don't access off the ends of these newly variable-sized arrays. Back-patch to v10, as b9c130a1f was. Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
* Rename wal_keep_segments to wal_keep_size.Fujii Masao2020-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | max_slot_wal_keep_size that was added in v13 and wal_keep_segments are the GUC parameters to specify how much WAL files to retain for the standby servers. While max_slot_wal_keep_size accepts the number of bytes of WAL files, wal_keep_segments accepts the number of WAL files. This difference of setting units between those similar parameters could be confusing to users. To alleviate this situation, this commit renames wal_keep_segments to wal_keep_size, and make users specify the WAL size in it instead of the number of WAL files. There was also the idea to rename max_slot_wal_keep_size to max_slot_wal_keep_segments, in the discussion. But we have been moving away from measuring in segments, for example, checkpoint_segments was replaced by max_wal_size. So we concluded to rename wal_keep_segments to wal_keep_size. Back-patch to v13 where max_slot_wal_keep_size was added. Author: Fujii Masao Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
* Fix minor typo in nodeIncrementalSort.c.Amit Kapila2020-07-20
| | | | | | | Author: Vignesh C Reviewed-by: James Coleman Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/CALDaNm0WjZqRvdeL59ZfYH0o4mLbKQ23jm-bnjXcFzgpANx55g@mail.gmail.com
* Correctly mark pg_subscription.subslotname as nullable.Tom Lane2020-07-19
| | | | | | | | | | | | | | | | | | | | | | Due to the layout of this catalog, subslotname has to be explicitly marked BKI_FORCE_NULL, else initdb will default to the assumption that it's non-nullable. Since, in fact, CREATE/ALTER SUBSCRIPTION will store null values there, the existing marking is just wrong, and has been since this catalog was invented. We haven't noticed because not much in the system actually depends on attnotnull being truthful. However, JIT'ed tuple deconstruction does depend on that in some cases, allowing crashes or wrong answers in queries that inspect pg_subscription. Commit 9de77b545 quite accidentally exposed this on the buildfarm members that force JIT activation. Back-patch to v13. The problem goes further back, but we cannot force initdb in released branches, so some klugier solution will be needed there. Before working on that, push this simple fix to try to get the buildfarm back to green. Discussion: https://postgr.es/m/4118109.1595096139@sss.pgh.pa.us
* Cope with data-offset-less archive files during out-of-order restores.Tom Lane2020-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_dump produces custom-format archive files that lack data offsets when it is unable to seek its output. Up to now that's been a hazard for pg_restore. But if pg_restore is able to seek in the archive file, there is no reason to throw up our hands when asked to restore data blocks out of order. Instead, whenever we are searching for a data block, record the locations of the blocks we passed over (that is, fill in the missing data-offset fields in our in-memory copy of the TOC data). Then, when we hit a case that requires going backwards, we can just seek back. Also track the furthest point that we've searched to, and seek back to there when beginning a search for a new data block. This avoids possible O(N^2) time consumption, by ensuring that each data block is examined at most twice. (On Unix systems, that's at most twice per parallel-restore job; but since Windows uses threads here, the threads can share block location knowledge, reducing the amount of duplicated work.) We can also improve the code a bit by using fseeko() to skip over data blocks during the search. This is all of some use even in simple restores, but it's really significant for parallel pg_restore. In that case, we require seekability of the input already, and we will very probably need to do out-of-order restores. Back-patch to v12, as this fixes a regression introduced by commit 548e50976. Before that, parallel restore avoided requesting out-of-order restores, so it would work on a data-offset-less archive. Now it will again. Ideally this patch would include some test coverage, but there are other open bugs that need to be fixed before we can extend our coverage of parallel restore very much. Plan to revisit that later. David Gilman and Tom Lane; reviewed by Justin Pryzby Discussion: https://postgr.es/m/CALBH9DDuJ+scZc4MEvw5uO-=vRyR2=QF9+Yh=3hPEnKHWfS81A@mail.gmail.com
* Remove manual tracking of file position in pg_dump/pg_backup_custom.c.Tom Lane2020-07-17
| | | | | | | | | | | | | | | | | | | We do not really need to track the file position by hand. We were already relying on ftello() whenever the archive file is seekable, while if it's not seekable we don't need the file position info anyway because we're not going to be able to re-write the TOC. Moreover, that tracking was buggy since it failed to account for the effects of fseeko(). Somewhat remarkably, that seems not to have made for any live bugs up to now. We could fix the oversights, but it seems better to just get rid of the whole error-prone mess. In itself this is merely code cleanup. However, it's necessary infrastructure for an upcoming bug-fix patch (because that code *does* need valid file position after fseeko). The bug fix needs to go back as far as v12; hence, back-patch that far. Discussion: https://postgr.es/m/CALBH9DDuJ+scZc4MEvw5uO-=vRyR2=QF9+Yh=3hPEnKHWfS81A@mail.gmail.com
* Avoid CREATE INDEX unique index deduplication.Peter Geoghegan2020-07-17
| | | | | | | | | | | | | | | | | There is no advantage to attempting deduplication for a unique index during CREATE INDEX, since there cannot possibly be any duplicates. Doing so wastes cycles due to unnecessary copying. Make sure that we avoid it consistently. We already avoided unique index deduplication in the case where there were some spool2 tuples to merge. That didn't account for the fact that spool2 is removed early/unset in the common case where it has no tuples that need to be merged (i.e. it failed to account for the "spool2 turns out to be unnecessary" optimization in _bt_spools_heapscan()). Oversight in commit 0d861bbb, which added nbtree deduplication Backpatch: 13-, where nbtree deduplication was introduced.
* Ensure that distributed timezone abbreviation files are plain ASCII.Tom Lane2020-07-17
| | | | | | | | | | | | | | | | | | We had two occurrences of "Mitteleuropäische Zeit" in Europe.txt, though the corresponding entries in Default were spelled "Mitteleuropaeische Zeit". Standardize on the latter spelling to avoid questions of which encoding to use. While here, correct a couple of other trivial inconsistencies between the Default file and the supposedly-matching entries in the *.txt files, as exposed by some checking with comm(1). Also, add BDST to the Europe.txt file; it previously was only listed in Default. None of this has any direct functional effect. Per complaint from Christoph Berg. As usual for timezone data patches, apply to all branches. Discussion: https://postgr.es/m/20200716100743.GE3534683@msg.df7cb.de
* Fix whitespacePeter Eisentraut2020-07-17
|
* Resolve gratuitous tabs in SQL filePeter Eisentraut2020-07-17
|
* Fix signal handler setup for SIGHUP in the apply launcher process.Amit Kapila2020-07-17
| | | | | | | | | | | | Commit 1e53fe0e70 has unified the usage of the config-file reload flag by using the same signal handler function for the SIGHUP signal at many places in the code. By mistake, it used the wrong SIGNAL in apply launcher process for the SIGHUP signal handler function. Author: Bharath Rupireddy Reviewed-by: Dilip Kumar Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/CALj2ACVzHCRnS20bOiEHaLtP5PVBENZQn4khdsSJQgOv_GM-LA@mail.gmail.com
* Switch pg_test_fsync to use binary mode on WindowsMichael Paquier2020-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_test_fsync has always opened files using the text mode on Windows, as this is the default mode used if not enforced by _setmode(). This fixes a failure when running pg_test_fsync down to 12 because O_DSYNC and the text mode are not able to work together nicely. We fixed the handling of O_DSYNC in 12~ for the tool by switching to the concurrent-safe version of fopen() in src/port/ with 0ba06e0. And 40cfe86, by enforcing the text mode for compatibility reasons if O_TEXT or O_BINARY are not specified by the caller, broke pg_test_fsync. For all versions, this avoids any translation overhead, and pg_test_fsync should test binary writes, so it is a gain in all cases. Note that O_DSYNC is still not handled correctly in ~11, leading to pg_test_fsync to show insanely high numbers for open_datasync() (using this property it is easy to notice that the binary mode is much faster). This would require a backpatch of 0ba06e0 and 40cfe86, which could potentially break existing applications, so this is left out. There are no TAP tests for this tool yet, so I have checked all builds manually using MSVC. We could invent a new option to run a single transaction instead of using a duration of 1s to make the tests a maximum short, but this is left as future work. Thanks to Bruce Momjian for the discussion. Reported-by: Jeff Janes Author: Michael Paquier Discussion: https://postgr.es/m/16526-279ded30a230d275@postgresql.org Backpatch-through: 9.5
* Fix handling of missing files when using pg_rewind with online sourceMichael Paquier2020-07-15
| | | | | | | | | | | | | | | | | | | | | | | | When working with an online source cluster, pg_rewind gets a list of all the files in the source data directory using a WITH RECURSIVE query, returning a NULL result for a file's metadata if it gets removed between the moment it is listed in a directory and the moment its metadata is obtained with pg_stat_file() (say a recycled WAL segment). The query result was processed in such a way that for each tuple we checked only that the first file's metadata was NULL. This could have two consequences, both resulting in a failure of the rewind: - If the first tuple referred to a removed file, all files from the source would be ignored. - Any file actually missing would not be considered as such. While on it, rework slightly the code so as no values are saved if we know that a file is going to be skipped. Issue introduced by b36805f, so backpatch down to 9.5. Author: Justin Pryzby, Michael Paquier Reviewed-by: Daniel Gustafsson, Masahiko Sawada Discussion: https://postgr.es/m/20200713061010.GC23581@telsasoft.com Backpatch-through: 9.5
* Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.Tom Lane2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reparameterize_path_by_child() failed to reparameterize BitmapAnd and BitmapOr paths. This matters only if such a path is chosen as the inside of a nestloop partition-wise join, where we have to pass in parameters from the outside of the nestloop. If that did happen, we generated a bad plan that would likely lead to crashes at execution. This is not entirely reparameterize_path_by_child()'s fault though; it's the victim of an ancient decision (my ancient decision, I think) to not bother filling in param_info in BitmapAnd/Or path nodes. That caused the function to believe that such nodes and their children contain no parameter references and so need not be processed. In hindsight that decision looks pretty penny-wise and pound-foolish: while it saves a few cycles during path node setup, we do commonly need the information later. In particular, by reversing the decision and requiring valid param_info data in all nodes of a bitmap path tree, we can get rid of indxpath.c's get_bitmap_tree_required_outer() function, which computed the data on-demand. It's not unlikely that that nets out as a savings of cycles in many scenarios. A couple of other things in indxpath.c can be simplified as well. While here, get rid of some cases in reparameterize_path_by_child() that are visibly dead or useless, given that we only care about reparameterizing paths that can be on the inside of a parameterized nestloop. This case reminds one of the maxim that untested code probably does not work, so I'm unwilling to leave unreachable code in this function. (I did leave the T_Gather case in place even though it's not reached in the regression tests. It's not very clear to me when the planner might prefer to put Gather below rather than above a nestloop, but at least in principle the case might be interesting.) Per bug #16536, originally from Arne Roland but with a test case by Andrew Gierth. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/16536-2213ee0b3aad41fd@postgresql.org
* Fix timing issue with ALTER TABLE's validate constraintDavid Rowley2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | An ALTER TABLE to validate a foreign key in which another subcommand already caused a pending table rewrite could fail due to ALTER TABLE attempting to validate the foreign key before the actual table rewrite takes place. This situation could result in an error such as: ERROR: could not read block 0 in file "base/nnnnn/nnnnn": read only 0 of 8192 bytes The failure here was due to the SPI call which validates the foreign key trying to access an index which is yet to be rebuilt. Similarly, we also incorrectly tried to validate CHECK constraints before the heap had been rewritten. The fix for both is to delay constraint validation until phase 3, after the table has been rewritten. For CHECK constraints this means a slight behavioral change. Previously ALTER TABLE VALIDATE CONSTRAINT on inheritance tables would be validated from the bottom up. This was different from the order of evaluation when a new CHECK constraint was added. The changes made here aligns the VALIDATE CONSTRAINT evaluation order for inheritance tables to be the same as ADD CONSTRAINT, which is generally top-down. Reported-by: Nazli Ugur Koyluoglu, using SQLancer Discussion: https://postgr.es/m/CAApHDvp%3DZXv8wiRyk_0rWr00skhGkt8vXDrHJYXRMft3TjkxCA%40mail.gmail.com Backpatch-through: 9.5 (all supported versions)
* Fix comments related to table AMsMichael Paquier2020-07-14
| | | | | | | | | | | Incorrect function names were referenced. As this fixes some portions of tableam.h, that is mentioned in the docs as something to look at when implementing a table AM, backpatch down to 12 where this has been introduced. Author: Hironobu Suzuki Discussion: https://postgr.es/m/8fe6d672-28dd-3f1d-7aed-ac2f6d599d3f@interdb.jp Backpatch-through: 12
* Cope with lateral references in the quals of a subquery RTE.Tom Lane2020-07-13
| | | | | | | | | | | | | | | | | | | | | | The qual pushdown logic assumed that all Vars in a restriction clause must be Vars referencing subquery outputs; but since we introduced LATERAL, it's possible for such a Var to be a lateral reference instead. This led to an assertion failure in debug builds. In a non-debug build, there might be no ill effects (if qual_is_pushdown_safe decided the qual was unsafe anyway), or we could get failures later due to construction of an invalid plan. I've not gone to much length to characterize the possible failures, but at least segfaults in the executor have been observed. Given that this has been busted since 9.3 and it took this long for anybody to notice, I judge that the case isn't worth going to great lengths to optimize. Hence, fix by just teaching qual_is_pushdown_safe that such quals are unsafe to push down, matching the previous behavior when it accidentally didn't fail. Per report from Tom Ellis. Back-patch to all supported branches. Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder
* Fix uninitialized value in segno calculationAlvaro Herrera2020-07-13
| | | | | | | | | | | | Remove previous hack in KeepLogSeg that added a case to deal with a (badly represented) invalid segment number. This was added for the sake of GetWALAvailability. But it's not needed if in that function we initialize the segment number to be retreated to the currently being written segment, so do that instead. Per valgrind-running buildfarm member skink, and some sparc64 animals. Discussion: https://postgr.es/m/1724648.1594230917@sss.pgh.pa.us
* Fix bugs in libpq's management of GSS encryption state.Tom Lane2020-07-13
| | | | | | | | | | | | | | | | | | | | | GSS-related resources should be cleaned up in pqDropConnection, not freePGconn, else the wrong things happen when resetting a connection or trying to switch to a different server. It's also critical to reset conn->gssenc there. During connection setup, initialize conn->try_gss at the correct place, else switching to a different server won't work right. Remove now-redundant cleanup of GSS resources around one (and, for some reason, only one) pqDropConnection call in connectDBStart. Per report from Kyotaro Horiguchi that psql would freeze up, rather than successfully resetting a GSS-encrypted connection after a server restart. This is YA oversight in commit b0b39f72b, so back-patch to v12. Discussion: https://postgr.es/m/20200710.173803.435804731896516388.horikyota.ntt@gmail.com
* Improvements to psql \dAo and \dAp commandsAlexander Korotkov2020-07-13
| | | | | | | | | | | | | | | | * Strategy number and purpose are essential information for opfamily operator. So, show those columns in non-verbose output. * "Left/right arg type" \dAp column names are confusing, because those type don't necessary match to function arguments. Rename them to "Registered left/right type". * Replace manual assembling of operator/procedure names with casts to regoperator/regprocedure. * Add schema-qualification for pg_catalog functions and tables. Reported-by: Peter Eisentraut, Tom Lane Reviewed-by: Tom Lane Discussion: https://postgr.es/m/2edc7b27-031f-b2b6-0db2-864241c91cb9%402ndquadrant.com Backpatch-through: 13
* HashAgg: before spilling tuples, set unneeded columns to NULL.Jeff Davis2020-07-12
| | | | | | | | | This is a replacement for 4cad2534. Instead of projecting all tuples going into a HashAgg, only remove unnecessary attributes when actually spilling. This avoids the regression for the in-memory case. Discussion: https://postgr.es/m/a2fb7dfeb4f50aa0a123e42151ee3013933cb802.camel%40j-davis.com Backpatch-through: 13
* Revert "Use CP_SMALL_TLIST for hash aggregate"Jeff Davis2020-07-12
| | | | | | | | | | This reverts commit 4cad2534da6d17067d98cf04be2dfc1bda8f2cd0 due to a performance regression. It will be replaced by a new approach in an upcoming commit. Reported-by: Andres Freund Discussion: https://postgr.es/m/20200614181418.mx4bvljmfkkhoqzl@alap3.anarazel.de Backpatch-through: 13
* Revert "Track statistics for spilling of changes from ReorderBuffer".Amit Kapila2020-07-13
| | | | | | | | | | | | | | | | | | | | The stats with this commit was available only for WALSenders, however, users might want to see for backends doing logical decoding via SQL API. Then, users might want to reset and access these stats across server restart which was not possible with the current patch. List of commits reverted: caa3c4242c Don't call elog() while holding spinlock. e641b2a995 Doc: Update the documentation for spilled transaction statistics. 5883f5fe27 Fix unportable printf format introduced in commit 9290ad198. 9290ad198b Track statistics for spilling of changes from ReorderBuffer. Additionaly, remove the release notes entry for this feature. Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
* Avoid trying to restore table ACLs and per-column ACLs in parallel.Tom Lane2020-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | Parallel pg_restore has always supposed that ACL items for different objects are independent and can be restored in parallel without conflicts. However, there is one case where this fails: because REVOKE on a table is defined to also revoke the privilege(s) at column level, we can't restore per-column ACLs till after we restore any table-level privileges on their table. Failure to honor this restriction can lead to "tuple concurrently updated" errors during parallel restore, or even to the per-column ACLs silently disappearing because the table-level REVOKE is executed afterwards. To fix, add a dependency from each column-level ACL item to its table's ACL item, if there is one. Note that this doesn't fix the hazard for pre-existing archive files, only for ones made with a corrected pg_dump. Given that the bug's been there quite awhile without field reports, I think this is acceptable. This requires changing the API of pg_dump's dumpACL() function. To keep its argument list from getting even longer, I removed the "CatalogId objCatId" argument, which has been unused for ages. Per report from Justin Pryzby. Back-patch to all supported branches. Discussion: https://postgr.es/m/20200706050129.GW4107@telsasoft.com
* Forbid numeric NaN in jsonpathAlexander Korotkov2020-07-11
| | | | | | | | | | | | | | SQL standard doesn't define numeric Inf or NaN values. It appears even more ridiculous to support then in jsonpath assuming JSON doesn't support these values as well. This commit forbids returning NaN from .double(), which was previously allowed. NaN can't be result of inner-jsonpath computation over non-NaNs. So, we can not expect NaN in the jsonpath output. Reported-by: Tom Lane Discussion: https://postgr.es/m/203949.1591879542%40sss.pgh.pa.us Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12
* Improve error reporting for jsonpath .double() methodAlexander Korotkov2020-07-11
| | | | | | | | | | | When jsonpath .double() method detects that numeric or string can't be converted to double precision, it throws an error. This commit makes these errors explicitly express the reason of failure. Discussion: https://postgr.es/m/CAPpHfdtqJtiSXkP7tOXez18NxhLUH_-75bL8%3DOce4Ki%2Bbv7V6Q%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12