aboutsummaryrefslogtreecommitdiff
path: root/src/test
Commit message (Collapse)AuthorAge
* Fix interpolation in test name.Noah Misch2020-09-13
| | | | | | | A pre-commit review had reported the problem, but the fix reached only v10 and earlier. Back-patch to v11. Discussion: https://postgr.es/m/20200423.140546.1055476118690602079.horikyota.ntt@gmail.com
* Use the properly transformed RangeVar for expandTableLikeClause().Tom Lane2020-09-13
| | | | | | | | | | | | | | | transformCreateStmt() adjusts the transformed statement's RangeVar to specify the target schema explicitly, for the express reason of making sure that auxiliary statements derived by parse transformation operate on the right table. But the refactoring I did in commit 502898192 got this wrong and passed the untransformed RangeVar to expandTableLikeClause(). This could lead to assertion failures or weird misbehavior if the wrong table was accessed. Per report from Alexander Lakhin. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/05051f9d-b32b-cb35-6735-0e9f2ab86b5f@gmail.com
* Check default partitions constraints while descendingAlvaro Herrera2020-09-08
| | | | | | | | | | | | | | | | | | | | | | | Partitioning tuple route code assumes that the partition chosen while descending the partition hierarchy is always the correct one. This is true except when the partition is the default partition and another partition has been added concurrently: the partition constraint changes and we don't recheck it. This can lead to tuples mistakenly being added to the default partition that should have been rejected. Fix by rechecking the default partition constraint while descending the hierarchy. An isolation test based on the reproduction steps described by Hao Wu (with tweaks for extra coverage) is included. Backpatch to 12, where this bug came in with 898e5e3290a7. Reported by: Hao Wu <hawu@vmware.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CA+HiwqFqBmcSSap4sFnCBUEL_VfOMmEKaQ3gwUhyfa4c7J_-nA@mail.gmail.com Discussion: https://postgr.es/m/DM5PR0501MB3910E97A9EDFB4C775CF3D75A42F0@DM5PR0501MB3910.namprd05.prod.outlook.com
* Make new authentication test case more robust.Tom Lane2020-09-04
| | | | | | | | | | | I happened to notice that the new test case I added in b55b4dad9 falls over if one runs "make check" repeatedly; though not in branches after v10. That's because it was assuming that tmp_check/pgpass wouldn't exist already. However, it's only been since v11 that the Makefiles forcibly remove all of tmp_check/ before starting a TAP run. This fix to unlink the file is therefore strictly necessary only in v10 ... but it seems wisest to do it across the board, rather than let the test rely on external logic to get the conditions right.
* Collect attribute data on extension owned tables being dumpedAndrew Dunstan2020-09-04
| | | | | | | | | If this data is not collected, pg_dump segfaults if asked for column inserts. Fix by Fabrízio de Royes Mello Backpatch to release 12 where the bug was introduced.
* Fix rare deadlock failure in create_am regression test.Tom Lane2020-09-04
| | | | | | | | | | | | | | | | | The "DROP ACCESS METHOD gist2" test will require locking the index to be dropped and then its table; while most ordinary operations lock a table first then its index. While no concurrent test scripts should be touching fast_emp4000, autovacuum might chance to be processing that table when the DROP runs, resulting in a deadlock failure. This is pretty rare but we see it in the buildfarm from time to time. To fix, acquire a lock on fast_emp4000 before issuing the DROP. Since the point of the exercise is mostly to prevent buildfarm failures, back-patch to 9.6 where this test was introduced. Discussion: https://postgr.es/m/839004.1599185607@sss.pgh.pa.us
* Avoid lockup of a parallel worker when reporting a long error message.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | | | | | Because sigsetjmp() will restore the initial state with signals blocked, the code path in bgworker.c for reporting an error and exiting would execute that way. Usually this is fairly harmless; but if a parallel worker had an error message exceeding the shared-memory communication buffer size (16K) it would lock up, because it would wait for a resume-sending signal from its parallel leader which it would never detect. To fix, just unblock signals at the appropriate point. This can be shown to fail back to 9.6. The lack of parallel query infrastructure makes it difficult to provide a simple test case for 9.5; but I'm pretty sure the issue exists in some form there as well, so apply the code change there too. Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
* Raise error on concurrent drop of partitioned indexAlvaro Herrera2020-09-01
| | | | | | | | | | | | | | | | | | We were already raising an error for DROP INDEX CONCURRENTLY on a partitioned table, albeit a different and confusing one: ERROR: DROP INDEX CONCURRENTLY must be first action in transaction Change that to throw a more comprehensible error: ERROR: cannot drop partitioned index \"%s\" concurrently Michael Paquier authored the test case for indexes on temporary partitioned tables. Backpatch to 11, where indexes on partitioned tables were added. Reported-by: Jan Mussler <jan.mussler@zalando.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
* Teach libpq to handle arbitrary-length lines in .pgpass files.Tom Lane2020-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically there's been a hard-wired assumption here that no line of a .pgpass file could be as long as NAMEDATALEN*5 bytes. That's a bit shaky to start off with, because (a) there's no reason to suppose that host names fit in NAMEDATALEN, and (b) this figure fails to allow for backslash escape characters. However, it fails completely if someone wants to use a very long password, and we're now hearing reports of people wanting to use "security tokens" that can run up to several hundred bytes. Another angle is that the file is specified to allow comment lines, but there's no reason to assume that long comment lines aren't possible. Rather than guessing at what might be a more suitable limit, let's replace the fixed-size buffer with an expansible PQExpBuffer. That adds one malloc/free cycle to the typical use-case, but that's surely pretty cheap relative to the I/O this code has to do. Also, add TAP test cases to exercise this code, because there was no test coverage before. This reverts most of commit 2eb3bc588, as there's no longer a need for a warning message about overlength .pgpass lines. (I kept the explicit check for comment lines, though.) In HEAD and v13, this also fixes an oversight in 74a308cf5: there's not much point in explicit_bzero'ing the line buffer if we only do so in two of the three exit paths. Back-patch to all supported branches, except that the test case only goes back to v10 where src/test/authentication/ was added. Discussion: https://postgr.es/m/4187382.1598909041@sss.pgh.pa.us
* Avoid pushing quals down into sub-queries that have grouping sets.Tom Lane2020-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | The trouble with doing this is that an apparently-constant subquery output column isn't really constant if it is a grouping column that appears in only some of the grouping sets. A qual using such a column would be subject to incorrect const-folding after push-down, as seen in bug #16585 from Paul Sivash. To fix, just disable qual pushdown altogether if the sub-query has nonempty groupingSets. While we could imagine far less restrictive solutions, there is not much point in working harder right now, because subquery_planner() won't move HAVING clauses to WHERE within such a subquery. If the qual stays in HAVING it's not going to be a lot more useful than if we'd kept it at the outer level. Having said that, this restriction could be removed if we used a parsetree representation that distinguished such outputs from actual constants, which is something I hope to do in future. Hence, make the patch a minimal addition rather than integrating it more tightly (e.g. by renumbering the existing items in subquery_is_pushdown_safe's comment). Back-patch to 9.5 where grouping sets were introduced. Discussion: https://postgr.es/m/16585-9d8c340d23ade8c1@postgresql.org
* Fix handling of CREATE TABLE LIKE with inheritance.Tom Lane2020-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a CREATE TABLE command uses both LIKE and traditional inheritance, Vars in CHECK constraints and expression indexes that are absorbed from a LIKE parent table tended to get mis-numbered, resulting in wrong answers and/or bizarre error messages (though probably not any actual crashes, thanks to validation occurring in the executor). In v12 and up, the same could happen to Vars in GENERATED expressions, even in cases with no LIKE clause but multiple traditional-inheritance parents. The cause of the problem for LIKE is that parse_utilcmd.c supposed it could renumber such Vars correctly during transformCreateStmt(), which it cannot since we have not yet accounted for columns added via inheritance. Fix that by postponing processing of LIKE INCLUDING CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed DefineRelation(). The error with GENERATED and multiple inheritance is a simple oversight in MergeAttributes(); it knows it has to renumber Vars in inherited CHECK constraints, but forgot to apply the same processing to inherited GENERATED expressions (a/k/a defaults). Per bug #16272 from Tom Gottfried. The non-GENERATED variants of the issue are ancient, presumably dating right back to the addition of CREATE TABLE LIKE; hence back-patch to all supported branches. Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org
* Disable autovacuum for BRIN test tableAlvaro Herrera2020-08-17
| | | | | | | | This should improve stability in the tests. Per buildfarm member hyrax (CLOBBER_CACHE_ALWAYS) via Tom Lane. Discussion: https://postgr.es/m/871534.1597503261@sss.pgh.pa.us
* Be more careful about the shape of hashable subplan clauses.Tom Lane2020-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan has the form of one or more OpExprs whose LHS is an expression of the outer query's, while the RHS is an expression over Params representing output columns of the subquery. However, the planner only went as far as verifying that the clauses were all binary OpExprs. This works 99.99% of the time, because the clauses have the right shape when emitted by the parser --- but it's possible for function inlining to break that, as reported by PegoraroF10. To fix, teach the planner to check that the LHS and RHS contain the right things, or more accurately don't contain the wrong things. Given that this has been broken for years without anyone noticing, it seems sufficient to just give up hashing when it happens, rather than go to the trouble of commuting the clauses back again (which wouldn't necessarily work anyway). While poking at that, I also noticed that nodeSubplan.c had a baked-in assumption that the number of hash clauses is identical to the number of subquery output columns. Again, that's fine as far as parser output goes, but it's not hard to break it via function inlining. There seems little reason for that assumption though --- AFAICS, the only thing it's buying us is not having to store the number of hash clauses explicitly. Adding code to the planner to reject such cases would take more code than getting nodeSubplan.c to cope, so I fixed it that way. This has been broken for as long as we've had hashable SubPlans, so back-patch to all supported branches. Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
* Fix typo in test comment.Heikki Linnakangas2020-08-14
|
* Empty search_path in logical replication apply worker and walsender.Noah Misch2020-08-10
| | | | | | | | | | | | | | This is like CVE-2018-1058 commit 582edc369cdbd348d68441fc50fa26a84afd0c1a. Today, a malicious user of a publisher or subscriber database can invoke arbitrary SQL functions under an identity running replication, often a superuser. This fix may cause "does not exist" or "no schema has been selected to create in" errors in a replication process. After upgrading, consider watching server logs for these errors. Objects accruing schema qualification in the wake of the earlier commit are unlikely to need further correction. Back-patch to v10, which introduced logical replication. Security: CVE-2020-14349
* Fix yet another issue with step generation in partition pruning.Etsuro Fujita2020-08-07
| | | | | | | | | | | | | | | | Commit 13838740f fixed some issues with step generation in partition pruning, but there was yet another one: get_steps_using_prefix() assumes that clauses in the passed-in prefix list are sorted in ascending order of their partition key numbers, but the caller failed to ensure this for range partitioning, which led to an assertion failure in debug builds. Adjust the caller function to arrange the clauses in the prefix list in the required order for range partitioning. Back-patch to v11, like the previous commit. Patch by me, reviewed by Amit Langote. Discussion: https://postgr.es/m/CAPmGK16jkXiFG0YqMbU66wte-oJTfW6D1HaNvQf%3D%2B5o9%3Dm55wQ%40mail.gmail.com
* Fix rare failure in LDAP tests.Thomas Munro2020-08-03
| | | | | | | | | | | Instead of writing a query to psql's stdin, use -c. This avoids a failure where psql exits before we write, seen a few times on the build farm. Thanks to Tom Lane for the suggestion. Back-patch to 11, where the LDAP tests arrived. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CA%2BhUKGLFmW%2BHQYPeKiwSp5sdFFHtFViCpw4Mh6yAgEx74r5-Cw%40mail.gmail.com
* Fix some issues with step generation in partition pruning.Etsuro Fujita2020-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of range partitioning, get_steps_using_prefix() assumes that the passed-in prefix list contains at least one clause for each of the partition keys earlier than one specified in the passed-in step_lastkeyno, but the caller (ie, gen_prune_steps_from_opexps()) didn't take it into account, which led to a server crash or incorrect results when the list contained no clauses for such partition keys, as reported in bug #16500 and #16501 from Kobayashi Hisanori. Update the caller to call that function only when the list created there contains at least one clause for each of the earlier partition keys in the case of range partitioning. While at it, fix some other issues: * The list to pass to get_steps_using_prefix() is allowed to contain multiple clauses for the same partition key, as described in the comment for that function, but that function actually assumed that the list contained just a single clause for each of middle partition keys, which led to an assertion failure when the list contained multiple clauses for such partition keys. Update that function to match the comment. * In the case of hash partitioning, partition keys are allowed to be NULL, in which case the list to pass to get_steps_using_prefix() contains no clauses for NULL partition keys, but that function treats that case as like the case of range partitioning, which led to the assertion failure. Update the assertion test to take into account NULL partition keys in the case of hash partitioning. * Fix a typo in a comment in get_steps_using_prefix_recurse(). * gen_partprune_steps() failed to detect self-contradiction from strict-qual clauses and an IS NULL clause for the same partition key in some cases, producing incorrect partition-pruning steps, which led to incorrect results of partition pruning, but didn't cause any user-visible problems fortunately, as the self-contradiction is detected later in the query planning. Update that function to detect the self-contradiction. Per bug #16500 and #16501 from Kobayashi Hisanori. Patch by me, initial diagnosis for the reported issue and review by Dmitry Dolgov. Back-patch to v11, where partition pruning was introduced. Discussion: https://postgr.es/m/16500-d1613f2a78e1e090%40postgresql.org Discussion: https://postgr.es/m/16501-5234a9a0394f6754%40postgresql.org
* Kluge slot_compile_deform() to ignore incorrect attnotnull markings.Tom Lane2020-07-20
| | | | | | | | | | | | | | | | | | | | | | Since we mustn't force an initdb in released branches, there is no simple way to correct the markings of pg_subscription.subslotname and pg_subscription_rel.srsublsn as attnotnull in existing pre-v13 installations. Fortunately, released branches don't rely on attnotnull being correct for much. The planner looks at it in relation_excluded_by_constraints, but it'd be difficult to get that to matter for a query on a system catalog. The only place where it's really problematic is in JIT's slot_compile_deform(), which can produce incorrect code that crashes if there are NULLs in an allegedly not-null column. Hence, hack up slot_compile_deform() to be specifically aware of these two incorrect markings and not trust them. This applies to v11 and v12; the JIT code didn't exist before that, and we've fixed the markings in v13. Discussion: https://postgr.es/m/229396.1595191345@sss.pgh.pa.us
* Fix construction of updated-columns bitmap in logical replication.Tom Lane2020-07-20
| | | | | | | | | | | | | | | | | | | Commit b9c130a1f failed to apply the publisher-to-subscriber column mapping while checking which columns were updated. Perhaps less significantly, it didn't exclude dropped columns either. This could result in an incorrect updated-columns bitmap and thus wrong decisions about whether to fire column-specific triggers on the subscriber while applying updates. In HEAD (since commit 9de77b545), it could also result in accesses off the end of the colstatus array, as detected by buildfarm member skink. Fix the logic, and adjust 003_constraints.pl so that the problem is exposed in unpatched code. In HEAD, also add some assertions to check that we don't access off the ends of these newly variable-sized arrays. Back-patch to v10, as b9c130a1f was. Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
* Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.Tom Lane2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reparameterize_path_by_child() failed to reparameterize BitmapAnd and BitmapOr paths. This matters only if such a path is chosen as the inside of a nestloop partition-wise join, where we have to pass in parameters from the outside of the nestloop. If that did happen, we generated a bad plan that would likely lead to crashes at execution. This is not entirely reparameterize_path_by_child()'s fault though; it's the victim of an ancient decision (my ancient decision, I think) to not bother filling in param_info in BitmapAnd/Or path nodes. That caused the function to believe that such nodes and their children contain no parameter references and so need not be processed. In hindsight that decision looks pretty penny-wise and pound-foolish: while it saves a few cycles during path node setup, we do commonly need the information later. In particular, by reversing the decision and requiring valid param_info data in all nodes of a bitmap path tree, we can get rid of indxpath.c's get_bitmap_tree_required_outer() function, which computed the data on-demand. It's not unlikely that that nets out as a savings of cycles in many scenarios. A couple of other things in indxpath.c can be simplified as well. While here, get rid of some cases in reparameterize_path_by_child() that are visibly dead or useless, given that we only care about reparameterizing paths that can be on the inside of a parameterized nestloop. This case reminds one of the maxim that untested code probably does not work, so I'm unwilling to leave unreachable code in this function. (I did leave the T_Gather case in place even though it's not reached in the regression tests. It's not very clear to me when the planner might prefer to put Gather below rather than above a nestloop, but at least in principle the case might be interesting.) Per bug #16536, originally from Arne Roland but with a test case by Andrew Gierth. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/16536-2213ee0b3aad41fd@postgresql.org
* Fix timing issue with ALTER TABLE's validate constraintDavid Rowley2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | An ALTER TABLE to validate a foreign key in which another subcommand already caused a pending table rewrite could fail due to ALTER TABLE attempting to validate the foreign key before the actual table rewrite takes place. This situation could result in an error such as: ERROR: could not read block 0 in file "base/nnnnn/nnnnn": read only 0 of 8192 bytes The failure here was due to the SPI call which validates the foreign key trying to access an index which is yet to be rebuilt. Similarly, we also incorrectly tried to validate CHECK constraints before the heap had been rewritten. The fix for both is to delay constraint validation until phase 3, after the table has been rewritten. For CHECK constraints this means a slight behavioral change. Previously ALTER TABLE VALIDATE CONSTRAINT on inheritance tables would be validated from the bottom up. This was different from the order of evaluation when a new CHECK constraint was added. The changes made here aligns the VALIDATE CONSTRAINT evaluation order for inheritance tables to be the same as ADD CONSTRAINT, which is generally top-down. Reported-by: Nazli Ugur Koyluoglu, using SQLancer Discussion: https://postgr.es/m/CAApHDvp%3DZXv8wiRyk_0rWr00skhGkt8vXDrHJYXRMft3TjkxCA%40mail.gmail.com Backpatch-through: 9.5 (all supported versions)
* Cope with lateral references in the quals of a subquery RTE.Tom Lane2020-07-13
| | | | | | | | | | | | | | | | | | | | | | The qual pushdown logic assumed that all Vars in a restriction clause must be Vars referencing subquery outputs; but since we introduced LATERAL, it's possible for such a Var to be a lateral reference instead. This led to an assertion failure in debug builds. In a non-debug build, there might be no ill effects (if qual_is_pushdown_safe decided the qual was unsafe anyway), or we could get failures later due to construction of an invalid plan. I've not gone to much length to characterize the possible failures, but at least segfaults in the executor have been observed. Given that this has been busted since 9.3 and it took this long for anybody to notice, I judge that the case isn't worth going to great lengths to optimize. Hence, fix by just teaching qual_is_pushdown_safe that such quals are unsafe to push down, matching the previous behavior when it accidentally didn't fail. Per report from Tom Ellis. Back-patch to all supported branches. Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder
* Forbid numeric NaN in jsonpathAlexander Korotkov2020-07-11
| | | | | | | | | | | | | | SQL standard doesn't define numeric Inf or NaN values. It appears even more ridiculous to support then in jsonpath assuming JSON doesn't support these values as well. This commit forbids returning NaN from .double(), which was previously allowed. NaN can't be result of inner-jsonpath computation over non-NaNs. So, we can not expect NaN in the jsonpath output. Reported-by: Tom Lane Discussion: https://postgr.es/m/203949.1591879542%40sss.pgh.pa.us Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12
* Improve error reporting for jsonpath .double() methodAlexander Korotkov2020-07-11
| | | | | | | | | | | When jsonpath .double() method detects that numeric or string can't be converted to double precision, it throws an error. This commit makes these errors explicitly express the reason of failure. Discussion: https://postgr.es/m/CAPpHfdtqJtiSXkP7tOXez18NxhLUH_-75bL8%3DOce4Ki%2Bbv7V6Q%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12
* Tighten up Windows CRLF conversion in our TAP test scripts.Tom Lane2020-07-09
| | | | | | | | | | Back-patch commits 91bdf499b and ffb4cee43, so that all branches agree on when and how to do Windows CRLF conversion. This should close the referenced thread. Thanks to Andrew Dunstan for discussion/review. Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net
* Future-proof regression tests against possibly-missing posixrules file.Tom Lane2020-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | The IANA time zone folk have deprecated use of a "posixrules" file in the tz database. While for now it's our choice whether to keep supplying one in our own builds, installations built with --with-system-tzdata will soon be needing to cope with that file not being present, at least on some platforms. This causes a problem for the horology test, which expected the nonstandard POSIX zone spec "CST7CDT" to apply pre-2007 US daylight savings rules. That does happen if the posixrules file supplies such information, but otherwise the test produces undesired results. To fix, add an explicit transition date rule that matches 2005 practice. (We could alternatively have switched the test to use some real time zone, but it seems useful to have coverage of this type of zone spec.) While at it, update a documentation example that also relied on "CST7CDT"; use a real-world zone name instead. Also, document why the zone names EST5EDT, CST6CDT, MST7MDT, PST8PDT aren't subject to similar failures when "posixrules" is missing. Back-patch to all supported branches, since the hazard is the same for all. Discussion: https://postgr.es/m/1665379.1592581287@sss.pgh.pa.us
* Fix deadlock danger when atomic ops are done under spinlock.Andres Freund2020-06-18
| | | | | | | | | | | | | | | | | This was a danger only for --disable-spinlocks in combination with atomic operations unsupported by the current platform. While atomics.c was careful to signal that a separate semaphore ought to be used when spinlock emulation is active, spin.c didn't actually implement that mechanism. That's my (Andres') fault, it seems to have gotten lost during the development of the atomic operations support. Fix that issue and add test for nesting atomic operations inside a spinlock. Author: Andres Freund Discussion: https://postgr.es/m/20200605023302.g6v3ydozy5txifji@alap3.anarazel.de Backpatch: 9.5-
* Add basic spinlock tests to regression tests.Andres Freund2020-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As s_lock_test, the already existing test for spinlocks, isn't run in an automated fashion (and doesn't test a normal backend environment), adding tests that are run as part of a normal regression run is a good idea. Particularly in light of several recent and upcoming spinlock related fixes. Currently the new tests are run as part of the pre-existing test_atomic_ops() test. That perhaps can be quibbled about, but for now seems ok. The only operations that s_lock_test tests but the new tests don't are the detection of a stuck spinlock and S_LOCK_FREE (which is otherwise unused, not implemented on all platforms, and will be removed). This currently contains a test for more than INT_MAX spinlocks (only run with --disable-spinlocks), to ensure the recent commit fixing a bug with more than INT_MAX spinlock initializations is correct. That test is somewhat slow, so we might want to disable it after a few days. It might be worth retiring s_lock_test after this. The added coverage of a stuck spinlock probably isn't worth the added complexity? Author: Andres Freund Discussion: https://postgr.es/m/20200606023103.avzrctgv7476xj7i@alap3.anarazel.de
* Fix oldest xmin and LSN computation across repslots after advancingMichael Paquier2020-06-18
| | | | | | | | | | | | | | | | | Advancing a replication slot did not recompute the oldest xmin and LSN values across replication slots, preventing resource removal like segments not recycled at checkpoint time. The original commit that introduced the slot advancing in 9c7d06d never did the update of those oldest values, and b0afdca removed this code. This commit adds a TAP test to check segment recycling with advancing for physical slots, enforcing an extra segment switch before advancing to check if the segment gets correctly recycled after a checkpoint. Reported-by: Andres Freund Reviewed-by: Alexey Kondratov, Kyptaro Horiguchi Discussion: https://postgr.es/m/20200609171904.kpltxxvjzislidks@alap3.anarazel.de Backpatch-through: 11
* Fix behavior of float aggregates for single Inf or NaN inputs.Tom Lane2020-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there is just one non-null input value, and it is infinity or NaN, aggregates such as stddev_pop and covar_pop should produce a NaN result, because the calculation is not well-defined. They used to do so, but since we adopted Youngs-Cramer aggregation in commit e954a727f, they produced zero instead. That's an oversight, so fix it. Add tests exercising these edge cases. Affected aggregates are var_pop(double precision) stddev_pop(double precision) var_pop(real) stddev_pop(real) regr_sxx(double precision,double precision) regr_syy(double precision,double precision) regr_sxy(double precision,double precision) regr_r2(double precision,double precision) regr_slope(double precision,double precision) regr_intercept(double precision,double precision) covar_pop(double precision,double precision) corr(double precision,double precision) Back-patch to v12 where the behavior change was accidentally introduced. Report and patch by me; thanks to Dean Rasheed for review. Discussion: https://postgr.es/m/353062.1591898766@sss.pgh.pa.us
* Fix mishandling of NaN counts in numeric_[avg_]combine.Tom Lane2020-06-11
| | | | | | | | | | | | | | | | | | | | | | When merging two NumericAggStates, the code missed adding the new state's NaNcount unless its N was also nonzero; since those counts are independent, this is wrong. This would only have visible effect if some partial aggregate scans found only NaNs while earlier ones found only non-NaNs; then we could end up falsely deciding that there were no NaNs and fail to return a NaN final result as expected. That's pretty improbable, so it's no surprise this hasn't been reported from the field. Still, it's a bug. I didn't try to produce a regression test that would show the bug, but I did notice that these functions weren't being reached at all in our regression tests, so I improved the tests to at least exercise them. With these additions, I see pretty complete code coverage on the aggregation-related functions in numeric.c. Back-patch to 9.6 where this code was introduced. (I only added the improved test case as far back as v10, though, since the relevant part of aggregates.sql isn't there at all in 9.6.)
* Avoid update conflict out serialization anomalies.Peter Geoghegan2020-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | SSI's HeapCheckForSerializableConflictOut() test failed to correctly handle conditions involving a concurrently inserted tuple which is later concurrently updated by a separate transaction . A SELECT statement that called HeapCheckForSerializableConflictOut() could end up using the same XID (updater's XID) for both the original tuple, and the successor tuple, missing the XID of the xact that created the original tuple entirely. This only happened when neither tuple from the chain was visible to the transaction's MVCC snapshot. The observable symptoms of this bug were subtle. A pair of transactions could commit, with the later transaction failing to observe the effects of the earlier transaction (because of the confusion created by the update to the non-visible row). This bug dates all the way back to commit dafaa3ef, which added SSI. To fix, make sure that we check the xmin of concurrently inserted tuples that happen to also have been updated concurrently. Author: Peter Geoghegan Reported-By: Kyle Kingsbury Reviewed-By: Thomas Munro Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443@jepsen.io Backpatch: All supported versions
* Fix typos.Amit Kapila2020-06-11
| | | | | | | Reported-by: John Naylor Author: John Naylor Backpatch-through: 9.5 Discussion: https://postgr.es/m/CACPNZCtRuvs6G+EYqejhVJgBq2AKeZdXRVJsbX4syhO9gn5SNQ@mail.gmail.com
* Preserve pg_index.indisreplident across REINDEX CONCURRENTLYMichael Paquier2020-06-05
| | | | | | | | | | | If the flag value is lost, logical decoding would work the same way as REPLICA IDENTITY NOTHING, meaning that no old tuple values would be included in the changes anymore produced by logical decoding. Author: Michael Paquier Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/20200603065340.GK89559@paquier.xyz Backpatch-through: 12
* Reject "23:59:60.nnn" in datetime input.Tom Lane2020-06-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's intentional that we don't allow values greater than 24 hours, while we do allow "24:00:00" as well as "23:59:60" as inputs. However, the range check was miscoded in such a way that it would accept "23:59:60.nnn" with a nonzero fraction. For time or timetz, the stored result would then be greater than "24:00:00" which would fail dump/reload, not to mention possibly confusing other operations. Fix by explicitly calculating the result and making sure it does not exceed 24 hours. (This calculation is redundant with what will happen later in tm2time or tm2timetz. Maybe someday somebody will find that annoying enough to justify refactoring to avoid the duplication; but that seems too invasive for a back-patched bug fix, and the cost is probably unmeasurable anyway.) Note that this change also rejects such input as the time portion of a timestamp(tz) value. Back-patch to v10. The bug is far older, but to change this pre-v10 we'd need to ensure that the logic behaves sanely with float timestamps, which is possibly nontrivial due to roundoff considerations. Doesn't really seem worth troubling with. Per report from Christoph Berg. Discussion: https://postgr.es/m/20200520125807.GB296739@msg.df7cb.de
* Fix crashes with currtid() and currtid2()Michael Paquier2020-06-01
| | | | | | | | | | | | | | | | | | | | | | | A relation that has no storage initializes rd_tableam to NULL, which caused those two functions to crash because of a pointer dereference. Note that in 11 and older versions, this has always failed with a confusing error "could not open file". These two functions are used by the Postgres ODBC driver, which requires them only when connecting to a backend strictly older than 8.1. When connected to 8.2 or a newer version, the driver uses a RETURNING clause instead whose support has been added in 8.2, so it should be possible to just remove both functions in the future. This is left as an issue to address later. While on it, add more regression tests for those functions as we never really had coverage for them, and for aggregates of TIDs. Reported-by: Jaime Casanova, via sqlsmith Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAJGNTeO93u-5APMga6WH41eTZ3Uee9f3s8dCpA-GSSqNs1b=Ug@mail.gmail.com Backpatch-through: 12
* Make install-tests target work with vpath buildsAndrew Dunstan2020-05-31
| | | | | | | | Also add a top-level install-tests target. Backpatch to all live branches. Craig Ringer, tweaked by me.
* Fix typo in test comment.Heikki Linnakangas2020-05-28
| | | | | The same comment was copied to a few different places, with the same typo. Backpatch down to v11, where this typo was introduced.
* Fix assertion with relation using REPLICA IDENTITY FULL in subscriberMichael Paquier2020-05-16
| | | | | | | | | | | | | | | | In a logical replication subscriber, a table using REPLICA IDENTITY FULL which has a primary key would try to use the primary key's index available to scan for a tuple, but an assertion only assumed as correct the case of an index associated to REPLICA IDENTITY USING INDEX. This commit corrects the assertion so as the use of a primary key index is a valid case. Reported-by: Dilip Kumar Analyzed-by: Dilip Kumar Author: Euler Taveira Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/CAFiTN-u64S5bUiPL1q5kwpHNd0hRnf1OE-bzxNiOs5zo84i51w@mail.gmail.com Backpatch-through: 10
* In successful pg_recvlogical, end PGRES_COPY_OUT cleanly.Noah Misch2020-05-13
| | | | | | | | | | | | | | | | | pg_recvlogical merely called PQfinish(), so the backend sent messages after the disconnect. When that caused EPIPE in internal_flush(), before a LogicalConfirmReceivedLocation(), the next pg_recvlogical would repeat already-acknowledged records. Whether or not the defect causes EPIPE, post-disconnect messages could contain an ErrorResponse that the user should see. One properly ends PGRES_COPY_OUT by repeating PQgetCopyData() until it returns a negative value. Augment one of the tests to cover the case of WAL past --endpos. Back-patch to v10, where commit 7c030783a5bd07cadffc2a1018bc33119a4c7505 first appeared. Before that commit, pg_recvlogical never reached PGRES_COPY_OUT. Reported by Thomas Munro. Discussion: https://postgr.es/m/CAEepm=1MzM2Z_xNe4foGwZ1a+MO_2S9oYDq3M5D11=JDU_+0Nw@mail.gmail.com
* Fix several DDL issues of generated columns versus inheritancePeter Eisentraut2020-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | Several combinations of generated columns and inheritance in CREATE TABLE were not handled correctly. Specifically: - Disallow a child column specifying a generation expression if the parent column is a generated column. The child column definition must be unadorned and the parent column's generation expression will be copied. - Prohibit a child column of a generated parent column specifying default values or identity. - Allow a child column of a not-generated parent column specifying itself as a generated column. This previously did not work, but it was possible to arrive at the state via other means (involving ALTER TABLE), so it seems sensible to support it. Add tests for each case. Also add documentation about the rules involving generated columns and inheritance. Discussion: https://www.postgresql.org/message-id/flat/15830.1575468847%40sss.pgh.pa.us https://www.postgresql.org/message-id/flat/2678bad1-048f-519a-ef24-b12962f41807%40enterprisedb.com https://www.postgresql.org/message-id/flat/CAJvUf_u4h0DxkCMCeEKAWCuzGUTnDP-G5iVmSwxLQSXn0_FWNQ%40mail.gmail.com
* Propagate ALTER TABLE ... SET STORAGE to indexesPeter Eisentraut2020-05-08
| | | | | | | | | When creating a new index, the attstorage setting of the table column is copied to regular (non-expression) index columns. But a later ALTER TABLE ... SET STORAGE is not propagated to indexes, thus creating an inconsistent and undumpable state. Discussion: https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com
* Fix YA text phrase search bug.Tom Lane2020-05-07
| | | | | | | | | | | | | | | | | | | checkcondition_str() failed to report multiple matches for a prefix pattern correctly: it would dutifully merge the match positions, but then after exiting that loop, if the last prefix-matching word had had no suitable positions, it would report there were no matches. The upshot would be failing to recognize a match that the query should match. It looks like you need all of these conditions to see the bug: * a phrase search (else we don't ask for match position details) * a prefix search item (else we don't get to this code) * a weight restriction (else checkclass_str won't fail) Noted while investigating a problem report from Pavel Borisov, though this is distinct from the issue he was on about. Back-patch to 9.6 where phrase search was added.
* Fix full text search to handle NOT above a phrase search correctly.Tom Lane2020-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Queries such as '!(foo<->bar)' failed to find matching rows when implemented as a GiST or GIN index search. That's because of failing to handle phrase searches as tri-valued when considering a query without any position information for the target tsvector. We can only say that the phrase operator might match, not that it does match; and therefore its NOT also might match. The previous coding incorrectly inverted the approximate phrase result to decide that there was certainly no match. To fix, we need to make TS_phrase_execute return a real ternary result, and then bubble that up accurately in TS_execute. As long as we have to do that anyway, we can simplify the baroque things TS_phrase_execute was doing internally to manage tri-valued searching with only a bool as explicit result. For now, I left the externally-visible result of TS_execute as a plain bool. There do not appear to be any outside callers that need to distinguish a three-way result, given that they passed in a flag saying what to do in the absence of position data. This might need to change someday, but we wouldn't want to back-patch such a change. Although tsginidx.c has its own TS_execute_ternary implementation for use at upper index levels, that sadly managed to get this case wrong as well :-(. Fixing it is a lot easier fortunately. Per bug #16388 from Charles Offenbacher. Back-patch to 9.6 where phrase search was introduced. Discussion: https://postgr.es/m/16388-98cffba38d0b7e6e@postgresql.org
* Remove some unstable parts from new TAP test for archive status checkMichael Paquier2020-04-24
| | | | | | | | | | | | | | The test is proving to have timing issues when looking at archive status files on standbys after crash recovery, while other parts of the test rely on pg_stat_archiver as a wait point to make sure that a given state of the archiving is reached. The coverage is not heavily impacted by the removal those extra tests. Per reports from several buildfarm animals, like crake, piculet, culicidae and francolin. Discussion: https://postgr.es/m/20200424005929.GK33034@paquier.xyz Backpatch-through: 9.5
* Fix handling of WAL segments ready to be archived during crash recoveryMichael Paquier2020-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 78ea8b5 has fixed an issue related to the recycling of WAL segments on standbys depending on archive_mode. However, it has introduced a regression with the handling of WAL segments ready to be archived during crash recovery, causing those files to be recycled without getting archived. This commit fixes the regression by tracking in shared memory if a live cluster is either in crash recovery or archive recovery as the handling of WAL segments ready to be archived is different in both cases (those WAL segments should not be removed during crash recovery), and by using this new shared memory state to decide if a segment can be recycled or not. Previously, it was not possible to know if a cluster was in crash recovery or archive recovery as the shared state was able to track only if recovery was happening or not, leading to the problem. A set of TAP tests is added to close the gap here, making sure that WAL segments ready to be archived are correctly handled when a cluster is in archive or crash recovery with archive_mode set to "on" or "always", for both standby and primary. Reported-by: Benoît Lobréau Author: Jehan-Guillaume de Rorthais Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/20200331172229.40ee00dc@firost Backpatch-through: 9.5
* Fix detaching partitions with cloned row triggersAlvaro Herrera2020-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a partition is detached, any triggers that had been cloned from its parent were not properly disentangled from its parent triggers. This resulted in triggers that could not be dropped because they depended on the trigger in the trigger in the no-longer-parent table: ALTER TABLE t DETACH PARTITION t1; DROP TRIGGER trig ON t1; ERROR: cannot drop trigger trig on table t1 because trigger trig on table t requires it HINT: You can drop trigger trig on table t instead. Moreover the table can no longer be re-attached to its parent, because the trigger name is already taken: ALTER TABLE t ATTACH PARTITION t1 FOR VALUES FROM (1)TO(2); ERROR: trigger "trig" for relation "t1" already exists The former is a bug introduced in commit 86f575948c77. (The latter is not necessarily a bug, but it makes the bug more uncomfortable.) To avoid the complexity that would be needed to tell whether the trigger has a local definition that has to be merged with the one coming from the parent table, establish the behavior that the trigger is removed when the table is detached. Backpatch to pg11. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/20200408152412.GZ2228@telsasoft.com
* Fix possible crash with GENERATED ALWAYS columnsDavid Rowley2020-04-18
| | | | | | | | | | | | | | | | | | In some corner cases, this could also lead to corrupted values being included in the tuple. Users who are concerned that they are affected by this should first upgrade and then perform a base backup of their database and restore onto an off-line server. They should then query each table with generated columns to ensure there are no rows where the generated expression does not match a newly calculated version of the GENERATED ALWAYS expression. If no crashes occur and no rows are returned then you're not affected. Fixes bug #16369. Reported-by: Cameron Ezell Discussion: https://postgr.es/m/16369-5845a6f1bef59884@postgresql.org Backpatch-through: 12 (where GENERATED ALWAYS columns were added.)
* Add a wait_for_catchup() before immediate stop of a test master.Noah Misch2020-04-13
| | | | | | | Per buildfarm member hoverfly, a slow walsender could make the test fail. Back-patch to v10, where the test was introduced. Discussion: https://postgr.es/m/20200414013849.GA886648@rfd.leadboat.com