aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Fix handling of "Subplans Removed" field in EXPLAIN output.Tom Lane2020-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 499be013d added this field in a rather poorly-thought-through manner, with the result being that rather than being a field of the Append or MergeAppend plan node as intended (and as it seems to be, in text format), it was actually an element of the "Plans" subgroup. At least in JSON format, that's flat out invalid syntax, because "Plans" is an array not an object. While it's not hard to move the generation of the field so that it appears where it's supposed to, this does result in a visible change in field order in text format, in cases where a Append or MergeAppend plan node has any InitPlans attached. That's slightly annoying to do in stable branches; but the alternative of continuing to emit broken non-text formats seems worse. Also, since the set of fields emitted is not supposed to be data-dependent in non-text formats, make sure that "Subplans Removed" appears in Append and MergeAppend nodes even when it's zero, in those formats. (The previous coding made it look like it could appear in some other node types such as BitmapAnd, but we don't actually support runtime pruning there, so don't emit it in those cases.) Per bug #16171 from Mahadevan Ramachandran. Fix by Daniel Gustafsson and Tom Lane, reviewed by Hamid Akhtar. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/16171-b72259ab75505fa2@postgresql.org
* Add missing break out seqscan loop in logical replicationAlvaro Herrera2020-02-03
| | | | | | | | | | | | | When replica identity is FULL (an admittedly unusual case), the loop that searches for tuples in execReplication.c didn't stop scanning the table when once a matching tuple was found. Add the missing 'break'. Note slight behavior change: we now return the first matching tuple rather than the last one. They are supposed to be indistinguishable anyway, so this shouldn't matter. Author: Konstantin Knizhnik Discussion: https://postgr.es/m/379743f6-ae91-b866-f7a2-5624e6d2b0a4@postgrespro.ru
* Revert commit de0177788b.Fujii Masao2020-02-03
| | | | | | | | | | | | This commit reverts the fix "Make inherited TRUNCATE perform access permission checks on parent table only" only in the back branches. It's not hard to imagine that there are some applications expecting the old behavior and the fix breaks their security. To avoid this compatibility problem, we decided to apply the fix only in HEAD and revert it in all supported back branches. Discussion: https://postgr.es/m/21015.1580400165@sss.pgh.pa.us
* Fix memory leak on DSM slot exhaustion.Thomas Munro2020-02-01
| | | | | | | | | | | | | | If we attempt to create a DSM segment when no slots are available, we should return the memory to the operating system. Previously we did that if the DSM_CREATE_NULL_IF_MAXSEGMENTS flag was passed in, but we didn't do it if an error was raised. Repair. Back-patch to 9.4, where DSM segments arrived. Author: Thomas Munro Reviewed-by: Robert Haas Reported-by: Julian Backes Discussion: https://postgr.es/m/CA%2BhUKGKAAoEw-R4om0d2YM4eqT1eGEi6%3DQot-3ceDR-SLiWVDw%40mail.gmail.com
* Fix CheckAttributeType's handling of collations for ranges.Tom Lane2020-01-31
| | | | | | | | | | | | | | | | | | | | Commit fc7695891 changed CheckAttributeType to recurse into ranges, but made it pass down the wrong collation (always InvalidOid, since ranges as such have no collation). This would result in guaranteed failure when considering a range type whose subtype is collatable. Embarrassingly, we lack any regression tests that would expose such a problem (but fortunately, somebody noticed before we shipped this bug in any release). Fix it to pass down the range's subtype collation property instead, and add some regression test cases to exercise collatable-subtype ranges a bit more. Back-patch to all supported branches, as the previous patch was. Report and patch by Julien Rouhaud, test cases tweaked by me Discussion: https://postgr.es/m/CAOBaU_aBWqNweiGUFX0guzBKkcfJ8mnnyyGC_KBQmO12Mj5f_A@mail.gmail.com
* Handle lack of DSM slots in parallel btree build.Thomas Munro2020-01-31
| | | | | | | | | | | | | If no DSM slots are available, a ParallelContext can still be created, but its seg pointer is NULL. Teach parallel btree build to cope with that by falling back to a regular non-parallel build, to avoid crashing with a segmentation fault. Back-patch to 11, where parallel CREATE INDEX landed. Reported-by: Nicola Contu Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CA%2BhUKGJgJEBnkuODBVomyK3MWFvDBbMVj%3Dgdt6DnRPU-5sQ6UQ%40mail.gmail.com
* Make inherited TRUNCATE perform access permission checks on parent table only.Fujii Masao2020-01-31
| | | | | | | | | | | | | | Previously, TRUNCATE command through a parent table checked the permissions on not only the parent table but also the children tables inherited from it. This was a bug and inherited queries should perform access permission checks on the parent table only. This commit fixes that bug. Back-patch to all supported branches. Author: Amit Langote Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CAHGQGwFHdSvifhJE+-GSNqUHSfbiKxaeQQ7HGcYz6SC2n_oDcg@mail.gmail.com
* Fix slot data persistency when advancing physical replication slotsMichael Paquier2020-01-30
| | | | | | | | | | | | | | | | | | | Advancing a physical replication slot with pg_replication_slot_advance() did not mark the slot as dirty if any advancing was done, preventing the follow-up checkpoint to flush the slot data to disk. This caused the advancing to be lost even on clean restarts. This does not happen for logical slots as any advancing marked the slot as dirty. Per discussion, the original feature has been implemented so as in the event of a crash the slot may move backwards to a past LSN. This property is kept and more documentation is added about that. This commit adds some new TAP tests to check the persistency of physical and logical slots after advancing across clean restarts. Author: Alexey Kondratov, Michael Paquier Reviewed-by: Andres Freund, Kyotaro Horiguchi, Craig Ringer Discussion: https://postgr.es/m/059cc53a-8b14-653a-a24d-5f867503b0ee@postgrespro.ru Backpatch-through: 11
* Fix dangling pointer in EvalPlanQual machinery.Tom Lane2020-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | EvalPlanQualStart() supposed that it could re-use the relsubs_rowmark and relsubs_done arrays from a prior instantiation. But since they are allocated in the es_query_cxt of the recheckestate, that's just wrong; EvalPlanQualEnd() will blow away that storage. Therefore we were using storage that could have been reallocated to something else, causing all sorts of havoc. I think this was modeled on the old code's handling of es_epqTupleSlot, but since the code was anyway clearing the arrays at re-use, there's clearly no expectation of importing any outside state. So it's just a dubious savings of a couple of pallocs, which is negligible compared to setting up a new planstate tree. Therefore, just allocate the arrays always. (I moved the allocations slightly for readability.) In principle this bug could cause a problem whenever EPQ rechecks are needed in more than one target table of a ModifyTable plan node. In practice it seems not quite so easy to trigger as that; I couldn't readily duplicate a crash with a partitioned target table, for instance. That's probably down to incidental choices about when to free or reallocate stuff. The added isolation test case does seem to reliably show an assertion failure, though. Per report from Oleksii Kliukin. Back-patch to v12 where the bug was introduced (evidently by commit 3fb307bc4). Discussion: https://postgr.es/m/EEF05F66-2871-4786-992B-5F45C92FEE2E@hintbits.com
* Avoid unnecessary shm writes in Parallel Hash Join.Thomas Munro2020-01-27
| | | | | | | | | | | | | | Currently, Parallel Hash Join cannot be used for full/right joins, so there is no point in setting the match flag. It turns out that the cache coherence traffic generated by those writes slows down large systems running many-core joins, so let's stop doing that. In future, if we need to use match bits in parallel joins, we might want to consider setting them only if not already set. Back-patch to 11, where Parallel Hash Join arrived. Reported-by: Deng, Gang Discussion: https://postgr.es/m/0F44E799048C4849BAE4B91012DB910462E9897A%40SHSMSX103.ccr.corp.intel.com
* Fix EXPLAIN (SETTINGS) to follow policy about when to print empty fields.Tom Lane2020-01-26
| | | | | | | | | | | | | | | | | | | | | | In non-TEXT output formats, the "Settings" field should appear when requested, even if it would be empty. Also, get rid of the premature optimization of counting all the GUC_EXPLAIN variables at startup. Since there was no provision for adjusting that count later, all it'd take would be some extension marking a parameter as GUC_EXPLAIN to risk an assertion failure or memory stomp. We could make get_explain_guc_options() count those variables on-the-fly, or dynamically resize its array ... but TBH I do not think that making a transient array of pointers a bit smaller is worth any extra complication, especially when you consider all the other transient space EXPLAIN eats. So just allocate that array at the max possible size. In HEAD, also add some regression test coverage for this feature. Because of the memory-stomp hazard, back-patch to v12 where this feature was added. Discussion: https://postgr.es/m/19416.1580069629@sss.pgh.pa.us
* Fix an oversight in commit 4c70098ff.Tom Lane2020-01-23
| | | | | | | | | | | | | | | | | | | | I had supposed that the from_char_seq_search() call sites were all passing the constant arrays you'd expect them to pass ... but on looking closer, the one for DY format was passing the days[] array not days_short[]. This accidentally worked because the day abbreviations in English are all the same as the first three letters of the full day names. However, once we took out the "maximum comparison length" logic, it stopped working. As penance for that oversight, add regression test cases covering this, as well as every other switch case in DCH_from_char() that was not reached according to the code coverage report. Also, fold the DCH_RM and DCH_rm cases into one --- now that seq_search is case independent, there's no need to pass different comparison arrays for those cases. Back-patch, as the previous commit was.
* Clean up formatting.c's logic for matching constant strings.Tom Lane2020-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | seq_search(), which is used to match input substrings to constants such as month and day names, had a lot of bizarre and unnecessary behaviors. It was mostly possible to avert our eyes from that before, but we don't want to duplicate those behaviors in the upcoming patch to allow recognition of non-English month and day names. So it's time to clean this up. In particular: * seq_search scribbled on the input string, which is a pretty dangerous thing to do, especially in the badly underdocumented way it was done here. Fortunately the input string is a temporary copy, but that was being made three subroutine levels away, making it something easy to break accidentally. The behavior is externally visible nonetheless, in the form of odd case-folding in error reports about unrecognized month/day names. The scribbling is evidently being done to save a few calls to pg_tolower, but that's such a cheap function (at least for ASCII data) that it's pretty pointless to worry about. In HEAD I switched it to be pg_ascii_tolower to ensure it is cheap in all cases; but there are corner cases in Turkish where this'd change behavior, so leave it as pg_tolower in the back branches. * seq_search insisted on knowing the case form (all-upper, all-lower, or initcap) of the constant strings, so that it didn't have to case-fold them to perform case-insensitive comparisons. This likewise seems like excessive micro-optimization, given that pg_tolower is certainly very cheap for ASCII data. It seems unsafe to assume that we know the case form that will come out of pg_locale.c for localized month/day names, so it's better just to define the comparison rule as "downcase all strings before comparing". (The choice between downcasing and upcasing is arbitrary so far as English is concerned, but it might not be in other locales, so follow citext's lead here.) * seq_search also had a parameter that'd cause it to report a match after a maximum number of characters, even if the constant string were longer than that. This was not actually used because no caller passed a value small enough to cut off a comparison. Replicating that behavior for localized month/day names seems expensive as well as useless, so let's get rid of that too. * from_char_seq_search used the maximum-length parameter to truncate the input string in error reports about not finding a matching name. This leads to rather confusing reports in many cases. Worse, it is outright dangerous if the input string isn't all-ASCII, because we risk truncating the string in the middle of a multibyte character. That'd lead either to delivering an illegible error message to the client, or to encoding-conversion failures that obscure the actual data problem. Get rid of that in favor of truncating at whitespace if any (a suggestion due to Alvaro Herrera). In addition to fixing these things, I const-ified the input string pointers of DCH_from_char and its subroutines, to make sure there aren't any other scribbling-on-input problems. The risk of generating a badly-encoded error message seems like enough of a bug to justify back-patching, so patch all supported branches. Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us
* Fix concurrent indexing operations with temporary tablesMichael Paquier2020-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempting to use CREATE INDEX, DROP INDEX or REINDEX with CONCURRENTLY on a temporary relation with ON COMMIT actions triggered unexpected errors because those operations use multiple transactions internally to complete their work. Here is for example one confusing error when using ON COMMIT DELETE ROWS: ERROR: index "foo" already contains data Issues related to temporary relations and concurrent indexing are fixed in this commit by enforcing the non-concurrent path to be taken for temporary relations even if using CONCURRENTLY, transparently to the user. Using a non-concurrent path does not matter in practice as locks cannot be taken on a temporary relation by a session different than the one owning the relation, and the non-concurrent operation is more effective. The problem exists with REINDEX since v12 with the introduction of CONCURRENTLY, and with CREATE/DROP INDEX since CONCURRENTLY exists for those commands. In all supported versions, this caused only confusing error messages to be generated. Note that with REINDEX, it was also possible to issue a REINDEX CONCURRENTLY for a temporary relation owned by a different session, leading to a server crash. The idea to enforce transparently the non-concurrent code path for temporary relations comes originally from Andres Freund. Reported-by: Manuel Rigger Author: Michael Paquier, Heikki Linnakangas Reviewed-by: Andres Freund, Álvaro Herrera, Heikki Linnakangas Discussion: https://postgr.es/m/CA+u7OA6gP7YAeCguyseusYcc=uR8+ypjCcgDDCTzjQ+k6S9ksQ@mail.gmail.com Backpatch-through: 9.4
* Fix edge case leading to agg transitions skipping ExecAggTransReparent() calls.Andres Freund2020-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code checking whether an aggregate transition value needs to be reparented into the current context has always only compared the transition return value with the previous transition value by datum, i.e. without regard for NULLness. This normally works, because when the transition function returns NULL (via fcinfo->isnull), it'll return a value that won't be the same as its input value. But there's no hard requirement that that's the case. And it turns out, it's possible to hit this case (see discussion or reproducers), leading to a non-null transition value not being reparented, followed by a crash caused by that. Instead of adding another comparison of NULLness, instead have ExecAggTransReparent() ensure that pergroup->transValue ends up as 0 when the new transition value is NULL. That avoids having to add an additional branch to the much more common cases of the transition function returning the old transition value (which is a pointer in this case), and when the new value is different, but not NULL. In branches since 69c3936a149, also deduplicate the reparenting code between the expression evaluation based transitions, and the path for ordered aggregates. Reported-By: Teodor Sigaev, Nikita Glukhov Author: Andres Freund Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru Backpatch: 9.4-, this issue has existed since at least 7.4
* Fix crash in BRIN inclusion op functions, due to missing datum copy.Heikki Linnakangas2020-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | The BRIN add_value() and union() functions need to make a longer-lived copy of the argument, if they want to store it in the BrinValues struct also passed as argument. The functions for the "inclusion operator classes" used with box, range and inet types didn't take into account that the union helper function might return its argument as is, without making a copy. Check for that case, and make a copy if necessary. That case arises at least with the range_union() function, when one of the arguments is an 'empty' range: CREATE TABLE brintest (n numrange); CREATE INDEX brinidx ON brintest USING brin (n); INSERT INTO brintest VALUES ('empty'); INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric)); INSERT INTO brintest VALUES ('(-1, 0)'); SELECT brin_desummarize_range('brinidx', 0); SELECT brin_summarize_range('brinidx', 0); Backpatch down to 9.5, where BRIN was introduced. Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera
* Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_versionMichael Paquier2020-01-18
| | | | | | | | | | | | | | | | | Mixing incorrect bounds set in the SSL context leads to confusing error messages generated by OpenSSL which are hard to act on. New checks are added within the GUC machinery to improve the user experience as they apply to any SSL implementation, not only OpenSSL, and doing the checks beforehand avoids the creation of a SSL during a reload (or startup) which we know will never be used anyway. Backpatch down to 12, as those parameters have been introduced by e73e67c. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz Backpatch-through: 12
* Repair more failures with SubPlans in multi-row VALUES lists.Tom Lane2020-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9b63c13f0 turns out to have been fundamentally misguided: the parent node's subPlan list is by no means the only way in which a child SubPlan node can be hooked into the outer execution state. As shown in bug #16213 from Matt Jibson, we can also get short-lived tuple table slots added to the outer es_tupleTable list. At this point I have little faith that there aren't other possible connections as well; the long time it took to notice this problem shows that this isn't a heavily-exercised situation. Therefore, revert that fix, returning to the coding that passed a NULL parent plan pointer down to the transiently-built subexpressions. That gives us a pretty good guarantee that they won't hook into the outer executor state in any way. But then we need some other solution to make SubPlans work. Adopt the solution speculated about in the previous commit's log message: do expression initialization at plan startup for just those VALUES rows containing SubPlans, abandoning the goal of reclaiming memory intra-query for those rows. In practice it seems unlikely that queries containing a vast number of VALUES rows would be using SubPlans in them, so this should not give up much. (BTW, this test case also refutes my claim in connection with the prior commit that the issue only arises with use of LATERAL. That was just wrong: some variants of SubLink always produce SubPlans.) As with previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org
* Set ReorderBufferTXN->final_lsn more eagerlyAlvaro Herrera2020-01-17
| | | | | | | | | | | | | | | | | | | ... specifically, set it incrementally as each individual change is spilled down to disk. This way, it is set correctly when the transaction disappears without trace, ie. without leaving an XACT_ABORT wal record. (This happens when the server crashes midway through a transaction.) Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from working, since it needs the final_lsn in order to know the endpoint of its iteration through spilled files. Commit df9f682c7bf8 already tried to fix the problem, but it didn't set the final_lsn in all cases. Revert that, since it's no longer needed. Author: Vignesh C Reviewed-by: Amit Kapila, Dilip Kumar Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com
* Allocate freechunks bitmap as part of SlabContextTomas Vondra2020-01-17
| | | | | | | | | | | | | | | | | | | | The bitmap used by SlabCheck to cross-check free chunks in a block used to be allocated for each SlabCheck call, and was never freed. The memory leak could be fixed by simply adding a pfree call, but it's actually a bad idea to do any allocations in SlabCheck at all as it assumes the state of the memory management as a whole is sane. So instead we allocate the bitmap as part of SlabContext, which means we don't need to do any allocations in SlabCheck and the bitmap goes away together with the SlabContext. Backpatch to 10, where the Slab context was introduced. Author: Tomas Vondra Reported-by: Andres Freund Reviewed-by: Tom Lane Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20200116044119.g45f7pmgz4jmodxj%40alap3.anarazel.de
* Fix buggy logic in isTempNamespaceInUse()Michael Paquier2020-01-15
| | | | | | | | | | | | | | | The logic introduced in this routine as of 246a6c8 would report an incorrect result when a session calls it to check if the temporary namespace owned by the session is in use or not. It is possible to optimize more the routine in this case to avoid a PGPROC lookup, but let's keep the logic simple. As this routine is used only by autovacuum for now, there were no live bugs, still let's be correct for any future code involving it. Author: Michael Paquier Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/20200113093703.GA41902@paquier.xyz Backpatch-through: 11
* Make rewriter prevent auto-updates on views with conditional INSTEAD rules.Dean Rasheed2020-01-14
| | | | | | | | | | | | | | | | | | | | A view with conditional INSTEAD rules and no unconditional INSTEAD rules or INSTEAD OF triggers is not auto-updatable. Previously we relied on a check in the executor to catch this, but that's problematic since the planner may fail to properly handle such a query and thus return a particularly unhelpful error to the user, before reaching the executor check. Instead, trap this in the rewriter and report the correct error there. Doing so also allows us to include more useful error detail than the executor check can provide. This doesn't change the existing behaviour of updatable views; it merely ensures that useful error messages are reported when a view isn't updatable. Per report from Pengzhou Tang, though not adopting that suggested fix. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAG4reAQn+4xB6xHJqWdtE0ve_WqJkdyCV4P=trYr4Kn8_3_PEA@mail.gmail.com
* Fix base backup with database OIDs larger than INT32_MAXPeter Eisentraut2020-01-13
| | | | | | | | | | The use of pg_atoi() for parsing a string into an Oid fails for values larger than INT32_MAX, since OIDs are unsigned. Instead, use atooid(). While this has less error checking, the contents of the data directory are expected to be trustworthy, so we don't need to go out of our way to do full error checking. Discussion: https://www.postgresql.org/message-id/flat/dea47fc8-6c89-a2b1-07e3-754ff1ab094b%402ndquadrant.com
* Fix edge-case crashes and misestimation in range containment selectivity.Tom Lane2020-01-12
| | | | | | | | | | | | | | | | | | | | | | | | When estimating the selectivity of "range_var <@ range_constant" or "range_var @> range_constant", if the upper (or respectively lower) bound of the range_constant was above the last bin of the range_var's histogram, the code would access uninitialized memory and potentially crash (though it seems the probability of a crash is quite low). Handle the endpoint cases explicitly to fix that. While at it, be more paranoid about the possibility of getting NaN or other silly results from the range type's subdiff function. And improve some comments. Ordinarily we'd probably add a regression test case demonstrating the bug in unpatched code. But it's too hard to get it to crash reliably because of the uninitialized-memory dependence, so skip that. Per bug #16122 from Adam Scott. It's been broken from the beginning, apparently, so backpatch to all supported branches. Diagnosis by Michael Paquier, patch by Andrey Borodin and Tom Lane. Discussion: https://postgr.es/m/16122-eb35bc248c806c15@postgresql.org
* Remove incorrect assertion for INSERT in logical replication's publisherMichael Paquier2020-01-12
| | | | | | | | | | | | | | | | | On the publisher, it was assumed that an INSERT change cannot happen for a relation with no replica identity. However this is true only for a change that needs references to old rows, aka UPDATE or DELETE, so trying to use logical replication with a relation that has no replica identity led to an assertion failure in the publisher when issuing an INSERT. This commit removes the incorrect assertion, and adds more regression tests to provide coverage for relations without replica identity. Reported-by: Neha Sharma Author: Dilip Kumar, Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CANiYTQsL1Hb8_Km08qd32svrqNumXLJeoGo014O7VZymgOhZEA@mail.gmail.com Backpatch-through: 10
* Extensive code review for GSSAPI encryption mechanism.Tom Lane2020-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix assorted bugs in handling of non-blocking I/O when using GSSAPI encryption. The encryption layer could return the wrong status information to its caller, resulting in effectively dropping some data (or possibly in aborting a not-broken connection), or in a "livelock" situation where data remains to be sent but the upper layers think transmission is done and just go to sleep. There were multiple small thinkos contributing to that, as well as one big one (failure to think through what to do when a send fails after having already transmitted data). Note that these errors could cause failures whether the client application asked for non-blocking I/O or not, since both libpq and the backend always run things in non-block mode at this level. Also get rid of use of static variables for GSSAPI inside libpq; that's entirely not okay given that multiple connections could be open at once inside a single client process. Also adjust a bunch of random small discrepancies between the frontend and backend versions of the send/receive functions -- except for error handling, they should be identical, and now they are. Also extend the Kerberos TAP tests to exercise cases where nontrivial amounts of data need to be pushed through encryption. Before, those tests didn't provide any useful coverage at all for the cases of interest here. (They still might not, depending on timing, but at least there's a chance.) Per complaint from pmc@citylink and subsequent investigation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/20200109181822.GA74698@gate.oper.dinoex.org
* Maintain valid md.c state when FileClose() fails.Noah Misch2020-01-10
| | | | | | | | | | | | | | | FileClose() failure ordinarily causes a PANIC. Suppose the user disables that PANIC via data_sync_retry=on. After mdclose() issued a FileClose() that failed, calls into md.c raised SIGSEGV. This fix adds repalloc() calls during mdclose(); update a comment about ignoring repalloc() cost. The rate of relation segment count change is a minor factor; more relevant to overall performance is the rate of mdclose() and subsequent re-opening of segments. Back-patch to v10, where commit 45e191e3aa62d47a8bc1a33f784286b2051f45cb introduced the bug. Reviewed by Kyotaro Horiguchi. Discussion: https://postgr.es/m/20191222091930.GA1280238@rfd.leadboat.com
* Reimplement nullification of walsender timestampAlvaro Herrera2020-01-08
| | | | | | | | | | | | | | | Make the value null only at pg_stat_activity-output time, as suggested by Tom Lane, instead of messing with the internal state. This should appease buildfarm members with force_parallel_mode=regress, which are running parallel queries on logical replication walsenders. The fact that walsenders can run parallel queries should perhaps be studied more carefully, but for the moment let's get rid of the red blots in buildfarm. Backpatch to pg10, like the previous commit. Discussion: https://postgr.es/m/30804.1578438763@sss.pgh.pa.us
* Fix handling of generated columns in ALTER TABLE.Tom Lane2020-01-08
| | | | | | | | | | | | | | | ALTER TABLE failed if a column referenced in a GENERATED expression had been added or changed in type earlier in the ALTER command. That's because the GENERATED expression needs to be evaluated against the table's updated tuples, but it was being evaluated against the original tuples. (Fortunately the executor has adequate cross-checks to notice the mismatch, so we just got an obscure error message and not anything more dangerous.) Per report from Andreas Joseph Krogh. Back-patch to v12 where GENERATED was added. Discussion: https://postgr.es/m/VisenaEmail.200.231b0a41523275d0.16ea7f800c7@tc7-visena
* Revert "Forbid DROP SCHEMA on temporary namespaces"Michael Paquier2020-01-08
| | | | | | | | This reverts commit a052f6c, following complains from Robert Haas and Tom Lane. Backpatch down to 9.4, like the previous commit. Discussion: https://postgr.es/m/CA+TgmobL4npEX5=E5h=5Jm_9mZun3MT39Kq2suJFVeamc9skSQ@mail.gmail.com Backpatch-through: 9.4
* pg_stat_activity: show NULL stmt start time for walsendersAlvaro Herrera2020-01-07
| | | | | | | | | | | | | Returning a non-NULL time is pointless, sinc a walsender is not a process that would be running normal transactions anyway, but the code was unintentionally exposing the process start time intermittently, which was not only bogus but it also confused monitoring systems looking for idle transactions. Fix by avoiding all updates in walsenders. Backpatch to 11, where walsenders started appearing in pg_stat_activity. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/20191209234409.exe7osmyalwkt5j4@development
* Reduce the number of GetFlushRecPtr() calls done by walsenders.Tom Lane2020-01-06
| | | | | | | | | | | | | | | | | | | | | | Since the WAL flush position only moves forward, it's safe to cache its previous value within each walsender process, and update from shared memory only once we've caught up to the previously-seen value. When there are many active walsenders, this makes for a very significant reduction in the amount of contention on the XLogCtl->info_lck spinlock. This patch also adjusts the logic so that we update our idea of the flush position after processing a WAL record, rather than beforehand. This may cause us to realize we're not caught up when the preceding coding would've thought that we were, but that seems all to the good; it may avoid a useless sleep-and-wakeup cycle. Back-patch to v12. The contention problem exists in prior branches, but it's much less severe (due to inefficiencies elsewhere) so there seems no need to take any risk of back-patching further. Pierre Ducroquet, reviewed by Julien Rouhaud Discussion: https://postgr.es/m/2931018.Vxl9zapr77@pierred-pdoc
* Have logical replication subscriber fire column triggersPeter Eisentraut2020-01-06
| | | | | | | | | The logical replication apply worker did not fire per-column update triggers because the updatedCols bitmap in the RTE was not populated. This fixes that. Reviewed-by: Euler Taveira <euler@timbira.com.br> Discussion: https://www.postgresql.org/message-id/flat/21673e2d-597c-6afe-637e-e8b10425b240%402ndquadrant.com
* Fix cloning of row triggers to sub-partitionsAlvaro Herrera2020-01-02
| | | | | | | | | | | | | | When row triggers exist in partitioned partitions that are not either part of FKs or deferred unique constraints, they are not correctly cloned to their partitions. That's because they are marked "internal", and those are purposefully skipped when doing the clone triggers dance. Fix by relaxing the condition on which internal triggers are skipped. Amit Langote initially diagnosed the problem and proposed a fix, but I used a different approach. Reported-by: Petr Fedorov Discussion: https://postgr.es/m/6b3f0646-ba8c-b3a9-c62d-1c6651a1920f@phystech.edu
* Fix running out of file descriptors for spill files.Amit Kapila2020-01-02
| | | | | | | | | | | | | | | | | | | Currently while decoding changes, if the number of changes exceeds a certain threshold, we spill those to disk.  And this happens for each (sub)transaction.  Now, while reading all these files, we don't close them until we read all the files.  While reading these files, if the number of such files exceeds the maximum number of file descriptors, the operation errors out. Use PathNameOpenFile interface to open these files as that internally has the mechanism to release kernel FDs as needed to get us under the max_safe_fds limit. Reported-by: Amit Khandekar Author: Amit Khandekar Reviewed-by: Amit Kapila Backpatch-through: 9.4 Discussion: https://postgr.es/m/CAJ3gD9c-sECEn79zXw4yBnBdOttacoE-6gAyP0oy60nfs_sabQ@mail.gmail.com
* Forbid DROP SCHEMA on temporary namespacesMichael Paquier2019-12-27
| | | | | | | | | | | | | | | | | | This operation was possible for the owner of the schema or a superuser. Down to 9.4, doing this operation would cause inconsistencies in a session whose temporary schema was dropped, particularly if trying to create new temporary objects after the drop. A more annoying consequence is a crash of autovacuum on an assertion failure when logging information about an orphaned temp table dropped. Note that because of 246a6c8 (present in v11~), which has made the removal of orphaned temporary tables more aggressive, the failure could be triggered more easily, but it is possible to reproduce down to 9.4. Reported-by: Mahendra Singh, Prabhat Sahu Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Mahendra Singh Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com Backpatch-through: 9.4
* Fix some comments related to logical repslot advancingMichael Paquier2019-12-26
| | | | | | | | | confirmed_flush is part of a replication slot's information, but not confirmed_lsn. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20191226.175919.17237335658671970.horikyota.ntt@gmail.com Backpatch-through: 11
* Rotate instead of shifting hash join batch number.Thomas Munro2019-12-24
| | | | | | | | | | | | | | | | | | | Our algorithm for choosing batch numbers turned out not to work effectively for multi-billion key inner relations. We would use more hash bits than we have, and effectively concentrate all tuples into a smaller number of batches than we intended. While ideally we should switch to wider hashes, for now, change the algorithm to one that effectively gives up bits from the bucket number when we don't have enough bits. That means we'll finish up with longer bucket chains than would be ideal, but that's better than having batches that don't fit in work_mem and can't be divided. Batch-patch to all supported releases. Author: Thomas Munro Reviewed-by: Tom Lane, thanks also to Tomas Vondra, Alvaro Herrera, Andres Freund for testing and discussion Reported-by: James Coleman Discussion: https://postgr.es/m/16104-dc11ed911f1ab9df%40postgresql.org
* Disallow partition key expressions that return pseudo-types.Tom Lane2019-12-23
| | | | | | | | | | | | | | | | | This wasn't checked originally, but it should have been, because in general pseudo-types can't be stored to and retrieved from disk. Notably, partition bound values of type "record" would not be interpretable by another session. In v12 and HEAD, add another flag to CheckAttributeType's repertoire so that it can produce a specific error message for this case. That's infeasible in older branches without an ABI break, so fall back to a slightly-less-nicely-worded error message in v10 and v11. Problem noted by Amit Langote, though this patch is not his initial solution. Back-patch to v10 where partitioning was introduced. Discussion: https://postgr.es/m/CA+HiwqFUzjfj9HEsJtYWcr1SgQ_=iCAvQ=O2Sx6aQxoDu4OiHw@mail.gmail.com
* Prevent a rowtype from being included in itself via a range.Tom Lane2019-12-23
| | | | | | | | | | We probably should have thought of this case when ranges were added, but we didn't. (It's not the fault of commit eb51af71f, because ranges didn't exist then.) It's an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/7782.1577051475@sss.pgh.pa.us
* Update neglected comment.Robert Haas2019-12-19
| | | | | | | Commit d986d4e87f61c68f52c68ebc274960dc664b7b4e renamed a variable but neglected to update the corresponding comment. Amit Langote
* Fix subscriber invalid memory access on DDL.Amit Kapila2019-12-18
| | | | | | | | | | | | | This patch allows building the local relmap cache for a subscribed relation after processing pending invalidation messages and potential relcache updates. Without this, the attributes in the local cache don't tally with the updated relcache entry leading to invalid memory access. Reported-by Jehan-Guillaume de Rorthais Author: Jehan-Guillaume de Rorthais and Vignesh C Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/20191025175929.7e90dbf5@firost
* Remove shadow variables linked to RedoRecPtr in xlog.cMichael Paquier2019-12-18
| | | | | | | | | | | | | | | This changes the routines in charge of recycling WAL segments past the last redo LSN to not use anymore "RedoRecPtr" as a local variable, which is also available in the context of the session as a static declaration, replacing it with "lastredoptr". This confusion has been introduced by d9fadbf, so backpatch down to v11 like the other commit. Thanks to Tom Lane, Robert Haas, Alvaro Herrera, Mark Dilger and Kyotaro Horiguchi for the input provided. Author: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927F7B5F690065E1194B258E35D0@MN2PR18MB2927.namprd18.prod.outlook.com Backpatch-through: 11
* Fix error reporting for index expressions of prohibited types.Tom Lane2019-12-17
| | | | | | | | | | | | | | | | | | | If CheckAttributeType() threw an error about the datatype of an index expression column, it would report an empty column name, which is pretty unhelpful and certainly not the intended behavior. I (tgl) evidently broke this in commit cfc5008a5, by not noticing that the column's attname was used above where I'd placed the assignment of it. In HEAD and v12, this is trivially fixable by moving up the assignment of attname. Before v12 the code is a bit more messy; to avoid doing substantial refactoring, I took the lazy way out and just put in two copies of the assignment code. Report and patch by Amit Langote. Back-patch to all supported branches. Discussion: https://postgr.es/m/CA+HiwqFA+BGyBFimjiYXXMa2Hc3fcL0+OJOyzUNjhU4NCa_XXw@mail.gmail.com
* Change overly strict Assert in TransactionGroupUpdateXidStatus.Amit Kapila2019-12-17
| | | | | | | | | | | | | | | | | | This Assert thought that an overflowed transaction can never get registered for the group update.  But that is not true, because even when the number of children for a transaction got reduced, the overflow flag is not changed. And, for group update, we only care about the current number of children for a transaction that is being committed. Based on comments by Andres Freund, remove a redundant Assert in TransactionIdSetPageStatus as we already had a static Assert for the same condition a few lines earlier. Reported-by: Vignesh C Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAFiTN-s5=uJw-Z6JC9gcqtBSjXsrHnU63PXBrA=pnBjqnkm5UA@mail.gmail.com
* Fix yet another crash in page split during GiST index creation.Heikki Linnakangas2019-12-16
| | | | | | | | | | | | | | | | Commit a7ee7c8513 fixed a bug in GiST page split during index creation, where we failed to re-find the position of a downlink after the page containing it was split. However, that fix was incomplete; the other call to gistinserttuples() in the same function needs to also clear 'downlinkoffnum'. Fixes bug #16134 reported by Alexander Lakhin, for real this time. The previous fix was enough to fix the crash with the reproducer script for bug #16162, but the original script for #16134 was still crashing. Backpatch to v12, like the previous incomplete fix. Discussion: https://www.postgresql.org/message-id/d869f537-abe4-d2ea-0510-38cd053f5152%40gmail.com
* Prevent overly-aggressive collapsing of joins to RTE_RESULT relations.Tom Lane2019-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RTE_RESULT simplification logic added by commit 4be058fe9 had a flaw: it would collapse out a RTE_RESULT that is due to compute a PlaceHolderVar, and reassign the PHV to the parent join level, even if another input relation of the join contained a lateral reference to the PHV. That can't work because the PHV would be computed too late. In practice it led to failures of internal sanity checks later in planning (either assertion failures or errors such as "failed to construct the join relation"). To fix, add code to check for the presence of such PHVs in relevant portions of the query tree. Notably, this required refactoring range_table_walker so that a caller could ask to walk individual RTEs not the whole list. (It might be a good idea to refactor range_table_mutator in the same way, if only to keep those functions looking similar; but I didn't do so here as it wasn't necessary for the bug fix.) This exercise also taught me that find_dependent_phvs(), as it stood, could only safely be used on the entire Query, not on subtrees. Adjust its API to reflect that; which in passing allows it to have a fast path for the common case of no PHVs anywhere. Per report from Will Leinweber. Back-patch to v12 where the bug was introduced. Discussion: https://postgr.es/m/CALLb-4xJMd4GZt2YCecMC95H-PafuWNKcmps4HLRx2NHNBfB4g@mail.gmail.com
* Fix mdsyncfiletag(), take II.Thomas Munro2019-12-14
| | | | | | | | The previous commit failed to consider that FileGetRawDesc() might not return a valid fd, as discovered on the build farm. Switch to using the File interface only. Back-patch to 12, like the previous commit.
* Don't use _mdfd_getseg() in mdsyncfiletag().Thomas Munro2019-12-14
| | | | | | | | | | | | | | | | | | _mdfd_getseg() opens all segments up to the requested one. That causes problems for mdsyncfiletag(), if mdunlinkfork() has already unlinked other segment files. Open the file we want directly by name instead, if it's not already open. The consequence of this bug was a rare panic in the checkpointer, made more likely if you saturated the sync request queue so that the SYNC_FORGET_REQUEST messages for a given relation were more likely to be absorbed in separate cycles by the checkpointer. Back-patch to 12. Defect in commit 3eb77eba. Author: Thomas Munro Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20191119115759.GI30362%40telsasoft.com
* Fix crash when a page was split during GiST index creation.Heikki Linnakangas2019-12-13
| | | | | | | | | | | | | | | | | | | | | The bug was similar to the one that was fixed in commit 22251686f0. When we split page X and insert the downlink for the new page, the parent page might also need to be split. When that happens, the downlink offset number we remembered for X is no longer valid. We correctly called gistFindCorrectParent() to re-find it, but gistFindCorrectParent() doesn't do anything if the LSN of the page hasn't changed, and we stopped updating LSNs during index build in commit 9155580fd5. The buggy codepath was taken if the page was split into three or more pages, and inserting the downlink caused the parent page to split. To fix, explicitly mark the downlink offset number as invalid, to force gistFindCorrectParent() to re-find it. Fixes bug #16134 reported by Alexander Lakhin, reported again as #16162 by Andreas Kunert. Thanks to Jeff Janes, Tom Lane and Tomas Vondra for debugging. Backpatch to v12, where we stopped WAL-logging during index build. Discussion: https://www.postgresql.org/message-id/16134-0423f729671dec64%40postgresql.org Discussion: https://www.postgresql.org/message-id/16162-45d21b7b6c1a3105%40postgresql.org