aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Fix error message.Thomas Munro2020-07-23
| | | | | | | Remove extra space. Back-patch to all releases, like commit 7897e3bb. Author: Lu, Chenyang <lucy.fnst@cn.fujitsu.com> Discussion: https://postgr.es/m/795d03c6129844d3803e7eea48f5af0d%40G08CNEXMBPEKD04.g08.fujitsu.local
* neqjoinsel must now pass through collation to eqjoinsel.Tom Lane2020-07-21
| | | | | | | | | | | | | Since commit 044c99bc5, eqjoinsel passes the passed-in collation to any operators it invokes. However, neqjoinsel failed to pass on whatever collation it got, so that if we invoked a collation-dependent operator via that code path, we'd get "could not determine which collation to use for string comparison" or the like. Per report from Justin Pryzby. Back-patch to v12, like the previous commit. Discussion: https://postgr.es/m/20200721191606.GL5748@telsasoft.com
* Assert that we don't insert nulls into attnotnull catalog columns.Tom Lane2020-07-21
| | | | | | | | | | | | | | | | | | | | | The executor checks for this error, and so does the bootstrap catalog loader, but we never checked for it in retail catalog manipulations. The folly of that has now been exposed, so let's add assertions checking it. Checking in CatalogTupleInsert[WithInfo] and CatalogTupleUpdate[WithInfo] should be enough to cover this. Back-patch to v10; the aforesaid functions didn't exist before that, and it didn't seem worth adapting the patch to the oldest branches. But given the risk of JIT crashes, I think we certainly need this as far back as v11. Pre-v13, we have to explicitly exclude pg_subscription.subslotname and pg_subscription_rel.srsublsn from the checks, since they are mismarked. (Even if we change our mind about applying BKI_FORCE_NULL in the branch tips, it doesn't seem wise to have assertions that would fire in existing databases.) Discussion: https://postgr.es/m/298837.1595196283@sss.pgh.pa.us
* Avoid direct C access to possibly-null pg_subscription_rel.srsublsn.Tom Lane2020-07-21
| | | | | | | | | | | | | | | | | | | | | | This coding technique is unsafe, since we'd be accessing off the end of the tuple if the field is null. SIGSEGV is pretty improbable, but perhaps not impossible. Also, returning garbage for the LSN doesn't seem like a great idea, even if callers aren't looking at it today. Also update docs to point out explicitly that pg_subscription.subslotname and pg_subscription_rel.srsublsn can be null. Perhaps we should mark these two fields BKI_FORCE_NULL, so that they'd be correctly labeled in databases that are initdb'd in the future. But we can't force that for existing databases, and on balance it's not too clear that having a mix of different catalog contents in the field would be wise. Apply to v10 (where this code came in) through v12. Already fixed in v13 and HEAD. Discussion: https://postgr.es/m/732838.1595278439@sss.pgh.pa.us
* Kluge slot_compile_deform() to ignore incorrect attnotnull markings.Tom Lane2020-07-20
| | | | | | | | | | | | | | | | | | | | | | Since we mustn't force an initdb in released branches, there is no simple way to correct the markings of pg_subscription.subslotname and pg_subscription_rel.srsublsn as attnotnull in existing pre-v13 installations. Fortunately, released branches don't rely on attnotnull being correct for much. The planner looks at it in relation_excluded_by_constraints, but it'd be difficult to get that to matter for a query on a system catalog. The only place where it's really problematic is in JIT's slot_compile_deform(), which can produce incorrect code that crashes if there are NULLs in an allegedly not-null column. Hence, hack up slot_compile_deform() to be specifically aware of these two incorrect markings and not trust them. This applies to v11 and v12; the JIT code didn't exist before that, and we've fixed the markings in v13. Discussion: https://postgr.es/m/229396.1595191345@sss.pgh.pa.us
* Fix construction of updated-columns bitmap in logical replication.Tom Lane2020-07-20
| | | | | | | | | | | | | | | | | | | Commit b9c130a1f failed to apply the publisher-to-subscriber column mapping while checking which columns were updated. Perhaps less significantly, it didn't exclude dropped columns either. This could result in an incorrect updated-columns bitmap and thus wrong decisions about whether to fire column-specific triggers on the subscriber while applying updates. In HEAD (since commit 9de77b545), it could also result in accesses off the end of the colstatus array, as detected by buildfarm member skink. Fix the logic, and adjust 003_constraints.pl so that the problem is exposed in unpatched code. In HEAD, also add some assertions to check that we don't access off the ends of these newly variable-sized arrays. Back-patch to v10, as b9c130a1f was. Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
* Fix whitespacePeter Eisentraut2020-07-17
|
* Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.Tom Lane2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reparameterize_path_by_child() failed to reparameterize BitmapAnd and BitmapOr paths. This matters only if such a path is chosen as the inside of a nestloop partition-wise join, where we have to pass in parameters from the outside of the nestloop. If that did happen, we generated a bad plan that would likely lead to crashes at execution. This is not entirely reparameterize_path_by_child()'s fault though; it's the victim of an ancient decision (my ancient decision, I think) to not bother filling in param_info in BitmapAnd/Or path nodes. That caused the function to believe that such nodes and their children contain no parameter references and so need not be processed. In hindsight that decision looks pretty penny-wise and pound-foolish: while it saves a few cycles during path node setup, we do commonly need the information later. In particular, by reversing the decision and requiring valid param_info data in all nodes of a bitmap path tree, we can get rid of indxpath.c's get_bitmap_tree_required_outer() function, which computed the data on-demand. It's not unlikely that that nets out as a savings of cycles in many scenarios. A couple of other things in indxpath.c can be simplified as well. While here, get rid of some cases in reparameterize_path_by_child() that are visibly dead or useless, given that we only care about reparameterizing paths that can be on the inside of a parameterized nestloop. This case reminds one of the maxim that untested code probably does not work, so I'm unwilling to leave unreachable code in this function. (I did leave the T_Gather case in place even though it's not reached in the regression tests. It's not very clear to me when the planner might prefer to put Gather below rather than above a nestloop, but at least in principle the case might be interesting.) Per bug #16536, originally from Arne Roland but with a test case by Andrew Gierth. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/16536-2213ee0b3aad41fd@postgresql.org
* Fix timing issue with ALTER TABLE's validate constraintDavid Rowley2020-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | An ALTER TABLE to validate a foreign key in which another subcommand already caused a pending table rewrite could fail due to ALTER TABLE attempting to validate the foreign key before the actual table rewrite takes place. This situation could result in an error such as: ERROR: could not read block 0 in file "base/nnnnn/nnnnn": read only 0 of 8192 bytes The failure here was due to the SPI call which validates the foreign key trying to access an index which is yet to be rebuilt. Similarly, we also incorrectly tried to validate CHECK constraints before the heap had been rewritten. The fix for both is to delay constraint validation until phase 3, after the table has been rewritten. For CHECK constraints this means a slight behavioral change. Previously ALTER TABLE VALIDATE CONSTRAINT on inheritance tables would be validated from the bottom up. This was different from the order of evaluation when a new CHECK constraint was added. The changes made here aligns the VALIDATE CONSTRAINT evaluation order for inheritance tables to be the same as ADD CONSTRAINT, which is generally top-down. Reported-by: Nazli Ugur Koyluoglu, using SQLancer Discussion: https://postgr.es/m/CAApHDvp%3DZXv8wiRyk_0rWr00skhGkt8vXDrHJYXRMft3TjkxCA%40mail.gmail.com Backpatch-through: 9.5 (all supported versions)
* Fix comments related to table AMsMichael Paquier2020-07-14
| | | | | | | | | | | Incorrect function names were referenced. As this fixes some portions of tableam.h, that is mentioned in the docs as something to look at when implementing a table AM, backpatch down to 12 where this has been introduced. Author: Hironobu Suzuki Discussion: https://postgr.es/m/8fe6d672-28dd-3f1d-7aed-ac2f6d599d3f@interdb.jp Backpatch-through: 12
* Cope with lateral references in the quals of a subquery RTE.Tom Lane2020-07-13
| | | | | | | | | | | | | | | | | | | | | | The qual pushdown logic assumed that all Vars in a restriction clause must be Vars referencing subquery outputs; but since we introduced LATERAL, it's possible for such a Var to be a lateral reference instead. This led to an assertion failure in debug builds. In a non-debug build, there might be no ill effects (if qual_is_pushdown_safe decided the qual was unsafe anyway), or we could get failures later due to construction of an invalid plan. I've not gone to much length to characterize the possible failures, but at least segfaults in the executor have been observed. Given that this has been busted since 9.3 and it took this long for anybody to notice, I judge that the case isn't worth going to great lengths to optimize. Hence, fix by just teaching qual_is_pushdown_safe that such quals are unsafe to push down, matching the previous behavior when it accidentally didn't fail. Per report from Tom Ellis. Back-patch to all supported branches. Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder
* Forbid numeric NaN in jsonpathAlexander Korotkov2020-07-11
| | | | | | | | | | | | | | SQL standard doesn't define numeric Inf or NaN values. It appears even more ridiculous to support then in jsonpath assuming JSON doesn't support these values as well. This commit forbids returning NaN from .double(), which was previously allowed. NaN can't be result of inner-jsonpath computation over non-NaNs. So, we can not expect NaN in the jsonpath output. Reported-by: Tom Lane Discussion: https://postgr.es/m/203949.1591879542%40sss.pgh.pa.us Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12
* Improve error reporting for jsonpath .double() methodAlexander Korotkov2020-07-11
| | | | | | | | | | | When jsonpath .double() method detects that numeric or string can't be converted to double precision, it throws an error. This commit makes these errors explicitly express the reason of failure. Discussion: https://postgr.es/m/CAPpHfdtqJtiSXkP7tOXez18NxhLUH_-75bL8%3DOce4Ki%2Bbv7V6Q%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane Backpatch-through: 12
* Remove WARNING message from brin_desummarize_rangeAlvaro Herrera2020-07-09
| | | | | | | | | | This message was being emitted on the grounds that only crashed summarization could cause it, but in reality even an aborted vacuum could do it ... which makes it way too noisy, particularly since it shows up in regression tests and makes them die. Reported by Tom Lane. Discussion: https://postgr.es/m/489091.1593534251@sss.pgh.pa.us
* Fix pg_current_logfile() to not emit a carriage return on Windows.Tom Lane2020-07-09
| | | | | | | | | | | | | | | | | | | | Due to not having our signals straight about CRLF vs. LF line termination, the output of pg_current_logfile() included a trailing \r on Windows. To fix, force the file descriptor it uses into text mode. While here, move a couple of local variable declarations to make the function's logic clearer. In v12 and v13, also back-patch the test added by 1c4e88e2f so that this function has some test coverage. However, the 004_logrotate.pl test script doesn't exist before v12, and it didn't seem worth adding to older branches just for this. Per report from Thomas Kellerer. Back-patch to v10 where this function was added. Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net
* Fix "ignoring return value" complaints from commit 96d1f423f9Joe Conway2020-07-04
| | | | | | | | | | | | | | | The cfbot and some BF animals are complaining about the previous read_binary_file commit because of ignoring return value of ‘fread’. So let's make everyone happy by testing the return value even though not strictly needed. Reported by Justin Pryzby, and suggested patch by Tom Lane. Backpatched to v11 same as the previous commit. Reported-By: Justin Pryzby Reviewed-By: Tom Lane Discussion: https://postgr.es/m/flat/969b8d82-5bb2-5fa8-4eb1-f0e685c5d736%40joeconway.com Backpatch-through: 11
* Read until EOF vice stat-reported size in read_binary_fileJoe Conway2020-07-04
| | | | | | | | | | | | | | | | | | read_binary_file(), used by SQL functions pg_read_file() and friends, uses stat to determine file length to read, when not passed an explicit length as an argument. This is problematic, for example, if the file being read is a virtual file with a stat-reported length of zero. Arrange to read until EOF, or StringInfo data string lenth limit, is reached instead. Original complaint and patch by me, with significant review, corrections, advice, and code optimizations by Tom Lane. Backpatched to v11. Prior to that only paths relative to the data and log dirs were allowed for files, so no "zero length" files were reachable anyway. Reviewed-By: Tom Lane Discussion: https://postgr.es/m/flat/969b8d82-5bb2-5fa8-4eb1-f0e685c5d736%40joeconway.com Backpatch-through: 11
* Clamp total-tuples estimates for foreign tables to ensure planner sanity.Tom Lane2020-07-03
| | | | | | | | | | | | | | | | | | | | | After running GetForeignRelSize for a foreign table, adjust rel->tuples to be at least as large as rel->rows. This prevents bizarre behavior in estimate_num_groups() and perhaps other places, especially in the scenario where rel->tuples is zero because pg_class.reltuples is (suggesting that ANALYZE has never been run for the table). As things stood, we'd end up estimating one group out of any GROUP BY on such a table, whereas the default group-count estimate is more likely to result in a sane plan. Also, clarify in the documentation that GetForeignRelSize has the option to override the rel->tuples value if it has a better idea of what to use than what is in pg_class.reltuples. Per report from Jeff Janes. Back-patch to all supported branches. Patch by me; thanks to Etsuro Fujita for review Discussion: https://postgr.es/m/CAMkU=1xNo9cnan+Npxgz0eK7394xmjmKg-QEm8wYG9P5-CcaqQ@mail.gmail.com
* Fix temporary tablespaces for shared filesets some more.Tom Lane2020-07-03
| | | | | | | | | | | | | | | | | | | | | | | | Commit ecd9e9f0b fixed the problem in the wrong place, causing unwanted side-effects on the behavior of GetNextTempTableSpace(). Instead, let's make SharedFileSetInit() responsible for subbing in the value of MyDatabaseTableSpace when the default tablespace is called for. The convention about what is in the tempTableSpaces[] array is evidently insufficiently documented, so try to improve that. It also looks like SharedFileSetInit() is doing the wrong thing in the case where temp_tablespaces is empty. It was hard-wiring use of the pg_default tablespace, but it seems like using MyDatabaseTableSpace is more consistent with what happens for other temp files. Back-patch the reversion of PrepareTempTablespaces()'s behavior to 9.5, as ecd9e9f0b was. The changes in SharedFileSetInit() go back to v11 where that was introduced. (Note there is net zero code change before v11 from these two patch sets, so nothing to release-note.) Magnus Hagander and Tom Lane Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
* Fix temporary tablespaces for shared filesetsMagnus Hagander2020-07-03
| | | | | | | | | | | | | | A likely copy/paste error in 98e8b480532 from back in 2004 would cause temp tablespace to be reset to InvalidOid if temp_tablespaces was set to the same value as the primary tablespace in the database. This would cause shared filesets (such as for parallel hash joins) to ignore them, putting the temporary files in the default tablespace instead of the configured one. The bug is in the old code, but it appears to have been exposed only once we had shared filesets. Reviewed-By: Daniel Gustafsson Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com Backpatch-through: 9.5
* Add parens to ConvertToXSegs macroAlvaro Herrera2020-06-24
| | | | | | | The current definition is dangerous. No bugs exist in our code at present, but backpatch to 11 nonetheless where it was introduced. Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
* Undo double-quoting of index names in non-text EXPLAIN output formats.Tom Lane2020-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | explain_get_index_name() applied quote_identifier() to the index name. This is fine for text output, but the non-text output formats all have their own quoting conventions and would much rather start from the actual index name. For example in JSON you'd get something like "Index Name": "\"My Index\"", which is surely not desirable, especially when the same does not happen for table names. Hence, move the responsibility for applying quoting out to the callers, where it can go into already-existing special code paths for text format. This changes the API spec for users of explain_get_index_name_hook: before, they were supposed to apply quote_identifier() if necessary, now they should not. Research suggests that the only publicly available user of the hook is hypopg, and it actually forgot to apply quoting anyway, so it's fine. (In any case, there's no behavioral change for the output of a hook as seen in non-text EXPLAIN formats, so this won't break any case that programs should be relying on.) Digging in the commit logs, it appears that quoting was included in explain_get_index_name's duties when commit 604ffd280 invented it; and that was fine at the time because we only had text output format. This should have been rethought when non-text formats were invented, but it wasn't. This is a fairly clear bug for users of non-text EXPLAIN formats, so back-patch to all supported branches. Per bug #16502 from Maciek Sakrejda. Patch by me (based on investigation by Euler Taveira); thanks to Julien Rouhaud for review. Discussion: https://postgr.es/m/16502-57bd1c9f913ed1d1@postgresql.org
* Fix masking of SP-GiST pages during xlog consistency checkAlexander Korotkov2020-06-20
| | | | | | | | | | spg_mask() didn't take into account that pd_lower equal to SizeOfPageHeaderData is still valid value. This commit fixes that. Backpatch to 11, where spg_mask() pg_lower check was introduced. Reported-by: Michael Paquier Discussion: https://postgr.es/m/20200615131405.GM52676%40paquier.xyz Backpatch-through: 11
* Fix deadlock danger when atomic ops are done under spinlock.Andres Freund2020-06-18
| | | | | | | | | | | | | | | | | This was a danger only for --disable-spinlocks in combination with atomic operations unsupported by the current platform. While atomics.c was careful to signal that a separate semaphore ought to be used when spinlock emulation is active, spin.c didn't actually implement that mechanism. That's my (Andres') fault, it seems to have gotten lost during the development of the atomic operations support. Fix that issue and add test for nesting atomic operations inside a spinlock. Author: Andres Freund Discussion: https://postgr.es/m/20200605023302.g6v3ydozy5txifji@alap3.anarazel.de Backpatch: 9.5-
* Fix oldest xmin and LSN computation across repslots after advancingMichael Paquier2020-06-18
| | | | | | | | | | | | | | | | | Advancing a replication slot did not recompute the oldest xmin and LSN values across replication slots, preventing resource removal like segments not recycled at checkpoint time. The original commit that introduced the slot advancing in 9c7d06d never did the update of those oldest values, and b0afdca removed this code. This commit adds a TAP test to check segment recycling with advancing for physical slots, enforcing an extra segment switch before advancing to check if the segment gets correctly recycled after a checkpoint. Reported-by: Andres Freund Reviewed-by: Alexey Kondratov, Kyptaro Horiguchi Discussion: https://postgr.es/m/20200609171904.kpltxxvjzislidks@alap3.anarazel.de Backpatch-through: 11
* spinlock emulation: Fix bug when more than INT_MAX spinlocks are initialized.Andres Freund2020-06-17
| | | | | | | | | | Once the counter goes negative we ended up with spinlocks that errored out on first use (due to check in tas_sema). Author: Andres Freund Reviewed-By: Robert Haas Discussion: https://postgr.es/m/20200606023103.avzrctgv7476xj7i@alap3.anarazel.de Backpatch: 9.5-
* Fix buffile.c error handling.Thomas Munro2020-06-16
| | | | | | | | | | | | | | | | | | Convert buffile.c error handling to use ereport. This fixes cases where I/O errors were indistinguishable from EOF or not reported. Also remove "%m" from error messages where errno would be bogus. While we're modifying those strings, add block numbers and short read byte counts where appropriate. Back-patch to all supported releases. Reported-by: Amit Khandekar <amitdkhan.pg@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ibrar Ahmed <ibrar.ahmad@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGJE04G%3D8TLK0DLypT_27D9dR8F1RQgNp0jK6qR0tZGWOw%40mail.gmail.com
* Fix behavior of float aggregates for single Inf or NaN inputs.Tom Lane2020-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there is just one non-null input value, and it is infinity or NaN, aggregates such as stddev_pop and covar_pop should produce a NaN result, because the calculation is not well-defined. They used to do so, but since we adopted Youngs-Cramer aggregation in commit e954a727f, they produced zero instead. That's an oversight, so fix it. Add tests exercising these edge cases. Affected aggregates are var_pop(double precision) stddev_pop(double precision) var_pop(real) stddev_pop(real) regr_sxx(double precision,double precision) regr_syy(double precision,double precision) regr_sxy(double precision,double precision) regr_r2(double precision,double precision) regr_slope(double precision,double precision) regr_intercept(double precision,double precision) covar_pop(double precision,double precision) corr(double precision,double precision) Back-patch to v12 where the behavior change was accidentally introduced. Report and patch by me; thanks to Dean Rasheed for review. Discussion: https://postgr.es/m/353062.1591898766@sss.pgh.pa.us
* Fix mishandling of NaN counts in numeric_[avg_]combine.Tom Lane2020-06-11
| | | | | | | | | | | | | | | | | | | | | | When merging two NumericAggStates, the code missed adding the new state's NaNcount unless its N was also nonzero; since those counts are independent, this is wrong. This would only have visible effect if some partial aggregate scans found only NaNs while earlier ones found only non-NaNs; then we could end up falsely deciding that there were no NaNs and fail to return a NaN final result as expected. That's pretty improbable, so it's no surprise this hasn't been reported from the field. Still, it's a bug. I didn't try to produce a regression test that would show the bug, but I did notice that these functions weren't being reached at all in our regression tests, so I improved the tests to at least exercise them. With these additions, I see pretty complete code coverage on the aggregation-related functions in numeric.c. Back-patch to 9.6 where this code was introduced. (I only added the improved test case as far back as v10, though, since the relevant part of aggregates.sql isn't there at all in 9.6.)
* Avoid update conflict out serialization anomalies.Peter Geoghegan2020-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | SSI's HeapCheckForSerializableConflictOut() test failed to correctly handle conditions involving a concurrently inserted tuple which is later concurrently updated by a separate transaction . A SELECT statement that called HeapCheckForSerializableConflictOut() could end up using the same XID (updater's XID) for both the original tuple, and the successor tuple, missing the XID of the xact that created the original tuple entirely. This only happened when neither tuple from the chain was visible to the transaction's MVCC snapshot. The observable symptoms of this bug were subtle. A pair of transactions could commit, with the later transaction failing to observe the effects of the earlier transaction (because of the confusion created by the update to the non-visible row). This bug dates all the way back to commit dafaa3ef, which added SSI. To fix, make sure that we check the xmin of concurrently inserted tuples that happen to also have been updated concurrently. Author: Peter Geoghegan Reported-By: Kyle Kingsbury Reviewed-By: Thomas Munro Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443@jepsen.io Backpatch: All supported versions
* Update description of parameter password_encryptionPeter Eisentraut2020-06-10
| | | | | | | The previous description string still described the pre-PostgreSQL 10 (pre eb61136dc75a76caef8460fa939244d8593100f2) behavior of selecting between encrypted and unencrypted, but it is now choosing between encryption algorithms.
* Fix locking bugs that could corrupt pg_control.Thomas Munro2020-06-08
| | | | | | | | | | | | | | | | | | The redo routines for XLOG_CHECKPOINT_{ONLINE,SHUTDOWN} must acquire ControlFileLock before modifying ControlFile->checkPointCopy, or the checkpointer could write out a control file with a bad checksum. Likewise, XLogReportParameters() must acquire ControlFileLock before modifying ControlFile and calling UpdateControlFile(). Back-patch to all supported releases. Author: Nathan Bossart <bossartn@amazon.com> Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/70BF24D6-DC51-443F-B55A-95735803842A%40amazon.com
* Use query collation, not column's collation, while examining statistics.Tom Lane2020-06-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5e0928005 changed the planner so that, instead of blindly using DEFAULT_COLLATION_OID when invoking operators for selectivity estimation, it would use the collation of the column whose statistics we're considering. This was recognized as still being not quite the right thing, but it seemed like a good incremental improvement. However, shortly thereafter we introduced nondeterministic collations, and that creates cases where operators can fail if they're passed the wrong collation. We don't want planning to fail in cases where the query itself would work, so this means that we *must* use the query's collation when invoking operators for estimation purposes. The only real problem this creates is in ineq_histogram_selectivity, where the binary search might produce a garbage answer if we perform comparisons using a different collation than the column's histogram is ordered with. However, when the query's collation is significantly different from the column's default collation, the estimate we previously generated would be pretty irrelevant anyway; so it's not clear that this will result in noticeably worse estimates in practice. (A follow-on patch will improve this situation in HEAD, but it seems too invasive for back-patch.) The patch requires changing the signatures of mcv_selectivity and allied functions, which are exported and very possibly are used by extensions. In HEAD, I just did that, but an API/ABI break of this sort isn't acceptable in stable branches. Therefore, in v12 the patch introduces "mcv_selectivity_ext" and so on, with signatures matching HEAD, and makes the old functions into wrappers that assume DEFAULT_COLLATION_OID should be used. That does not match the prior behavior, but it should avoid risk of failure in most cases. (In practice, I think most extension datatypes aren't collation-aware, so the change probably doesn't matter to them.) Per report from James Lucas. Back-patch to v12 where the problem was introduced. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com
* Preserve pg_index.indisreplident across REINDEX CONCURRENTLYMichael Paquier2020-06-05
| | | | | | | | | | | If the flag value is lost, logical decoding would work the same way as REPLICA IDENTITY NOTHING, meaning that no old tuple values would be included in the changes anymore produced by logical decoding. Author: Michael Paquier Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/20200603065340.GK89559@paquier.xyz Backpatch-through: 12
* Reject "23:59:60.nnn" in datetime input.Tom Lane2020-06-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's intentional that we don't allow values greater than 24 hours, while we do allow "24:00:00" as well as "23:59:60" as inputs. However, the range check was miscoded in such a way that it would accept "23:59:60.nnn" with a nonzero fraction. For time or timetz, the stored result would then be greater than "24:00:00" which would fail dump/reload, not to mention possibly confusing other operations. Fix by explicitly calculating the result and making sure it does not exceed 24 hours. (This calculation is redundant with what will happen later in tm2time or tm2timetz. Maybe someday somebody will find that annoying enough to justify refactoring to avoid the duplication; but that seems too invasive for a back-patched bug fix, and the cost is probably unmeasurable anyway.) Note that this change also rejects such input as the time portion of a timestamp(tz) value. Back-patch to v10. The bug is far older, but to change this pre-v10 we'd need to ensure that the logic behaves sanely with float timestamps, which is possibly nontrivial due to roundoff considerations. Doesn't really seem worth troubling with. Per report from Christoph Berg. Discussion: https://postgr.es/m/20200520125807.GB296739@msg.df7cb.de
* Fix instance of elog() called while holding a spinlockMichael Paquier2020-06-04
| | | | | | | | This broke the project rule to not call any complex code while a spinlock is held. Issue introduced by b89e151. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com Backpatch-through: 9.5
* Don't call palloc() while holding a spinlock, either.Tom Lane2020-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix some more violations of the "only straight-line code inside a spinlock" rule. These are hazardous not only because they risk holding the lock for an excessively long time, but because it's possible for palloc to throw elog(ERROR), leaving a stuck spinlock behind. copy_replication_slot() had two separate places that did pallocs while holding a spinlock. We can make the code simpler and safer by copying the whole ReplicationSlot struct into a local variable while holding the spinlock, and then referencing that copy. (While that's arguably more cycles than we really need to spend holding the lock, the struct isn't all that big, and this way seems far more maintainable than copying fields piecemeal. Anyway this is surely much cheaper than a palloc.) That bug goes back to v12. InvalidateObsoleteReplicationSlots() not only did a palloc while holding a spinlock, but for extra sloppiness then leaked the memory --- probably for the lifetime of the checkpointer process, though I didn't try to verify that. Fortunately that silliness is new in HEAD. pg_get_replication_slots() had a cosmetic violation of the rule, in that it only assumed it's safe to call namecpy() while holding a spinlock. Still, that's a hazard waiting to bite somebody, and there were some other cosmetic coding-rule violations in the same function, so clean it up. I back-patched this as far as v10; the code exists before that but it looks different, and this didn't seem important enough to adapt the patch further back. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
* Fix use-after-release mistake in currtid() and currtid2() for viewsMichael Paquier2020-06-01
| | | | | | | | | | This issue has been present since the introduction of this code as of a3519a2 from 2002, and has been found by buildfarm member prion that uses RELCACHE_FORCE_RELEASE via the tests introduced recently in e786be5. Discussion: https://postgr.es/m/20200601022055.GB4121@paquier.xyz Backpatch-through: 9.5
* Fix crashes with currtid() and currtid2()Michael Paquier2020-06-01
| | | | | | | | | | | | | | | | | | | | | | | A relation that has no storage initializes rd_tableam to NULL, which caused those two functions to crash because of a pointer dereference. Note that in 11 and older versions, this has always failed with a confusing error "could not open file". These two functions are used by the Postgres ODBC driver, which requires them only when connecting to a backend strictly older than 8.1. When connected to 8.2 or a newer version, the driver uses a RETURNING clause instead whose support has been added in 8.2, so it should be possible to just remove both functions in the future. This is left as an issue to address later. While on it, add more regression tests for those functions as we never really had coverage for them, and for aggregates of TIDs. Reported-by: Jaime Casanova, via sqlsmith Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAJGNTeO93u-5APMga6WH41eTZ3Uee9f3s8dCpA-GSSqNs1b=Ug@mail.gmail.com Backpatch-through: 12
* llvmjit: Fix building against LLVM 11 by removing unnecessary include.Andres Freund2020-05-28
| | | | | | | | | LLVM has removed this header, in the branch that will become llvm 11. But as it turns out we didn't actually need it, so just remove it. Author: Jesse Zhang <sbjesse@gmail.com> Discussion: https://postgr.es/m/CAGf+fX7bvtP0YXMu7pOsu_NwhxW6dArTkxb=jt7M2-UJkyJ_3g@mail.gmail.com Backpatch: 11, where JIT support using llvm was introduced.
* Add CHECK_FOR_INTERRUPTS() to the repeat() functionJoe Conway2020-05-28
| | | | | | | | | | | The repeat() function loops for potentially a long time without ever checking for interrupts. This prevents, for example, a query cancel from interrupting until the work is all done. Fix by inserting a CHECK_FOR_INTERRUPTS() into the loop. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/flat/8692553c-7fe8-17d9-cbc1-7cddb758f4c6%40joeconway.com
* Add missing error code to "cannot attach index ..." error.Heikki Linnakangas2020-05-28
| | | | | | | | ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE was used in an ereport with the same message but different errdetail a few lines earlier, so use that here as well. Backpatch-through: 11
* Add lcov exclusion markers to jsonpath scannerPeter Eisentraut2020-05-26
| | | | | This was done for all scanners in 421167362242ce1fb46d6d720798787e7cd65aad but not added to the new one.
* gss: add missing references to hostgssenc and hostnogssencBruce Momjian2020-05-25
| | | | | | | | | | | | | These were missed when these were added to pg_hba.conf in PG 12; updates docs and pg_hba.conf.sample. Reported-by: Arthur Nascimento Bug: 16380 Discussion: https://postgr.es/m/20200421182736.GG19613@momjian.us Backpatch-through: 12
* Fix two typos in a commentAlvaro Herrera2020-05-22
| | | | They were introduced in 898e5e3290a7; backpatch to 12.
* Fix comment in slot.c.Amit Kapila2020-05-18
| | | | | | | | Reported-by: Sawada Masahiko Author: Sawada Masahiko Reviewed-by: Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CA+fd4k4Ws7M7YQ8PqSym5WB1y75dZeBTd1sZJUQdfe0KJQ-iSA@mail.gmail.com
* Fix assertion with relation using REPLICA IDENTITY FULL in subscriberMichael Paquier2020-05-16
| | | | | | | | | | | | | | | | In a logical replication subscriber, a table using REPLICA IDENTITY FULL which has a primary key would try to use the primary key's index available to scan for a tuple, but an assertion only assumed as correct the case of an index associated to REPLICA IDENTITY USING INDEX. This commit corrects the assertion so as the use of a primary key index is a valid case. Reported-by: Dilip Kumar Analyzed-by: Dilip Kumar Author: Euler Taveira Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/CAFiTN-u64S5bUiPL1q5kwpHNd0hRnf1OE-bzxNiOs5zo84i51w@mail.gmail.com Backpatch-through: 10
* Fix bogus initialization of replication origin shared memory state.Tom Lane2020-05-15
| | | | | | | | | | | | | | | | | The previous coding zeroed out offsetof(ReplicationStateCtl, states) more bytes than it was entitled to, as a consequence of starting the zeroing from the wrong pointer (or, if you prefer, using the wrong calculation of how much to zero). It's unsurprising that this has not caused any reported problems, since it can be expected that the newly-allocated block is at the end of what we've used in shared memory, and we always make the shmem block substantially bigger than minimally necessary. Nonetheless, this is wrong and it could bite us someday; plus it's a dangerous model for somebody to copy. This dates back to the introduction of this code (commit 5aa235042), so back-patch to all supported branches.
* Avoid killing btree items that are already deadAlvaro Herrera2020-05-15
| | | | | | | | | | | | | | | | | | | | | | _bt_killitems marks btree items dead when a scan leaves the page where they live, but it does so with only share lock (to improve concurrency). This was historicall okay, since killing a dead item has no consequences. However, with the advent of data checksums and wal_log_hints, this action incurs a WAL full-page-image record of the page. Multiple concurrent processes would write the same page several times, leading to WAL bloat. The probability of this happening can be reduced by only killing items if they're not already dead, so change the code to do that. The problem could eliminated completely by having _bt_killitems upgrade to exclusive lock upon seeing a killable item, but that would reduce concurrency so it's considered a cure worse than the disease. Backpatch all the way back to 9.5, since wal_log_hints was introduced in 9.4. Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://postgr.es/m/CA+fd4k6PeRj2CkzapWNrERkja5G0-6D-YQiKfbukJV+qZGFZ_Q@mail.gmail.com
* Move check for fsync=off so that pendingOps still gets cleared.Heikki Linnakangas2020-05-14
| | | | | | | | | | | | Commit 3eb77eba5a moved the loop and refactored it, and inadvertently changed the effect of fsync=off so that it also skipped removing entries from the pendingOps table. That was not intentional, and leads to an assertion failure if you turn fsync on while the server is running and reload the config. Backpatch-through: 12- Reviewed-By: Thomas Munro Discussion: https://www.postgresql.org/message-id/3cbc7f4b-a5fa-56e9-9591-c886deb07513%40iki.fi