aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Set ReorderBufferTXN->final_lsn more eagerlyAlvaro Herrera2020-01-17
| | | | | | | | | | | | | | | | | | | ... specifically, set it incrementally as each individual change is spilled down to disk. This way, it is set correctly when the transaction disappears without trace, ie. without leaving an XACT_ABORT wal record. (This happens when the server crashes midway through a transaction.) Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from working, since it needs the final_lsn in order to know the endpoint of its iteration through spilled files. Commit df9f682c7bf8 already tried to fix the problem, but it didn't set the final_lsn in all cases. Revert that, since it's no longer needed. Author: Vignesh C Reviewed-by: Amit Kapila, Dilip Kumar Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com
* Make rewriter prevent auto-updates on views with conditional INSTEAD rules.Dean Rasheed2020-01-14
| | | | | | | | | | | | | | | | | | | | A view with conditional INSTEAD rules and no unconditional INSTEAD rules or INSTEAD OF triggers is not auto-updatable. Previously we relied on a check in the executor to catch this, but that's problematic since the planner may fail to properly handle such a query and thus return a particularly unhelpful error to the user, before reaching the executor check. Instead, trap this in the rewriter and report the correct error there. Doing so also allows us to include more useful error detail than the executor check can provide. This doesn't change the existing behaviour of updatable views; it merely ensures that useful error messages are reported when a view isn't updatable. Per report from Pengzhou Tang, though not adopting that suggested fix. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAG4reAQn+4xB6xHJqWdtE0ve_WqJkdyCV4P=trYr4Kn8_3_PEA@mail.gmail.com
* Fix edge-case crashes and misestimation in range containment selectivity.Tom Lane2020-01-12
| | | | | | | | | | | | | | | | | | | | | | | | When estimating the selectivity of "range_var <@ range_constant" or "range_var @> range_constant", if the upper (or respectively lower) bound of the range_constant was above the last bin of the range_var's histogram, the code would access uninitialized memory and potentially crash (though it seems the probability of a crash is quite low). Handle the endpoint cases explicitly to fix that. While at it, be more paranoid about the possibility of getting NaN or other silly results from the range type's subdiff function. And improve some comments. Ordinarily we'd probably add a regression test case demonstrating the bug in unpatched code. But it's too hard to get it to crash reliably because of the uninitialized-memory dependence, so skip that. Per bug #16122 from Adam Scott. It's been broken from the beginning, apparently, so backpatch to all supported branches. Diagnosis by Michael Paquier, patch by Andrey Borodin and Tom Lane. Discussion: https://postgr.es/m/16122-eb35bc248c806c15@postgresql.org
* Revert "Forbid DROP SCHEMA on temporary namespaces"Michael Paquier2020-01-08
| | | | | | | | This reverts commit a052f6c, following complains from Robert Haas and Tom Lane. Backpatch down to 9.4, like the previous commit. Discussion: https://postgr.es/m/CA+TgmobL4npEX5=E5h=5Jm_9mZun3MT39Kq2suJFVeamc9skSQ@mail.gmail.com Backpatch-through: 9.4
* Fix running out of file descriptors for spill files.Amit Kapila2020-01-02
| | | | | | | | | | | | | | | | | | | Currently while decoding changes, if the number of changes exceeds a certain threshold, we spill those to disk.  And this happens for each (sub)transaction.  Now, while reading all these files, we don't close them until we read all the files.  While reading these files, if the number of such files exceeds the maximum number of file descriptors, the operation errors out. Use PathNameOpenFile interface to open these files as that internally has the mechanism to release kernel FDs as needed to get us under the max_safe_fds limit. Reported-by: Amit Khandekar Author: Amit Khandekar Reviewed-by: Amit Kapila Backpatch-through: 9.4 Discussion: https://postgr.es/m/CAJ3gD9c-sECEn79zXw4yBnBdOttacoE-6gAyP0oy60nfs_sabQ@mail.gmail.com
* Forbid DROP SCHEMA on temporary namespacesMichael Paquier2019-12-27
| | | | | | | | | | | | | | | | | | This operation was possible for the owner of the schema or a superuser. Down to 9.4, doing this operation would cause inconsistencies in a session whose temporary schema was dropped, particularly if trying to create new temporary objects after the drop. A more annoying consequence is a crash of autovacuum on an assertion failure when logging information about an orphaned temp table dropped. Note that because of 246a6c8 (present in v11~), which has made the removal of orphaned temporary tables more aggressive, the failure could be triggered more easily, but it is possible to reproduce down to 9.4. Reported-by: Mahendra Singh, Prabhat Sahu Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Mahendra Singh Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com Backpatch-through: 9.4
* Rotate instead of shifting hash join batch number.Thomas Munro2019-12-24
| | | | | | | | | | | | | | | | | | | Our algorithm for choosing batch numbers turned out not to work effectively for multi-billion key inner relations. We would use more hash bits than we have, and effectively concentrate all tuples into a smaller number of batches than we intended. While ideally we should switch to wider hashes, for now, change the algorithm to one that effectively gives up bits from the bucket number when we don't have enough bits. That means we'll finish up with longer bucket chains than would be ideal, but that's better than having batches that don't fit in work_mem and can't be divided. Batch-patch to all supported releases. Author: Thomas Munro Reviewed-by: Tom Lane, thanks also to Tomas Vondra, Alvaro Herrera, Andres Freund for testing and discussion Reported-by: James Coleman Discussion: https://postgr.es/m/16104-dc11ed911f1ab9df%40postgresql.org
* Prevent a rowtype from being included in itself via a range.Tom Lane2019-12-23
| | | | | | | | | | We probably should have thought of this case when ranges were added, but we didn't. (It's not the fault of commit eb51af71f, because ranges didn't exist then.) It's an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/7782.1577051475@sss.pgh.pa.us
* Fix error reporting for index expressions of prohibited types.Tom Lane2019-12-17
| | | | | | | | | | | | | | | | | | | If CheckAttributeType() threw an error about the datatype of an index expression column, it would report an empty column name, which is pretty unhelpful and certainly not the intended behavior. I (tgl) evidently broke this in commit cfc5008a5, by not noticing that the column's attname was used above where I'd placed the assignment of it. In HEAD and v12, this is trivially fixable by moving up the assignment of attname. Before v12 the code is a bit more messy; to avoid doing substantial refactoring, I took the lazy way out and just put in two copies of the assignment code. Report and patch by Amit Langote. Back-patch to all supported branches. Discussion: https://postgr.es/m/CA+HiwqFA+BGyBFimjiYXXMa2Hc3fcL0+OJOyzUNjhU4NCa_XXw@mail.gmail.com
* Fix EXTRACT(ISOYEAR FROM timestamp) for years BC.Tom Lane2019-12-12
| | | | | | | | | The test cases added by commit 26ae3aa80 exposed an old oversight in timestamp[tz]_part: they didn't correct the result of date2isoyear() for BC years, so that we produced an off-by-one answer for such years. Fix that, and back-patch to all supported branches. Discussion: https://postgr.es/m/SG2PR06MB37762CAE45DB0F6CA7001EA9B6550@SG2PR06MB3776.apcprd06.prod.outlook.com
* Remove redundant function calls in timestamp[tz]_part().Tom Lane2019-12-12
| | | | | | | | | | | | | | | | | | | | The DTK_DOW/DTK_ISODOW and DTK_DOY switch cases in timestamp_part() and timestamptz_part() contained calls of timestamp2tm() that were fully redundant with the ones done just above the switch. This evidently crept in during commit 258ee1b63, which relocated that code from another place where the calls were indeed needed. Just delete the redundant calls. I (tgl) noted that our test coverage of these functions left quite a bit to be desired, so extend timestamp.sql and timestamptz.sql to cover all the branches. Back-patch to all supported branches, as the previous commit was. There's no real issue here other than some wasted cycles in some not-too-heavily-used code paths, but the test coverage seems valuable. Report and patch by Li Japin; test case adjustments by me. Discussion: https://postgr.es/m/SG2PR06MB37762CAE45DB0F6CA7001EA9B6550@SG2PR06MB3776.apcprd06.prod.outlook.com
* Fix race condition in our Windows signal emulation.Tom Lane2019-12-09
| | | | | | | | | | | | | | | | | | | | | | | | pg_signal_dispatch_thread() responded to the client (signal sender) and disconnected the pipe before actually setting the shared variables that make the signal visible to the backend process's main thread. In the worst case, it seems, effective delivery of the signal could be postponed for as long as the machine has any other work to do. To fix, just move the pg_queue_signal() call so that we do it before responding to the client. This essentially makes pgkill() synchronous, which is a stronger guarantee than we have on Unix. That may be overkill, but on the other hand we have not seen comparable timing bugs on any Unix platform. While at it, add some comments to this sadly underdocumented code. Problem diagnosis and fix by Amit Kapila; I just added the comments. Back-patch to all supported versions, as it appears that this can cause visible NOTIFY timing oddities on all of them, and there might be other misbehavior due to slow delivery of other signals. Discussion: https://postgr.es/m/32745.1575303812@sss.pgh.pa.us
* Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables.Tom Lane2019-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We implement ON COMMIT DELETE ROWS by truncating tables marked that way, which requires also truncating/rebuilding their indexes. But RelationTruncateIndexes asks the relcache for up-to-date copies of any index expressions, which may cause execution of eval_const_expressions on them, which can result in actual execution of subexpressions. This is a bad thing to have happening during ON COMMIT. Manuel Rigger reported that use of a SQL function resulted in crashes due to expectations that ActiveSnapshot would be set, which it isn't. The most obvious fix perhaps would be to push a snapshot during PreCommit_on_commit_actions, but I think that would just open the door to more problems: CommitTransaction explicitly expects that no user-defined code can be running at this point. Fortunately, since we know that no tuples exist to be indexed, there seems no need to use the real index expressions or predicates during RelationTruncateIndexes. We can set up dummy index expressions instead (we do need something that will expose the right data type, as there are places that build index tupdescs based on this), and just ignore predicates and exclusion constraints. In a green field it'd likely be better to reimplement ON COMMIT DELETE ROWS using the same "init fork" infrastructure used for unlogged relations. That seems impractical without catalog changes though, and even without that it'd be too big a change to back-patch. So for now do it like this. Per private report from Manuel Rigger. This has been broken forever, so back-patch to all supported branches.
* Fix typo in comment.Etsuro Fujita2019-11-27
|
* Avoid assertion failure with LISTEN in a serializable transaction.Tom Lane2019-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If LISTEN is the only action in a serializable-mode transaction, and the session was not previously listening, and the notify queue is not empty, predicate.c reported an assertion failure. That happened because we'd acquire the transaction's initial snapshot during PreCommit_Notify, which was called *after* predicate.c expects any such snapshot to have been established. To fix, just swap the order of the PreCommit_Notify and PreCommit_CheckForSerializationFailure calls during CommitTransaction. This will imply holding the notify-insertion lock slightly longer, but the difference could only be meaningful in serializable mode, which is an expensive option anyway. It appears that this is just an assertion failure, with no consequences in non-assert builds. A snapshot used only to scan the notify queue could not have been involved in any serialization conflicts, so there would be nothing for PreCommit_CheckForSerializationFailure to do except assign it a prepareSeqNo and set the SXACT_FLAG_PREPARED flag. And given no conflicts, neither of those omissions affect the behavior of ReleasePredicateLocks. This admittedly once-over-lightly analysis is backed up by the lack of field reports of trouble. Per report from Mark Dilger. The bug is old, so back-patch to all supported branches; but the new test case only goes back to 9.6, for lack of adequate isolationtester infrastructure before that. Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
* Defend against self-referential views in relation_is_updatable().Tom Lane2019-11-21
| | | | | | | | | | | | | | | | | | | | | While a self-referential view doesn't actually work, it's possible to create one, and it turns out that this breaks some of the information_schema views. Those views call relation_is_updatable(), which neglected to consider the hazards of being recursive. In older PG versions you get a "stack depth limit exceeded" error, but since v10 it'd recurse to the point of stack overrun and crash, because commit a4c35ea1c took out the expression_returns_set() call that was incidentally checking the stack depth. Since this function is only used by information_schema views, it seems like it'd be better to return "not updatable" than suffer an error. Hence, add tracking of what views we're examining, in just the same way that the nearby fireRIRrules() code detects self-referential views. I added a check_stack_depth() call too, just to be defensive. Per private report from Manuel Rigger. Back-patch to all supported versions.
* Revise GIN READMEAlexander Korotkov2019-11-20
| | | | | | | | | | | | | | | | | We find GIN concurrency bugs from time to time. One of the problems here is that concurrency of GIN isn't well-documented in README. So, it might be even hard to distinguish design bugs from implementation bugs. This commit revised concurrency section in GIN README providing more details. Some examples are illustrated in ASCII art. Also, this commit add the explanation of how is tuple layout in internal GIN B-tree page different in comparison with nbtree. Discussion: https://postgr.es/m/CAPpHfduXR_ywyaVN4%2BOYEGaw%3DcPLzWX6RxYLBncKw8de9vOkqw%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Peter Geoghegan Backpatch-through: 9.4
* Fix traversing to the deleted GIN page via downlinkAlexander Korotkov2019-11-20
| | | | | | | | | | | | | | | Current GIN code appears to don't handle traversing to the deleted page via downlink. This commit fixes that by stepping right from the delete page like we do in nbtree. This commit also fixes setting 'deleted' flag to the GIN pages. Now other page flags are not erased once page is deleted. That helps to keep our assertions true if we arrive deleted page via downlink. Discussion: https://postgr.es/m/CAPpHfdvMvsw-NcE5bRS7R1BbvA4BxoDnVVjkXC5W0Czvy9LVrg%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Peter Geoghegan Backpatch-through: 9.4
* Further fix dumping of views that contain just VALUES(...).Tom Lane2019-11-16
| | | | | | | | | | | | | | | | | | | | | | | It turns out that commit e9f1c01b7 missed a case: we must print a VALUES clause in long format if get_query_def is given a resultDesc that would require the query's output column name(s) to be different from what the bare VALUES clause would produce. This applies in case an ALTER ... RENAME COLUMN has been done to a view that formerly could be printed in simple format, as shown in the added regression test case. It also explains bug #16119 from Dmitry Telpt, because it turns out that (unlike CREATE VIEW) CREATE MATERIALIZED VIEW fails to apply any column aliases it's given to the stored ON SELECT rule. So to get them to be printed, we have to account for the resultDesc renaming. It might be worth changing the matview code so that it creates the ON SELECT rule with the correct aliases; but we'd still need these messy checks in get_simple_values_rte to handle the case of a subsequent column rename, so any such change would be just neatnik-ism not a bug fix. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16119-e64823f30a45a754@postgresql.org
* Translation updatesPeter Eisentraut2019-11-11
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ccaca304fd66160be942ccf657be787511659ac1
* Fix gratuitous error message variationPeter Eisentraut2019-11-08
|
* Fix integer-overflow edge case detection in interval_mul and pgbench.Tom Lane2019-11-07
| | | | | | | | | | | | | | | | | | This patch adopts the overflow check logic introduced by commit cbdb8b4c0 into two more places. interval_mul() failed to notice if it computed a new microseconds value that was one more than INT64_MAX, and pgbench's double-to-int64 logic had the same sorts of edge-case problems that cbdb8b4c0 fixed in the core code. To make this easier to get right in future, put the guts of the checks into new macros in c.h, and add commentary about how to use the macros correctly. Back-patch to all supported branches, as we did with the previous fix. Yuya Watari Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com
* Fix assertion failure when running pgbench -s.Fujii Masao2019-11-07
| | | | | | | | | | | | | | | | | | If there is the WAL page that the continuation WAL record just fits within (i.e., the continuation record ends just at the end of the page) and the LSN in such page is specified with -s option, previously pg_waldump caused an assertion failure. The cause of this assertion failure was that XLogFindNextRecord() that pg_waldump -s calls mistakenly handled such special WAL page. This commit changes XLogFindNextRecord() so that it can handle such WAL page correctly. Back-patch to all supported versions. Author: Andrey Lepikhov Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
* Fix timestamp of sent message for write context in logical decodingMichael Paquier2019-11-06
| | | | | | | | | | | | | | | | | | | When sending data for logical decoding using the streaming replication protocol via a WAL sender, the timestamp of the sent write message is allocated at the beginning of the message when preparing for the write, and actually computed when the write message is ready to be sent. The timestamp was getting computed after sending the message. This impacts anything using logical decoding, causing for example logical replication to report mostly NULL for last_msg_send_time in pg_stat_subscription. This commit makes sure that the timestamp is computed before sending the message. This is wrong since 5a991ef, so backpatch down to 9.4. Author: Jeff Janes Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com Backpatch-through: 9.4
* Avoid logging complaints about abandoned connections when using PAM.Tom Lane2019-11-05
| | | | | | | | | | | | | | | | | | | For a long time (since commit aed378e8d) we have had a policy to log nothing about a connection if the client disconnects when challenged for a password. This is because libpq-using clients will typically do that, and then come back for a new connection attempt once they've collected a password from their user, so that logging the abandoned connection attempt will just result in log spam. However, this did not work well for PAM authentication: the bottom-level function pam_passwd_conv_proc() was on board with it, but we logged messages at higher levels anyway, for lack of any reporting mechanism. Add a flag and tweak the logic so that the case is silent, as it is for other password-using auth mechanisms. Per complaint from Yoann La Cancellera. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com
* Catch invalid typlens in a couple of placesPeter Eisentraut2019-11-04
| | | | | | | | | | Rearrange the logic in record_image_cmp() and record_image_eq() to error out on unexpected typlens (either not supported there or completely invalid due to corruption). Barring corruption, this is not possible today but it seems more future-proof and robust to fix this. Reported-by: Peter Geoghegan <pg@bowt.ie>
* Fix race condition at backend exit when deleting element in syncrep queueMichael Paquier2019-11-01
| | | | | | | | | | | | | | | When a backend exits, it gets deleted from the syncrep queue if present. The queue was checked without SyncRepLock taken in exclusive mode, so it would have been possible for a backend to remove itself after a WAL sender already did the job. Fix this issue based on a suggestion from Fujii Masao, by first checking the queue without the lock. Then, if the backend is present in the queue, take the lock and perform an additional lookup check before doing the element deletion. Author: Dongming Liu Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com Backpatch-through: 9.4
* Clean up properly error_context_stack in autovacuum worker on exceptionMichael Paquier2019-10-23
| | | | | | | | | | | | | | Any callback set would have no meaning in the context of an exception. As an autovacuum worker exits quickly in this context, this could be only an issue within EmitErrorReport(), where the elog hook is for example called. That's unlikely to going to be a problem, but let's be clean and consistent with other code paths handling exceptions. This is present since 2909419, which introduced autovacuum. Author: Ashwin Agrawal Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com Backpatch-through: 9.4
* Fix failure of archive recovery with recovery_min_apply_delay enabled.Fujii Masao2019-10-18
| | | | | | | | | | | | | | | | | | | | | | | | recovery_min_apply_delay parameter is intended for use with streaming replication deployments. However, the document clearly explains that the parameter will be honored in all cases if it's specified. So it should take effect even if in archive recovery. But, previously, archive recovery with recovery_min_apply_delay enabled always failed, and caused assertion failure if --enable-caasert is enabled. The cause of this problem is that; the ownership of recoveryWakeupLatch that recovery_min_apply_delay uses was taken only when standby mode is requested. So unowned latch could be used in archive recovery, and which caused the failure. This commit changes recovery code so that the ownership of recoveryWakeupLatch is taken even in archive recovery. Which prevents archive recovery with recovery_min_apply_delay from failing. Back-patch to v9.4 where recovery_min_apply_delay was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
* Fix minor bug in logical-replication walsender shutdownAlvaro Herrera2019-10-17
| | | | | | | | | | | | | | | | | Logical walsender should exit when it catches up with sending WAL during shutdown; but there was a rare corner case when it failed to because of a race condition that puts it back to wait for more WAL instead -- but since there wasn't any, it'd not shut down immediately. It would only continue the shutdown when wal_sender_timeout terminates the sleep, which causes annoying waits during shutdown procedure. Restructure the code so that we no longer forget to set WalSndCaughtUp in that case. This was an oversight in commit c6c333436. Backpatch all the way down to 9.4. Author: Craig Ringer, Álvaro Herrera Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com
* When restoring GUCs in parallel workers, show an error context.Thomas Munro2019-10-17
| | | | | | | | | | | Otherwise it can be hard to see where an error is coming from, when the parallel worker sets all the GUCs that it received from the leader. Bug #15726. Back-patch to 9.5, where RestoreGUCState() appeared. Reported-by: Tiago Anastacio Reviewed-by: Daniel Gustafsson, Tom Lane Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org
* Fix bug that could try to freeze running multixacts.Thomas Munro2019-10-17
| | | | | | | | | | | | | Commits 801c2dc7 and 801c2dc7 made it possible for vacuum to try to freeze a multixact that is still running. That was prevented by a check, but raised an error. Repair. Back-patch all the way. Author: Nathan Bossart, Jeremy Schneider Reported-by: Jeremy Schneider Reviewed-by: Jim Nasby, Thomas Munro Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com
* Flush logical mapping files with fd opened for read/write at checkpointMichael Paquier2019-10-09
| | | | | | | | | | | | | | | The file descriptor was opened with read-only to fsync a regular file, which would cause EBADFD errors on some platforms. This is similar to the recent fix done by a586cc4b (which was broken by me with 82a5649), except that I noticed this issue while monitoring the backend code for similar mistakes. Backpatch to 9.4, as this has been introduced since logical decoding exists as of b89e151. Author: Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz Backpatch-through: 9.4
* Check for too many postmaster children before spawning a bgworker.Tom Lane2019-10-07
| | | | | | | | | | | | | | | | | | | | The postmaster's code path for spawning a bgworker neglected to check whether we already have the max number of live child processes. That's a bit hard to hit, since it would necessarily be a transient condition; but if we do, AssignPostmasterChildSlot() fails causing a postmaster crash, as seen in a report from Bhargav Kamineni. To fix, invoke canAcceptConnections() in the bgworker code path, as we do in the other code paths that spawn children. Since we don't want the same pmState tests in this case, add a child-process-type parameter to canAcceptConnections() so that it can know what to do. Back-patch to 9.5. In principle the same hazard exists in 9.4, but the code is enough different that this patch wouldn't quite fix it there. Given the tiny usage of bgworkers in that branch it doesn't seem worth creating a variant patch for it. Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us
* Fix bitshiftright()'s zero-padding some more.Tom Lane2019-10-04
| | | | | | | | | | | | | | | Commit 5ac0d9360 failed to entirely fix bitshiftright's habit of leaving one-bits in the pad space that should be all zeroes, because in a moment of sheer brain fade I'd concluded that only the code path used for not-a-multiple-of-8 shift distances needed to be fixed. Of course, a multiple-of-8 shift distance can also cause the problem, so we need to forcibly zero the extra bits in both cases. Per bug #16037 from Alexander Lakhin. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org
* Avoid unnecessary out-of-memory errors during encoding conversion.Tom Lane2019-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encoding conversion uses the very simplistic rule that the output can't be more than 4X longer than the input, and palloc's a buffer of that size. This results in failure to convert any string longer than 1/4 GB, which is becoming an annoying limitation. As a band-aid to improve matters, allow the allocated output buffer size to exceed 1GB. We still insist that the final result fit into MaxAllocSize (1GB), though. Perhaps it'd be safe to relax that restriction, but it'd require close analysis of all callers, which is daunting (not least because external modules might call these functions). For the moment, this should allow a 2X to 4X improvement in the longest string we can convert, which is a useful gain in return for quite a simple patch. Also, once we have successfully converted a long string, repalloc the output down to the actual string length, returning the excess to the malloc pool. This seems worth doing since we can usually expect to give back several MB if we take this path at all. This still leaves much to be desired, most notably that the assumption that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no guard code verifying that the output buffer isn't overrun. Fixing that would require significant changes in the encoding conversion APIs, so it'll have to wait for some other day. The present patch seems safely back-patchable, so patch all supported branches. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
* Allow repalloc() to give back space when a large chunk is downsized.Tom Lane2019-10-03
| | | | | | | | | | | | | | | | | | Up to now, if you resized a large (>8K) palloc chunk down to a smaller size, aset.c made no attempt to return any space to the malloc pool. That's unpleasant if a really large allocation is resized to a significantly smaller size. I think no such cases existed when this code was designed, and I'm not sure whether they're common even yet, but an upcoming fix to encoding conversion will certainly create such cases. Therefore, fix AllocSetRealloc so that it gives realloc() a chance to do something with the block. This doesn't noticeably increase complexity, we mostly just have to change the order in which the cases are considered. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
* Selectively include window frames in expression walks/mutates.Andrew Gierth2019-10-03
| | | | | | | | | | | | | | | | | | | | query_tree_walker and query_tree_mutator were skipping the windowClause of the query, without regard for the fact that the startOffset and endOffset in a WindowClause node are expression trees that need to be processed. This was an oversight in commit ec4be2ee6 from 2010 which added the expression fields; the main symptom is that function parameters in window frame clauses don't work in inlined functions. Fix (as conservatively as possible since this needs to not break existing out-of-tree callers) and add tests. Backpatch all the way, since this has been broken since 9.0. Per report from Alastair McKinley; fix by me with kibitzing and review from Tom Lane. Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com
* Remove temporary WAL and history files at the end of archive recoveryMichael Paquier2019-10-02
| | | | | | | | | | | | | | | cbc55da has reworked the order of some actions at the end of archive recovery. Unfortunately this overlooked the fact that the startup process needs to remove RECOVERYXLOG (for temporary WAL segment newly recovered from archives) and RECOVERYHISTORY (for temporary history file) at this step, leaving the files around even after recovery ended. Backpatch to 9.5, like the previous commit. Author: Sawada Masahiko Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com Backpatch-through: 9.5
* Fix failure to zero-pad the result of bitshiftright().Tom Lane2019-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the bitstring length is not a multiple of 8, we'd shift the rightmost bits into the pad space, which must be zeroes --- bit_cmp, for one, depends on that. This'd lead to the result failing to compare equal to what it should compare equal to, as reported in bug #16013 from Daryl Waycott. This is, if memory serves, not the first such bug in the bitstring functions. In hopes of making it the last one, do a bit more work than minimally necessary to fix the bug: * Add assertion checks to bit_out() and varbit_out() to complain if they are given incorrectly-padded input. This will improve the odds that manual testing of any new patch finds problems. * Encapsulate the padding-related logic in macros to make it easier to use. Also, remove unnecessary padding logic from bit_or() and bitxor(). Somebody had already noted that we need not re-pad the result of bit_and() since the inputs are required to be the same length, but failed to extrapolate that to the other two. Also, move a comment block that once was near the head of varbit.c (but people kept putting other stuff in front of it), to put it in the header block. Note for the release notes: if anyone has inconsistent data as a result of saving the output of bitshiftright() in a table, it's possible to fix it with something like UPDATE mytab SET bitcol = ~(~bitcol) WHERE bitcol != ~(~bitcol); This has been broken since day one, so back-patch to all supported branches. Discussion: https://postgr.es/m/16013-c2765b6996aacae9@postgresql.org
* Fix oversight in backpatch of 6cae9d2c10Alexander Korotkov2019-09-19
| | | | | | | | | | During backpatch of 6cae9d2c10 Float8GetDatum() was accidentally removed. This commit turns it back. Reported-by: Erik Rijkers Discussion: https://postgr.es/m/6d51305e1159241cabee132f7efc7eff%40xs4all.nl Author: Tom Lane Backpatch-through: from 11 to 9.5
* Improve handling of NULLs in KNN-GiST and KNN-SP-GiSTAlexander Korotkov2019-09-19
| | | | | | | | | | | | | | | | | | | | | This commit improves subject in two ways: * It removes ugliness of 02f90879e7, which stores distance values and null flags in two separate arrays after GISTSearchItem struct. Instead we pack both distance value and null flag in IndexOrderByDistance struct. Alignment overhead should be negligible, because we typically deal with at most few "col op const" expressions in ORDER BY clause. * It fixes handling of "col op NULL" expression in KNN-SP-GiST. Now, these expression are not passed to support functions, which can't deal with them. Instead, NULL result is implicitly assumed. It future we may decide to teach support functions to deal with NULL arguments, but current solution is bugfix suitable for backpatch. Reported-by: Nikita Glukhov Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
* logical decoding: process ASSIGNMENT during snapshot buildAlvaro Herrera2019-09-13
| | | | | | | | | | | | | Most WAL records are ignored in early SnapBuild snapshot build phases. But it's critical to process some of them, so that later messages have the correct transaction state after the snapshot is completely built; in particular, XLOG_XACT_ASSIGNMENT messages are critical in order for sub-transactions to be correctly assigned to their parent transactions, or at least one assert misbehaves, as reported by Ildar Musin. Diagnosed-by: Masahiko Sawada Author: Masahiko Sawada Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com
* Fix nbtree page split rmgr desc routine.Peter Geoghegan2019-09-12
| | | | | | | | | | | | Include newitemoff in rmgr desc output for nbtree page split records. In passing, correct an obsolete comment that claimed that newitemoff is only logged for _L variant nbtree page split WAL records. Both issues were oversights in commit 2c03216d831, which revamped the WAL format. Author: Peter Geoghegan Backpatch: 9.5-, where the WAL format was revamped.
* Fix usage of whole-row variables in WCO and RLS policy expressions.Tom Lane2019-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since WITH CHECK OPTION was introduced, ExecInitModifyTable has initialized WCO expressions with the wrong plan node as parent -- that is, it passed its input subplan not the ModifyTable node itself. Up to now we thought this was harmless, but bug #16006 from Vinay Banakar shows it's not: if the input node is a SubqueryScan then ExecInitWholeRowVar can get confused into doing the wrong thing. (The fact that ExecInitWholeRowVar contains such logic is certainly a horrid kluge that doesn't deserve to live, but figuring out another way to do that is a task for some other day.) Andres had already noticed the wrong-parent mistake and fixed it in commit 148e632c0, but not being aware of any user-visible consequences, he quite reasonably didn't back-patch. This patch is simply a back-patch of 148e632c0, plus addition of a test case based on bug #16006. I also added the test case to v12/HEAD, even though the bug is already fixed there. Back-patch to all supported branches. 9.4 lacks RLS policies so the new test case doesn't work there, but I'm pretty sure a test could be devised based on using a whole-row Var in a plain WITH CHECK OPTION condition. (I lack the cycles to do so myself, though.) Andres Freund and Tom Lane Discussion: https://postgr.es/m/16006-99290d2e4642cbd5@postgresql.org Discussion: https://postgr.es/m/20181205225213.hiwa3kgoxeybqcqv@alap3.anarazel.de
* Fix RelationIdGetRelation calls that weren't bothering with error checks.Tom Lane2019-09-08
| | | | | | | | | | | | Some of these are quite old, but that doesn't make them not bugs. We'd rather report a failure via elog than SIGSEGV. While at it, uniformly spell the error check as !RelationIsValid(rel) rather than a bare rel == NULL test. The machine code is the same but it seems better to be consistent. Coverity complained about this today, not sure why, because the mistake is in fact old.
* Fix handling of NULL distances in KNN-GiSTAlexander Korotkov2019-09-08
| | | | | | | | | | | | | | In order to implement NULL LAST semantic GiST previously assumed distance to the NULL value to be Inf. However, our distance functions can return Inf and NaN for non-null values. In such cases, NULL LAST semantic appears to be broken. This commit fixes that by introducing separate array of null flags for distances. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 9.4
* Fix handling Inf and Nan values in GiST pairing heap comparatorAlexander Korotkov2019-09-08
| | | | | | | | | | | | | | | | Previously plain float comparison was used in GiST pairing heap. Such comparison doesn't provide proper ordering for value sets containing Inf and Nan values. This commit fixes that by usage of float8_cmp_internal(). Note, there is remaining problem with NULL distances, which are represented as Inf in pairing heap. It would be fixes in subsequent commit. Backpatch to all supported versions. Reported-by: Andrey Borodin Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Heikki Linnakangas Backpatch-through: 9.4
* When performing a base backup, check for read errors.Robert Haas2019-09-06
| | | | | | | | | | | | | | | | The old code didn't differentiate between a read error and a concurrent truncation. fread reports both of these by returning 0; you have to use feof() or ferror() to distinguish between them, which this code did not do. It might be a better idea to use read() rather than fread() here, so that we can display a less-generic error message, but I'm not sure that would qualify as a back-patchable bug fix, so just do this much for now. Jeevan Chalke, reviewed by Jeevan Ladhe and by me. Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com
* Fix overflow check and comment in GIN posting list encoding.Heikki Linnakangas2019-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment did not match what the code actually did for integers with the 43rd bit set. You get an integer like that, if you have a posting list with two adjacent TIDs that are more than 2^31 blocks apart. According to the comment, we would store that in 6 bytes, with no continuation bit on the 6th byte, but in reality, the code encodes it using 7 bytes, with a continuation bit on the 6th byte as normal. The decoding routine also handled these 7-byte integers correctly, except for an overflow check that assumed that one integer needs at most 6 bytes. Fix the overflow check, and fix the comment to match what the code actually does. Also fix the comment that claimed that there are 17 unused bits in the 64-bit representation of an item pointer. In reality, there are 64-32-11=21. Fitting any item pointer into max 6 bytes was an important property when this was written, because in the old pre-9.4 format, item pointers were stored as plain arrays, with 6 bytes for every item pointer. The maximum of 6 bytes per integer in the new format guaranteed that we could convert any page from the old format to the new format after upgrade, so that the new format was never larger than the old format. But we hardly need to worry about that anymore, and running into that problem during upgrade, where an item pointer is expanded from 6 to 7 bytes such that the data doesn't fit on a page anymore, is implausible in practice anyway. Backpatch to all supported versions. This also includes a little test module to test these large distances between item pointers, without requiring a 16 TB table. It is not backpatched, I'm including it more for the benefit of future development of new posting list formats. Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov