aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Fix negative bitmapset member not allowed error in logical replicationPeter Eisentraut2019-11-09
| | | | | | | | | | | | | | | | | This happens when we add a replica identity column on a subscriber that does not yet exist on the publisher, according to the mapping maintained by the subscriber. Code that checks whether the target relation on the subscriber is updatable would check the replica identity attribute bitmap with a column number -1, which would result in an error. To fix, skip such columns in the bitmap lookup and consider the relation not updatable. The result is consistent with the rule that the replica identity columns on the subscriber must be a subset of those on the publisher, since if the column doesn't exist on the publisher, the column set on the subscriber can't be a subset. Reported-by: Tim Clarke <tim.clarke@minerva.info> Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info
* Fix gratuitous error message variationPeter Eisentraut2019-11-08
|
* Fix SET CONSTRAINTS .. DEFERRED on partitioned tablesAlvaro Herrera2019-11-07
| | | | | | | | | | | | | | | | | SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a sanity check that ensures that the affected constraints have triggers. On partitioned tables, the triggers are in the leaf partitions, not in the partitioned relations themselves, so the sanity check fails. Removing the sanity check solves the problem, because the code needed to support the case is already there. Backpatch to 11. Note: deferred unique constraints are not affected by this bug, because they do have triggers in the parent partitioned table. I did not add a test for this scenario. Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql
* Fix integer-overflow edge case detection in interval_mul and pgbench.Tom Lane2019-11-07
| | | | | | | | | | | | | | | | | | This patch adopts the overflow check logic introduced by commit cbdb8b4c0 into two more places. interval_mul() failed to notice if it computed a new microseconds value that was one more than INT64_MAX, and pgbench's double-to-int64 logic had the same sorts of edge-case problems that cbdb8b4c0 fixed in the core code. To make this easier to get right in future, put the guts of the checks into new macros in c.h, and add commentary about how to use the macros correctly. Back-patch to all supported branches, as we did with the previous fix. Yuya Watari Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com
* Fix assertion failure when running pgbench -s.Fujii Masao2019-11-07
| | | | | | | | | | | | | | | | | | If there is the WAL page that the continuation WAL record just fits within (i.e., the continuation record ends just at the end of the page) and the LSN in such page is specified with -s option, previously pg_waldump caused an assertion failure. The cause of this assertion failure was that XLogFindNextRecord() that pg_waldump -s calls mistakenly handled such special WAL page. This commit changes XLogFindNextRecord() so that it can handle such WAL page correctly. Back-patch to all supported versions. Author: Andrey Lepikhov Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
* Minor code review for tuple slot rewrite.Tom Lane2019-11-06
| | | | | | | | | | | | | | | | | Avoid creating transiently-inconsistent slot states where possible, by not setting TTS_FLAG_SHOULDFREE until after the slot actually has a free'able tuple pointer, and by making sure that we reset tts_nvalid and related derived state before we replace the tuple contents. This would only matter if something were to examine the slot after we'd suffered some kind of error (e.g. out of memory) while manipulating the slot. We typically don't do that, so these changes might just be cosmetic --- but even if so, it seems like good future-proofing. Also remove some redundant Asserts, and add a couple for consistency. Back-patch to v12 where all this code was rewritten. Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org
* Sync our DTrace infrastructure with c.h's definition of type bool.Tom Lane2019-11-06
| | | | | | | | | | | | | | | | | | | Since commit d26a810eb, we've defined bool as being either _Bool from <stdbool.h>, or "unsigned char"; but that commit overlooked the fact that probes.d has "#define bool char". For consistency, make it say "unsigned char" instead. This should be strictly a cosmetic change, but it seems best to be in sync. Formally, in the now-normal case where we're using <stdbool.h>, it'd be better to write "#define bool _Bool". However, then we'd need some build infrastructure to inject that configuration choice into probes.d, and it doesn't seem worth the trouble. We only use <stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that bool parameters are "unsigned char" should be close enough. Back-patch to v12 where d26a810eb came in. Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
* Fix memory allocation mistakePeter Eisentraut2019-11-06
| | | | | | | | The previous code was allocating more memory than necessary because the formula used the wrong data type. Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/20191105172918.3e32a446@firost
* Fix timestamp of sent message for write context in logical decodingMichael Paquier2019-11-06
| | | | | | | | | | | | | | | | | | | When sending data for logical decoding using the streaming replication protocol via a WAL sender, the timestamp of the sent write message is allocated at the beginning of the message when preparing for the write, and actually computed when the write message is ready to be sent. The timestamp was getting computed after sending the message. This impacts anything using logical decoding, causing for example logical replication to report mostly NULL for last_msg_send_time in pg_stat_subscription. This commit makes sure that the timestamp is computed before sending the message. This is wrong since 5a991ef, so backpatch down to 9.4. Author: Jeff Janes Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com Backpatch-through: 9.4
* Request small targetlist for input to WindowAgg.Andrew Gierth2019-11-06
| | | | | | | | | | | | | | | | WindowAgg will potentially store large numbers of input rows into tuplestores to allow access to other rows in the frame. If the input is coming via an explicit Sort node, then unneeded columns will already have been discarded (since Sort requests a small tlist); but there are idioms like COUNT(*) OVER () that result in the input not being sorted at all, and cases where the input is being sorted by some means other than a Sort; if we don't request a small tlist, then WindowAgg's storage requirement is inflated by the unneeded columns. Backpatch back to 9.6, where the current tlist handling was added. (Prior to that, WindowAgg would always use a small tlist.) Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk
* Avoid logging complaints about abandoned connections when using PAM.Tom Lane2019-11-05
| | | | | | | | | | | | | | | | | | | For a long time (since commit aed378e8d) we have had a policy to log nothing about a connection if the client disconnects when challenged for a password. This is because libpq-using clients will typically do that, and then come back for a new connection attempt once they've collected a password from their user, so that logging the abandoned connection attempt will just result in log spam. However, this did not work well for PAM authentication: the bottom-level function pam_passwd_conv_proc() was on board with it, but we logged messages at higher levels anyway, for lack of any reporting mechanism. Add a flag and tweak the logic so that the case is silent, as it is for other password-using auth mechanisms. Per complaint from Yoann La Cancellera. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com
* Fix "unexpected relkind" error when denying permissions on toast tables.Tom Lane2019-11-05
| | | | | | | | | | | | | | | | | | | | get_relkind_objtype, and hence get_object_type, failed when applied to a toast table. This is not a good thing, because it prevents reporting of perfectly legitimate permissions errors. (At present, these functions are in fact *only* used to determine the ObjectType argument for acl_error() calls.) It seems best to have them fall back to returning OBJECT_TABLE in every case where they can't determine an object type for a pg_class entry, so do that. In passing, make some edits to alter.c to make it more obvious that those calls of get_object_type() are used only for error reporting. This might save a few cycles in the non-error code path, too. Back-patch to v11 where this issue originated. John Hsu, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/C652D3DF-2B0C-4128-9420-FB5379F6B1E4@amazon.com
* Generate EquivalenceClass members for partitionwise child join rels.Tom Lane2019-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d25ea0127 got rid of what I thought were entirely unnecessary derived child expressions in EquivalenceClasses for EC members that mention multiple baserels. But it turns out that some of the child expressions that code created are necessary for partitionwise joins, else we fail to find matching pathkeys for Sort nodes. (This happens only for certain shapes of the resulting plan; it may be that partitionwise aggregation is also necessary to show the failure, though I'm not sure of that.) Reverting that commit entirely would be quite painful performance-wise for large partition sets. So instead, add code that explicitly generates child expressions that match only partitionwise child join rels we have actually generated. Per report from Justin Pryzby. (Amit Langote noticed the problem earlier, though it's not clear if he recognized then that it could result in a planner error, not merely failure to exploit partitionwise join, in the code as-committed.) Back-patch to v12 where commit d25ea0127 came in. Amit Langote, with lots of kibitzing from me Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com Discussion: https://postgr.es/m/20191011143703.GN10470@telsasoft.com
* Catch invalid typlens in a couple of placesPeter Eisentraut2019-11-04
| | | | | | | | | | Rearrange the logic in record_image_cmp() and datum_image_eq() to error out on unexpected typlens (either not supported there or completely invalid due to corruption). Barring corruption, this is not possible today but it seems more future-proof and robust to fix this. Reported-by: Peter Geoghegan <pg@bowt.ie>
* Suppress warning from older compilers.Tom Lane2019-11-03
| | | | | | | | Commit 8af1624e3 introduced a warning about possibly returning without a value, on compilers that don't realize that ereport(ERROR) doesn't return. Tweak the code to avoid that. Per buildfarm. Back-patch to 9.6, like the aforesaid commit.
* Validate ispell dictionaries more carefully.Tom Lane2019-11-02
| | | | | | | | | | | | | | | Using incorrect, or just mismatched, dictionary and affix files could result in a crash, due to failure to cross-check offsets obtained from the file. Add necessary validation, as well as some Asserts for future-proofing. Per bug #16050 from Alexander Lakhin. Back-patch to 9.6 where the problem was introduced. Arthur Zakirov, per initial investigation by Tomas Vondra Discussion: https://postgr.es/m/16050-024ae722464ab604@postgresql.org Discussion: https://postgr.es/m/20191013012610.2p2fp3zzpoav7jzf@development
* Fix failure when creating cloned indexes for a partitionMichael Paquier2019-11-02
| | | | | | | | | | | | | | | | | | | When using CREATE TABLE for a new partition, the partitioned indexes of the parent are created automatically in a fashion similar to LIKE INDEXES. The new partition and its parent use a mapping for attribute numbers for this operation, and while the mapping was correctly built, its length was defined as the number of attributes of the newly-created child, and not the parent. If the parent includes dropped columns, this could cause failures. This is wrong since 8b08f7d which has introduced the concept of partitioned indexes, so backpatch down to 11. Reported-by: Wyatt Alt Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/CAGem3qCcRmhbs4jYMkenYNfP2kEusDXvTfw-q+eOhM0zTceG-g@mail.gmail.com Backpatch-through: 11
* Fix race condition at backend exit when deleting element in syncrep queueMichael Paquier2019-11-01
| | | | | | | | | | | | | | | When a backend exits, it gets deleted from the syncrep queue if present. The queue was checked without SyncRepLock taken in exclusive mode, so it would have been possible for a backend to remove itself after a WAL sender already did the job. Fix this issue based on a suggestion from Fujii Masao, by first checking the queue without the lock. Then, if the backend is present in the queue, take the lock and perform an additional lookup check before doing the element deletion. Author: Dongming Liu Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com Backpatch-through: 9.4
* Fix handling of pg_class.relispartition at swap phase in REINDEX CONCURRENTLYMichael Paquier2019-10-29
| | | | | | | | | | | | | | | | | | | | When cancelling REINDEX CONCURRENTLY after swapping the old and new indexes (for example interruption at step 5), the old index remains around and is marked as invalid. The old index should also be manually droppable to clean up the parent relation from any invalid indexes still remaining. For a partition index reindexed, pg_class.relispartition was not getting updated, causing the index to not be droppable as DROP INDEX would look for dependencies in a partition tree, which do not exist anymore after the swap phase is done. The fix here is simple: when swapping the old and new indexes, make sure that pg_class.relispartition is correctly switched, similarly to what is done for the index name. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191015164047.GA22729@telsasoft.com Backpatch-through: 12
* Handle empty-string edge cases correctly in strpos().Tom Lane2019-10-28
| | | | | | | | | | | | | | Commit 9556aa01c rearranged the innards of text_position() in a way that would make it not work for empty search strings. Which is fine, because all callers of that code special-case an empty pattern in some way. However, the primary use-case (text_position itself) got special-cased incorrectly: historically it's returned 1 not 0 for an empty search string. Restore the historical behavior. Per complaint from Austin Drenski (via Shay Rojansky). Back-patch to v12 where it got broken. Discussion: https://postgr.es/m/CADT4RqAz7oN4vkPir86Kg1_mQBmBxCp-L_=9vRpgSNPJf0KRkw@mail.gmail.com
* Fix dependency handling at swap phase of REINDEX CONCURRENTLYMichael Paquier2019-10-28
| | | | | | | | | | | | | | | | | | | When swapping the dependencies of the old and new indexes, the code has been correctly switching all links in pg_depend from the old to the new index for both referencing and referenced entries. However it forgot the fact that the new index may itself have existing entries in pg_depend, like references to the parent table attributes. This resulted in duplicated entries in pg_depend after running REINDEX CONCURRENTLY. Fix this problem by removing any existing entries in pg_depend for the new index before switching the dependencies of the old index to the new one. More regression tests are added to check the consistency of entries in pg_depend for indexes, including partitions. Author: Michael Paquier Discussion: https://postgr.es/m/20191025064318.GF8671@paquier.xyz Backpatch-through: 12
* Fix initialization of fake LSN for unlogged relationsMichael Paquier2019-10-27
| | | | | | | | | | | | | | 9155580 has changed the value of the first fake LSN for unlogged relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a non-zero LSN on some pages to allow an interlocking logic to work, but its value was still initialized to 1 at the beginning of recovery or after running pg_resetwal. This fixes the initialization for both code paths. Author: Takayuki Tsunakawa Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com Backpatch-through: 12
* Handle interrupts within a transaction context in REINDEX CONCURRENTLYMichael Paquier2019-10-25
| | | | | | | | | | | | | | | | | | | | | | | Phases 2 (building the new index) and 3 (validating the new index) checked for interrupts outside a transaction context, having as consequence to not release session-level locks taken on the parent relation and the old and new indexes processed. This could for example be triggered with statement_timeout and a bad timing, and would issue confusing error messages when shutting down the session still holding the locks (note that an assertion failure would be triggered first), on top of more issues with concurrent sessions trying to take a lock that would interfere with the SHARE UPDATE EXCLUSIVE locks hold here. This moves all the interruption checks inside a transaction context. Note that I have manually tested all interruptions to make sure that invalid indexes can be cleaned up properly. Partition indexes still have issues on their own with some missing dependency handling, which will be dealt with in a follow-up patch. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191013025145.GC4475@telsasoft.com Backpatch-through: 12
* Acquire properly session-level lock on new index in REINDEX CONCURRENTLYMichael Paquier2019-10-23
| | | | | | | | | | | | | | | In the first transaction run for REINDEX CONCURRENTLY, a thinko in the existing logic caused two session locks to be taken on the old index, causing the session lock on the newly-created index to be missed. This made possible concurrent DDL commands (like ALTER INDEX) on the new index while REINDEX CONCURRENTLY was processing from the point where the first internal transaction committed. This issue has been discovered while digging into another bug. Author: Michael Paquier Discussion: https://postgr.es/m/20191021074323.GB1869@paquier.xyz Backpatch-through: 12
* Clean up properly error_context_stack in autovacuum worker on exceptionMichael Paquier2019-10-23
| | | | | | | | | | | | | | Any callback set would have no meaning in the context of an exception. As an autovacuum worker exits quickly in this context, this could be only an issue within EmitErrorReport(), where the elog hook is for example called. That's unlikely to going to be a problem, but let's be clean and consistent with other code paths handling exceptions. This is present since 2909419, which introduced autovacuum. Author: Ashwin Agrawal Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com Backpatch-through: 9.4
* Update obsolete comment.Etsuro Fujita2019-10-21
| | | | | | | | | | | Commit b52b7dc25, which moved code creating PartitionBoundInfo in RelationBuildPartitionDesc() in partcache.c (relocated to partdesc.c afterwards) to partbounds.c, should have updated this, but didn't. Author: Etsuro Fujita Reviewed-by: Alvaro Herrera Backpatch-through: 12 Discussion: https://postgr.es/m/CAPmGK16Uxr%3DPatiGyaRwiQVLB7Y-GqbkK3AxRLVYzU0Czv%3DsEw%40mail.gmail.com
* Fix memory leak introduced in commit 7df159a620.Amit Kapila2019-10-21
| | | | | | | | | | | | | We memorize all internal and empty leaf pages in the 1st vacuum stage for gist indexes. They are used in the 2nd stage, to delete all the empty pages. There was a memory context page_set_context for this purpose, but we never used it. Reported-by: Amit Kapila Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 12, where it got introduced Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
* Fix failure of archive recovery with recovery_min_apply_delay enabled.Fujii Masao2019-10-18
| | | | | | | | | | | | | | | | | | | | | | | | recovery_min_apply_delay parameter is intended for use with streaming replication deployments. However, the document clearly explains that the parameter will be honored in all cases if it's specified. So it should take effect even if in archive recovery. But, previously, archive recovery with recovery_min_apply_delay enabled always failed, and caused assertion failure if --enable-caasert is enabled. The cause of this problem is that; the ownership of recoveryWakeupLatch that recovery_min_apply_delay uses was taken only when standby mode is requested. So unowned latch could be used in archive recovery, and which caused the failure. This commit changes recovery code so that the ownership of recoveryWakeupLatch is taken even in archive recovery. Which prevents archive recovery with recovery_min_apply_delay from failing. Back-patch to v9.4 where recovery_min_apply_delay was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
* Make crash recovery ignore recovery_min_apply_delay setting.Fujii Masao2019-10-18
| | | | | | | | | | | | | | | | | | | In v11 or before, this setting could not take effect in crash recovery because it's specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed this setting to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery_min_apply_delay setting. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
* Fix typoAlvaro Herrera2019-10-18
| | | | | | | Apparently while this code was being developed, ReindexRelationConcurrently operated on multiple relations. The version that was ultimately pushed doesn't, so this comment's use of plural is inaccurate.
* Update comments about progress reporting by index_dropAlvaro Herrera2019-10-18
| | | | | | | | Michaël Paquier complained that index_drop is requesting progress reporting for non-obvious reasons, so let's add a comment to explain why. Discussion: https://postgr.es/m/20191017010412.GH2602@paquier.xyz
* Fix timeout handling in logical replication workerMichael Paquier2019-10-18
| | | | | | | | | | | | | | | | The timestamp tracking the last moment a message is received in a logical replication worker was initialized in each loop checking if a message was received or not, causing wal_receiver_timeout to be ignored in basically any logical replication deployments. This also broke the ping sent to the server when reaching half of wal_receiver_timeout. This simply moves the initialization of the timestamp out of the apply loop to the beginning of LogicalRepApplyLoop(). Reported-by: Jehan-Guillaume De Rorthais Author: Julien Rouhaud Discussion: https://postgr.es/m/CAOBaU_ZHESFcWva8jLjtZdCLspMj7vqaB2k++rjHLY897ZxbYw@mail.gmail.com Backpatch-through: 10
* Fix minor bug in logical-replication walsender shutdownAlvaro Herrera2019-10-17
| | | | | | | | | | | | | | | | | Logical walsender should exit when it catches up with sending WAL during shutdown; but there was a rare corner case when it failed to because of a race condition that puts it back to wait for more WAL instead -- but since there wasn't any, it'd not shut down immediately. It would only continue the shutdown when wal_sender_timeout terminates the sleep, which causes annoying waits during shutdown procedure. Restructure the code so that we no longer forget to set WalSndCaughtUp in that case. This was an oversight in commit c6c333436. Backpatch all the way down to 9.4. Author: Craig Ringer, Álvaro Herrera Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com
* When restoring GUCs in parallel workers, show an error context.Thomas Munro2019-10-17
| | | | | | | | | | | Otherwise it can be hard to see where an error is coming from, when the parallel worker sets all the GUCs that it received from the leader. Bug #15726. Back-patch to 9.5, where RestoreGUCState() appeared. Reported-by: Tiago Anastacio Reviewed-by: Daniel Gustafsson, Tom Lane Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org
* Fix bug that could try to freeze running multixacts.Thomas Munro2019-10-17
| | | | | | | | | | | | | Commits 801c2dc7 and 801c2dc7 made it possible for vacuum to try to freeze a multixact that is still running. That was prevented by a check, but raised an error. Repair. Back-patch all the way. Author: Nathan Bossart, Jeremy Schneider Reported-by: Jeremy Schneider Reviewed-by: Jim Nasby, Thomas Munro Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com
* Fix crash when reporting CREATE INDEX progressAlvaro Herrera2019-10-16
| | | | | | | | | | | | A race condition can make us try to dereference a NULL pointer to the PGPROC struct of a process that's already finished. That results in crashes during REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY. This was introduced in ab0dfc961b6a, so backpatch to pg12. Reported by: Justin Pryzby Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/20191012004446.GT10470@telsasoft.com
* Fix CLUSTER on expression indexes.Andres Freund2019-10-15
| | | | | | | | | | | | | | | | Since the introduction of different slot types, in 1a0586de3657, we create a virtual slot in tuplesort_begin_cluster(). While that looks right, it unfortunately doesn't actually work, as ExecStoreHeapTuple() is used to store tuples in the slot. Unfortunately no regression tests for CLUSTER on expression indexes existed so far. Fix the slot type, and add bare bones tests for CLUSTER on expression indexes. Reported-By: Justin Pryzby Author: Andres Freund Discussion: https://postgr.es/m/20191011210320.GS10470@telsasoft.com Backpatch: 12, like 1a0586de3657
* Fix dependency handling of column drop with partitioned tablesMichael Paquier2019-10-13
| | | | | | | | | | | | | | | | | | | | | | When dropping a column on a partitioned table which has one or more partitioned indexes, the operation was failing as dependencies with partitioned indexes using the column dropped were not getting removed in a way consistent with the columns involved across all the relations part of an inheritance tree. This commit refactors the code executing column drop so as all the columns from an inheritance tree to remove are gathered first, and dropped all at the end. This way, we let the dependency machinery sort out by itself the deletion of all the columns with the partitioned indexes across a partition tree. This issue has been introduced by 1d92a0c, so backpatch down to REL_12_STABLE. Author: Amit Langote, Michael Paquier Reviewed-by: Álvaro Herrera, Ashutosh Sharma Discussion: https://postgr.es/m/CA+HiwqE9kuBsZ3b5pob2-cvE8ofzPWs-og+g8bKKGnu6b4-yTQ@mail.gmail.com Backpatch-through: 12
* Make crash recovery ignore restore_command and recovery_end_command settings.Fujii Masao2019-10-11
| | | | | | | | | | | | | | | | | | In v11 or before, those settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed those settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore restore_command and recovery_end_command settings. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
* Fix table rewrites that include a column without a default.Andres Freund2019-10-09
| | | | | | | | | | | | | | In c2fe139c201c I made ATRewriteTable() use tuple slots. Unfortunately I did not notice that columns can be added in a rewrite that do not have a default, when another column is added/altered requiring one. Initialize columns to NULL again, and add tests. Bug: #16038 Reported-By: anonymous Author: Andres Freund Discussion: https://postgr.es/m/16038-5c974541f2bf6749@postgresql.org Backpatch: 12, where the bug was introduced in c2fe139c201c
* Flush logical mapping files with fd opened for read/write at checkpointMichael Paquier2019-10-09
| | | | | | | | | | | | | | | The file descriptor was opened with read-only to fsync a regular file, which would cause EBADFD errors on some platforms. This is similar to the recent fix done by a586cc4b (which was broken by me with 82a5649), except that I noticed this issue while monitoring the backend code for similar mistakes. Backpatch to 9.4, as this has been introduced since logical decoding exists as of b89e151. Author: Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz Backpatch-through: 9.4
* Check for too many postmaster children before spawning a bgworker.Tom Lane2019-10-07
| | | | | | | | | | | | | | | | | | | | The postmaster's code path for spawning a bgworker neglected to check whether we already have the max number of live child processes. That's a bit hard to hit, since it would necessarily be a transient condition; but if we do, AssignPostmasterChildSlot() fails causing a postmaster crash, as seen in a report from Bhargav Kamineni. To fix, invoke canAcceptConnections() in the bgworker code path, as we do in the other code paths that spawn children. Since we don't want the same pmState tests in this case, add a child-process-type parameter to canAcceptConnections() so that it can know what to do. Back-patch to 9.5. In principle the same hazard exists in 9.4, but the code is enough different that this patch wouldn't quite fix it there. Given the tiny usage of bgworkers in that branch it doesn't seem worth creating a variant patch for it. Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us
* Fix crash caused by EPQ happening with a before update trigger present.Andres Freund2019-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ExecBRUpdateTriggers()'s GetTupleForTrigger() follows an EPQ chain the former needs to run the result tuple through the junkfilter again, and update the slot containing the new version of the tuple to contain that new version. The input tuple may already be in the junkfilter's output slot, which used to be OK - we don't need the previous version anymore. Unfortunately ff11e7f4b9ae started to use ExecCopySlot() to update newslot, and ExecCopySlot() doesn't support copying a slot into itself, leading to a slot in a corrupt state, which then can cause crashes or other symptoms. Fix this by skipping the ExecCopySlot() when copying into itself. While we could have easily made ExecCopySlot() handle that case, it seems better to add an assert forbidding doing so instead. As the goal of copying might be to make the contents of one slot independent from another, it seems failure prone to handle doing so silently. A follow-up commit will add tests for the obviously under-covered combination of EPQ and triggers. Done as a separate commit as it might make sense to backpatch them further than this bug. Also remove confusion with confusing variable names for slots in ExecBRDeleteTriggers() and ExecBRUpdateTriggers(). Bug: #16036 Reported-By: Антон Власов Author: Andres Freund Discussion: https://postgr.es/m/16036-28184c90d952fb7f@postgresql.org Backpatch: 12-, where ff11e7f4b9ae was merged
* Use a fd opened for read/write when syncing slots during startup, take 2.Andres Freund2019-10-04
| | | | | | | | | | | | | | | | | | | | | | | Cribbing from dfbaed45975: Some operating systems, including the reporter's windows, return EBADFD or similar when fsync() is invoked on a O_RDONLY file descriptor. Unfortunately RestoreSlotFromDisk() does exactly that; which causes failures after restarts in at least some scenarios. If you hit the bug the error message will be something like ERROR: could not fsync file "pg_replslot/$name/state": Bad file descriptor Simply use O_RDWR instead of O_RDONLY when opening the relevant file descriptor to fix the bug. Unfortunately this fix was undone in 82a5649fb9db. Re-apply, and add a comment. Bug: 16039 Reported-By: Hans Buschmann Author: Andres Freund Discussion: https://postgr.es/m/16039-196fc97cc05e141c@postgresql.org Backpatch: 12-, as 82a5649fb9db
* Fix bitshiftright()'s zero-padding some more.Tom Lane2019-10-04
| | | | | | | | | | | | | | | Commit 5ac0d9360 failed to entirely fix bitshiftright's habit of leaving one-bits in the pad space that should be all zeroes, because in a moment of sheer brain fade I'd concluded that only the code path used for not-a-multiple-of-8 shift distances needed to be fixed. Of course, a multiple-of-8 shift distance can also cause the problem, so we need to forcibly zero the extra bits in both cases. Per bug #16037 from Alexander Lakhin. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org
* Avoid unnecessary out-of-memory errors during encoding conversion.Tom Lane2019-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encoding conversion uses the very simplistic rule that the output can't be more than 4X longer than the input, and palloc's a buffer of that size. This results in failure to convert any string longer than 1/4 GB, which is becoming an annoying limitation. As a band-aid to improve matters, allow the allocated output buffer size to exceed 1GB. We still insist that the final result fit into MaxAllocSize (1GB), though. Perhaps it'd be safe to relax that restriction, but it'd require close analysis of all callers, which is daunting (not least because external modules might call these functions). For the moment, this should allow a 2X to 4X improvement in the longest string we can convert, which is a useful gain in return for quite a simple patch. Also, once we have successfully converted a long string, repalloc the output down to the actual string length, returning the excess to the malloc pool. This seems worth doing since we can usually expect to give back several MB if we take this path at all. This still leaves much to be desired, most notably that the assumption that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no guard code verifying that the output buffer isn't overrun. Fixing that would require significant changes in the encoding conversion APIs, so it'll have to wait for some other day. The present patch seems safely back-patchable, so patch all supported branches. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
* Allow repalloc() to give back space when a large chunk is downsized.Tom Lane2019-10-03
| | | | | | | | | | | | | | | | | | Up to now, if you resized a large (>8K) palloc chunk down to a smaller size, aset.c made no attempt to return any space to the malloc pool. That's unpleasant if a really large allocation is resized to a significantly smaller size. I think no such cases existed when this code was designed, and I'm not sure whether they're common even yet, but an upcoming fix to encoding conversion will certainly create such cases. Therefore, fix AllocSetRealloc so that it gives realloc() a chance to do something with the block. This doesn't noticeably increase complexity, we mostly just have to change the order in which the cases are considered. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
* Selectively include window frames in expression walks/mutates.Andrew Gierth2019-10-03
| | | | | | | | | | | | | | | | | | | | query_tree_walker and query_tree_mutator were skipping the windowClause of the query, without regard for the fact that the startOffset and endOffset in a WindowClause node are expression trees that need to be processed. This was an oversight in commit ec4be2ee6 from 2010 which added the expression fields; the main symptom is that function parameters in window frame clauses don't work in inlined functions. Fix (as conservatively as possible since this needs to not break existing out-of-tree callers) and add tests. Backpatch all the way, since this has been broken since 9.0. Per report from Alastair McKinley; fix by me with kibitzing and review from Tom Lane. Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com
* Remove temporary WAL and history files at the end of archive recoveryMichael Paquier2019-10-02
| | | | | | | | | | | | | | | cbc55da has reworked the order of some actions at the end of archive recovery. Unfortunately this overlooked the fact that the startup process needs to remove RECOVERYXLOG (for temporary WAL segment newly recovered from archives) and RECOVERYHISTORY (for temporary history file) at this step, leaving the files around even after recovery ended. Backpatch to 9.5, like the previous commit. Author: Sawada Masahiko Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com Backpatch-through: 9.5
* Make crash recovery ignore recovery target settings.Fujii Masao2019-09-30
| | | | | | | | | | | | | | | | | | In v11 or before, recovery target settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit 2dedf4d9a8 integrated recovery.conf into postgresql.conf and which unexpectedly allowed recovery target settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery target settings. Back-patch to v12. Author: Peter Eisentraut Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net