aboutsummaryrefslogtreecommitdiff
path: root/src/include
Commit message (Collapse)AuthorAge
* windows: Define WIN32_LEAN_AND_MEAN to make compilation faster.Andres Freund2021-10-04
| | | | | | | | | windows.h includes a lot of other headers, slowing down compilation significantly. WIN32_LEAN_AND_MEAN reduces that a bit. It'd be better to remove the include of windows.h (as well as indirect inclusions of it) from such a central place, but until then... Discussion: https://postgr.es/m/20210921193035.pqzay43vpyv7in43@alap3.anarazel.de
* Fix duplicate words in commentsDaniel Gustafsson2021-10-04
| | | | | | | Remove accidentally duplicated words in code comments. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87bl45t0co.fsf@wibble.ilmari.org
* Fix checking of query type in plpgsql's RETURN QUERY command.Tom Lane2021-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
* Fix Portal snapshot tracking to handle subtransactions properly.Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | | Commit 84f5c2908 forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
* Ensure interleaved_parts field is always initializedDavid Rowley2021-10-01
| | | | | | | | | | | | | | | This field was recently added in db632fbca, however that commit missed one place where it should have initialized the new field to NULL. The missed location is where the PartitionBoundInfo is created for partition-wise join relations. Technically there could be interleaved partitions in a partition-wise join relation, but currently the only optimization we use this field for only does so for base rels and other member rels. So just document that we don't populate this field for join rels. Reported-by: Amit Langote Author: Amit Langote, David Rowley Reviewed-by: Amit Langote, David Rowley Discussion: https://postgr.es/m/CA+HiwqE76Rps24kwHsd2Cr82Ua07tJC9t9reG0c7ScX9n_xrEA@mail.gmail.com
* Treat ETIMEDOUT as indicating a non-recoverable connection failure.Tom Lane2021-09-30
| | | | | | | | | | | | | | | | | | | | | | Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS' list of "errnos that identify hard failure of a previously-established network connection". While one could imagine that this is sometimes recoverable, the same could be said of other entries such as ENETDOWN. In support of this, handle ETIMEDOUT on par with other socket errors in relevant infrastructure, such as TranslateSocketError(). (I made a couple of cosmetic adjustments in TranslateSocketError(), too.) The code now assumes that ETIMEDOUT is defined everywhere, which it should be given that POSIX has required it since SUSv2. Perhaps this should be back-patched, but I'm hesitant to do so given the lack of previous complaints, and the hazard that there's a small ABI break on Windows from redefining the symbol. Even if we decide to do that, it'd be prudent to let this bake awhile in HEAD first. Jelte Fennema Discussion: https://postgr.es/m/AM5PR83MB01782BFF2978505F6D6C559AF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
* Fix WAL replay in presence of an incomplete recordAlvaro Herrera2021-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit 515e3d84a0b5, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
* Fix incorrect format placeholderPeter Eisentraut2021-09-29
|
* Split macros from visibilitymap.h into a separate headerAlexander Korotkov2021-09-23
| | | | | | | | | | | That allows to include just visibilitymapdefs.h from file.c, and in turn, remove include of postgres.h from relcache.h. Reported-by: Andres Freund Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de Author: Alexander Korotkov Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera Backpatch-through: 13
* Invalidate all partitions for a partitioned table in publication.Amit Kapila2021-09-22
| | | | | | | | | | | | | | | | | Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
* Fix "single value strategy" index deletion issue.Peter Geoghegan2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | It is not appropriate for deduplication to apply single value strategy when triggered by a bottom-up index deletion pass. This wastes cycles because later bottom-up deletion passes will overinterpret older duplicate tuples that deduplication actually just skipped over "by design". It also makes bottom-up deletion much less effective for low cardinality indexes that happen to cross a meaningless "index has single key value per leaf page" threshold. To fix, slightly narrow the conditions under which deduplication's single value strategy is considered. We already avoided the strategy for a unique index, since our high level goal must just be to buy time for VACUUM to run (not to buy space). We'll now also avoid it when we just had a bottom-up pass that reported failure. The two cases share the same high level goal, and already overlapped significantly, so this approach is quite natural. Oversight in commit d168b666, which added bottom-up index deletion. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznaOvM+Gyj-JQ0X=JxoMDxctDTYjiEuETdAGbF5EUc3MA@mail.gmail.com Backpatch: 14-, where bottom-up deletion was introduced.
* Document XLOG_INCLUDE_XID a little betterAlvaro Herrera2021-09-21
| | | | | | | | | | | | I noticed that commit 0bead9af484c left this flag undocumented in XLogSetRecordFlags, which led me to discover that the flag doesn't actually do what the one comment on it said it does. Improve the situation by adding some more comments. Backpatch to 14, where the aforementioned commit appears. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202109212119.c3nhfp64t2ql@alvherre.pgsql
* Introduce GUC shared_memory_size_in_huge_pagesMichael Paquier2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This runtime-computed GUC shows the number of huge pages required for the server's main shared memory area, taking advantage of the work done in 0c39c29 and 0bd305e. This is useful for users to estimate the amount of huge pages required for a server as it becomes possible to do an estimation without having to start the server and potentially allocate a large chunk of shared memory. The number of huge pages is calculated based on the existing GUC huge_page_size if set, or by using the system's default by looking at /proc/meminfo on Linux. There is nothing new here as this commit reuses the existing calculation methods, and just exposes this information directly to the user. The routine calculating the huge page size is refactored to limit the number of files with platform-specific flags. This new GUC's name was the most popular choice based on the discussion done. This is only supported on Linux. I have taken the time to test the change on Linux, Windows and MacOS, though for the last two ones large pages are not supported. The first one calculates correctly the number of pages depending on the existing GUC huge_page_size or the system's default. Thanks to Andres Freund, Robert Haas, Kyotaro Horiguchi, Tom Lane, Justin Pryzby (and anybody forgotten here) for the discussion. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* pgstat: Prepare to use mechanism for truncated rels also for droppped rels.Andres Freund2021-09-20
| | | | | | | | | | | | | The upcoming shared memory stats patch drops stats for dropped objects in a transactional manner, rather than removing them later as part of vacuum. This means that stats for DROP inside a transaction needs to handle aborted (sub-)transactions similar to TRUNCATE: The stats up to the DROP should be restored. Rename the existing infrastructure in preparation. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
* Doc: add glossary term for "auxiliary process"Alvaro Herrera2021-09-20
| | | | | | | | | | | Add entries for existing processes not documented, too, and adjust existing definitions for consistency. Per question from Bharath Rupireddy. Author: Justin Pryzby <pryzby@telsasoft.com> Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CALj2ACVpYCT0M+k8zqrAa4ZQZV+ce5s6G=yajwoS1m=h-jj8NQ@mail.gmail.com
* process startup: Split single user code out of PostgresMain().Andres Freund2021-09-17
| | | | | | | | | | | | | | | It was harder than necessary to understand PostgresMain() because the code for a normal backend was interspersed with single-user mode specific code. Split most of the single-user mode code into its own function PostgresSingleUserMain(), that does all the necessary setup for single-user mode, and then hands off after that to PostgresMain(). There still is some single-user mode code in InitPostgres(), and it'd likely be worth moving at least some of it out. But that's for later. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
* Fix performance regression from session statistics.Andres Freund2021-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Session statistics, as introduced by 960869da08, had several shortcomings: - an additional GetCurrentTimestamp() call that also impaired the accuracy of the data collected This can be avoided by passing the current timestamp we already have in pgstat_report_stat(). - an additional statistics UDP packet sent every 500ms This is solved by adding the new statistics to PgStat_MsgTabstat. This is conceptually ugly, because session statistics are not table statistics. But the struct already contains data unrelated to tables, so there is not much damage done. Connection and disconnection are reported in separate messages, which reduces the number of additional messages to two messages per session and a slight increase in PgStat_MsgTabstat size (but the same number of table stats fit). - Session time computation could overflow on systems where long is 32 bit. Reported-By: Andres Freund <andres@anarazel.de> Author: Andres Freund <andres@anarazel.de> Author: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de Backpatch: 14-, where the feature was introduced.
* Support "postgres -C" with runtime-computed GUCsMichael Paquier2021-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, the -C option of postgres was handled before a small subset of GUCs computed at runtime are initialized, leading to incorrect results as GUC machinery would fall back to default values for such parameters. For example, data_checksums could report "off" for a cluster as the control file is not loaded yet. Or wal_segment_size would show a segment size at 16MB even if initdb --wal-segsize used something else. Worse, the command would fail to properly report the recently-introduced shared_memory, that requires to load shared_preload_libraries as these could ask for a chunk of shared memory. Support for runtime GUCs comes with a limitation, as the operation is now allowed on a running server. One notable reason for this is that _PG_init() functions of loadable libraries are called before all runtime-computed GUCs are initialized, and this is not guaranteed to be safe to do on running servers. For the case of shared_memory_size, where we want to know how much memory would be used without allocating it, this limitation is fine. Another case where this will help is for huge pages, with the introduction of a different GUC to evaluate the amount of huge pages required for a server before starting it, without having to allocate large chunks of memory. This feature is controlled with a new GUC flag, and four parameters are classified as runtime-computed as of this change: - data_checksums - shared_memory_size - data_directory_mode - wal_segment_size Some TAP tests are added to provide some coverage here, using data_checksums in the tests of pg_checksums. Per discussion with Andres Freund, Justin Pryzby, Magnus Hagander and more. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Remove arbitrary 64K-or-so limit on rangetable size.Tom Lane2021-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now the size of a query's rangetable has been limited by the constants INNER_VAR et al, which mustn't be equal to any real rangetable index. 65000 doubtless seemed like enough for anybody, and it still is orders of magnitude larger than the number of joins we can realistically handle. However, we need a rangetable entry for each child partition that is (or might be) processed by a query. Queries with a few thousand partitions are getting more realistic, so that the day when that limit becomes a problem is in sight, even if it's not here yet. Hence, let's raise the limit. Rather than just increase the values of INNER_VAR et al, this patch adopts the approach of making them small negative values, so that rangetables could theoretically become as long as INT_MAX. The bulk of the patch is concerned with changing Var.varno and some related variables from "Index" (unsigned int) to plain "int". This is basically cosmetic, with little actual effect other than to help debuggers print their values nicely. As such, I've only bothered with changing places that could actually see INNER_VAR et al, which the parser and most of the planner don't. We do have to be careful in places that are performing less/greater comparisons on varnos, but there are very few such places, other than the IS_SPECIAL_VARNO macro itself. A notable side effect of this patch is that while it used to be possible to add INNER_VAR et al to a Bitmapset, that will now draw an error. I don't see any likelihood that it wouldn't be a bug to include these fake varnos in a bitmapset of real varnos, so I think this is all to the good. Although this touches outfuncs/readfuncs, I don't think a catversion bump is required, since stored rules would never contain Vars with these fake varnos. Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
* Add Cardinality typedefPeter Eisentraut2021-09-15
| | | | | | | | Similar to Cost and Selectivity, this is just a double, which can be used in path and plan nodes to give some hint about the meaning of a field. Discussion: https://www.postgresql.org/message-id/c091e5cd-45f8-69ee-6a9b-de86912cc7e7@enterprisedb.com
* Update Unicode data to Unicode 14.0.0Peter Eisentraut2021-09-15
|
* Send NOTIFY signals during CommitTransaction.Tom Lane2021-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
* Fix planner error with multiple copies of an AlternativeSubPlan.Tom Lane2021-09-14
| | | | | | | | | | | | | | | | | | | It's possible for us to copy an AlternativeSubPlan expression node into multiple places, for example the scan quals of several partition children. Then it's possible that we choose a different one of the alternatives as optimal in each place. Commit 41efb8340 failed to consider this scenario, so its attempt to remove "unused" subplans could remove subplans that were still used elsewhere. Fix by delaying the removal logic until we've examined all the AlternativeSubPlans in a given query level. (This does assume that AlternativeSubPlans couldn't get copied to other query levels, but for the foreseeable future that's fine; cf qual_is_pushdown_safe.) Per report from Rajkumar Raghuwanshi. Back-patch to v14 where the faulty logic came in. Discussion: https://postgr.es/m/CAKcux6==O3NNZC3bZ2prRYv3cjm3_Zw1GfzmOjEVqYN4jub2+Q@mail.gmail.com
* Remove T_ExprPeter Eisentraut2021-09-14
| | | | | | | This is an abstract node that shouldn't have a node tag defined. Reviewed-by: Jacob Champion <pchampion@vmware.com> Discussion: https://www.postgresql.org/message-id/c091e5cd-45f8-69ee-6a9b-de86912cc7e7@enterprisedb.com
* jit: Do not try to shut down LLVM state in case of LLVM triggered errors.Andres Freund2021-09-13
| | | | | | | | | | | | | | | | | If an allocation failed within LLVM it is not safe to call back into LLVM as LLVM is not generally safe against exceptions / stack-unwinding. Thus errors while in LLVM code are promoted to FATAL. However llvm_shutdown() did call back into LLVM even in such cases, while llvm_release_context() was careful not to do so. We cannot generally skip shutting down LLVM, as that can break profiling. But it's OK to do so if there was an error from within LLVM. Reported-By: Jelte Fennema <Jelte.Fennema@microsoft.com> Author: Andres Freund <andres@anarazel.de> Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/AM5PR83MB0178C52CCA0A8DEA0207DC14F7FF9@AM5PR83MB0178.EURPRD83.prod.outlook.com Backpatch: 11-, where jit was introduced
* Remove code duplication for permission checks with replication slotsMichael Paquier2021-09-14
| | | | | | | | | | | Two functions, both named check_permissions(), used the same checks to verify if a user had required privileges to work on replication slots. This commit removes the duplication, and moves the function doing the checks to slot.c to be centralized. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Euler Taveira Discussion: https://postgr.es/m/CALj2ACUPpVw1u7sQocFVWrSs0n10pt_G_4NPZKSxXK6cW1dErw@mail.gmail.com
* Refactor the syslogger pipe protocol to use a bitmask for its optionsMichael Paquier2021-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | The previous protocol expected a set of matching characters to check if a message sent was the last one or not, that changed depending on the destination wanted: - 't' and 'f' tracked the last message of a log sent to stderr. - 'T' and 'F' tracked the last message of a log sent to csvlog. This could be extended with more characters when introducing new destinations, but using a bitmask is much more elegant. This commit changes the protocol so as a bitmask is used in the header of a log chunk message sent to the syslogger, with the following options available for now: - log_destination as stderr. - log_destination as csvlog. - if a message is the last chunk of a message. Sehrope found this issue in a patch set to introduce JSON as an option for log_destination, but his patch made the size of the protocol header larger. This commit keeps the same size as the original, and adapts the protocol as wanted. Thanks also to Andrew Dunstan and Greg Stark for the discussion. Author: Michael Paquier, Sehrope Sarkuni Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
* Revoke PUBLIC CREATE from public schema, now owned by pg_database_owner.Noah Misch2021-09-09
| | | | | | | | | | | | | This switches the default ACL to what the documentation has recommended since CVE-2018-1058. Upgrades will carry forward any old ownership and ACL. Sites that declined the 2018 recommendation should take a fresh look. Recipes for commissioning a new database cluster from scratch may need to create a schema, grant more privileges, etc. Out-of-tree test suites may require such updates. Reviewed by Peter Eisentraut. Discussion: https://postgr.es/m/20201031163518.GB4039133@rfd.leadboat.com
* Remove Value node structPeter Eisentraut2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Value node struct is a weird construct. It is its own node type, but most of the time, it actually has a node type of Integer, Float, String, or BitString. As a consequence, the struct name and the node type don't match most of the time, and so it has to be treated specially a lot. There doesn't seem to be any value in the special construct. There is very little code that wants to accept all Value variants but nothing else (and even if it did, this doesn't provide any convenient way to check it), and most code wants either just one particular node type (usually String), or it accepts a broader set of node types besides just Value. This change removes the Value struct and node type and replaces them by separate Integer, Float, String, and BitString node types that are proper node types and structs of their own and behave mostly like normal node types. Also, this removes the T_Null node tag, which was previously also a possible variant of Value but wasn't actually used outside of the Value contained in A_Const. Replace that by an isnull field in A_Const. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
* Fix misleading comments about TOAST access macros.Tom Lane2021-09-08
| | | | | | | Seems to have been my error in commit aeb1631ed. Noted by Christoph Berg. Discussion: https://postgr.es/m/YTeLipdnSOg4NNcI@msg.df7cb.de
* Disable anonymous record hash support except in special casesPeter Eisentraut2021-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 01e658fa74 added hash support for row types. This also added support for hashing anonymous record types, using the same approach that the type cache uses for comparison support for record types: It just reports that it works, but it might fail at run time if a component type doesn't actually support the operation. We get away with that for comparison because most types support that. But some types don't support hashing, so the current state can result in failures at run time where the planner chooses hashing over sorting, whereas that previously worked if only sorting was an option. We do, however, want the record hashing support for path tracking in recursive unions, and the SEARCH and CYCLE clauses built on that. In that case, hashing is the only plan option. So enable that, this commit implements the following approach: The type cache does not report that hashing is available for the record type. This undoes that part of 01e658fa74. Instead, callers that require hashing no matter what can override that result themselves. This patch only touches the callers to make the aforementioned recursive query cases work, namely the parse analysis of unions, as well as the hash_array() function. Reported-by: Sait Talha Nisanci <sait.nisanci@microsoft.com> Bug: #17158 Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
* Invalidate relcache for publications defined for all tables.Amit Kapila2021-09-08
| | | | | | | | | | | | | | | Updates/Deletes on a relation were allowed even without replica identity after we define the publication for all tables. This would later lead to an error on subscribers. The reason was that for such publications we were not invalidating the relcache and the publication information for relations was not getting rebuilt. Similarly, we were not invalidating the relcache after dropping of such publications which will prohibit Updates/Deletes without replica identity even without any publication. Author: Vignesh C and Hou Zhijie Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
* Introduce GUC shared_memory_sizeMichael Paquier2021-09-08
| | | | | | | | | | This runtime-computed GUC shows the size of the server's main shared memory area, taking into account the amount of shared memory allocated by extensions as this is calculated after processing shared_preload_libraries. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Add PublicationTable and PublicationRelInfo structsAlvaro Herrera2021-09-06
| | | | | | | | | | | | | | | These encapsulate a relation when referred from replication DDL. Currently they don't do anything useful (they're just wrappers around RangeVar and Relation respectively) but in the future they'll be used to carry column lists. Extracted from a larger patch by Rahila Syed. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
* Fix actively-misleading comments about the contents of struct pg_tm.Tom Lane2021-09-06
| | | | | | | | | | | | | pgtime.h documented the PG interpretation of tm_mon right alongside the POSIX interpretation of tm_year, with no hint that neither comment was correct throughout our code. Perhaps someday we ought to switch to using two separate struct definitions to provide a clearer indication of which semantics are in use where. But I fear the tedium-versus-safety-gain tradeoff would not be very good. Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com
* Make timetz_zone() stable, and correct a bug for DYNTZ abbreviations.Tom Lane2021-09-06
| | | | | | | | | | | | | | | | | | | | | Historically, timetz_zone() has used time(NULL) as the reference point for deciding whether DST is active. That means its result can change intra-statement, requiring it to be marked VOLATILE (cf. 35979e6c3). But that definition is pretty inconsistent with the way we deal with timestamps elsewhere. Let's make it use the transaction start time ("now()") as the reference point instead. That lets it be marked STABLE, and also saves a kernel call per invocation. While at it, remove the function's use of pg_time_t and pg_localtime. Those are inconsistent with the other code in this area, which indeed created a bug: timetz_zone() delivered completely wrong answers if the zone was specified by a dynamic TZ abbreviation. (We need to do something about that in the back branches, but the fix will look different from this.) Aleksander Alekseev and Tom Lane Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com
* Move the shared memory size calculation to its own functionMichael Paquier2021-09-06
| | | | | | | | | | This change refactors the shared memory size calculation in CreateSharedMemoryAndSemaphores() to its own function. This is intended for use in a future change related to the setup of huge pages and shared memory with some GUCs, while useful on its own for extensions. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Revert "Avoid creating archive status ".ready" files too early"Alvaro Herrera2021-09-04
| | | | | | | | | | This reverts commit 515e3d84a0b5 and equivalent commits in back branches. This solution to the problem has a number of problems, so we'll try again with a different approach. Per note from Andres Freund Discussion: https://postgr.es/m/20210831042949.52eqp5xwbxgrfank@alap3.anarazel.de
* Set the volatility of the timestamptz version of date_bin() back to immutableJohn Naylor2021-09-03
| | | | | | | | | | | 543f36b43d was too hasty in thinking that the volatility of date_bin() had to match date_trunc(), since only the latter references session_timezone. Bump catversion Per feedback from Aleksander Alekseev Backpatch to v14, as the former commit was
* Enhance pg_stat_reset_single_table_counters function.Fujii Masao2021-09-02
| | | | | | | | | | | This commit allows pg_stat_reset_single_table_counters() to reset statistics for a single relation shared across all databases in the cluster to zero. Bump catalog version. Author: B Sadhu Prasad Patro Reviewed-by: Mahendra Singh Thalor, Himanshu Upadhyaya, Dilip Kumar, Fujii Masao Discussion: https://postgr.es/m/CAFF0-CGy7EHeF=AqqkGMF85cySPQBgDcvNk73G2O0vL94O5U5A@mail.gmail.com
* Optimize fileset usage in apply worker.Amit Kapila2021-09-02
| | | | | | | | | | | | | | | | Use one fileset for the entire worker lifetime instead of using separate filesets for each streaming transaction. Now, the changes/subxacts files for every streaming transaction will be created under the same fileset and the files will be deleted after the transaction is completed. This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet APIs to allow users to specify whether to give an error on missing files. Author: Dilip Kumar, based on suggestion by Thomas Munro Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila Discussion: https://postgr.es/m/E1mCC6U-0004Ik-Fs@gemulon.postgresql.org
* Mark the timestamptz variant of date_bin() as stableJohn Naylor2021-08-31
| | | | | | | | | | Previously, it was immutable by lack of marking. This is not correct, since the time zone could change. Bump catversion Discussion: https://www.postgresql.org/message-id/CAFBsxsG2UHk8mOWL0tca%3D_cg%2B_oA5mVRNLhDF0TBw980iOg5NQ%40mail.gmail.com Backpatch to v14, when this function came in
* Refactor sharedfileset.c to separate out fileset implementation.Amit Kapila2021-08-30
| | | | | | | | | | | | Move fileset related implementation out of sharedfileset.c to allow its usage by backends that don't want to share filesets among different processes. After this split, fileset infrastructure is used by both sharedfileset.c and worker.c for the named temporary files that survive across transactions. Author: Dilip Kumar, based on suggestion by Andres Freund Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila Discussion: https://postgr.es/m/E1mCC6U-0004Ik-Fs@gemulon.postgresql.org
* Add logical change details to logical replication worker errcontext.Amit Kapila2021-08-27
| | | | | | | | | | | | | | | | | | | | | Previously, on the subscriber, we set the error context callback for the tuple data conversion failures. This commit replaces the existing error context callback with a comprehensive one so that it shows not only the details of data conversion failures but also the details of logical change being applied by the apply worker or table sync worker. The additional information displayed will be the command, transaction id, and timestamp. The error context is added to an error only when applying a change but not while doing other work like receiving data etc. This will help users in diagnosing the problems that occur during logical replication. It also can be used for future work that allows skipping a particular transaction on the subscriber. Author: Masahiko Sawada Reviewed-by: Hou Zhijie, Greg Nancarrow, Haiying Tang, Amit Kapila Tested-by: Haiying Tang Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
* Extend collection of Unicode combining characters to beyond the BMPJohn Naylor2021-08-26
| | | | | | | | The former limit was perhaps a carryover from an older hand-coded table. Since commit bab982161 we have enough space in mbinterval to store larger codepoints, so collect all combining characters. Discussion: https://www.postgresql.org/message-id/49ad1fa0-174e-c901-b14c-c484b60907f1%40enterprisedb.com
* Update display widths as part of updating UnicodeJohn Naylor2021-08-26
| | | | | | | | | | | | | | | | | | The hardcoded "wide character" set in ucs_wcwidth() was last updated around the Unicode 5.0 era. This led to misalignment when printing emojis and other codepoints that have since been designated wide or full-width. To fix and keep up to date, extend update-unicode to download the list of wide and full-width codepoints from the offical sources. In passing, remove some comments about non-spacing characters that haven't been accurate since we removed the former hardcoded logic. Jacob Champion Reported and reviewed by Pavel Stehule Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRCeX21O69YHxmykYySYyprZAqrKWWg0KoGKdjgqcGyygg@mail.gmail.com
* Revert "Rename unicode_combining_table to unicode_width_table"John Naylor2021-08-26
| | | | | | | | | | | | | This reverts commit eb0d0d2c7300c9c5c22b35975c11265aa4becc84. After I had committed eb0d0d2c7 and 78ab944cd, I decided to add a sanity check for a "can't happen" scenario just to be cautious. It turned out that it already happened in the official Unicode source data, namely that a character can be both wide and a combining character. This fact renders the aforementioned commits unnecessary, so revert both of them. Discussion: https://www.postgresql.org/message-id/CAFBsxsH5ejH4-1xaTLpSK8vWoK1m6fA1JBtTM6jmBsLfmDki1g%40mail.gmail.com
* Revert "Change mbbisearch to return the character range"John Naylor2021-08-26
| | | | | | | | | | | | | | This reverts commit 78ab944cd4b9977732becd9d0bc83223b88af9a2. After I had committed eb0d0d2c7 and 78ab944cd, I decided to add a sanity check for a "can't happen" scenario just to be cautious. It turned out that it already happened in the official Unicode source data, namely that a character can be both wide and a combining character. This fact renders the aforementioned commits unnecessary, so revert both of them. Discussion: https://www.postgresql.org/message-id/CAFBsxsH5ejH4-1xaTLpSK8vWoK1m6fA1JBtTM6jmBsLfmDki1g%40mail.gmail.com
* Change mbbisearch to return the character rangeJohn Naylor2021-08-25
| | | | | | | | | | Add a width field to mbinterval and have mbbisearch return a pointer to the found range rather than just bool for success. A future commit will add another width besides zero, and this will allow that to use the same search. Reviewed by Jacob Champion Discussion: https://www.postgresql.org/message-id/CAFBsxsGOCpzV7c-f3a8ADsA1n4uZ%3D8puCctQp%2Bx7W0vgkv%3Dw%2Bg%40mail.gmail.com
* Rename unicode_combining_table to unicode_width_tableJohn Naylor2021-08-25
| | | | | No functional changes. A future commit will use this table for other purposes besides combining characters.