aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils
Commit message (Collapse)AuthorAge
...
* Remove non-functional code for unloading loadable modules.Robert Haas2022-05-11
| | | | | | | | | | The code for unloading a library has been commented-out for over 12 years, ever since commit 602a9ef5a7c60151e10293ae3c4bb3fbb0132d03, and we're no closer to supporting it now than we were back then. Nathan Bossart, reviewed by Michael Paquier and by me. Discussion: http://postgr.es/m/Ynsc9bRL1caUSBSE@paquier.xyz
* Fix some incorrect preprocessor tests in tuplesort specializationsDavid Rowley2022-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | 697492434 added 3 new quicksort specialization functions for common datatypes. That commit was not very consistent in how it would determine if we're compiling for 32-bit or 64-bit machines. It would sometimes use USE_FLOAT8_BYVAL and at other times check if SIZEOF_DATUM == 8. This could cause theoretical problems due to the way USE_FLOAT8_BYVAL is now defined based on SIZEOF_VOID_P >= 8. If pointers for some reason were ever larger than 8-bytes then we'd end up doing 32-bit comparisons mistakenly. Let's just always check SIZEOF_DATUM >= 8. It also seems that ssup_datum_signed_cmp is just never used on 32-bit builds, so let's just ifdef that out to make sure we never accidentally use that comparison function on such machines. This also allows us to ifdef out 1 of the 3 new specialization quicksort functions in 32-bit builds which seems to shrink down the binary by over 4KB on my machine. In passing, also add the missing DatumGetInt32() / DatumGetInt64() macros in the comparison functions. Discussion: https://postgr.es/m/CAApHDvqcQExRhtRa9hJrJB_5egs3SUfOcutP3m+3HO8A+fZTPA@mail.gmail.com Reviewed-by: John Naylor
* Formatting and punctuation improvements in sample configuration filesPeter Eisentraut2022-05-10
|
* Revert "Disallow infinite endpoints in generate_series() for timestamps."Tom Lane2022-05-09
| | | | | | | | | | | | | | | | | | | This reverts commit eafdf9de06e9b60168f5e47cedcfceecdc6d4b5f and its back-branch counterparts. Corey Huinker pointed out that we'd discussed this exact change back in 2016 and rejected it, on the grounds that there's at least one usage pattern with LIMIT where an infinite endpoint can usefully be used. Perhaps that argument needs to be re-litigated, but there's no time left before our back-branch releases. To keep our options open, restore the status quo ante; if we do end up deciding to change things, waiting one more quarter won't hurt anything. Rather than just doing a straight revert, I added a new test case demonstrating the usage with LIMIT. That'll at least remind us of the issue if we forget again. Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com
* Make relation-enumerating operations be security-restricted operations.Noah Misch2022-05-09
| | | | | | | | | | | | | | | | | | When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552
* Fix misleading comments about background worker registration.Robert Haas2022-05-06
| | | | | | | | | | | | | | | | Since 6bc8ef0b7f1f1df3998745a66e1790e27424aa0c, the maximum number of backends can't change as background workers are registered, but these comments still reflect the way things worked prior to that. Also, per recent discussion, some modules call SetConfigOption() from _PG_init(). It's not entirely clear to me whether we want to regard that as a fully supported operation, but since we know it's a thing that happens, it at least deserves a mention in the comments, so add that. Nathan Bossart, reviewed by Anton A. Melnikov Discussion: http://postgr.es/m/20220419154658.GA2487941@nathanxps13
* Fix JSON_OBJECTAGG uniquefying bugAndrew Dunstan2022-04-28
| | | | | | | | Commit f4fb45d15c contained a bug in removing items with null values when unique keys are required, where the leading items that are sorted contained such values. Fix that and add a test for it. Discussion: https://postgr.es/m/CAJA4AWQ_XbSmsNbW226UqNyRLJ+wb=iQkQMj77cQyoNkqtf=2Q@mail.gmail.com
* Fix incorrect format placeholdersPeter Eisentraut2022-04-27
|
* Always pfree strings returned by GetDatabasePathAlvaro Herrera2022-04-25
| | | | | | | | | | | | | | Several places didn't do it, and in many cases it didn't matter because it would be a small allocation in a short-lived context; but other places may accumulate a few (for example, in CreateDatabaseUsingFileCopy, one per tablespace). In most databases this is highly unlikely to be very serious either, but it seems better to make the code consistent in case there's future copy-and-paste. The only case of actual concern seems to be the aforementioned routine, which is new with commit 9c08aea6a309, so there's no need to backpatch. As pointed out by Coverity.
* Fix performance regression in tuplesort specializationsDavid Rowley2022-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 697492434 added 3 new qsort specialization functions aimed to improve the performance of sorting many of the common pass-by-value data types when they're the leading or only sort key. Unfortunately, that has caused a performance regression when sorting datasets where many of the values being compared were equal. What was happening here was that we were falling back to the standard sort comparison function to handle tiebreaks. When the two given Datums compared equally we would incur both the overhead of an indirect function call to the standard comparer to perform the tiebreak and also the standard comparer function would go and compare the leading key needlessly all over again. Here improve the situation in the 3 new comparison functions. We now return 0 directly when the two Datums compare equally and we're performing a 1-key sort. Here we don't do anything to help the multi-key sort case where the leading key uses one of the sort specializations functions. On testing this case, even when the leading key's values are all equal, there appeared to be no performance regression. Let's leave it up to future work to optimize that case so that the tiebreak function no longer re-compares the leading key over again. Another possible fix for this would have been to add 3 additional sort specialization functions to handle single-key sorts for these pass-by-value types. The reason we didn't do that here is that we may deem some other sort specialization to be more useful than single-key sorts. It may be impractical to have sort specialization functions for every single combination of what may be useful and it was already decided that further analysis into which ones are the most useful would be delayed until the v16 cycle. Let's not let this regression force our hand into trying to make that decision for v15. Author: David Rowley Reviewed-by: John Naylor Discussion: https://postgr.es/m/CA+hUKGJRbzaAOUtBUcjF5hLtaSHnJUqXmtiaLEoi53zeWSizeA@mail.gmail.com
* Rethink method for assigning OIDs to the template0 and postgres DBs.Tom Lane2022-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit aa0105141 assigned fixed OIDs to template0 and postgres in a very ad-hoc way. Notably, instead of teaching Catalog.pm about these OIDs, the unused_oids script was just hacked to not show them as unused. That's problematic since, for example, duplicate_oids wouldn't report any future conflict. Hence, invent a macro DECLARE_OID_DEFINING_MACRO() that can be used to define an OID that is known to Catalog.pm and will participate in duplicate-detection as well as renumbering by renumber_oids.pl. (We don't anticipate renumbering these particular OIDs, but we might as well build out all the Catalog.pm infrastructure while we're here.) Another issue is that aa0105141 neglected to touch IsPinnedObject, with the result that it now claimed template0 and postgres are pinned. The right thing to do there seems to be to teach it that no database is pinned, since in fact DROP DATABASE doesn't check for pinned-ness (and at least for these cases, that is an intentional choice). It's not clear whether this wrong answer had any visible effect, but perhaps it could have resulted in erroneous management of dependency entries. In passing, rename the TemplateDbOid macro to Template1DbOid to reduce confusion (likely we should have done that way back when we invented template0, but we didn't), and rename the OID macros for template0 and postgres to have a similar style. There are no changes to postgres.bki here, so no need for a catversion bump. Discussion: https://postgr.es/m/2935358.1650479692@sss.pgh.pa.us
* Fix CLUSTER tuplesorts on abbreviated expressions.Peter Geoghegan2022-04-20
| | | | | | | | | | | | | | | | | | | | CLUSTER sort won't use the datum1 SortTuple field when clustering against an index whose leading key is an expression. This makes it unsafe to use the abbreviated keys optimization, which was missed by the logic that sets up SortSupport state. Affected tuplesorts output tuples in a completely bogus order as a result (the wrong SortSupport based comparator was used for the leading attribute). This issue is similar to the bug fixed on the master branch by recent commit cc58eecc5d. But it's a far older issue, that dates back to the introduction of the abbreviated keys optimization by commit 4ea51cdfe8. Backpatch to all supported versions. Author: Peter Geoghegan <pg@bowt.ie> Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com Backpatch: 10-
* Disallow infinite endpoints in generate_series() for timestamps.Tom Lane2022-04-20
| | | | | | | | | | Such cases will lead to infinite loops, so they're of no practical value. The numeric variant of generate_series() already threw error for this, so borrow its message wording. Per report from Richard Wesley. Back-patch to all supported branches. Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com
* set_deparse_plan: Reuse variable to appease CoverityAlvaro Herrera2022-04-20
| | | | | | | | | | | | | | Coverity complains that dpns->outer_plan is deferenced (to obtain ->targetlist) when possibly NULL. We can avoid this by using dpns->outer_tlist instead, which was already obtained a few lines up. The fact that we end up with dpns->inner_tlist = dpns->outer_tlist is a bit suspicious-looking and maybe worthy of more investigation, but I'll leave that for another day. Reviewed-by: Michaƫl Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
* Fix extract epoch from interval calculationPeter Eisentraut2022-04-19
| | | | | | | | | | | | | | The new numeric code for extract epoch from interval accidentally truncated the DAYS_PER_YEAR value to an integer, leading to results that mismatched the floating-point interval_part calculations. The commit a2da77cdb4661826482ebf2ddba1f953bc74afe4 that introduced this actually contains the regression test change that this reverts. I suppose this was missed at the time. Reported-by: Joseph Koshakow <koshy44@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CAAvxfHd5n%3D13NYA2q_tUq%3D3%3DSuWU-CufmTf-Ozj%3DfrEgt7pXwQ%40mail.gmail.com
* pgstat: Use correct lock level in pgstat_drop_all_entries().Andres Freund2022-04-16
| | | | | | | | | | | | | | | Previously we didn't, which lead to an assertion failure when resetting partially loaded statistics. This was encountered on the buildfarm, for as-of-yet unknown reasons. Ttighten up a validity check when reading the stats file, verifying 'E' signals the end of the file (rather than just stopping reading). That's then used in a test appending to the stats file that crashed before the fix in pgstat_drop_all_entries(). Reported by buildfarm animals mylodon and kestrel, via Tom Lane. Discussion: https://postgr.es/m/1656446.1650043715@sss.pgh.pa.us
* Fix incorrect logic in HaveRegisteredOrActiveSnapshot().Tom Lane2022-04-16
| | | | | | | | | | | | | | | | | This function gave the wrong answer when there's more than one RegisteredSnapshots entry, whether or not any of them is the CatalogSnapshot. This leads to assertion failure in some scenarios involving fetching toasted data using a cursor. (As per discussion, I'm dubious that this is the right contract to be enforcing at all; but it surely doesn't help to be enforcing it incorrectly.) Fetching toasted data using a cursor is evidently under-tested, so add a test case too. Per report from Erik Rijkers. This is new code, so no need for back-patch. Discussion: https://postgr.es/m/dc9dd229-ed30-6c62-4c41-d733ffff776b@xs4all.nl
* Small cleanups in SQL/JSON codeAndrew Dunstan2022-04-15
| | | | | | These are to keep Coverity happy. In one case remove a redundant NULL check, and in another explicitly ignore a function result that is already known.
* pgstat: set timestamps of fixed-numbered stats after a crash.Andres Freund2022-04-14
| | | | | | | | | | | | When not loading stats at startup (i.e. pgstat_discard_stats() getting called), reset timestamps of fixed numbered stats would be left at 0. Oversight in 5891c7a8ed8. Instead use pgstat_reset_after_failure() and add tests verifying that fixed-numbered reset timestamps are set appropriately. Reported-By: "David G. Johnston" <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CAKFQuwamFuaQHKdhcMt4Gbw5+Hca2UE741B8gOOXoA=TtAd2Yw@mail.gmail.com
* Remove extraneous blank lines before block-closing bracesAlvaro Herrera2022-04-13
| | | | | | | | | These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
* Fix finalization for json_objectagg and friendsAndrew Dunstan2022-04-13
| | | | | | | | | | | | | Commit f4fb45d15c misguidedly tried to free some state during aggregate finalization for json_objectagg. This resulted in attempts to access freed memory, especially when the function is used as a window function. Commit 4eb9798879 attempted to ameliorate that, but in fact it should just be ripped out, which is done here. Also add some regression tests for json_objectagg in various flavors as a window function. Original report from Jaime Casanova, diagnosis by Andres Freund. Discussion: https://postgr.es/m/YkfeMNYRCGhySKyg@ahch-to
* Fix incorrect format placeholdersPeter Eisentraut2022-04-13
|
* Revert the addition of GetMaxBackends() and related stuff.Robert Haas2022-04-12
| | | | | | | | | | | | This reverts commits 0147fc7, 4567596, aa64f23, and 5ecd018. There is no longer agreement that introducing this function was the right way to address the problem. The consensus now seems to favor trying to make a correct value for MaxBackends available to mdules executing their _PG_init() functions. Nathan Bossart Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud
* Explicitly ignore guaranteed-true result from pgstat_lock_entry().Tom Lane2022-04-11
| | | | | | With nowait passed as false, pgstat_lock_entry() must return true so there's no need to check its result. Coverity seems unconvinced of this, so whack it upside the head with a (void) cast.
* fgetc() returns int, not char.Tom Lane2022-04-11
| | | | | This has no practical effect, since this code doesn't actually need to distinguish EOF (-1) from \0377; but it silences a Coverity complaint.
* Fix various typos and spelling mistakes in code commentsDavid Rowley2022-04-11
| | | | | Author: Justin Pryzby Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
* Add missing serial commasPeter Eisentraut2022-04-09
|
* Remove error message hints mentioning configure optionsPeter Eisentraut2022-04-08
| | | | | | | | | | | | These are usually not useful since users will use packaged distributions and won't be interested in rebuilding their installation from source. Also, we have only used these kinds of hints for some features and in some places, not consistently throughout. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
* Teach planner and executor about monotonic window funcsDavid Rowley2022-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT *, row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count(*) and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
* Revert "Rewrite some RI code to avoid using SPI"Alvaro Herrera2022-04-07
| | | | | | | This reverts commit 99392cdd78b788295e52b9f4942fa11992fd5ba9. We'd rather rewrite ri_triggers.c as a whole rather than piecemeal. Discussion: https://postgr.es/m/E1ncXX2-000mFt-Pe@gemulon.postgresql.org
* Rewrite some RI code to avoid using SPIAlvaro Herrera2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify the subroutines called by RI trigger functions that want to check if a given referenced value exists in the referenced relation to simply scan the foreign key constraint's unique index, instead of using SPI to execute SELECT 1 FROM referenced_relation WHERE ref_key = $1 This saves a lot of work, especially when inserting into or updating a referencing relation. This rewrite allows to fix a PK row visibility bug caused by a partition descriptor hack which requires ActiveSnapshot to be set to come up with the correct set of partitions for the RI query running under REPEATABLE READ isolation. We now set that snapshot indepedently of the snapshot to be used by the PK index scan, so the two no longer interfere. The buggy output in src/test/isolation/expected/fk-snapshot.out of the relevant test case added by commit 00cb86e75d6d has been corrected. (The bug still exists in branch 14, however, but this fix is too invasive to backpatch.) Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Li Japin <japinli@hotmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
* Revert "Logical decoding of sequences"Tomas Vondra2022-04-07
| | | | | | | | | | | | | | | | | | | | | This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - 0da92dc530c9251735fc70b20cd004d9630a1266 - 80901b32913ffa59bf157a4d88284b2b3a7511d9 - b779d7d8fdae088d70da5ed9fcd8205035676df3 - d5ed9da41d96988d905b49bebb273a9b2d6e2915 - a180c2b34de0989269fdb819bff241a249bf5380 - 75b1521dae1ff1fde17fda2e30e591f2e5d64b6a - 2d2232933b02d9396113662e44dca5f120d6830e - 002c9dd97a0c874fd1693a570383e2dd38cd40d5 - 05843b1aa49df2ecc9b97c693b755bd1b6f856a9 The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
* Prefetch data referenced by the WAL, take II.Thomas Munro2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
* pgstat: add pg_stat_have_stats() test helper.Andres Freund2022-04-07
| | | | | | | | | | Will be used by tests committed subsequently. Bumps catversion (this time for real, the one in 0f96965c658 got lost when rebasing over 5c279a6d350). Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_aNxL1WegCa45r=VAViCLnpOU7uNC7bTtGw+=QAPyYivw@mail.gmail.com
* pgstat: add pg_stat_force_next_flush(), use it to simplify tests.Andres Freund2022-04-06
| | | | | | | | | | | | | | | In the stats collector days it was hard to write tests for the stats system, because fundamentally delivery of stats messages over UDP was not synchronous (nor guaranteed). Now we easily can force pending stats updates to be flushed synchronously. This moves stats.sql into a parallel group, there isn't a reason for it to run in isolation anymore. And it may shake out some bugs. Bumps catversion. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: fix small bug in pgstat_drop_relation().Andres Freund2022-04-06
| | | | | | | | | | Just after committing 5891c7a8ed8, a test running with debug_discard_caches=1 failed locally... pgstat_drop_relation() neither checked pgstat_should_count_relation() nor called pgstat_prep_relation_pending(). With debug_discard_caches=1 rel->pgstat_info wasn't set up, leading pg_stat_get_xact_tuples_inserted() spuriously still returning > 0 while in the transaction dropping the table.
* pgstat: prevent fix pgstat_reinit_entry() from zeroing out lwlock.Andres Freund2022-04-06
| | | | | | | | | | | Zeroing out an lwlock in a normal build turns out to not trigger any alarms, if nobody can use the lwlock at that moment (as the case here). But with --disable-spinlocks --disable-atomics, the sema field needs to be initialized. We probably should make sure that this fails on more common configurations as well... Per buildfarm animal rorqual
* Custom WAL Resource Managers.Jeff Davis2022-04-06
| | | | | | | | | | | | | Allow extensions to specify a new custom resource manager (rmgr), which allows specialized WAL. This is meant to be used by a Table Access Method or Index Access Method. Prior to this commit, only Generic WAL was available, which offers support for recovery and physical replication but not logical replication. Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
* pgstat: move pgstat.c to utils/activity.Andres Freund2022-04-06
| | | | | | | | Now that pgstat is not related to postmaster anymore, src/backend/postmaster is not a well fitting directory. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: rename STATS_COLLECTOR GUC group to STATS_CUMULATIVE.Andres Freund2022-04-06
| | | | | | Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: remove stats_temp_directory.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | With stats now being stored in shared memory, the GUC isn't needed anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are kept, as pg_stat_statements (and some out-of-core extensions) store data in it. Docs will be updated in a subsequent commit, together with the other pending docs updates due to shared memory stats. Author: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: store statistics in shared memory.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version) Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version) Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version) Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
* pgstat: normalize function naming.Andres Freund2022-04-06
| | | | | | Most of pgstat uses pgstat_<verb>_<subject>() or just <verb>_<subject>(). But not all (some introduced fairly recently by me). Rename ones that aren't intentionally following a different scheme (e.g. AtEOXact_*).
* pgstat: revise replication slot API in preparation for shared memory stats.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | Previously the pgstat <-> replication slots API was done with on the basis of names. However, the upcoming move to storing stats in shared memory makes it more convenient to use a integer as key. Change the replication slot functions to take the slot rather than the slot name, and expose ReplicationSlotIndex() to compute the index of an replication slot. Special handling will be required for restarts, as the index is not stable across restarts. For now pgstat internally still uses names. Rename pgstat_report_replslot_{create,drop}() to pgstat_{create,drop}_replslot() to match the functions for other kinds of stats. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* pgstat: scaffolding for transactional stats creation / drop.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are *not* WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: prepare APIs used by pgstatfuncs for shared memory stats.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of PgStat_Kind PgStat_Single_Reset_Type, PgStat_Shared_Reset_Target don't make sense anymore. Replace them with PgStat_Kind. Instead of having dedicated reset functions for different kinds of stats, use two generic helper routines (one to reset all stats of a kind, one to reset one stats entry). A number of reset functions were named pgstat_reset_*_counter(), despite affecting multiple counters. The generic helper routines get rid of pgstat_reset_single_counter(), pgstat_reset_subscription_counter(). Rename pgstat_reset_slru_counter(), pgstat_reset_replslot_counter() to pgstat_reset_slru(), pgstat_reset_replslot() respectively, and have them only deal with a single SLRU/slot. Resetting all SLRUs/slots goes through the generic pgstat_reset_of_kind(). Previously pg_stat_reset_replication_slot() used SearchNamedReplicationSlot() to check if a slot exists. API wise it seems better to move that to pgstat_replslot.c. This is done separately from the - quite large - shared memory statistics patch to make review easier. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* pgstat: add pgstat_copy_relation_stats().Andres Freund2022-04-06
| | | | | | | | | | | | | | Until now index_concurrently_swap() directly modified pgstat internal datastructures. That will break with the introduction of shared memory statistics and seems off architecturally. This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: rename some pgstat_send_* functions to pgstat_report_*.Andres Freund2022-04-06
| | | | | | | | | | Only the pgstat_send_* functions that are called from outside pgstat*.c are renamed (the rest will go away). This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* pgstat: stats collector references in comments.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | Soon the stats collector will be no more, with statistics instead getting stored in shared memory. There are a lot of references to the stats collector in comments. This commit replaces most of these references with "cumulative statistics system", with the remaining ones getting replaced as part of subsequent commits. This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
* pgstat: move transactional code into pgstat_xact.c.Andres Freund2022-04-06
| | | | | | | | | The transactional integration code is largely independent from the rest of pgstat.c. Subsequent commits will add more related code. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de