postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Prohibit combining publications with different column lists.	Amit Kapila	2022-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we simply combine the column lists when publishing tables on multiple publications and that can sometimes lead to unexpected behavior. Say, if a column is published in any row-filtered publication, then the values for that column are sent to the subscriber even for rows that don't match the row filter, as long as the row matches the row filter for any other publication, even if that other publication doesn't include the column. The main purpose of introducing a column list is to have statically different shapes on publisher and subscriber or hide sensitive column data. In both cases, it doesn't seem to make sense to combine column lists. So, we disallow the cases where the column list is different for the same table when combining publications. It can be later extended to combine the column lists for selective cases where required. Reported-by: Alvaro Herrera Author: Hou Zhijie Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
*	Fix COPY FROM when database encoding is SQL_ASCII.	Heikki Linnakangas	2022-05-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the codepath when no encoding conversion is required, the check for incomplete character at the end of input incorrectly used server encoding's max character length, instead of the client's. Usually the server and client encodings are the same when we're not performing encoding conversion, but SQL_ASCII is an exception. In the passing, also fix some outdated comments that still talked about the old COPY protocol. It was removed in v14. Per bug #17501 from Vitaly Voronov. Backpatch to v14 where this was introduced. Discussion: https://www.postgresql.org/message-id/17501-128b1dd039362ae6@postgresql.org
*	Fix DDL deparse of CREATE OPERATOR CLASS	Alvaro Herrera	2022-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	When an implicit operator family is created, it wasn't getting reported. Make it do so. This has always been missing. Backpatch to 10. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
*	Fix EXPLAIN MERGE output when no tuples are processed	Alvaro Herrera	2022-05-18
\| \| \| \| \| \| \| \| \|	An 'else' clause was misplaced in commit 598ac10be1c2, making zero-rows output look a bit silly. Add a test case for it. Pointed out by Tom Lane. Discussion: https://postgr.es/m/21030.1652893083@sss.pgh.pa.us
*	Make EXPLAIN MERGE output format more compact	Alvaro Herrera	2022-05-18
\| \| \| \| \| \| \| \| \| \|	We can use a single line to print all tuple counts that MERGE processed, for conciseness, and elide those that are zeroes. Non-text formats report all numbers, as is typical. Per comment from Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220511163350.GL19626@telsasoft.com
*	Add missing binary_upgrade.h includes.	Andres Freund	2022-05-12
\| \| \| \| \| \| \| \| \|	A few places used binary_upgrade_* variables without including the header, which worked without warnings because the variables are defined in those places. However that can cause linker complaints with MSVC - except that we don't see them right now, due to the use of a symbol export file. Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
*	Pre-beta mechanical code beautification.	Tom Lane	2022-05-12
\| \| \| \| \|	Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
*	In REFRESH MATERIALIZED VIEW, set user ID before running user code.	Noah Misch	2022-05-09
\| \| \| \| \| \| \| \| \| \|	It intended to, but did not, achieve this. Adopt the new standard of setting user ID just after locking the relation. Back-patch to v10 (all supported versions). Reviewed by Simon Riggs. Reported by Alvaro Herrera. Security: CVE-2022-1552
*	Make relation-enumerating operations be security-restricted operations.	Noah Misch	2022-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552
*	Fix old-fd issues using global barriers everywhere.	Thomas Munro	2022-05-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commits 4eb21763 and b74e94dc introduced a way to force every backend to close all relation files, to fix an ancient Windows-only bug. This commit extends that behavior to all operating systems and adds a couple of extra barrier points, to fix a totally different class of bug: the reuse of relfilenodes in scenarios that have no other kind of cache invalidation to prevent file descriptor mix-ups. In all releases, data corruption could occur when you moved a database to another tablespace and then back again. Despite that, no back-patch for now as the infrastructure required is too new and invasive. In master only, since commit aa010514, it could also happen when using CREATE DATABASE with a user-supplied OID or via pg_upgrade. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
*	Always pfree strings returned by GetDatabasePath	Alvaro Herrera	2022-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several places didn't do it, and in many cases it didn't matter because it would be a small allocation in a short-lived context; but other places may accumulate a few (for example, in CreateDatabaseUsingFileCopy, one per tablespace). In most databases this is highly unlikely to be very serious either, but it seems better to make the code consistent in case there's future copy-and-paste. The only case of actual concern seems to be the aforementioned routine, which is new with commit 9c08aea6a309, so there's no need to backpatch. As pointed out by Coverity.
*	Fix breakage in AlterFunction().	Tom Lane	2022-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An ALTER FUNCTION command that tried to update both the function's proparallel property and its proconfig list failed to do the former, because it stored the new proparallel value into a tuple that was no longer the interesting one. Carelessness in 7aea8e4f2. (I did not bother with a regression test, because the only likely future breakage would be for someone to ignore the comment I added and add some other field update after the heap_modify_tuple step. A test using existing function properties could not catch that.) Per report from Bryn Llewellyn. Back-patch to all supported branches. Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
*	Have CLUSTER ignore partitions not owned by caller	Alvaro Herrera	2022-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a partitioned table has partitions owned by roles other than the owner of the partitioned table, don't include them in the to-be- clustered list. This is similar to what VACUUM FULL does (except we do it sooner, because there is no reason to postpone it). Add a simple test to verify that only owned partitions are clustered. While at it, change memory context switch-and-back to occur once per partition instead of outside of the loop. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
*	Remove extraneous blank lines before block-closing braces	Alvaro Herrera	2022-04-13
\| \| \| \| \| \| \| \| \|	These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
*	Release cache tuple when no longer needed	Alvaro Herrera	2022-04-13
\| \| \| \| \| \| \| \| \| \| \| \|	There was a small buglet in commit 52e4f0cd472d whereby a tuple acquired from cache was not released, giving rise to WARNING messages; fix that. While at it, restructure the code a bit on stylistic grounds. Author: Hou zj <houzj.fnst@fujitsu.com> Reported-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut+PvKTyhTBtYCQsP6Ph7=o-oWRSX+v+PXXLXp81-o2bazig@mail.gmail.com
*	Remove "recheck" argument from check_index_is_clusterable()	Michael Paquier	2022-04-13
\| \| \| \| \| \| \| \| \| \| \|	The last usage of this argument in this routine can be tracked down to 7e2f9062, aka 11 years ago. Getting rid of this argument can also be an advantage for extensions calling check_index_is_clusterable(), as it removes any need to worry about the meaning of what a recheck would be at this level. Author: Justin Pryzby Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
*	Revert the addition of GetMaxBackends() and related stuff.	Robert Haas	2022-04-12
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commits 0147fc7, 4567596, aa64f23, and 5ecd018. There is no longer agreement that introducing this function was the right way to address the problem. The consensus now seems to favor trying to make a correct value for MaxBackends available to mdules executing their _PG_init() functions. Nathan Bossart Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud
*	Fix various typos and spelling mistakes in code comments	David Rowley	2022-04-11
\| \| \| \| \|	Author: Justin Pryzby Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
*	Add missing serial commas	Peter Eisentraut	2022-04-09
\|
*	Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)	Michael Paquier	2022-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O timing spent reading and writing shared and local buffers. This commit adds on top of that the I/O timing for temporary buffers in the output of EXPLAIN (for spilled external sorts, hashes, materialization. etc). This can be helpful for users in cases where the I/O related to temporary buffers is the bottleneck. Like its cousin, this information is available only when track_io_timing is enabled. Playing the patch, this is showing an extra overhead of up to 1% even when using gettimeofday() as implementation for interval timings, which is slightly within the usual range noise still that's measurable. Author: Masahiko Sawada Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud, Ranier Vilela Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
*	Teach planner and executor about monotonic window funcs	David Rowley	2022-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT , row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count() and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
*	Revert "Logical decoding of sequences"	Tomas Vondra	2022-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - 0da92dc530c9251735fc70b20cd004d9630a1266 - 80901b32913ffa59bf157a4d88284b2b3a7511d9 - b779d7d8fdae088d70da5ed9fcd8205035676df3 - d5ed9da41d96988d905b49bebb273a9b2d6e2915 - a180c2b34de0989269fdb819bff241a249bf5380 - 75b1521dae1ff1fde17fda2e30e591f2e5d64b6a - 2d2232933b02d9396113662e44dca5f120d6830e - 002c9dd97a0c874fd1693a570383e2dd38cd40d5 - 05843b1aa49df2ecc9b97c693b755bd1b6f856a9 The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
*	Unlogged sequences	Peter Eisentraut	2022-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for unlogged sequences. Unlike for unlogged tables, this is not a performance feature. It allows sequences associated with unlogged tables to be excluded from replication. A new subcommand ALTER SEQUENCE ... SET LOGGED/UNLOGGED is added. An identity/serial sequence now automatically gets and follows the persistence level (logged/unlogged) of its owning table. (The sequences owned by temporary tables were already temporary through the separate mechanism in RangeVarAdjustRelationPersistence().) But you can still change the persistence of an owned sequence separately. Also, pg_dump and pg_upgrade preserve the persistence of existing sequences. Discussion: https://www.postgresql.org/message-id/flat/04e12818-2f98-257c-b926-2845d74ed04f%402ndquadrant.com
*	pgstat: store statistics in shared memory.	Andres Freund	2022-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version) Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version) Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version) Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
*	Reorder subskiplsn in pg_subscription to avoid alignment issues.	Amit Kapila	2022-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The column 'subskiplsn' uses TYPALIGN_DOUBLE (which has 4 bytes alignment on AIX) for storage. But the C Struct (Form_pg_subscription) has 8-byte alignment for this field, so retrieving it from storage causes an unaligned read. To fix this, we rearranged the 'subskiplsn' column in the catalog so that it naturally comes at an 8-byte boundary. We have fixed a similar problem in commit f3b421da5f. This patch adds a test to avoid a similar mistake in the future. Reported-by: Noah Misch Diagnosed-by: Noah Misch, Masahiko Sawada, Amit Kapila Author: Masahiko Sawada Reviewed-by: Noah Misch, Amit Kapila Discussion: https://postgr.es/m/20220401074423.GC3682158@rfd.leadboat.com https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
*	pgstat: scaffolding for transactional stats creation / drop.	Andres Freund	2022-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are not WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
*	pgstat: stats collector references in comments.	Andres Freund	2022-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soon the stats collector will be no more, with statistics instead getting stored in shared memory. There are a lot of references to the stats collector in comments. This commit replaces most of these references with "cumulative statistics system", with the remaining ones getting replaced as part of subsequent commits. This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
*	Allow granting SET and ALTER SYSTEM privileges on GUC parameters.	Tom Lane	2022-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows "PGC_SUSET" parameters to be set by non-superusers if they have been explicitly granted the privilege to do so. The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter can also be granted. Such privileges are cluster-wide, not per database. They are tracked in a new shared catalog, pg_parameter_acl. Granting and revoking these new privileges works as one would expect. One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege --- one could wish that those were handled by a revocable grant to PUBLIC, but they are not, because we couldn't make it robust enough for GUCs defined by extensions. Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas, Joshua Brindle, and myself Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com
*	Have VACUUM warn on relfrozenxid "in the future".	Peter Geoghegan	2022-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commits 74cf7d46 and a61daa14 fixed pg_upgrade bugs involving oversights in how relfrozenxid or relminmxid are carried forward or initialized. Corruption caused by bugs of this nature was ameliorated by commit 78db307bb2, which taught VACUUM to always overwrite existing invalid relfrozenxid or relminmxid values that are apparently "in the future". Extend that work now by showing a warning in the event of overwriting either relfrozenxid or relminmxid due to an existing value that is "in the future". There is probably a decent chance that the sanity checks added by commit 699bf7d05c will raise an error before VACUUM reaches this point, but we shouldn't rely on that. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzmRZEzeGvLv8yDW0AbFmSvJjTziORqjVUrf74mL4GL0Ww@mail.gmail.com
*	JSON_TABLE	Andrew Dunstan	2022-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This feature allows jsonb data to be treated as a table and thus used in a FROM clause like other tabular data. Data can be selected from the jsonb using jsonpath expressions, and hoisted out of nested structures in the jsonb to form multiple rows, more or less like an outer join. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
*	Set relfrozenxid to oldest extant XID seen by VACUUM.	Peter Geoghegan	2022-04-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When VACUUM set relfrozenxid before now, it set it to whatever value was used to determine which tuples to freeze -- the FreezeLimit cutoff. This approach was very naive. The relfrozenxid invariant only requires that new relfrozenxid values be <= the oldest extant XID remaining in the table (at the point that the VACUUM operation ends), which in general might be much more recent than FreezeLimit. VACUUM now carefully tracks the oldest remaining XID/MultiXactId as it goes (the oldest remaining values _after_ lazy_scan_prune processing). The final values are set as the table's new relfrozenxid and new relminmxid in pg_class at the end of each VACUUM. The oldest XID might come from a tuple's xmin, xmax, or xvac fields. It might even come from one of the table's remaining MultiXacts. Final relfrozenxid values must still be >= FreezeLimit in an aggressive VACUUM (FreezeLimit still acts as a lower bound on the final value that aggressive VACUUM can set relfrozenxid to). Since standard VACUUMs still make no guarantees about advancing relfrozenxid, they might as well set relfrozenxid to a value from well before FreezeLimit when the opportunity presents itself. In general standard VACUUMs may now set relfrozenxid to any value > the original relfrozenxid and <= OldestXmin. Credit for the general idea of using the oldest extant XID to set pg_class.relfrozenxid at the end of VACUUM goes to Andres Freund. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkymFbz6D_vL+jmqSn_5q1wsFvFrE+37yLgL_Rkfd6Gzg@mail.gmail.com
*	Allow CLUSTER on partitioned tables	Alvaro Herrera	2022-04-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is essentially the same as applying VACUUM FULL to a partitioned table, which has been supported since commit 3c3bb99330aa (March 2017). While there's no great use case in applying CLUSTER to partitioned tables, we don't have any strong reason not to allow it either. For now, partitioned indexes cannot be marked clustered, so an index must always be specified. While at it, rename some variables that were RangeVars during the development that led to 8bc717cb8878 but never made it that way to the source tree; there's no need to perpetuate names that have always been more confusing than helpful. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/20201028003312.GU9241@telsasoft.com Discussion: https://postgr.es/m/20200611153502.GT14879@telsasoft.com
*	Raise a WARNING for missing publications.	Amit Kapila	2022-03-31
\| \| \| \| \| \| \| \| \| \|	When we create or alter a subscription to add publications raise a warning for non-existent publications. We don't want to give an error here because it is possible that users can later create the missing publications. Author: Vignesh C Reviewed-by: Bharath Rupireddy, Japin Li, Dilip Kumar, Euler Taveira, Ashutosh Sharma, Amit Kapila Discussion: https://postgr.es/m/CALDaNm0f4YujGW+q-Di0CbZpnQKFFrXntikaQQKuEmGG0=Zw=Q@mail.gmail.com
*	Fix typo in comment.	Etsuro Fujita	2022-03-30
\|
*	Add header matching mode to COPY FROM	Peter Eisentraut	2022-03-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	COPY FROM supports the HEADER option to silently discard the header line from a CSV or text file. It is possible to load by mistake a file that matches the expected format, for example, if two text columns have been swapped, resulting in garbage in the database. This adds a new option value HEADER MATCH that checks the column names in the header line against the actual column names and errors out if they do not match. Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr> Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
*	Add new block-by-block strategy for CREATE DATABASE.	Robert Haas	2022-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because this strategy logs changes on a block-by-block basis, it avoids the need to checkpoint before and after the operation. However, because it logs each changed block individually, it might generate a lot of extra write-ahead logging if the template database is large. Therefore, the older strategy remains available via a new STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy option to createdb. Somewhat controversially, this patch assembles the list of relations to be copied to the new database by reading the pg_class relation of the template database. Cross-database access like this isn't normally possible, but it can be made to work here because there can't be any connections to the database being copied, nor can it contain any in-doubt transactions. Even so, we have to use lower-level interfaces than normal, since the table scan and relcache interfaces will not work for a database to which we're not connected. The advantage of this approach is that we do not need to rely on the filesystem to determine what ought to be copied, but instead on PostgreSQL's own knowledge of the database structure. This avoids, for example, copying stray files that happen to be located in the source database directory. Dilip Kumar, with a fairly large number of cosmetic changes by me. Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor, Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian, Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others. Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com
*	Revert "Fix replay of create database records on standby"	Alvaro Herrera	2022-03-29
\| \| \| \| \| \| \|	This reverts commit 49d9cfc68bf4. The approach taken by this patch has problems, so we'll come up with a radically different fix. Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
*	Use has_privs_for_roles for predefined role checks	Joe Conway	2022-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally if a role is granted membership to another role with NOINHERIT they must use SET ROLE to access the privileges of that role, however with predefined roles the membership and privilege is conflated. Fix that by replacing is_member_of_role with has_privs_for_role for predefined roles. Patch does not remove is_member_of_role from acl.h, but it does add a warning not to use that function for privilege checking. Not backpatched based on hackers list discussion. Author: Joshua Brindle Reviewed-by: Stephen Frost, Nathan Bossart, Joe Conway Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
*	Remove the ability of a role to administer itself.	Robert Haas	2022-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit f9fd1764615ed5d85fab703b0ffb0c323fe7dfd5 effectively gave every role ADMIN OPTION on itself. However, this appears to be something that happened accidentally as a result of refactoring work rather than an intentional decision. Almost a decade later, it was discovered that this was a security vulnerability. As a result, commit fea164a72a7bfd50d77ba5fb418d357f8f2bb7d0 restricted this implicit ADMIN OPTION privilege to be exercisable only when the role being administered is the same as the session user and when no security-restricted operation is in progress. That commit also documented the existence of this implicit privilege for what seems to be the first time. The effect of the privilege is to allow a login role to grant the privileges of that role, and optionally ADMIN OPTION on it, to some other role. That's an unusual thing to do, because generally membership is granted in roles used as groups, rather than roles used as users. Therefore, it does not seem likely that removing the privilege will break things for many PostgreSQL users. However, it will make it easier to reason about the permissions system. This is the only case where a user who has not been given any special permission (superuser, or ADMIN OPTION on some role) can modify role membership, so removing it makes things more consistent. For example, if a superuser sets up role A and B and grants A to B but no other privileges to anyone, she can now be sure that no one else will be able to revoke that grant. Without this change, that would have been true only if A was a non-login role. Patch by me. Reviewed by Tom Lane and Stephen Frost. Discussion: http://postgr.es/m/CA+Tgmoawdt03kbA+dNyBcNWJpRxu0f4X=69Y3+DkXXZqmwMDLg@mail.gmail.com
*	Add support for MERGE SQL command	Alvaro Herrera	2022-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
*	Allow specifying column lists for logical replication	Tomas Vondra	2022-03-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows specifying an optional column list when adding a table to logical replication. The column list may be specified after the table name, enclosed in parentheses. Columns not included in this list are not sent to the subscriber, allowing the schema on the subscriber to be a subset of the publisher schema. For UPDATE/DELETE publications, the column list needs to cover all REPLICA IDENTITY columns. For INSERT publications, the column list is arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if the table uses REPLICA IDENTITY FULL, column list is not allowed. The column list can contain only simple column references. Complex expressions, function calls etc. are not allowed. This restriction could be relaxed in the future. During the initial table synchronization, only columns included in the column list are copied to the subscriber. If the subscription has several publications, containing the same table with different column lists, columns specified in any of the lists will be copied. This means all columns are replicated if the table has no column list at all (which is treated as column list with all columns), or when of the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that matches the schema of the table). For partitioned tables, publish_via_partition_root determines whether the column list for the root or the leaf relation will be used. If the parameter is 'false' (the default), the list defined for the leaf relation is used. Otherwise, the column list for the root partition will be used. Psql commands \dRp+ and \d <table-name> now display any column lists. Author: Tomas Vondra, Alvaro Herrera, Rahila Syed Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed, Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
*	Minor improvements in sequence decoding code and docs	Tomas Vondra	2022-03-25
\| \| \| \| \| \| \| \|	A couple minor comment improvements and code cleanups, based on post-commit feedback to the sequence decoding patch. Author: Amit Kapila, vignesh C Discussion: https://postgr.es/m/aeb2ba8d-e6f4-5486-cc4c-0d4982c291cb@enterprisedb.com
*	Fix replay of create database records on standby	Alvaro Herrera	2022-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch adds a mechanism similar to invalid-page tracking, to keep a tally of missing directories during crash recovery. If all the missing directory references are matched with corresponding drop records at the end of crash recovery, the standby can safely continue following the primary. Backpatch to 13, at least for now. The bug is older, but fixing it in older branches requires more careful study of the interactions with commit e6d8069522c8, which appeared in 13. A new TAP test file is added to verify the condition. However, because it depends on commit d6d317dbf615, it can only be added to branch master. I (Álvaro) manually verified that the code behaves as expected in branch 14. It's a bit nervous-making to leave the code uncovered by tests in older branches, but leaving the bug unfixed is even worse. Also, the main reason this fix took so long is precisely that we couldn't agree on a good strategy to approach testing for the bug, so perhaps this is the best we can do. Diagnosed-by: Paul Guo <paulguo@gmail.com> Author: Paul Guo <paulguo@gmail.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
*	Remove some useless free calls.	Amit Kapila	2022-03-25
\| \| \| \| \| \| \| \| \| \| \|	These were introduced in recent commit 52e4f0cd47. We were trying to free some transient space consumption and that too was not entirely correct and complete. We don't need this partial freeing of memory as it will be allocated just once for a query and will be freed at the end of the query. Author: Zhihong Yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALNJ-vQORfQ=vicbKA_RmeGZGzm1y3WsEcZqXWi7qjN43Cz_vg@mail.gmail.com
*	Add decoding of sequences to built-in replication	Tomas Vondra	2022-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds support for decoding of sequences to the built-in replication (the infrastructure was added by commit 0da92dc530). The syntax and behavior mostly mimics handling of tables, i.e. a publication may be defined as FOR ALL SEQUENCES (replicating all sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating all sequences in a particular schema) or individual sequences. To publish sequence modifications, the publication has to include 'sequence' action. The protocol is extended with a new message, describing sequence increments. A new system view pg_publication_sequences lists all the sequences added to a publication, both directly and indirectly. Various psql commands (\d and \dRp) are improved to also display publications including a given sequence, or sequences included in a publication. Author: Tomas Vondra, Cary Huang Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
*	Add support for security invoker views.	Dean Rasheed	2022-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A security invoker view checks permissions for accessing its underlying base relations using the privileges of the user of the view, rather than the privileges of the view owner. Additionally, if any of the base relations are tables with RLS enabled, the policies of the user of the view are applied, rather than those of the view owner. This allows views to be defined without giving away additional privileges on the underlying base relations, and matches a similar feature available in other database systems. It also allows views to operate more naturally with RLS, without affecting the assignments of policies to users. Christoph Heiss, with some additional hacking by me. Reviewed by Laurenz Albe and Wolfgang Walther. Discussion: https://postgr.es/m/b66dd6d6-ad3e-c6f2-8b90-47be773da240%40cybertec.at
*	Add ALTER SUBSCRIPTION ... SKIP.	Amit Kapila	2022-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This feature allows skipping the transaction on subscriber nodes. If incoming change violates any constraint, logical replication stops until it's resolved. Currently, users need to either manually resolve the conflict by updating a subscriber-side database or by using function pg_replication_origin_advance() to skip the conflicting transaction. This commit introduces a simpler way to skip the conflicting transactions. The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX), which allows the apply worker to skip the transaction finished at specified LSN. The apply worker skips all data modification changes within the transaction. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
*	Fix bogus dependency handling for GENERATED expressions.	Tom Lane	2022-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For GENERATED columns, we record all dependencies of the generation expression as AUTO dependencies of the column itself. This means that the generated column is silently dropped if any dependency is removed, even if CASCADE wasn't specified. This is at least a POLA violation, but I think it's actually based on a misreading of the standard. The standard does say that you can't drop a dependent GENERATED column in RESTRICT mode; but that's buried down in a subparagraph, on a different page from some pseudocode that makes it look like an AUTO drop is being suggested. Change this to be more like the way that we handle regular default expressions, ie record the dependencies as NORMAL dependencies of the pg_attrdef entry. Also, make the pg_attrdef entry's dependency on the column itself be INTERNAL not AUTO. That has two effects: * the column will go away, not just lose its default, if any dependency of the expression is dropped with CASCADE. So we don't need any special mechanism to make that happen. * it provides an additional cross-check preventing someone from dropping the default expression without dropping the column. catversion bump because of change in the contents of pg_depend (which also requires a change in one information_schema view). Per bug #17439 from Kevin Humphreys. Although this is a longstanding bug, it seems impractical to back-patch because of the need for catalog contents changes. Discussion: https://postgr.es/m/17439-7df4421197e928f0@postgresql.org
*	Move pg_attrdef manipulation code into new file catalog/pg_attrdef.c.	Tom Lane	2022-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a pure refactoring commit: there isn't (I hope) any functional change. StoreAttrDefault and RemoveAttrDefault[ById] are moved from heap.c, reducing the size of that overly-large file by about 300 lines. I took the opportunity to trim unused #includes from heap.c, too. Two new functions for translating between a pg_attrdef OID and the relid/attnum of the owning column are created by extracting ad-hoc code from objectaddress.c. This already removes one copy of said code, and a follow-on bug fix will create more callers. The only other function directly manipulating pg_attrdef is AttrDefaultFetch. I judged it was better to leave that in relcache.c, since it shares special concerns about recursion and error handling with the rest of that module. Discussion: https://postgr.es/m/651168.1647451676@sss.pgh.pa.us
*	Fix risk of deadlock failure while dropping a partitioned index.	Tom Lane	2022-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DROP INDEX needs to lock the index's table before the index itself, else it will deadlock against ordinary queries that acquire the relation locks in that order. This is correctly mechanized for plain indexes by RangeVarCallbackForDropRelation; but in the case of a partitioned index, we neglected to lock the child tables in advance of locking the child indexes. We can fix that by traversing the inheritance tree and acquiring the needed locks in RemoveRelations, after we have acquired our locks on the parent partitioned table and index. While at it, do some refactoring to eliminate confusion between the actual and expected relkind in RangeVarCallbackForDropRelation. We can save a couple of syscache lookups too, by having that function pass back info that RemoveRelations will need. Back-patch to v11 where partitioned indexes were added. Jimmy Yih, Gaurab Dey, Tom Lane Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com