| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
| |
Author: Yongtao Huang
Discussion: https://postgr.es/m/CAOe1Go1F99o5JsphtXdDC5bxm7AzetU8q3AxLh4AAVGKu1AzEQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required). We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.
This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses. It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit cb970240f13df2b63f0410f81f452179a2b78d6f moved some code from
lazy_scan_heap() to lazy_scan_prune(), and now some things that used to
need to be passed back and forth are completely local to lazy_scan_prune().
Hence, this struct is mostly obsolete. The only thing that still
needs to be passed back to the caller is has_lpdead_items, and that's
also passed back by lazy_scan_noprune(), so do it the same way in both
cases.
Melanie Plageman, reviewed and slightly revised by me.
Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the output parameters of lazy_scan_prune() were being
used to update the VM in lazy_scan_heap(). Moving that code into
lazy_scan_prune() simplifies lazy_scan_heap() and requires less
communication between the two.
This change permits some further code simplification, but that
is left for a separate commit.
Melanie Plageman, reviewed by me.
Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there are no indexes on a relation, items can be marked LP_UNUSED
instead of LP_DEAD when pruning. This significantly reduces WAL
volume, since we no longer need to emit one WAL record for pruning
and a second to change the LP_DEAD line pointers thus created to
LP_UNUSED.
Melanie Plageman, reviewed by Andres Freund, Peter Geoghegan, and me
Discussion: https://postgr.es/m/CAAKRu_bgvb_k0gKOXWzNKWHt560R0smrGe3E8zewKPs8fiMKkw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
try_index_open() is able to open an index if its relkind fits, except
that it would return NULL instead of generated an error if the relation
does not exist. This new routine will be used by an upcoming patch to
make REINDEX on partitioned relations more robust when an index in a
partition tree is dropped.
Extracted from a larger patch by the same author.
Author: Fei Changhong
Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when lazy_scan_noprune() was called and returned true, we would
update the FSM immediately if the relation had no indexes or if the page
contained no dead items. On the other hand, when lazy_scan_prune() was
called, we would update the FSM if either of those things was true or
if index vacuuming was disabled. Eliminate that behavioral difference by
considering vacrel->do_index_vacuuming in both cases.
Also, make lazy_scan_heap() responsible for deciding whether to update
the FSM, instead of doing it inside lazy_scan_noprune(). This is
more consistent with the lazy_scan_prune() case. lazy_scan_noprune()
still needs an output parameter for whether there are LP_DEAD items
on the page, but the real decision-making now happens in the caller.
Patch by me, reviewed by Peter Geoghegan and Melanie Plageman.
Discussion: http://postgr.es/m/CA+TgmoaOzvN1TcHd9iej=PR3fY40En1USxzOnXSR2CxCLaRM0g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the pg_attribute field attstattarget into a nullable
field in the variable-length part of the row. If no value is set by
the user for attstattarget, it is now null instead of previously -1.
This saves space in pg_attribute and tuple descriptors for most
practical scenarios. (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108
to 104.) Also, null is the semantically more correct value.
The ANALYZE code internally continues to represent the default
statistics target by -1, so that that code can avoid having to deal
with null values. But that is now contained to the ANALYZE code.
Only the DDL code deals with attstattarget possibly null.
For system columns, the field is now always null. The ANALYZE code
skips system columns anyway.
To set a column's statistics target to the default value, the new
command form ALTER TABLE ... SET STATISTICS DEFAULT can be used. (SET
STATISTICS -1 still works.)
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
| |
Instead, just have lazy_scan_prune() and lazy_scan_noprune() update
LVRelState->nonempty_pages directly. This makes the two functions
more similar and also removes makes lazy_scan_noprune need one fewer
output parameters.
Melanie Plageman, reviewed by Andres Freund, Michael Paquier, and me
Discussion: http://postgr.es/m/CAAKRu_btji_wQdg=ok-5E4v_bGVxKYnnFFe7RA6Frc1EcOwtSg@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
| |
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna.
Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
|
|
|
|
|
|
|
| |
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions
by me.
Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brinbuildCallbackParallel callback used by parallel BRIN builds did
not consider that the parallel table scans may be synchronized, starting
from an arbitrary block and then wrap around.
If this happened and the scan actually did wrap around, tuples from the
beginning of the table were added to the last range produced by the same
worker. The index would be missing range at the beginning of the table,
while the last range would be too wide. This would not produce incorrect
query results, but it'd be less efficient.
Fixed by checking for both past and future ranges in the callback. The
worker may produce multiple summaries for the same page range, but the
leader will merge them as if the summaries came from different workers.
Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit b437571714 added support for parallel builds for BRIN indexes,
using code similar to BTREE parallel builds, and also a new tuplesort
variant. This commit simplifies the new code in two ways:
* The "spool" grouping tuplesort and the heap/index is not necessary.
The heap/index are available as separate arguments, causing confusion.
So remove the spool, and use the tuplesort directly.
* The new tuplesort variant does not need the heap/index, as it sorts
simply by the range block number, without accessing the tuple data.
So simplify that too.
Initial report and patch by Ranier Vilela, further cleanup by me.
Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqD7f2i4iyEaAz-5o-bf6zXVX-AkNUBm-YjUXEemaEh6A%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recently-introduced code in reconstruct.c was using "unsigned"
to store the result of read(), pg_pread(), or write(). This is
completely bogus: it breaks subsequent tests for the result being
negative, as we're being reminded of by a chorus of buildfarm
warnings. Switch to "int" as was doubtless intended. (There are
several other uses of "unsigned" in this file that also look poorly
chosen to me, but for now I'm just trying to clean up the buildfarm.)
A larger problem is that "int" is not necessarily wide enough to hold
the result: per POSIX, all these functions return ssize_t. In places
where the requested read or write length clearly fits in int, that's
academic. It may be academic anyway as long as we constrain
individual data files to 1GB, since even a readv or writev-like
operation would then not be responsible for transferring more than
1GB. Nonetheless it seems like trouble waiting to happen, so I made
a pass over readv and writev calls and fixed the result variables
where that seemed appropriate. We might want to think about changing
some of the fd.c functions to return ssize_t too, for future-proofing;
but I didn't tackle that here.
Discussion: https://postgr.es/m/1672202.1703441340@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
e0b1ee17dc introduced optimization for matching B-tree scan keys required for
the directional scan. However, it incorrectly assumed that all keys required
for opposite direction scan are satisfied by _bt_first(). It has been
illustrated that with multiple scan keys over the same column, a lesser one
(according to the scan direction) could win leaving the other one unsatisfied.
Instead of relying on _bt_first() this commit introduces code that memorizes
whether there was at least one match on the page. If that's true we know that
keys required for opposite-direction scan are satisfied as soon as
corresponding values are not NULLs.
Also, this commit simplifies the description for the optimization of keys
required for the current direction scan. Now the flag used for this is named
continuescanPrechecked and means exactly that *continuescan flag is known
to be true for the last item on the page.
Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzn0LeLcb1PdBnK0xisz8NpHkxRrMr3NWJ%2BKOK-WZ%2BQtTQ%40mail.gmail.com
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
|
|
|
|
| |
It's not necessary to keep the firstPage flag as a field of BTScanOpaqueData.
This commit makes it an argument of the _bt_readpage() function. We can easily
distinguish first-time and repeated calls (within the scan) of this function.
Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzk4SOsw%2BtHuTFiz8U9Jqj-R77rYPkhWKODCBb1mdHACXA%40mail.gmail.com
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary when spgcanreturn() is invoked on a partitioned
index, and the failure might be reachable in other scenarios as
well. The rest of what spgGetCache() does is perfectly sensible
for a partitioned index, so we should allow it to go through.
I think the main takeaway from this is that we lack sufficient test
coverage for non-btree partitioned indexes. Therefore, I added
simple test cases for brin and gin as well as spgist (hash and
gist AMs were covered already in indexing.sql).
Per bug #18256 from Alexander Lakhin. Although the known test case
only fails since v16 (3c569049b), I've got no faith at all that there
aren't other ways to reach this problem; so back-patch to all
supported branches.
Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6af179395 added isCatalogRel field to some WAL record types,
but this field was not shown in the rmgr descriptions. This commit
changes the several rmgr descriptions to display the isCatalogRel
field.
Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/957dc8f9-2a02-4640-9c01-9dcbf97c4187%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To take an incremental backup, you use the new replication command
UPLOAD_MANIFEST to upload the manifest for the prior backup. This
prior backup could either be a full backup or another incremental
backup. You then use BASE_BACKUP with the INCREMENTAL option to take
the backup. pg_basebackup now has an --incremental=PATH_TO_MANIFEST
option to trigger this behavior.
An incremental backup is like a regular full backup except that
some relation files are replaced with files with names like
INCREMENTAL.${ORIGINAL_NAME}, and the backup_label file contains
additional lines identifying it as an incremental backup. The new
pg_combinebackup tool can be used to reconstruct a data directory
from a full backup and a series of incremental backups.
Patch by me. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub
Wartak, Peter Eisentraut, and Álvaro Herrera. Thanks especially to
Jakub for incredibly helpful and extensive testing.
Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When active, this process writes WAL summary files to
$PGDATA/pg_wal/summaries. Each summary file contains information for a
certain range of LSNs on a certain TLI. For each relation, it stores a
"limit block" which is 0 if a relation is created or destroyed within
a certain range of WAL records, or otherwise the shortest length to
which the relation was truncated during that range of WAL records, or
otherwise InvalidBlockNumber. In addition, it stores a list of blocks
which have been modified during that range of WAL records, but
excluding blocks which were removed by truncation after they were
modified and never subsequently modified again.
In other words, it tells us which blocks need to copied in case of an
incremental backup covering that range of WAL records. But this
doesn't yet add the capability to actually perform an incremental
backup; the next patch will do that.
A new parameter summarize_wal enables or disables this new background
process. The background process also automatically deletes summary
files that are older than wal_summarize_keep_time, if that parameter
has a non-zero value and the summarizer is configured to run.
Patch by me, with some design help from Dilip Kumar and Andres Freund.
Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter
Eisentraut, and Álvaro Herrera.
Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
First, mark the xlblocks member with InvalidXLogRecPtr, then issue a
write barrier, then initialize it. That ensures that the xlblocks
member doesn't appear valid while the contents are being initialized.
In preparation for reading WAL buffer contents without a lock.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
|
|
|
|
|
|
|
|
|
|
| |
In preparation for reading the contents of WAL buffers without a
lock. Also, avoids the previously-needed comment in GetXLogBuffer()
explaining why it's safe from torn reads.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary. This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.
Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent. This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.
This is broken since the introduction of hot standby in efc16ea52067, so
backpatch all the way down.
Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
| |
This GUC was intended as a debugging help in the 9.0 area when hot
standby and streaming replication were being developped, able to offer
more information at LOG level rather than DEBUGn. There are more tools
available these days that are able to offer rather equivalent
information, like pg_waldump introduced in 9.3. It is not obvious how
this facility is useful these days, so let's remove it.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz
|
|
|
|
|
|
|
|
|
|
| |
Remove comments that supposed that holding a pin was a useful interlock
for _bt_walk_left(). There are times when _bt_walk_left() doesn't hold
either a lock or a pin on any page, so clearly this can't be true.
_bt_walk_left() is even prepared to deal with concurrent deletion of
both the original page and any pages to its left.
Oversight in commit 2ed5b87f96.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans. Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too. These concepts are closely related, and are much easier
to understand when discussed together.
Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:
SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;
Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page. We'll now descend straight
to the leaf page containing a (12, -inf) high key instead. The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11. The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.
You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df8) for a small subset of cases where it was
always counterproductive.
Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases. Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow using multiple worker processes to build BRIN index, which until
now was supported only for BTREE indexes. For large tables this often
results in significant speedup when the build is CPU-bound.
The work is split in a simple way - each worker builds BRIN summaries on
a subset of the table, determined by the regular parallel scan used to
read the data, and feeds them into a shared tuplesort which sorts them
by blkno (start of the range). The leader then reads this sorted stream
of ranges, merges duplicates (which may happen if the parallel scan does
not align with BRIN pages_per_range), and adds the resulting ranges into
the index.
The number of duplicate results produced by workers (requiring merging
in the leader process) should be fairly small, thanks to how parallel
scans assign chunks to workers. The likelihood of duplicate results may
increase for higher pages_per_range values, but then there are fewer
page ranges in total. In any case, we expect the merging to be much
cheaper than summarization, so this should be a win.
Most of the parallelism infrastructure is a simplified copy of the code
used by BTREE indexes, omitting the parts irrelevant for BRIN indexes
(e.g. uniqueness checks).
This also introduces a new index AM flag amcanbuildparallel, determining
whether to attempt to start parallel workers for the index build.
Original patch by me, with reviews and substantial reworks by Matthias
van de Meent, certainly enough to make him a co-author.
Author: Tomas Vondra, Matthias van de Meent
Reviewed-by: Matthias van de Meent
Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building BRIN indexes, the brinbuildCallback only advances to the
next page range when seeing a tuple that doesn't belong to the current
one. This means that the index may end up missing ranges at the end of
the table, if those pages do not contain any indexable tuples.
We tend not to have completely empty pages at the end of a relation, but
this also applies to partial indexes, where the tuples may simply not
match the index predicate. This results in inefficient scans using the
affected BRIN index - without the summaries, the page ranges have to be
read and processed, which consumes I/O and possibly also CPU time.
The existing code already added empty ranges for earlier parts of the
table, this commit makes sure we add them for the ranges at the end of
the table too.
Patch by Matthias van de Meent, with review/improvements by me.
Author: Matthias van de Meent
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiMsPZg%3DxkvSF_jt4%3D69k6K7gz5B8V2wY3gCGZ%2B1BzCbQ%40mail.gmail.com
|
|
|
|
|
|
|
|
| |
The old name was misleading: It's not a cache, the values kept in the
struct are the authoritative source.
Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
|
|
|
|
|
|
|
| |
For sake of consistency.
Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While checking if a record could fit in the circular WAL decoding
buffer, the coding from commit 3f1ce973 used arithmetic that could
overflow. 64 bit systems were unaffected for various technical reasons,
which probably explains the lack of problem reports. Likewise for 32
bit systems running known 32 bit kernels. The systems at risk of
problems appear to be 32 bit processes running on 64 bit kernels, with
unlucky placement in memory.
Per complaint from GCC -fsanitize=undefined -m32, while testing
variations of 039_end_of_wal.pl.
Back-patch to 15.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKH0oRPOX7DhiQ_b51sM8HqcPp2J3WA-Oen%3DdXog%2BAGGQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
This has been broken since b060dbe0001a that has reworked the callback
mechanism of XLogReader, most likely unnoticed because any form of
development involving WAL happens on platforms where this compiles fine.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 861f86beea changed hash_xlog_squeeze_page() to start reading
the write buffer conditionally but forgot to initialize it leading to an
uninitialized access.
Reported-by: Alexander Lakhin
Author: Hayato Kuroda
Reviewed-by: Alexander Lakhin, Amit Kapila
Discussion: http://postgr.es/m/62ed1a9f-746a-8e86-904b-51b9b806a1d9@gmail.com
|
|
|
|
|
| |
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
|
|
|
|
|
|
| |
Reported-by: Alexander Lakhin, Tom Lane
Author: Pavel Borisov
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quotes are applied to GUCs in a very inconsistent way across the code
base, with a mix of double quotes or no quotes used. This commit
removes double quotes around all the GUC names that are obviously
referred to as parameters with non-English words (use of underscore,
mixed case, etc).
This is the result of a discussion with Álvaro Herrera, Nathan Bossart,
Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson.
Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch from using TransactionId to FullTransactionId in naming of 2PC files.
Transaction state file in the pg_twophase directory now have extra 8 bytes in
the name to address an epoch of a given xid.
Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've had repeated bugs in the area of handling SLRU wraparound in the past,
some of which have caused data loss. Switching to an indexing system for SLRUs
that does not wrap around should allow us to get rid of a whole bunch
of problems and improve the overall reliability of the system.
This particular patch however only changes the indexing and doesn't address
the wraparound per se. This is going to be done in the following patches.
Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the tuple being updated is not visible to the crosscheck snapshot,
we return TM_Updated but the assertions would not hold in that case.
Move them to before the cross-check.
Fixes bug #17893. Backpatch to all supported versions.
Author: Alexander Lakhin
Backpatch-through: 12
Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a bug introduced by c1ec02be1d79. It may happen that the executor
opens indexes on the result relation, but no rows end up being inserted.
Then the index_insert_cleanup still gets executed, but passes down NULL
to the AM callback. The AM callback may not expect this, as is the case
of brininsertcleanup, leading to a crash.
Fixed by only calling the cleanup callback if (ii_AmCache != NULL). This
way the AM can simply assume to only see a valid cache.
Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-w9qC-o9hQox9UHvdVZAYTp8OrPQOKtwbvzWaRejTT=Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up
walwriter every time the LSN advances, but walwriter doesn't actually
do anything unless it has at least 'wal_writer_flush_after' full
blocks of WAL to write. Repeatedly waking up walwriter to do nothing
is a waste of CPU cycles in both walwriter and the backends doing the
wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to
decide whether to wake up walwriter, as walwriter uses to determine if
it has any work to do.
In the passing, rename misleadingly named 'flushbytes' local variable
to 'flushblocks'.
Author: Andres Freund, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/20231024230929.vsc342baqs7kmbte@awork3.anarazel.de
|
|
|
|
| |
Per buildfarm member koel.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brininsert code used to initialize (and destroy) BrinDesc and
BrinRevmap for each tuple, which is not free. This patch initializes
these structures only once, and reuses them for all inserts in the same
command. The data is passed through indexInfo->ii_AmCache.
This also introduces an optional AM callback "aminsertcleanup" that
allows performing custom cleanup in case simply pfree-ing ii_AmCache is
not sufficient (which is the case when the cache contains TupleDesc,
Buffers, and so on).
Author: Soumyadeep Chakraborty
Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra
Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/CAB7nPqSDdF0heotQU3gsepgqx+9c+6KjLd3R6aNYH7KKfDd2ig@mail.gmail.com
Author: Michael Paquier
Backpatch-through: master
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously these functions returned the previous segment number if the
LSN was on a segment boundary. We now always return the current segment
number for an LSN.
Docs updated to reflect this change. Regression tests added, author
Andres Freund.
Also mentioned in thread https://postgr.es/m/flat/20220204225057.GA1535307%40nathanxps13#d964275c9540d8395e138efc0a75f7e8
BACKWARD INCOMPATIBILITY
Reported-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20190726.172120.101752680.horikyota.ntt@gmail.com
Co-authored-by: Kyotaro Horiguchi
Backpatch-through: master
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When there are no indexes on a table, we vacuum each heap block after
pruning it and then update the freespace map. Periodically, we also
vacuum the freespace map. This was done while unnecessarily holding a
lock on the heap page. Release the lock before calling
FreeSpaceMapVacuumRange() and, while we're at it, ensure the range
includes the heap block we just vacuumed.
There are no known deadlocks or other similar issues, therefore don't
backpatch. It's certainly not good to do all this work under a lock, but it's
not frequently reached, making it not worth the risk of backpatching.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_YiL%3D44GvGnt1dpYouDSSoV7wzxVoXs8m3p311rp-TVQQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is
just a backwards compatibility macro for MemoryContextReset(). Now
that some time has passed, this macro seems more likely to create
confusion.
This commit removes the macro and replaces all remaining uses with
calls to MemoryContextReset(). Any third-party code that use this
macro will need to be adjusted to call MemoryContextReset()
instead. Since the two have behaved the same way since v9.5, such
adjustments won't produce any behavior changes for all
currently-supported versions of PostgreSQL.
Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
|
|
|
|
|
|
|
|
|
|
|
| |
Alexander reported a crash with repeated create + drop database, after
the ResourceOwner rewrite (commit b8bff07daa). That was fixed by the
previous commit, but it nevertheless seems like a good idea clear
CurrentResourceOwner earlier, because you're not supposed to use it
for anything after we start releasing it.
Reviewed-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's clearly stated in the comments that ginFindParents() must keep
the pin on the index's root page that's associated with the topmost
GinBtreeStack item. However, the code path for the case that the
desired downlink has been pushed down to the next index level
ignored this proviso, and would release the pin anyway if we were
still examining the root level. That led to an assertion failure
or "buffer NNNN is not owned by resource owner" error later, when
we try to release the pin again at the end of the insertion.
This is quite hard to reproduce, since it can only happen if an
index root page split occurs concurrently with our own insertion.
Thanks to Jeff Janes for finding a test case that triggers it
often enough to allow investigation.
This has been there since the beginning of GIN, so back-patch
to all supported branches.
Discussion: https://postgr.es/m/CAMkU=1yCAKtv86dMrD__Ja-7KzjE=uMeKX8y__cx5W-OEWy2ow@mail.gmail.com
|