| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For import and export, use schemaname/relname rather than
regclass.
This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.
Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The default interval for \watch to wait between executing queries,
when executed without a specified interval, was hardcoded to two
seconds. This adds the new variable WATCH_INTERVAL which is used
to set the default interval, making it configurable for the user.
This makes \watch the first command which has a user configurable
default setting.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/B2FD26B4-8F64-4552-A603-5CC3DF1C7103@yesql.se
|
|
|
|
|
|
|
|
|
| |
This adds a missing PQclear in the error path of StreamLogicalLog, a
fix in the same vein as e889422d98e with an equivalent low impact.
Author: Steven Niu <niushiji@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/c4b1c627-a3e4-4347-a670-1e28a43ce0eb@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
createForeignKeyCheckTriggers()
Currently, createForeignKeyCheckTriggers() takes a Relation type as
its first argument, but it doesn't use that argument directly.
Instead, it fetches the relation OID by calling RelationGetRelid().
Therefore, it would be more consistent with other functions (e.g.,
createForeignKeyCheckTriggers()) to pass the relation OID directly
instead of the whole Relation.
Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split ATExecAlterConstraintInternal() into two functions:
ATExecAlterConstrDeferrability() and
ATExecAlterConstrInheritability(). This simplifies the code and
avoids unnecessary confusion caused by recursive code, which isn't
needed for ATExecAlterConstrInheritability().
(This also takes over the changes in commit 64224a834ce, as the new
AlterConstrDeferrabilityRecurse() is essentially the old
ATExecAlterChildConstr().)
Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
This extracts common/duplicate code for different ALTER CONSTRAINT
variants into a common function. We plan to add more variants that
would use the same code.
Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Small fixes for commit f4e53e10b6c: Add missing calls to
InvokeObjectPostAlterHook() and also CacheInvalidateRelcache(). The
former change could have a user-visible effect. The latter omission
might have caused other bugs, but it is not clear whether one actually
existed. With these changes, the code is now more consistent with
similar ALTER CONSTRAINT variants, especially the ones that set the
deferrability.
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we used pg_int64 in three function prototypes in libpq. It
was added by commit 461ef73f to expose the platform-dependent type used
for int64 in the C89 era. As of commit 962da900 it is defined as
standard int64_t, and the dust seems to have settled.
Let's just use int64_t directly in these three client-facing functions
instead of (yet) another name. We've required C99 and thus <stdint.h>
since PostgreSQL 12, C89 and C++98 compilers are long gone, and client
applications very likely use standard types for their own 64-bit needs.
This also cleans up the obscure placement of a new #include <stdint.h>
directive in postgres_ext.h, required for the new definition. The
typedef was hiding in there for historical reasons, but it doesn't fit
postgres_ext.h's own description of its purpose and there is no evidence
of client applications including postgres_ext.h directly to see it.
Keep a typedef marked deprecated for backward compatibility, but move it
into libpq-fe.h where it was used.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGKn_EkNNGMY5RzMcKP%2Ba6urT4JF%3DCPhw_zHtQwjvX6P2g%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The network (inet) support functions currently only supported a
hardcoded btree operator family. With the generalized compare type
facility, we can generalize this to support any operator family from
any index type that supports the required operators.
Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This option gives the possibility for query jumble to define a custom
routine for the field of a Node, extending support for
custom_query_jumble as a node field attribute. When dealing with
complex node structures, this can be simpler than having to enforce a
custom function across a full node.
Custom functions need to be defined in queryjumblefuncs.c, named as
_jumble${node}_${field}(), and use in input the JumbleState, the node
and its field. The field is not really required if we have the Node,
but it makes custom implementations somewhat easier to think about. The
code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), hiding the internals of the logic inside
queryjumblefuncs.c.
This will be used by an upcoming patch manipulating adding a custom
routine into a field of RangeTblEntry, but this facility can become
useful in more cases.
Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/Z9y43-dRvb4EtxQ0@paquier.xyz
|
|
|
|
|
|
|
|
|
|
| |
Reduces memory required for hash aggregation by avoiding an allocation
and a pointer in the TupleHashEntryData structure. That structure is
used for all buckets, whether occupied or not, so the savings is
substantial.
Discussion: https://postgr.es/m/AApHDvpN4v3t_sdz4dvrv1Fx_ZPw=twSnxuTEytRYP7LFz5K9A@mail.gmail.com
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
|
|
|
|
|
|
|
|
|
| |
Allows an "extra" argument that allocates extra memory at the end of
the MinimalTuple. This is important for callers that need to store
additional data, but do not want to perform an additional allocation.
Suggested-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvppeqw2pNM-+ahBOJwq2QmC0hOAGsmCpC89QVmEoOvsdg@mail.gmail.com
|
|
|
|
|
|
|
| |
Refactor for upcoming optimizations.
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/1cc3b400a0e8eead18ff967436fa9e42c0c14cfb.camel@j-davis.com
|
|
|
|
|
|
|
|
|
|
| |
The entries aren't freed until the entire hash table is destroyed, so
use the Bump allocator to improve allocation speed, avoid wasting
space on the chunk header, and avoid wasting space due to the
power-of-two allocations.
Discussion: https://postgr.es/m/CAApHDvqv1aNB4cM36FzRwivXrEvBO_LsG_eQ3nqDXTjECaatOQ@mail.gmail.com
Reviewed-by: David Rowley
|
|
|
|
|
|
| |
Author: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CALDaNm2ms1deM5EYNLFEfESv_Kw=Y4AiTB0LP=qGS-UpFwGbPg@mail.gmail.com
Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
|
|
|
|
|
|
|
| |
Forgot to update the comment atop one of the functions.
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/OSCPR01MB1496623BE1125B44614494E7AF5A72@OSCPR01MB14966.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now max_files_per_process=N limited each backend to open N files in
total (minus a safety factor), even if there were already more files opened in
postmaster and inherited by backends. Change max_files_per_process to control
how many additional files each process is allowed to open.
The main motivation for this is the patch to add io_method=io_uring, which
needs to open one file for each backend. Without this patch, even if
RLIMIT_NOFILE is high enough, postmaster will fail in set_max_safe_fds() if
started with a high max_connections. The cause of the failure is that, until
now, set_max_safe_fds() subtracted the already open files from
max_files_per_process.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/w6uiicyou7hzq47mbyejubtcyb2rngkkf45fk4q7inue5kfbeo@bbfad3qyubvs
Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d=h9J=D1rStFCMR+t7yh_Kwj-g87aLQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This field was added in commit 0164a0f9ee to provide a way to
determine whether a storage parameter was explicitly set for the
relation or if it just picked up the default value. In most cases,
this can be accomplished by giving the storage parameter a special
out-of-range default value (e.g., the
autovacuum_vacuum_insert_threshold storage parameter defaults to
-2), but this approach doesn't work in all cases. For example, a
Boolean storage parameter cannot be given an out-of-range default,
so we need another way to discover the source of its value.
Reported-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwYKtEUYKS%2B18gRs-xPhn0qOJgM2KGyyWVCODHuVn9F-XQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bitmap heap scan skip fetch optimization skips fetching the heap
block when a page is set all-visible in the visibility map and no
columns from the table are needed to satisfy the query.
2b73a8cd33b and c3953226a07 changed the control flow of bitmap heap scan
to use the read stream API. The read stream API returns buffers
containing blocks to the user. To make this work with the skip fetch
optimization, we keep a count of the empty tuples we need to emit for
all the blocks skipped and only emit the empty tuples after processing
the next block fetched from the heap or at the end of the scan.
It's incorrect to recheck NULL tuples, so we must set `recheck` to false
before yielding control back to BitmapHeapNext(). This was done before
emitting any remaining empty tuples at the end of the scan but not for
empty tuples emitted during the scan. This meant that if a page fetched
from the heap did require recheck and set `recheck` to true and then we
emitted empty tuples for subsequent blocks, we would get wrong results.
Fix this by always setting `recheck` to false before emitting empty
tuples.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Tested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/496f7acd-881c-4df3-9bd3-8f8534dfec26%40gmail.com
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When pg_recvlogical was introduced in 9.4, the --dbname option was not
required for --drop-slot. Without it, pg_recvlogical --drop-slot connected
using a replication connection (not tied to a specific database) and
was able to drop both physical and logical replication slots, similar to
pg_receivewal --drop-slot.
However, commit 0c013e08cfb unintentionally changed this behavior in 9.5,
making pg_recvlogical always check whether it's connected to a specific
database and fail if it's not. This change was expected for --create-slot
and --start, which handle logical replication slots and require a database
connection, but it was unnecessary for --drop-slot, which should work with
any replication connection. As a result, --dbname became a required option
for --drop-slot.
This commit removes that restriction, restoring the original behavior and
allowing pg_recvlogical --drop-slot to work without specifying --dbname.
Although this issue originated from an unintended change, it has existed
for a long time without complaints or bug reports, and the documentation
never explicitly stated that --drop-slot should work without --dbname.
Therefore, the change is not treated as a bug fix and is applied only to
master.
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/b15ecf4f-e5af-4fbb-82c2-a425f453e0b2@oss.nttdata.com
|
|
|
|
|
|
| |
Author:Jelte Fennema-Nio <github-tech@jeltef.nl>
Suggested-By: Michael Banck <mbanck@gmx.net>
Discussion: https://www.postgresql.org/message-id/67813520.170a0220.183245.7bf0%40mx.google.com
|
|
|
|
|
|
|
|
|
| |
Reviewed-By: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-By: Michael Banck <mbanck@gmx.net>
Reviewed-By: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-By: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-By: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CABUevEyTMyXC6OvCWkj+rPnHrfi8_Rw_+DD_jzgFFNPqgf+Oig@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
We haven't had bugs in this area, but there's some not-entirely
trivial code to detect that case, so it seems good to have test
coverage for it.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://www.postgresql.org/message-id/CAHut%2BPtX8P0EGhsk9p%3DhQGUHrzxeCSzANXSMKOvYiLX-EjdyNw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new conflict type, multiple_unique_conflicts, to handle cases
where an incoming row during logical replication violates multiple UNIQUE
constraints.
Previously, the apply worker detected and reported only the first
encountered key conflict (insert_exists/update_exists), causing repeated
failures as each constraint violation needs to be handled one by one
making the process slow and error-prone.
With this patch, the apply worker checks all unique constraints upfront
once the first key conflict is detected and reports
multiple_unique_conflicts if multiple violations exist. This allows users
to resolve all conflicts at once by deleting all conflicting tuples rather
than dealing with them individually or skipping the transaction.
In the future, this will also allow us to specify different resolution
handlers for such a conflict type.
Add the stats for this conflict type in pg_stat_subscription_stats.
Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Previously there was no coverage for this function.
Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com>
Discussion: https://postgr.es/m/CAJ7c6TMT6XCooMVKnCd_tR2oBdGcnjefSeCDCv8jzKy9VkWA5w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This field can be optionally set in a PlannedStmt through the planner
hook, giving extensions the possibility to assign an identifier related
to a computed plan. The backend is changed to report it in the backend
entry of a process running (including the extended query protocol), with
semantics and APIs to set or get it similar to what is used for the
existing query ID (introduced in the backend via 4f0b0966c8). The plan
ID is reset at the same timing as the query ID. Currently, this
information is not added to the system view pg_stat_activity; extensions
can access it through PgBackendStatus.
Some patches have been proposed to provide some features in the planning
area, where a plan identifier is used as a key to know the plan involved
(for statistics, plan storage and manipulations, etc.), and the point of
this commit is to provide an anchor in the backend that extensions can
rely on for future work. The reset of the plan identifier is
controlled by core and follows the same pattern as the query identifier
added in 4f0b0966c8.
The contents of this commit are extracted from a larger set proposed
originally by Lukas Fittl, that Sami Imseih has proposed as an
independent change, with a few tweaks sprinkled by me.
Author: Lukas Fittl <lukas@fittl.com>
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg@mail.gmail.com
Discussion: https://postgr.es/m/CAA5RZ0vyWd4r35uUBUmhngv8XqeiJUkJDDKkLf5LCoWxv-t_pw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve psql's tab completion for VACUUM and ANALYZE by supporting
the ONLY option introduced in 62ddf7ee9.
In passing, simplify some of the VACUUM patterns by making use
of MatchAnyN.
Author: Umar Hayat <postgresql.wizard@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: https://postgr.es/m/CAD68Dp3L6yW_nWs+MWBs6s8tKLRzXaQdQgVRm4byZe0L-hRD8g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During hot standby, ExpireAllKnownAssignedTransactionIds() and
ExpireOldKnownAssignedTransactionIds() functions mark old transactions
as no-longer running, but they failed to update xactCompletionCount
and latestCompletedXid. AFAICS it would not lead to incorrect query
results, because those functions effectively turn in-progress
transactions into aborted transactions and an MVCC snapshot considers
both as "not visible". But it could surprise GetSnapshotDataReuse()
and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin))" assertion in it, if the apparent xmin in a backend would
move backwards. We saw this happen when GetCatalogSnapshot() would
reuse an older catalog snapshot, when GetTransactionSnapshot() had
already advanced TransactionXmin.
The bug goes back all the way to commit 623a9ba79b in v14 that
introduced the snapshot reuse mechanism, but it started to happen more
frequently with commit 952365cded6 which removed a
GetTransactionSnapshot() call from backend startup. That made it more
likely for ExpireOldKnownAssignedTransactionIds() to be called between
GetCatalogSnapshot() and the first GetTransactionSnapshot() in a
backend.
Andres Freund first spotted this assertion failure on buildfarm member
'skink'. Reproduction and analysis by Tomas Vondra.
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5zmdj
|
|
|
|
| |
Commit 28f04984f0c240b76e61f00cd247554fbc850056 missed this.
|
|
|
|
|
|
|
|
| |
The previous prefix wasn't consistent with the naming of other AIO related
enum values. It seems best to rename it before the users are introduced.
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_Yb+JzQpNsgUxCB0gBi+sE-mi_HmcJF6ALnmO4W+UgwpA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The catchall exception condition OTHERS was represented as
sqlerrstate == 0, which was a poor choice because that comes
out the same as SQLSTATE '00000'. While we don't issue that
as an error code ourselves, there isn't anything particularly
stopping users from doing so. Use -1 instead, which can't
match any allowed SQLSTATE string.
While at it, invent a macro PLPGSQL_OTHERS to use instead of
a hard-coded magic number.
While this seems like a bug fix, I'm inclined not to back-patch.
It seems barely possible that someone has written code like this
and would be annoyed by changing the behavior in a minor release.
Reported-by: David Fiedler <david.fido.fiedler@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHjN70-=H5EpTOuZVbC8mPvRS5EfZ4MY2=OUdVDWoyGvKhb+Rw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new scheduling heuristic: don't end the ongoing primitive index
scan immediately (at the point where _bt_advance_array_keys notices that
the next set of matching tuples must be on a later page) if the primscan
already managed to step right/left from its first leaf page. Schedule a
recheck against the next sibling leaf page's finaltup instead.
The new heuristic tends to avoid scenarios where the top-level scan
repeatedly starts and ends primitive index scans that each read only one
leaf page from a group of neighboring leaf pages. Affected top-level
scans will now tend to step forward (or backward) through the index
instead, without wasting cycles on descending the index anew.
The recheck mechanism isn't exactly new. But up until now it has only
been used to deal with edge cases involving high key finaltups with one
or more truncated -inf attributes that _bt_advance_array_keys deemed
"provisionally satisfied" (satisfied for the purposes of allowing the
scan to step onto the next page, subject to recheck once on that page).
The mechanism was added by commit 5bf748b8, which invented the general
concept of primitive scan scheduling. It was later enhanced by commit
79fa7b3b, which taught it about cases involving -inf attributes that
satisfy inequality scan keys required in the opposite-to-scan direction
only (arguably, they should have been covered by the earliest version).
Now the recheck mechanism can be applied based on scan-level heuristics,
which have nothing to do with truncated high keys. Now rechecks might
be performed by _bt_readpage when scanning in _either_ scan direction.
The theory behind the new heuristic is that any primitive scan that
makes it past its first leaf page is one that is already likely to have
arrays whose key values match index tuples that are closely clustered
together in the index. The rules that determine whether we ever get
past the first page are still conservative (that'll still only happen
when pstate.finaltup strongly suggests that it's the right thing to do).
Surviving past the first leaf page is a strong signal in itself.
Preparation for an upcoming patch that will add skip scan optimizations
to nbtree. That'll work by adding skip arrays, which behave similarly
to SAOP arrays, but generate their elements procedurally and on-demand.
Note that this commit isn't specifically concerned with skip arrays; the
scheduling logic doesn't (and won't) condition anything on whether the
scan uses skip arrays, SAOP arrays, or some combination of the two
(which seems like a good general principle for _bt_advance_array_keys).
While the problems that this commit ameliorates are more likely with
skip arrays (at least in practice), SAOP arrays (or those with very
dense, contiguous array elements) are also affected.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wzkz0wPe6+02kr+hC+JJNKfGtjGTzpG3CFVTQmKwWNrXNw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like 69273b818b1df did for GiST vacuuming, make SP-GiST vacuum use the
read stream API for vacuuming physically contiguous index pages.
Concurrent insertions may cause SP-GiST index tuples to be redirected.
While vacuuming, these are added to a pending list which is later
processed to ensure no dead tuples are left behind. Pages containing
such tuples are still read by directly calling ReadBuffer() and do not
use the read stream API.
Author: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/37432403-8657-403B-9CDF-5A642BECDD81%40yandex-team.ru
|
|
|
|
|
|
|
|
|
| |
This code must have missed a memo about the backend type description
being supplied automatically these days, and was duplicating that
information.
Before: "io worker io worker: N"
After: "io worker N"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 682ce911f modified exec_save_simple_expr to accept a Param
in the tlist of a Gather node, rather than the normal case of a Var
referencing the Gather's input. It turns out that this was a kluge
to work around the bug later fixed in 0f7ec8d9c, namely that setrefs.c
was failing to replace Params in upper plan nodes with Var references
to the same Params appearing in the child tlists. With that fixed,
there seems no reason to continue to allow a Param here. (Moreover,
even if we did expect a Param here, the semantically correct thing
to do would be to take the Param as the expression being sought.
Whatever it may represent, it is *not* a reference to the child.)
Hence, revert that part of 682ce911f.
That all happened a long time ago. However, since the net effect
here is just to tighten an Assert condition, I'm content to change
it only in master.
Discussion: https://postgr.es/m/1565347.1742572349@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new GUC option max_active_replication_origins
to control the maximum number of active replication
origins. Previously, this was controlled by
'max_replication_slots'. Having a separate GUC option provides better
flexibility for setting up subscribers, as they may not require
replication slots (for cascading replication) but always require
replication origins.
Author: Euler Taveira <euler@eulerto.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/b81db436-8262-4575-b7c4-bc0c1551000b@app.fastmail.com
|
|
|
|
|
|
|
|
|
| |
errdetail_relkind_not_supported() was declared within
EXPOSE_TO_CLIENT_CODE, which is mistaken since that function
isn't available client-side. While relatively harmless,
this isn't good precedent.
Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
| |
Make genbki.pl emit some boilerplate comments identifying the
sections of the pg_*_d.h files that it generates. This is in
hopes of making them slightly more readable, in case people
look at those files and not the pg_*.h/pg_*.dat originals.
Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like c5c239e26e387 did for btree vacuuming, make GiST vacuum use the
read stream API for sequentially processed pages.
Because it is possible for concurrent insertions to relocate unprocessed
index entries to already vacuumed pages, GiST vacuum must backtrack and
reprocess those pages. These pages are still read with explicit
ReadBuffer() calls.
Author: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/EFEBED92-18D1-4C0F-A4EB-CD47072EF071%40yandex-team.ru
|
|
|
|
|
|
|
| |
c5c239e26e made btree vacuum use the read stream API. Though it used
functions declared in read_stream.h, it relied on transitively including
it. Explicitly include that file. Also remove an extraneous newline and
decrease the scope of one of the local variables in btvacuumscan().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
exec_save_simple_expr did not account for the possibility that
standard_planner would stick a Materialize node atop the plan
of even a simple Result, if CURSOR_OPT_SCROLL is set. This led
to an "unexpected plan node type" error.
This is a very old bug, but it'd only be reached by declaring a
cursor for a "SELECT simple-expression" query and explicitly
marking it scrollable, which is an odd thing to do. So the lack
of prior reports isn't too surprising.
Bug: #18859
Reported-by: Olleg Samoylov <splarv@ya.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18859-0d5f28ac99a37059@postgresql.org
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Btree vacuum processes all index pages in physical order. Now it uses
the read stream API to get the next buffer instead of explicitly
invoking ReadBuffer().
It is possible for concurrent insertions to cause page splits during
index vacuuming. This can lead to index entries that have yet to be
vacuumed being moved to pages that have already been vacuumed. Btree
vacuum code handles this by backtracking to reprocess those pages. So,
while sequentially encountered pages are now read through the
read stream API, backtracked pages are still read with explicit
ReadBuffer() calls.
Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_bW1UOyup%3DjdFw%2BkOF9bCaAm%3D9UpiyZtbPMn8n_vnP%2Big%40mail.gmail.com#3b3a84132fc683b3ee5b40bc4c2ea2a5
|
|
|
|
|
|
|
|
|
|
| |
All TupleDescAttr() calls in tablecmds.c that aren't in loops across all
attributes use AttrNumber-style indexes (1-based); there was only one
place in ATRewriteTable that was stashing 0-based indexes in a list for
later processing. Switch that to use attnums for consistency.
Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEoYA5ScUr2=CmA1xcpaS_1ixneDbEkVU77X1ctGxY2mA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
StartReadBuffers() reports a short read when it finds a cached block
that ends a range needing I/O by updating the caller's *nblocks. It
doesn't want to have to unpin the trailing hit that it knows the caller
wants, so the v17 version used sleight of hand in the name of
simplicity: it included it in *nblocks as if it were part of the I/O,
but internally tracked the shorter real I/O size in io_buffers_len (now
removed).
This API change "forwards" the delimiting buffer to the next call. It's
still pinned, and still stored in the caller's array, but *nblocks no
longer includes stray buffers that are not really part of the operation.
The expectation is that the caller still wants the rest of the blocks
and will call again starting from that point, and now it can pass the
already pinned buffer back in (or choose not to and release it).
The change is needed for the coming asynchronous I/O version's larger
version of the problem: by definition it must move BM_IO_IN_PROGRESS
negotiation from WaitReadBuffers() to StartReadBuffers(), but it might
already have many buffers pinned before it discovers a need to split an
I/O. (The current synchronous I/O version hides that detail from
callers by looping over smaller reads if required to make all covered
buffers valid in WaitReadBuffers(), so it looks like one operation but
it might occasionally be several under the covers.)
Aside from avoiding unnecessary pin traffic, this will also be important
for later work on out-of-order streams: you can't prioritize data that
is already available right now if that fact is hidden from you.
The new API is natural for read_stream.c (see ed0b87ca). After a short
read it leaves forwarded buffers where they fell in its circular queue
for the continuing call to pick up.
Single-block StartReadBuffer() and traditional ReadBuffer() share code
but are not affected by the change. They don't do multi-block I/O.
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for a follow-up change to the buffer manager, teach
read_stream.c to manage buffers "forwarded" from one StartReadBuffers()
call to the next after a short read. This involves a small amount of
extra book-keeping, and opens the way for lower levels to split I/O
operations without having to drop pins, as required for efficient
handling of various edge cases.
Concretely, the "buffers" argument will change from an out parameter to
an in/out parameter. Buffer queue elements must be initialized on first
use and cleared after they're consumed, but forwarded buffers are left
where they fall ahead of the current pending read in the queue, ready
for use by the operation that continues where a short read left off.
The stream also needs to count them for pin limit management and release
them on reset/early end.
Tested-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes a needless special case for Memoize's FORMAT TEXT EXPLAIN
output.
ExplainPropertyText() outputs the same thing in text mode as the
special-case code was doing, so removing the special-case code results in
the same EXPLAIN output, just with less code.
It seems like a good idea to fix this to help prevent future changes in
this area from copying the same pattern.
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/88a71bcd-0b5c-4d0b-8107-757e96f402d5@tantorlabs.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would have the following inaccuracies when a backend tried to
read in a buffer, but that buffer was read in concurrently by another backend:
- the read IO was double-counted in the global buffer access stats (pgBufferUsage)
- the buffer hit was not accounted for in:
- global buffer access statistics
- pg_stat_io
- relation level IO stats
- vacuum cost balancing
While trying to read in a buffer that is concurrently read in by another
backend is not a common occurrence, it's also not that rare, e.g. due to
concurrent sequential scans on the same relation. This scenario has become
more likely in PG 17, due to the introducing of read streams, which can pin
multiple buffers before calling StartBufferIO() for all the buffers.
This behaviour has historically grown, but there doesn't seem to be any reason
to continue with the wrong accounting.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_Zk-B08AzPsO-6680LUHLOCGaNJYofaxTFseLa=OepV1g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to hold interrupts across most of the smgr.c/md.c functions, as
otherwise interrupt processing, e.g. due to a < ERROR elog/ereport, can
trigger procsignal processing, which in turn can trigger smgrreleaseall(). As
the relevant code is not reentrant, we quickly end up in a bad situation.
The only reason we haven't noticed this before is that there is only one
non-error ereport called in affected routines, in register_dirty_segments(),
and that one is extremely rarely reached. If one enables fd.c's FDDEBUG it's
easy to reproduce crashes.
It seems better to put the HOLD_INTERRUPTS()/RESUME_INTERRUPTS() in smgr.c,
instead of trying to push them down to md.c where possible: For one, every
smgr implementation would be vulnerable, for another, a good bit of smgr.c
code itself is affected too.
Eventually we might want a more targeted solution, allowing e.g. a networked
smgr implementation to be interrupted, but many other, more complicated,
problems would need to be fixed for that to be viable (e.g. smgr.c is often
called with interrupts already held).
One could argue this should be backpatched, but the existing < ERROR
elog/ereports that can be reached with unmodified sources are unlikely to be
reached. On balance the risk of backpatching seems higher than the gain - at
least for now.
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17 made it possible for
loadable modules to add EXPLAIN options. Normally, any necessary
validation can be performed by the hook function passed to
RegisterExtensionExplainOption, but if a loadable module wants to sanity
check options against each other, that needs to be done after the entire
options list has been processed. So, add an additional hook for that
purpose.
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CAA5RZ0vOcJF91O2e5AQN+V6guMNLMhJx83dxALf-iUZ-hLGO_Q@mail.gmail.com
|