| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if an infinite value was passed to date_trunc(), then the
same infinite value would always be returned regardless of the field
unit given by the caller. This commit updates the function so that an
error is returned when an invalid unit is passed to date_trunc() with an
infinite value.
This matches the behavior of date_trunc() with a finite value and
date_part() with an infinite value, making the handling of interval,
timestamp and timestamptz more consistent across the board for these two
functions.
Some tests are added to cover all these new failure cases, with an
unsupported unit and infinite values for the three data types. There
were no test cases in core that checked all these patterns up to now.
Author: Joseph Koshakow
Discussion: https://postgr.es/m/CAAvxfHc4084dGzEJR0_pBZkDuqbPGc5wn7gK_M0XR_kRiCdUJQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
The totalrows parameter in compute_expr_stats is unused, so remove it.
This is a static function, so the parameter can easily be added again if
it's ever needed.
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.ru>
Discussion: https://postgr.es/m/667b92d2-f953-4fcb-9377-3765f5b94187@tantorlabs.com
|
|
|
|
|
|
|
| |
Rename "core_yy_extra_type core_yy" to "core_yy_extra". The previous
name was a bit unclear and confusing. The new name matches the name
used elsewhere for the same purpose, for example in
src/backend/parser/gramparse.h.
|
|
|
|
|
| |
Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB0445D51BCFA8680F0B35FD6EB60C2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes "unresolved external symbol" errors with extensions that
use functions from libpgport that need special CFLAGS to
compile. Currently, that includes the CRC-32 functions.
Commit 2571c1d5cc did this for libcommon, but I missed that libpqport
has the same issue.
Reported-by: Tom Lane
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
|
|
|
|
|
|
|
| |
Trying to clean up the code a bit while we're working on these files
for the reentrant scanner/pure parser patches. This cleanup only
touches the code sections after the second '%%' in each file, via a
manually-supervised and locally hacked up pgindent.
|
|
|
|
|
|
|
|
|
|
| |
This fixes "unresolved external symbol" errors with extensions that
use functions from libcommon. This was reported with pgvector.
Reported-by: Andrew Kane
Author: Vladlen Popolitov
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
Use the flex %option reentrant to make the generated scanner
reentrant, and perhaps eventually thread-safe, but that will require
additional work.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the flex %option reentrant to make the generated scanner
reentrant and thread-safe. Note: The parser was already pure.
Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally. (Actually, we
use yy_scan_bytes() here because we already have the length.)
Use flex yyextra to handle context information, instead of global
variables. This complements the other changes to make the scanner
reentrant.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
| |
Oversight in commit 5bf748b86b.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.
Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures. Now, all the memory is
under palloc() control.
Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.
The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc(). But this
is no longer the case; everything is under palloc() now.
Use flex yyextra to handle context information, instead of global
variables. This complements the other changes to make the scanner
reentrant.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.
Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures. As a result, there could be
some small memory leaks in case of uncaught errors. Now, all the
memory is under palloc() control, so there are no more such issues.
Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.
The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc(). But this
is no longer the case; everything is under palloc() now.
Use flex yyextra to handle context information, instead of global
variables. This complements the other changes to make the scanner
reentrant.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Co-authored-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
|
|
|
|
|
| |
Per git log, the last time someone tried to do something with
pgrminclude was around 2011. And it's always had a tendency of
causing trouble when it was active. Also, pgcominclude is redundant
with headerscheck.
Discussion: https://www.postgresql.org/message-id/flat/2d4dc7b2-cb2e-49b1-b8ca-ba5f7024f05b%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per git log, the last time someone tried to do something with
pgrminclude was around 2011. Many (not all) of the "pgrminclude
ignore" annotations are of a newer date but seem to have just been
copied around during refactorings and file moves and don't seem to
reflect an actual need anymore.
There have been some parallel experiments with include-what-you-use
(IWYU) annotations, but these don't seem to correspond very strongly
to pgrminclude annotations, so there is no value in keeping the
existing ones even for that kind of thing.
So, wipe them all away. We can always add new ones in the future
based on actual needs.
Discussion: https://www.postgresql.org/message-id/flat/2d4dc7b2-cb2e-49b1-b8ca-ba5f7024f05b%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
5983a4cff added CompactAttribute as an abbreviated alternative to
FormData_pg_attribute to allow more cache-friendly processing in tasks
related to TupleDescs. That commit contained some assert-only code to
check that the CompactAttribute had been populated correctly, however,
the method used to do that checking caused the TupleDesc's
CompactAttribute to be zeroed before it was repopulated and compared to
the snapshot taken before the memset call. This caused issues as the type
cache caches TupleDescs in shared memory which can be used by multiple
backend processes at the same time. There was a window of time between
the zero and repopulation of the CompactAttribute where another process
would mistakenly think that the CompactAttribute is invalid due to the
memset.
To fix this, instead of taking a snapshot of the CompactAttribute and
calling populate_compact_attribute() and comparing the snapshot to the
freshly populated TupleDesc's CompactAttribute, refactor things so we
can just populate a temporary CompactAttribute on the stack. This way
we don't touch the TupleDesc's memory.
Reported-by: Alexander Lakhin, SQLsmith
Discussion: https://postgr.es/m/ca3a256a-5d12-42db-aabe-a75a030d9fb9@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two platforms have a remarkably tight default limit on the
number of SysV semaphores in the system: SEMMNS is only 60
out-of-the-box. Unless manual action is taken to raise that,
we'll only be able to allocate 3 sets of 16 usable semaphores
each, leading to initdb setting max_connections to just 20.
That's problematic because the core regression tests expect
to be able to launch 20 concurrent sessions, leaving us with
no headroom. This seems to be the cause of intermittent
buildfarm failures on some machines.
While there's no getting around the fact that you'd better raise
SEMMNS for production use on these platforms, it does seem desirable
for "make check" to pass reliably without that. We can make that
happen, at least for awhile longer, with two small changes:
* Change sysv_sema.c's SEMAS_PER_SET to 19, so that we can eat up
all of the available semas not just most of them.
* Change initdb to make the smallest max_connections value it will
consider be 25 not 20.
As of HEAD this will leave us with four free semaphores (using the
default values for other relevant parameters such as max_wal_senders).
So we won't need to consider this again until we've invented five
more background processes. Maybe by then we can switch both these
platforms to some other semaphore API.
For the moment, do this only in master; there've not been field
complaints that might justify a back-patch.
Discussion: https://postgr.es/m/db2773a2-aca0-43d0-99c1-060efcd9954e@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reset btpo_cycleid to 0 in btree_xlog_vacuum for consistency with
_bt_delitems_vacuum (the corresponding original execution code). This
makes things neater.
There might be some performance benefit to being consistent like this.
When btvacuumpage doesn't call _bt_delitems_vacuum, it can still
proactively reset btpo_cycleid to 0 via a separate hint-like update
mechanism (it does so whenever it sees that it isn't already set to 0).
And so it's possible that being consistent about resetting btpo_cycleid
like this will save work later on, after standby promotion: subsequent
VACUUMs won't need to clear btpo_cycleid using the hint-like update
mechanism as often as they otherwise would.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAH2-Wz=+LDFxn9NZyEsCo8ifcyKt6+n-VLyygySEHgMz+oynqw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
A historic snapshot should only be used for catalog access, not
general queries. We never call GetTransactionSnapshot() during logical
decoding, which is good because it wouldn't be very sensible, so the
code to deal with that was unreachable and untested. Turn it into an
error, to avoid doing that in the future either.
Discussion: https://www.postgresql.org/message-id/a868fe78-ddb4-4b0a-9b96-873d91d93cfd@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In get_database_list() and get_subscription_list(), the
GetTransactionSnapshot() call is not required because the catalog
table scans use the catalog snapshot, which is held until the end of
the scan. See table_beginscan_catalog(), which calls
RegisterSnapshot(GetCatalogSnapshot(relid)).
In InitPostgres, it's a little less obvious that it's not required,
but still true I believe. All the catalog lookups in InitPostgres()
also use the catalog snapshot, and the looked up values are copied
while still holding the snapshot.
Furthermore, as the removed FIXME comments said, calling
GetTransactionSnapshot() didn't really prevent MyProc->xmin from being
reset anyway.
Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jian He reported the src/include/utility/tcop.h one and the remainder
were found by using a script to look for src/* and check that we have a
filename or directory of that name.
In passing, fix some out-date comments.
Reported-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CACJufxGoE3H-7VgO02=PrR4SNuVWDVbfTyUnwO0HvS-Lxurnog@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Library unloading has never been supported with its code removed in
ab02d702ef08, and there were some comments still mentioning that it was
a possible operation.
ChangAo has noticed the incorrect references in dfmgr.c, while I have
noticed the other ones while scanning the rest of the tree for similar
mistakes.
Author: ChangAo Chen, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/tencent_1D09840A1632D406A610C8C4E2491D74DB0A@qq.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GetSnapshotData() set TransactionXmin = MyProc->xmin, but when
SnapshotResetXmin() advanced MyProc->xmin, it did not advance
TransactionXmin correspondingly. That meant that TransactionXmin could
be older than MyProc->xmin, and XIDs between than TransactionXmin and
the real MyProc->xmin could be vacuumed away. One known consequence is
in pg_subtrans lookups: we might try to look up the status of an XID
that was already truncated away.
Back-patch to all supported versions.
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/d27a046d-a1e4-47d1-a95c-fbabe41debb4@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we convert CompactAttribute.attalign from a char, which is directly
derived from pg_attribute.attalign into a uint8, which stores the number
of bytes to align the column's value by in the tuple.
This allows tuple deformation and tuple size calculations to move away
from using the inefficient att_align_nominal() macro, which manually
checks each TYPALIGN_* char to translate that into the alignment bytes
for the given type. Effectively, this commit changes those to TYPEALIGN
calls, which are branchless and only perform some simple arithmetic with
some bit-twiddling.
The removed branches were often mispredicted by CPUs, especially so in
real-world tables which often contain a mishmash of different types
with different alignment requirements.
Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
|
|
|
|
|
| |
Like CurrentSnapshotData, it should not be accessed directly outside
snapmgr.c.
|
|
|
|
|
|
|
|
|
| |
This used to say "nsubxcnt isn't decreased when subtransactions
abort", but there's no variable called nsubxcnt. Commit 8548ddc61b
changed it to "subxcnt", among other typo fixes, but that was wrong
too: the comment actually talks about txn->nsubtxns. That's the field
that's incremented but never decremented and is used for the
allocation earlier in the function.
|
|
|
|
|
|
|
|
|
|
| |
28328ec87b45725 addressed one overflow danger in
SampleHeapTupleVisible() but introduced another, albeit a less likely
one. Modify the binary search code to remove this danger.
Reported-by: Richard Guo
Reviewed-by: Richard Guo, Ranier Vilela
Discussion: https://postgr.es/m/CAMbWs4_bE%2BNscChbKWzw6HZOipCUyXfA5133qvoXQ654D3B2gQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RelationTruncate() does three things, while holding an
AccessExclusiveLock and preventing checkpoints:
1. Logs the truncation.
2. Drops buffers, even if they're dirty.
3. Truncates some number of files.
Step 2 could previously be canceled if it had to wait for I/O, and step
3 could and still can fail in file APIs. All orderings of these
operations have data corruption hazards if interrupted, so we can't give
up until the whole operation is done. When dirty pages were discarded
but the corresponding blocks were left on disk due to ERROR, old page
versions could come back from disk, reviving deleted data (see
pgsql-bugs #18146 and several like it). When primary and standby were
allowed to disagree on relation size, standbys could panic (see
pgsql-bugs #18426) or revive data unknown to visibility management on
the primary (theorized).
Changes:
* WAL is now unconditionally flushed first
* smgrtruncate() is now called in a critical section, preventing
interrupts and causing PANIC on file API failure
* smgrtruncate() has a new parameter for existing fork sizes,
because it can't call smgrnblocks() itself inside a critical section
The changes apply to RelationTruncate(), smgr_redo() and
pg_truncate_visibility_map(). That last is also brought up to date with
other evolutions of the truncation protocol.
The VACUUM FileTruncate() failure mode had been discussed in older
reports than the ones referenced below, with independent analysis from
many people, but earlier theories on how to fix it were too complicated
to back-patch. The more recently invented cancellation bug was
diagnosed by Alexander Lakhin. Other corruption scenarios were spotted
by me while iterating on this patch and earlier commit 75818b3a.
Back-patch to all supported releases.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reported-by: rootcause000@gmail.com
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org
Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
|
|
|
|
|
|
|
|
|
| |
The column is no longer needed as the offset is now cached in the
CompactAttribute struct per commit 5983a4cff.
Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
One test added in 9aea73fc61d4 did not take into account that the
backend may have some fsync even after a checkpoint. Let's relax it to
be more flexible.
Per report from buildfarm member grassquit, via Alexander Lakhin.
Author: Bertrand Drouvot
Discussion: https://postgr.es/m/6143ab0a-9e88-4790-8d9d-50ba45657761@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new compact_attrs array stores a few select fields from
FormData_pg_attribute in a more compact way, using only 16 bytes per
column instead of the 104 bytes that FormData_pg_attribute uses. Using
CompactAttribute allows performance-critical operations such as tuple
deformation to be performed without looking at the FormData_pg_attribute
element in TupleDesc which means fewer cacheline accesses.
For some workloads, tuple deformation can be the most CPU intensive part
of processing the query. Some testing with 16 columns on a table
where the first column is variable length showed around a 10% increase in
transactions per second for an OLAP type query performing aggregation on
the 16th column. However, in certain cases, the increases were much
higher, up to ~25% on one AMD Zen4 machine.
This also makes pg_attribute.attcacheoff redundant. A follow-on commit
will remove it, thus shrinking the FormData_pg_attribute struct by 4
bytes.
Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
|
|
|
|
|
|
|
| |
b7493e1ab35 removed leftover mentions of XLOG_HEAP2_FREEZE_PAGE records
from comments but neglected to remove one mention of FREEZE_PAGE.
Reported off-list by Alexander Lakhin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was reasonable to preserve the old API of BuildTupleHashTable()
in the back branches, but in HEAD we should actively discourage use
of that version. There are no remaining callers in core, so just
get rid of it. Then rename BuildTupleHashTableExt() back to
BuildTupleHashTable().
While at it, fix up the miserably-poorly-maintained header comment
for BuildTupleHashTable[Ext]. It looks like more than one patch in
this area has had the opinion that updating comments is beneath them.
Discussion: https://postgr.es/m/538343.1734646986@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Append, MergeAppend, and RecursiveUnion can all use the support
functions added in commit 276279295. The first two can report a
fixed result slot type if all their children return the same fixed
slot type. That does nothing for the append step itself, but might
allow optimizations in the parent plan node. RecursiveUnion can
optimize tuple hash table operations in the same way as SetOp now
does.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the code for inserting flag columns in the inputs of a SetOp.
That was the only reason why there would be resjunk columns in a
set-operations plan tree, so we can get rid of some code that
supported that, too.
Get rid of choose_hashed_setop() in favor of building Paths for
the hashed and sorted alternatives, and letting them fight it out
within add_path().
Remove set_operation_ordered_results_useful(), which was giving wrong
answers due to examining the wrong ancestor node: we need to examine
the immediate SetOperationStmt parent not the topmost node. Instead
make each caller of recurse_set_operations() pass down the relevant
parent node. (This thinko seems to have led only to wasted planning
cycles and possibly-inferior plans, not wrong query answers. Perhaps
we should back-patch it, but I'm not doing so right now.)
Teach generate_nonunion_paths() to consider pre-sorted inputs for
sorted SetOps, rather than always generating a Sort node.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original design for set operations involved appending the two
input relations into one and adding a flag column that allows
distinguishing which side each row came from. Then the SetOp node
pries them apart again based on the flag. This is bizarre. The
only apparent reason to do it is that when sorting, we'd only need
one Sort node not two. But since sorting is at least O(N log N),
sorting all the data is actually worse than sorting each side
separately --- plus, we have no chance of taking advantage of
presorted input. On top of that, adding the flag column frequently
requires an additional projection step that adds cycles, and then
the Append node isn't free either. Let's get rid of all of that
and make the SetOp node have two separate children, using the
existing outerPlan/innerPlan infrastructure.
This initial patch re-implements nodeSetop.c and does a bare minimum
of work on the planner side to generate correctly-shaped plans.
In particular, I've tried not to change the cost estimates here,
so that the visible changes in the regression test results will only
involve removal of useless projection steps and not any changes in
whether to use sorted vs hashed mode.
For SORTED mode, we combine successive identical tuples from each
input into groups, and then merge-join the groups. The tuple
comparisons now use SortSupport instead of simple equality, but
the group-formation part should involve roughly the same number of
tuple comparisons as before. The cross-comparisons between left and
right groups probably add to that, but I'm not sure to quantify how
many more comparisons we might need.
For HASHED mode, nodeSetop's logic is almost the same as before,
just refactored into two separate loops instead of one loop that
has an assumption that it will see all the left-hand inputs first.
In both modes, I added early-exit logic to not bother reading the
right-hand relation if the left-hand input is empty, since neither
INTERSECT nor EXCEPT modes can produce any output if the left input
is empty. This could have been done before in the hashed mode, but
not in sorted mode. Sorted mode can also stop as soon as it exhausts
the left input; any remaining right-hand tuples cannot have matches.
Also, this patch adds some infrastructure for detecting whether
child plan nodes all output the same type of tuple table slot.
If they do, the hash table logic can use slightly more efficient
code based on assuming that that's the input slot type it will see.
We'll make use of that infrastructure in other plan node types later.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
1a0da347a7ac98db replaced Bitmap Table Scan's separate private and
shared bitmap iterators with a unified iterator. It accidentally set up
the prefetch iterator twice for non-parallel bitmap table scans. Remove
the extra set up call to tbm_begin_iterate().
|
|
|
|
|
|
|
|
|
|
|
| |
1a0da347a7ac98db replaced Bitmap Table Scan's individual private and
shared iterators with a unified iterator. It neglected, however, to
check if the iterator had already been cleaned up before doing so on
rescan. Add this check both on rescan and end scan to be safe.
Reported-by: Richard Guo
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs48nrhcLY1kcd-u9oD%2B6yiS631F_8Fx8ZGsO-BYDwH%2Bbyw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consistently reset so->scanBehind at the beginning of nbtree array
advancement, even during sktrig_required=false calls (calls where array
advancement is triggered by an unsatisfied non-required array scan key).
Otherwise, it's possible for queries to fail to return all relevant
tuples to the scan given a low-order required scan key that was
previously deemed "satisfied" by a truncated high key attribute value.
This only happened at the point where a later non-required array scan
key needed to be "advanced" once on the next leaf page (that is, once
the right sibling of the truncated high key page was reached).
The underlying issue was that later code within _bt_advance_array_keys
assumed that the so->scanBehind flag must have been set using the
current page's high key (not the previous page's high key). Any later
successful recheck call to _bt_check_compare would therefore spuriously
be prevented from making _bt_advance_array_keys return true, based on
the faulty belief that the truncated attribute must be from the scan's
current tuple (i.e. the non-pivot tuple at the start of the next page).
_bt_advance_array_keys would return false for the tuple, ultimately
resulting in _bt_checkkeys failing to return a matching tuple.
Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkJKncfqyAUTeuB5GgRhT1vhsWO2q11dbZNqKmvjopP_g@mail.gmail.com
Backpatch: 17-, where commit 5bf748b8 first appears.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.
Make the generated scanner use palloc() etc. instead of malloc() etc.
For the bootstrap scanner and parser, reentrancy and memory management
aren't that important, but we make this change here anyway so that all
the scanners and parsers in the backend use a similar set of options
and APIs.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
|
| |
Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new variable-numbered statistics kind in pgstats, where the
object ID key of the stats entries is based on the proc number of the
backends. This acts as an upper-bound for the number of stats entries
that can exist at once. The entries are created when a backend starts
after authentication succeeds, and are removed when the backend exits,
making the stats entry exist for as long as their backend is up and
running. These are not written to the pgstats file at shutdown (note
that write_to_file is disabled, as a safety measure).
Currently, these stats include only information about the I/O generated
by a backend, using the same layer as pg_stat_io, except that it is now
possible to know how much activity is happening in each backend rather
than an overall aggregate of all the activity. A function called
pg_stat_get_backend_io() is added to access this data depending on the
PID of a backend. The existing structure could be expanded in the
future to add more information about other statistics related to
backends, depending on requirements or ideas.
Auxiliary processes are not included in this set of statistics. These
are less interesting to have than normal backends as they have dedicated
entries in pg_stat_io, and stats kinds of their own.
This commit includes also pg_stat_reset_backend_stats(), function able
to reset all the stats associated to a single backend.
Bump catalog version and PGSTAT_FILE_FORMAT_ID.
Author: Bertrand Drouvot
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, Michael Paquier, Nazir
Bilal Yavuz
Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds pg_stat_io_build_tuples(), a helper routine for
pg_stat_get_io(), that fills its result tuplestore based on the contents
of PgStat_BktypeIO. This will be used in a follow-up commit that uses
the same structures as pg_stat_io for reporting, including the same
object types and contexts, but for a different statistics kind.
Author: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8f4ee9626 fixed an old Assert failure that could happen when the slot
type used to look up the hash table for BuildTupleHashTableExt() users
wasn't a TTSOpsMinimalTuple slot. The fix for that in the back branches
had to be to pass the TupleTableSlotOps as NULL, however in master,
since we have the inputOps parameter as was added by d96d1d515, we can
pass that down instead.
At least one caller uses a fixed slot that's always TTSOpsVirtual, so
passing down inputOps for these cases allows ExecBuildGroupingEqual() to
skip adding the EEOP_INNER_FETCHSOME ExprEvalStep.
This should increase the performance of hashed subplans very slightly.
Author: Tom Lane, David Rowley
Discussion: https://postgr.es/m/2543667.1734483723@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the non-recursive part of a recursive CTE ended up using
TTSOpsBufferHeapTuple as the table slot type, then a duplicate value
could cause an Assert failure in CheckOpSlotCompatibility() when
checking the hash table for the duplicate value. The expected slot type
for the deform step was TTSOpsMinimalTuple so the Assert failed when the
TTSOpsBufferHeapTuple slot was used.
This is a long-standing bug which we likely didn't notice because it
seems much more likely that the non-recursive term would have required
projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility
is ok with.
There doesn't seem to be any harm done here other than the Assert
failure. Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types
require tuple deformation, so the EEOP_*_FETCHSOME ExprState step would
have properly existed in the ExprState.
The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops'
parameter. This means the ExprState's EEOP_*_FETCHSOME step won't
expect a fixed slot type. This makes CheckOpSlotCompatibility() happy as
no checking is performed when the ExprEvalStep is not expecting a fixed
slot type.
Reported-by: Richard Guo
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com
Backpatch-through: 13, all supported versions
|
|
|
|
|
|
|
|
|
|
|
| |
f83d709760d merged the separate XLOG_HEAP2_FREEZE_PAGE records into a
new combined prune, freeze, and vacuum record with opcode
XLOG_HEAP2_PRUNE_VACUUM_SCAN. Remove the last few references to
XLOG_HEAP2_FREEZE_PAGE records which were accidentally left behind.
Reported-by: Tomas Vondra
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA%2BTgmoY1tYff-1CEn8kYt5FsOrynTbtr%3DUZw%3D7mTC1Hv1HpeBQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the repurposing of TBMIterator as an interface for both parallel
and serial iteration through TIDBitmaps in commit 7f9d4187e7bab10329cc,
bitmap table scans may now use it.
Modify bitmap table scan code to use the TBMIterator. This requires
moving around a bit of code, so a few variables are initialized
elsewhere.
Author: Melanie Plageman
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/c736f6aa-8b35-4e20-9621-62c7c82e2168%40vondra.me
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add and use TBMPrivateIterator, which replaces the current TBMIterator
for serial use cases, and repurpose TBMIterator to be a unified
interface for both the serial ("private") and parallel ("shared") TID
Bitmap iterator interfaces. This encapsulation simplifies call sites for
callers supporting both parallel and serial TID Bitmap access.
TBMIterator is not yet used in this commit.
Author: Melanie Plageman
Reviewed-by: Tomas Vondra, Heikki Linnakangas
Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
|
|
|
|
|
|
|
|
|
|
|
| |
68d9662be1c4b70 made HeapScanDesc->rs_ntuples unsigned but neglected to
change how it was being used in SampleHeapTupleVisible().
Return early if rs_ntuples is 0 to avoid overflowing and incorrectly
executing the loop code in SampleHeapTupleVisible().
Reported-by: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAot_xQoZyPZjpj1aBUPrPykY5mOPHGyvfe%3Djz%2BWowdA3A%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HeapScanDescData.rs_cindex and rs_ntuples can't be less than 0. All scan
types using the heap scan descriptor expect these values to be >= 0.
Make that expectation clear by making rs_cindex and rs_ntuples unsigned.
Also remove the test in heapam_scan_bitmap_next_tuple() that checks if
rs_cindex < 0. This was never true, but now that rs_cindex is unsigned,
it makes even less sense.
While we are at it, initialize both rs_cindex and rs_ntuples to 0 in
initscan().
Author: Melanie Plageman
Reviewed-by: Dilip Kumar
Discussion: https://postgr.es/m/CAAKRu_ZxF8cDCM_BFi_L-t%3DRjdCZYP1usd1Gd45mjHfZxm0nZw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for the following items in psql, able to show a
service name, when available:
- Variable SERVICE.
- Substitution %s in PROMPT{1,2,3}.
This relies on 4b99fed7541e, that has made the service name available in
PGconn for libpq.
Author: Michael Banck
Reviewed-by: Greg Sabino Mullane
Discussion: https://postgr.es/m/6723c612.050a0220.1567f4.b94a@mx.google.com
|