| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Windows can enumerate the locales that are either installed or
supported by calling EnumSystemLocalesEx(), similar to what is already
done in the READ_LOCALE_A_OUTPUT switch. We can refactor some of the
logic already used in that switch into a new function
create_collation_from_locale().
The enumerated locales have BCP 47 shape, that is with a hyphen
between language and territory, instead of POSIX's underscore. The
created collations will retain the BCP 47 shape, but we will also
create a POSIX alias, so xx-YY will have an xx_YY alias.
A new test collate.windows.win1252 is added that is like
collate.linux.utf8.
Author: Juan Jose Santamaria Flecha <juanjo.santamaria@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac@postgrespro.ru
|
|
|
|
|
|
|
|
|
|
| |
While on it, newlines are removed from the end of two elog() strings.
The others are simple grammar mistakes. One comment in pg_upgrade
referred incorrectly to sequences since a7e5457.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When considering an empty grouping set, we fetched
phasedata->eqfunctions[-1]. Because the eqfunctions array is
palloc'd, that would always be an aset pointer in released versions,
and thus the code accidentally failed to malfunction (since it would
do nothing unless it found a null pointer). Nonetheless this seems
like trouble waiting to happen, so add a check for length == 0.
It's depressing that our valgrind testing did not catch this.
Maybe we should reconsider the choice to not mark that word NOACCESS?
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com
|
|
|
|
| |
Backpatch-through: 11
|
|
|
|
|
|
|
| |
The term "truncation" has been ambiguous since commit 10a8d13823 added
line pointer array truncation during heap pruning. Clear things up by
specifying that we're talking about rel truncation here, to match nearby
comments that apply to tuples with storage.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't allow VACUUM to WAL-log the value FrozenTransactionId as the
snapshotConflictHorizon of freezing or visibility map related WAL
records.
The only special XID value that's an allowable snapshotConflictHorizon
is InvalidTransactionId, which is interpreted as "record definitely
doesn't require a recovery conflict".
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
|
|
|
|
|
| |
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
The float and numeric types accept this variant spelling of
"infinity", so it seems like the datetime types should too.
Vik Fearing, some cosmetic mods by me
Discussion: https://postgr.es/m/d0bef637-2dbd-0a5d-e539-48243b6f6c5e@postgresfriends.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When brin_minmax_multi_union merges summaries, we may end up with just a
single range after merge_overlapping_ranges. The summaries may contain
just one range each, and they may overlap (or be exactly the same).
With a single range there's no distance to calculate, but we happen to
call build_distances anyway - which is fine, we don't calculate the
distance in this case, except that with asserts this failed due to a
check there are at least two ranges.
The assert is unnecessarily strict, so relax it a bit and bail out if
there's just a single range. The relaxed assert would be enough, but
this way we don't allocate unnecessary memory for distance.
Backpatch to 14, where minmax-multi opclasses were introduced.
Reported-by: Jaime Casanova
Backpatch-through: 14
Discussion: https://postgr.es/m/YzVA55qS0hgz8P3r@ahch-to
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
f193883 has been incorrectly setting up the precision used in the
timestamp compilations returned by the following functions:
- LOCALTIME
- LOCALTIMESTAMP
- CURRENT_TIME
- CURRENT_TIMESTAMP
Specifying an out-of-range precision for CURRENT_TIMESTAMP and
LOCALTIMESTAMP was raising a WARNING without adjusting the precision,
leading to a subsequent error. LOCALTIME and CURRENT_TIME raised a
WARNING without an error, still the precision given to the internal
routines was not correct, so let's be clean.
Ian has reported the problems in timestamp.c, while I have noticed the
ones in date.c. Regression tests are added for all of them with
precisions high enough to provide coverage for the warnings, something
that went missing up to this commit.
Author: Ian Lawrence Barwick, Michael Paquier
Discussion: https://postgr.es/m/CAB8KJ=jQEnn9sYG+N752spt68wMrhmT-ocHCh4oeNmHF82QMWA@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
There is some code that uses this function to assemble some kind of
packed binary layout, which requires a bunch of casts because of this.
Functions taking binary data plus length should take void * instead,
like memcpy() for example.
Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
|
|
|
|
|
|
|
|
|
| |
For the jsonpath output, we don't need to squeeze out every bit of
performance, so instead use a more robust coding style. There are
similar calls in jsonb.c, which we leave alone here since there is
indeed a performance impact for bulk exports.
Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
|
|
|
|
|
|
|
|
| |
Make data buffer argument to BufFileWrite a const pointer and bubble
this up to various callers and related APIs. This makes the APIs
clearer and more consistent.
Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
|
|
|
|
|
|
|
|
| |
Some code carefully cast all data buffer arguments for data write and
read function calls to void *, even though the respective arguments
are already void *. Remove this unnecessary clutter.
Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Teach VACUUM to decide on whether or not to trigger freezing at the
level of whole heap pages. Individual XIDs and MXIDs fields from tuple
headers now trigger freezing of whole pages, rather than independently
triggering freezing of each individual tuple header field.
Managing the cost of freezing over time now significantly influences
when and how VACUUM freezes. The overall amount of WAL written is the
single most important freezing related cost, in general. Freezing each
page's tuples together in batch allows VACUUM to take full advantage of
the freeze plan WAL deduplication optimization added by commit 9e540599.
Also teach VACUUM to trigger page-level freezing whenever it detects
that heap pruning generated an FPI. We'll have already written a large
amount of WAL just to do that much, so it's very likely a good idea to
get freezing out of the way for the page early. This only happens in
cases where it will directly lead to marking the page all-frozen in the
visibility map.
In most cases "freezing a page" removes all XIDs < OldestXmin, and all
MXIDs < OldestMxact. It doesn't quite work that way in certain rare
cases involving MultiXacts, though. It is convenient to define "freeze
the page" in a way that gives FreezeMultiXactId the leeway to put off
the work of processing an individual tuple's xmax whenever it happens to
be a MultiXactId that would require an expensive second pass to process
aggressively (allocating a new multi is especially worth avoiding here).
FreezeMultiXactId is eager when processing is cheap (as it usually is),
and lazy in the event of an individual multi that happens to require
expensive second pass processing. This avoids regressions related to
processing of multis that page-level freezing might otherwise cause.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some compilers complain about sub_rteperminfos not being
initialized, evidently because they don't detect that it
is only used and set if isGeneralSelect is true.
Make it follow the long-established pattern for its
sibling variable sub_rtable.
Per reports from Pavel Stehule and the buildfarm.
Discussion: https://postgr.es/m/CAFj8pRDOvGOi-n616kM0Cc7qSbg_nGoS=-haB+D785sUXADqSg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The modified error message for regcollationin failure includes
the database encoding, which it should've occurred to me is a
portability hazard for the regression tests. Adjust the test
so the expected output doesn't include that.
In passing, fix a comment typo introduced in b8c0ffbd2.
Per buildfarm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given the soft-input-error feature, we can reduce these functions
to be just thin wrappers around a soft-error call of the
corresponding datatype input function. This means less code and
more certainty that the to_reg* functions match the normal input
behavior.
Notably, it also means that they will accept numeric OID input,
which they didn't before. It's not clear to me if that omission
had more than laziness behind it, but it doesn't seem like
something we need to work hard to preserve.
Discussion: https://postgr.es/m/3910031.1672095600@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is not really complete, but it catches most cases of practical
interest. The main omissions are:
* regtype, regprocedure, and regoperator parse type names by
calling the main grammar, so any grammar-detected syntax error
will still be a hard error. Also, if one includes a type
modifier in such a type specification, errors detected by the
typmodin function will be hard errors.
* Lookup errors are handled just by passing missing_ok = true
to the relevant catalog lookup function. Because we've used
quite a restrictive definition of "missing_ok", this means that
edge cases such as "the named schema exists, but you lack
USAGE permission on it" are still hard errors.
It would make sense to me to replace most/all missing_ok
parameters with an escontext parameter and then allow these
additional lookup failure cases to be trapped too. But that's
a job for some other day.
Discussion: https://postgr.es/m/3342239.1671988406@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is slightly tedious because the adjustments cascade through
a couple of levels of subroutines, but it's not very hard.
I chose to avoid changing function signatures more than absolutely
necessary, by passing the escontext pointer in existing structs
where possible.
tsquery's nuisance NOTICEs about empty queries are suppressed in
soft-error mode, since they're not errors and we surely don't want
them to be shown to the user anyway. Maybe that whole behavior
should be reconsidered.
Discussion: https://postgr.es/m/3824377.1672076822@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically these input functions just called strtoul or strtoull
and returned the result, with no error detection whatever. Upgrade
them to reject garbage input and out-of-range values, similarly to
our other numeric input routines.
To share the code for this with type oid, adjust the existing
"oidin_subr" to be agnostic about the SQL name of the type it is
handling, and move it to numutils.c; then clone it for 64-bit types.
Because the xid types previously accepted hex and octal input by
reason of calling strtoul[l] with third argument zero, I made the
common subroutine do that too, with the consequence that type oid
now also accepts hex and octal input. In view of 6fcda9aba, that
seems like a good thing.
While at it, simplify the existing over-complicated handling of
syntax errors from strtoul: we only need one ereturn not three.
Discussion: https://postgr.es/m/3526121.1672000729@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When VACUUM determines that an existing MultiXact should use a freeze
plan that sets xmax to InvalidTransactionId, the original Multi may or
may not be before OldestMxact. Remove an incorrect assertion that
expected it to always be from before OldestMxact.
Oversight in commit 4ce3af.
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/TYAPR01MB5866B24104FD80B5D7E65C3EF5ED9@TYAPR01MB5866.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables streaming or serializing changes immediately in logical
decoding. This parameter is intended to be used to test logical decoding
and replication of large transactions for which otherwise we need to
generate the changes till logical_decoding_work_mem is reached.
This helps in reducing the timing of existing tests related to logical
replication of in-progress transactions and will help in writing tests for
for the upcoming feature for parallelly applying large in-progress
transactions.
Author: Shi yu
Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi
Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
| |
I missed this in my initial survey, probably because I examined
the contents of pg_type in the postgres database, which lacks
any enumerated types.
Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
|
|
|
|
|
|
| |
Reviewed by Tom Lane
Discussion: https://postgr.es/m/a8dc5700-c341-3ba8-0507-cc09881e6200@dunslane.net
|
|
|
|
|
|
|
|
|
| |
While at it, use a common subroutine already.
This doesn't move the needle very far in terms of making these
functions non-throwing; the only case we're now able to trap is
numeric-OID-is-out-of-range. Still, it seems like a pretty
non-controversial step in that direction.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix incorrect code which was trying to convert a Bitmapset of columns at
the attnums according to a parent table and transform them into the
equivalent Bitmapset with same attnums according to the given child table.
This code is new as of a61b1f748 and was failing to do the correct
translation when there was an intermediate parent table between 'rel' and
'top_parent_rel'.
Reported-by: Ranier Vilela
Author: Richard Guo, Amit Langote
Discussion: https://postgr.es/m/CAEudQArohfB_Gy%2BhcH2-bANUkxgjJiP%3DABq01_LgTNTbcNijag%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An epoll fd belonging to the parent should be closed in the child. A
kqueue fd is automatically closed by fork(), but we should still adjust
our counter. For poll and Windows systems, nothing special is required.
On all systems we free the memory.
No caller yet, but we'll need this if we start using WaitEventSet in the
postmaster as planned.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
At the time of commit 6a2a70a02 we didn't use latch infrastructure in
the postmaster. We're planning to start doing that, so we'd better make
sure that the signalfd inherited from a postmaster is not duplicated and
then leaked in the child.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To be able to handle incoming connections on a server socket with
the WaitEventSet API, we'll need a new kind of event to indicate that
the the socket is ready to accept a connection.
On Unix, it's just the same as WL_SOCKET_READABLE, but on Windows there
is a different underlying kernel event that we need to map our
abstraction to.
No user yet, but a proposed patch would use this.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Three error strings used with cache lookup failures were referring to
incorrect object types for ACL checks:
- Schemas
- Types
- Foreign Servers
There errors should never be triggered, but if they do incorrect
information would be reported.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20221222153041.GN1153@telsasoft.com
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The former name was discussed as being confusing, so use "split", as per
a suggestion from Magnus Hagander.
While on it, one of the output arguments is renamed from "segno" to
"segment_number", as per a suggestion from Kyotaro Horiguchi.
The documentation is updated to reflect all these changes.
Bump catalog version.
Author: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/CABUevEytQVaOOhGdoh0D7hGwe3fuKcRF6NthsSW7ww04EmtFgQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
WindowFuncs such as row_number() don't care if it's called with ROWS
UNBOUNDED PRECEDING AND CURRENT ROW or with RANGE UNBOUNDED PRECEDING AND
CURRENT ROW. The latter is less efficient as the RANGE option requires
that the executor check for peer rows, so using the ROW option instead
would cause less overhead. Because RANGE is part of the default frame
options for WindowClauses, it means WindowAgg is, by default, working much
harder than it needs to for window functions where the ROWS / RANGE option
has no effect on the window function's result.
On a test query from the discussion thread, a performance improvement of
344% was seen by using ROWS instead of RANGE.
Here we add a new support function node type to allow support functions to
be called for window functions so that the most optimal version of the
frame options can be set. The planner has been adjusted so that the frame
options are changed only if all window functions sharing the same window
clause agree on what the optimized frame options are.
Here we give the ability for row_number(), rank(), dense_rank(),
percent_rank(), cume_dist() and ntile() to alter their WindowClause's
frameOptions.
Reviewed-by: Vik Fearing, Erwin Brandstetter, Zhihong Yu
Discussion: https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6+S88z+A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
Use C99 designated initializer syntax for the array elements, instead of
writing the enumerator name and position in a comment. Replace nkeys
and key with a local variadic macro, for a shorter notation.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/CA%2BhUKGKdpDjKL2jgC-GpoL4DGZU1YPqnOFHbDqFkfRQcPaR5DQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perform a failsafe check every time VACUUM's first heap scan scans a
further FAILSAFE_EVERY_PAGES pages, rather than using an approach based
on the number of physical blocks that our current blkno is from the
blkno at the time of the previous failsafe check. That way VACUUM will
perform a failsafe check every time it has scanned a uniform number of
pages, without it mattering when or how VACUUM skipped pages using the
visibility map.
Sami Imseih, with changes to FAILSAFE_EVERY_PAGES comments added by me.
Author: Sami Imseih <simseih@amazon.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/401CE010-4049-4B94-9961-0B610A5D254D%40amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use a dedicated struct for the XID/MXID cutoffs used by VACUUM, such as
FreezeLimit and OldestXmin. This state is initialized in vacuum.c, and
then passed around by code from vacuumlazy.c to heapam.c freezing
related routines. The new convention is that everybody works off of the
same cutoff state, which is passed around via pointers to const.
Also simplify some of the logic for dealing with frozen xmin in
heap_prepare_freeze_tuple: add dedicated "xmin_already_frozen" state to
clearly distinguish xmin XIDs that we're going to freeze from those that
were already frozen from before. That way the routine's xmin handling
code is symmetrical with the existing xmax handling code. This is
preparation for an upcoming commit that will add page level freezing.
Also refactor the control flow within FreezeMultiXactId(), while adding
stricter sanity checks. We now test OldestXmin directly, instead of
using FreezeLimit as an inexact proxy for OldestXmin. This is further
preparation for the page level freezing work, which will make the
function's caller cede control of page level freezing to the function
where appropriate (where heap_prepare_freeze_tuple sees a tuple that
happens to contain a MultiXactId in its xmax).
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CAH2-WznS9TxXmz2_=SY+SyJyDFbiOftKofM9=aDo68BbXNBUMA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
perform_pullup_replace_vars() knows how to scan the whole parent
query tree when we are replacing Vars during a subquery flattening
operation. However, for the specific case of flattening a
UNION ALL leaf query, that's mostly wasted work: the only place
where relevant Vars could exist is in the AppendRelInfo that we
just made for this leaf. Teaching perform_pullup_replace_vars()
to just deal with that and exit is worthwhile because, if we have
N such subqueries to pull up, we were spending O(N^2) work uselessly
mutating the AppendRelInfos for all the other subqueries.
While we're at it, avoid calling substitute_phv_relids if there are no
PlaceHolderVars, and remove an obsolete check of parse->hasSubLinks.
Andrey Lepikhov and Tom Lane
Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Andrey Lepikhov demonstrated a case where we spend an unreasonable
amount of time in pull_up_subqueries(). Not only is that recursing
with no explicit check for stack overrun, but the code seems not
interruptable by control-C. Let's stick a CHECK_FOR_INTERRUPTS
there, along with sprinkling some stack depth checks.
An actual fix for the excessive time consumption seems a bit
risky to back-patch; but this isn't, so let's do so.
Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
|
|
|
|
|
| |
The second appearance of NamespaceRelationId in this if-else chain is
in error and can be removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A bitwise operator was getting used on two bools in
ATAddCheckConstraint() to track if constraints should be merged or not
with the existing ones of a relation, though obviously this should use
a boolean OR operator. This led to the same result, but let's be
clean.
Oversight in 074c5cf.
Author: Ranier Vilela
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAEudQAp2R2fbbi0OHHhv_n4=Ch0t1VtjObR9YMqtGKHJ+faUFQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces palloc_aligned() and MemoryContextAllocAligned() which
allow callers to obtain memory which is allocated to the given size and
also aligned to the specified alignment boundary. The alignment
boundaries may be any power-of-2 value. Currently, the alignment is
capped at 2^26, however, we don't expect values anything like that large.
The primary expected use case is to align allocations to perhaps CPU
cache line size or to maybe I/O page size. Certain use cases can benefit
from having aligned memory by either having better performance or more
predictable performance.
The alignment is achieved by requesting 'alignto' additional bytes from
the underlying allocator function and then aligning the address that is
returned to the requested alignment. This obviously does waste some
memory, so alignments should be kept as small as what is required.
It's also important to note that these alignment bytes eat into the
maximum allocation size. So something like:
palloc_aligned(MaxAllocSize, 64, 0);
will not work as we cannot request MaxAllocSize + 64 bytes.
Additionally, because we're just requesting the requested size plus the
alignment requirements from the given MemoryContext, if that context is
the Slab allocator, then since slab can only provide chunks of the size
that's specified when the slab context is created, then this is not going
to work. Slab will generate an error to indicate that the requested size
is not supported.
The alignment that is requested in palloc_aligned() is stored along with
the allocated memory. This allows the alignment to remain intact through
repalloc() calls.
Author: Andres Freund, David Rowley
Reviewed-by: Maxim Orlov, Andres Freund, John Naylor
Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the guts of float4in, callable as a routine to input floats,
which will be useful in an upcoming patch for allowing soft errors in
the seg module's input function.
A similar operation was performed some years ago for float8in in
commit 50861cd683e.
Reviewed by Tom Lane
Discussion: https://postgr.es/m/cee4e426-d014-c0b7-aa22-a659f2cd9130@dunslane.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
d21ded75f changed the way slab.c works but introduced a bug that meant we
could end up with the slab's curBlocklistIndex pointing to the wrong list.
The condition which was checking for this was failing to account for two
things:
1. The curBlocklistIndex could be 0 as we've currently got no non-full
blocks to put chunks on. In this case, the dlist_is_empty() check cannot
be performed as there can be any number of completely full blocks at that
index.
2. The curBlocklistIndex may be greater than the index we just moved the
block onto. Since we need to ensure we fill up fuller blocks first, we
must reset curBlocklistIndex when changing any blocklist element that's
less than the curBlocklistIndex too.
Reported-by: Takamichi Osumi
Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This shaves some code by replacing the combinations of
CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of
the attributes listed in pg_proc.dat by get_call_result_type() to build
the TupleDesc needed for the rows generated.
get_call_result_type() is more expensive than the former style, but this
removes some duplication with the lists of OUT parameters (pg_proc.dat
and the attributes hardcoded in these code paths). This is applied to
functions that are not considered as critical (aka that could be called
repeatedly for monitoring purposes).
Author: Bharath Rupireddy
Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com
|
|
|
|
| |
Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 927f453a9 disallowed batching added by commit b663a4136 to be
used for the inserts performed as part of cross-partition updates of
partitioned tables, mainly because the previous code in
nodeModifyTable.c couldn't handle pending inserts into foreign-table
partitions that are also UPDATE target partitions. But we don't have
such a limitation anymore (cf. commit ffbb7e65a), so let's allow for
this by removing from execPartition.c the restriction added by commit
927f453a9 that batching is only allowed if the query command type is
CMD_INSERT.
In postgres_fdw, since commit 86dc90056 changed it to effectively
disable cross-partition updates in the case where a foreign-table
partition chosen to insert rows into is also an UPDATE target partition,
allow batching in the case where a foreign-table partition chosen to
do so is *not* also an UPDATE target partition. This is enabled by the
"batch_size" option added by commit b663a4136, which is disabled by
default.
This patch also adjusts the test case added by commit 927f453a9 to
confirm that the inserts performed as part of a cross-partition update
of a partitioned table indeed uses batching.
Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu,
Bharath Rupireddy, Hou Zhijie, Vignesh C, and me.
Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1349d279 added query planner support to allow more efficient execution of
aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to
that commit, the planner would only request that the lower planner produce
a plan with the order required for the GROUP BY clause and it would be
left up to nodeAgg.c to perform the final sort of records within each
group so that the aggregate transition functions were called in the
correct order. Now that the planner requests the lower planner produce a
plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind,
there is the possibility that the planner chooses a plan which could be
less efficient than what would have been produced before 1349d279.
While developing 1349d279, I had in mind that Incremental Sort would help
us in cases where an index exists only on the GROUP BY column(s).
Incremental Sort would just replace the implicit tuplesorts which are
being performed in nodeAgg.c. However, because the planner has the
flexibility to instead choose a plan which just performs a full sort on
both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is
potential for the planner to make a bad choice. The costing for
Incremental Sort is not perfect as it assumes an even distribution of rows
to sort within each sort group.
Here we add an escape hatch in the form of the enable_presorted_aggregate
GUC. This will allow users to get the pre-PG16 behavior in cases where
they have no other means to convince the query planner to produce a plan
which only sorts on the GROUP BY column(s).
Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
| |
This is less error prone and matches the placement of other code
in the file.
Justin Pryzby
Reviewed by Tom Lane
Discussion: https://www.postgresql.org/message-id/20221123172436.GJ11463@telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function takes in input a WAL segment name and returns a tuple made
of the segment sequence number (dependent on the WAL segment size of the
cluster) and its timeline, as of a thin SQL wrapper around the existing
XLogFromFileName().
This function has multiple usages, like being able to compile a LSN from
a file name and an offset, or finding the timeline of a segment without
having to do to some maths based on the first eight characters of the
segment.
Bump catalog version.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
|