| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The prohibitValueChange code paths in set_config_option(), which
are executed whenever we re-read a PGC_POSTMASTER variable from
postgresql.conf, neglected to free anything before exiting. Thus
we'd leak the proposed new value of a PGC_STRING variable, as noted
by BoChen in bug #16666. For all variable types, if the check hook
creates an "extra" chunk, we'd also leak that.
These are malloc not palloc chunks, so there is no mechanism for
recovering the leaks before process exit. Fortunately, the values
are typically not very large, meaning you'd have to go through an
awful lot of SIGHUP configuration-reload cycles to make the leakage
amount to anything. Still, for a long-lived postmaster process it
could potentially be a problem.
Oversight in commit 2594cf0e8. Back-patch to all supported branches.
Discussion: https://postgr.es/m/16666-2c41a4eec61b03e1@postgresql.org
|
|
|
|
|
|
|
| |
The previous description string still described the pre-PostgreSQL
10 (pre eb61136dc75a76caef8460fa939244d8593100f2) behavior of
selecting between encrypted and unencrypted, but it is now choosing
between encryption algorithms.
|
|
|
|
|
|
|
|
| |
This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64. Back-patch to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Back-patch to 9.5 (all supported versions). This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As
always, update standby systems before master systems. This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions. (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise it can be hard to see where an error is coming from, when
the parallel worker sets all the GUCs that it received from the
leader. Bug #15726. Back-patch to 9.5, where RestoreGUCState()
appeared.
Reported-by: Tiago Anastacio
Reviewed-by: Daniel Gustafsson, Tom Lane
Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An empty file name or subdirectory name leads join_path_components() to
just produce the parent directory name, which leads to weird failures or
recursive inclusions. Let's throw a specific error for that. It takes
only slightly more code to detect all-blank names, so do so.
Also, detect direct recursion, ie a file calling itself. As coded
this will also detect recursion via "include_dir '.'", which is
perhaps more likely than explicitly including the file itself.
Detecting indirect recursion would require API changes for guc-file.l
functions, which seems not worth it since extensions might call them.
The nesting depth limit will catch such cases eventually, just not
with such an on-point error message.
In passing, adjust the example usages in postgresql.conf.sample
to perhaps eliminate the problem at the source: there's no reason
for the examples to suggest that an empty value is valid.
Per a trouble report from Brent Bates. Back-patch to 9.5; the
issue is old, but the code in 9.4 is enough different that the
patch doesn't apply easily, and it doesn't seem worth the trouble
to fix there.
Ian Barwick and Tom Lane
Discussion: https://postgr.es/m/8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ALTER SYSTEM itself normally won't make duplicate entries (although
up till this patch, it was possible to confuse it by writing case
variants of a GUC's name). However, if some external tool has appended
entries to the file, that could result in duplicate entries for a single
GUC name. In such a situation, ALTER SYSTEM did exactly the wrong thing,
because it replaced or removed only the first matching entry, leaving
the later one(s) still there and hence still determining the active value.
This patch fixes that by making ALTER SYSTEM sweep through the file and
remove all matching entries, then (if not ALTER SYSTEM RESET) append the
new setting to the end. This means entries will be in order of last
setting rather than first setting, but that shouldn't hurt anything.
Also, make the comparisons case-insensitive so that the right things
happen if you do, say, ALTER SYSTEM SET "TimeZone" = 'whatever'.
This has been broken since ALTER SYSTEM was invented, so back-patch
to all supported branches.
Ian Barwick, with minor mods by me
Discussion: https://postgr.es/m/aed6cc9f-98f3-2693-ac81-52bb0052307e@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, include actions include_dir, include_if_exists, and include
listed commented-out values which were not the defaults, which is
inconsistent with other entries. Instead, replace them with '', which
is the default value.
Reported-by: Emanuel Araújo
Discussion: https://postgr.es/m/CAMuTAkYMx6Q27wpELDR3_v9aG443y7ZjeXu15_+1nGUjhMWOJA@mail.gmail.com
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
None of the code that uses GUC values is really prepared for them to
hold NaN, but parse_real() didn't have any defense against accepting
such a value. Treat it the same as a syntax error.
I haven't attempted to analyze the exact consequences of setting any
of the float GUCs to NaN, but since they're quite unlikely to be good,
this seems like a back-patchable bug fix.
Note: we don't need an explicit test for +-Infinity because those will
be rejected by existing range checks. I added a regression test for
that in HEAD, but not older branches because the spelling of the value
in the error message will be platform-dependent in branches where we
don't always use port/snprintf.c.
Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
|
|
|
|
|
|
|
|
| |
Reflecting an updated parameter value requires a server restart, which
was not mentioned in the documentation and in postgresql.conf.sample.
Reported-by: Thomas Poty
Discussion: https://postgr.es/m/15659-0cd812f13027a2d8@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
|
|
|
|
|
|
|
|
|
|
| |
Follow on patch for commit 3e0f1a4741f564c1a2fa6e944729d6967355d8c7.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/369ec766-b947-51bd-4dad-6fb9e026439f@2ndquadrant.com
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
| |
Clarify that effective_cache_size is both kernel buffers and shared
buffers.
Reported-by: nat@makarevitch.org
Discussion: https://postgr.es/m/153685164808.22334.15432535018443165207@wrigleys.postgresql.org
Backpatch-through: 9.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Give a specific error complaining about lack of posix_fadvise() when
someone tries to set effective_io_concurrency > 0 on platforms
without that.
This probably isn't worth extensive back-patching, but I (tgl) felt
cramming it into v11 was reasonable.
James Robinson
Discussion: https://postgr.es/m/153771876450.14994.560017943128223619@wrigleys.postgresql.org
Discussion: https://postgr.es/m/A3942987-5BC7-4F05-B54D-2A0EC2914B33@jlr-photo.com
|
|
|
|
|
|
|
|
|
|
|
| |
Per discussion, JIT isn't quite mature enough to ship enabled-by-default.
I failed to resist the temptation to do a bunch of copy-editing on the
related documentation. Also, clean up some inconsistencies in which
section of config.sgml the JIT GUCs are documented in vs. what guc.c
and postgresql.config.sample had.
Discussion: https://postgr.es/m/20180914222657.mw25esrzbcnu6qlu@alap3.anarazel.de
|
|
|
|
|
| |
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20180821.115841.93250330.horiguchi.kyotaro@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a small number of places that were testing the result of snprintf()
but doing so incorrectly. The right test for buffer overrun, per C99,
is "result >= bufsize" not "result > bufsize". Some places were also
checking for failure with "result == -1", but the standard only says
that a negative value is delivered on failure.
(Note that this only makes these places correct if snprintf() delivers
C99-compliant results. But at least now these places are consistent
with all the other places where we assume that.)
Also, make psql_start_test() and isolation_start_test() check for
buffer overrun while constructing their shell commands. There seems
like a higher risk of overrun, with more severe consequences, here
than there is for the individual file paths that are made elsewhere
in the same functions, so this seemed like a worthwhile change.
Also fix guc.c's do_serialize() to initialize errno = 0 before
calling vsnprintf. In principle, this should be unnecessary because
vsnprintf should have set errno if it returns a failure indication ...
but the other two places this coding pattern is cribbed from don't
assume that, so let's be consistent.
These errors are all very old, so back-patch as appropriate. I think
that only the shell command overrun cases are even theoretically
reachable in practice, but there's not much point in erroneous error
checks.
Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to
DBL_MAX. DBL_MAX appears to be platform-dependent. That causes
many buildfarm animals to fail, because we check boundaries of
vacuum_cleanup_index_scale_factor in regression tests.
This commit changes upper limit from DBL_MAX to just "large enough"
limit, which was arbitrary selected as 1e10.
Author: Alexander Korotkov
Reported-by: Tom Lane, Darafei Praliaskouski
Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com
Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upper limits for vacuum_cleanup_index_scale_factor GUC and reloption
were initially set to 100.0 in 857f9c36. However, after further
discussion, it appears that some users like to disable B-tree cleanup
index scan completely (assuming there are no deleted pages).
vacuum_cleanup_index_scale_factor is used barely to protect against
stalled index statistics. And after detailed consideration it appears
that risk of stalled index statistics is low. And it would be nice to
allow advanced users setting higher values of
vacuum_cleanup_index_scale_factor. So, set upper limit for these
GUC and reloption to DBL_MAX.
Author: Alexander Korotkov
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAC8Q8tJCb%3DgxhzcV7T6ctx7PY-Ux1oA-AsTJc6cAVNsQiYcCzA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vacuum_cleanup_index_scale_factor was located in autovacuum group of
GUCs. However, it affects not only autovacuum, but also manually run
VACUUM. It appears that "client connection defaults" group of GUCs
is more appropriate for vacuum_cleanup_index_scale_factor, because
vacuum_*_age options are already located there.
Also, vacuum_cleanup_index_scale_factor was missed in
postgresql.conf.sample. So, add it there with appropriate comment.
Author: Masahiko Sawada with minor editorization by me
Discussion: https://postgr.es/m/CAD21AoArsoXMLKudXSKN679FRzs6oubEchM53bHwn8Tp%3D2boNg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, fix the pg_settings view to display source filename and line
number when invoked by a pg_read_all_settings member. This addition by
me (Álvaro).
Also, fix wording of the comment in GetConfigOption regarding the
restriction it implements, renaming the parameter for extra clarity.
Noted by Michaël.
These were all oversight in commit 25fff40798fc; backpatch to pg10,
where that commit first appeared.
Author: Laurenz Albe
Reviewed-by: Michaël Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/1519917758.6586.8.camel@cybertec.at
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6e7baa3227 added support for "B" unit, for specifying config options
in bytes. However, it was only accepted in GUC_UNIT_BYTE settings,
wal_segment_size and track_activity_query_size, and not e.g. in work_mem.
This patch makes it consistent, so that "B" accepted in all the same
contexts where "kB", "MB", and so forth are accepted.
Add "B" to the list of accepted units in the error hint, along with "kB",
"MB", etc.
Add an entry in the conversion table for "TB" to "B" conversion. A terabyte
is out of range for any GUC_UNIT_BYTE option, so you always get an "out of
range" error with that, but without it, you get a confusing error message
that claims that "TB" is not an accepted unit, with a hint that nevertheless
lists "TB" as an accepted unit.
Reviewed-by: Alexander Korotkov, Andres Freund
Discussion: https://www.postgresql.org/message-id/1bfe7f4a-7e22-aa6e-7b37-f4d222ed2d67@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Change vacuum_cleanup_index_scale_factor GUC to PGC_USERSET.
vacuum_cleanup_index_scale_factor GUC was defined as PGC_SIGHUP. But this
GUC affects not only autovacuum. So it might be useful to change it from user
session in order to influence manually runned VACUUM.
- Add missing tab-complete support for vacuum_cleanup_index_scale_factor
reloption.
- Fix condition for B-tree index cleanup.
Zero value of vacuum_cleanup_index_scale_factor means that user wants B-tree
index cleanup to be never skipped.
- Documentation and comment improvements
Authors: Justin Pryzby, Alexander Korotkov, Liudmila Mantrova
Reviewed by: all authors and Robert Haas
Discussion: https://www.postgresql.org/message-id/flat/20180502023025.GD7631%40telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failure to use DatumGetFoo/FooGetDatum macros correctly, or at all,
causes some warnings about sign conversion. This is just cosmetic
at the moment but in principle it's a type violation, so clean up
the instances I could find.
autoprewarm.c and sharedfileset.c contained code that unportably
assumed that pid_t is the same size as int. We've variously dealt
with this by casting pid_t to int or to unsigned long for printing
purposes; I went with the latter.
Fix uninitialized-variable warning in RestoreGUCState. This is
a live bug in some sense, but of no great significance given that
nobody is very likely to care what "line number" is associated with
a GUC that hasn't got a source file recorded.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent gcc can warn about switch-case fall throughs that are not
explicitly labeled as intentional. This seems like a good thing,
so clean up the warnings exposed thereby by labeling all such
cases with comments that gcc will recognize.
In files that already had one or more suitable comments, I generally
matched the existing style of those. Otherwise I went with
/* FALLTHROUGH */, which is one of the spellings approved at the
more-restrictive-than-default level -Wimplicit-fallthrough=4.
(At the default level you can also spell it /* FALL ?THRU */,
and it's not picky about case. What you can't do is include
additional text in the same comment, so some existing comments
containing versions of this aren't good enough.)
Testing with gcc 8.0.1 (Fedora 28's current version), I found that
I also had to put explicit "break"s after elog(ERROR) or ereport(ERROR);
apparently, for this purpose gcc doesn't recognize that those don't
return. That seems like possibly a gcc bug, but it's fine because
in most places we did that anyway; so this amounts to a visit from the
style police.
Discussion: https://postgr.es/m/15083.1525207729@sss.pgh.pa.us
|
|
|
|
| |
Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This controls both plan-time and execution-time new-style partition
pruning. While finer-grain control is possible (maybe using an enum GUC
instead of boolean), there doesn't seem to be much need for that.
This new parameter controls partition pruning for all queries:
trivially, SELECT queries that affect partitioned tables are naturally
under its control since they are using the new technology. However,
while UPDATE/DELETE queries do not use the new code, we make the new GUC
control their behavior also (stealing control from
constraint_exclusion), because it is more natural, and it leads to a
more natural transition to the future in which those queries will also
use the new pruning code.
Constraint exclusion still controls pruning for regular inheritance
situations (those not involving partitioned tables).
Author: David Rowley
Review: Amit Langote, Ashutosh Bapat, Justin Pryzby, David G. Johnston
Discussion: https://postgr.es/m/CAKJS1f_0HwsxJG9m+nzU+CizxSdGtfe6iF_ykPYBiYft302DCw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's been a massive addition of partitioning code in PostgreSQL 11,
with little oversight on its placement, resulting in a
catalog/partition.c with poorly defined boundaries and responsibilities.
This commit tries to set a couple of distinct modules to separate things
a little bit. There are no code changes here, only code movement.
There are three new files:
src/backend/utils/cache/partcache.c
src/include/partitioning/partdefs.h
src/include/utils/partcache.h
The previous arrangement of #including catalog/partition.h almost
everywhere is no more.
Authors: Amit Langote and Álvaro Herrera
Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp
https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp
https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp
https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
|
|
|
|
|
|
|
|
| |
This reverts the backend sides of commit 1fde38beaa0c3e66c340efc7cc0dc272d6254bb0.
I have, at least for now, left the pg_verify_checksums tool in place, as
this tool can be very valuable without the rest of the patch as well,
and since it's a read-only tool that only runs when the cluster is down
it should be a lot safer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow the cluster to be optionally init'd with read access for the
group.
This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.
The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates. This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.
Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein. If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required. Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.
Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.
Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to turn checksums on in a live cluster, without
the previous need for dump/reload or logical replication (and to turn it
off).
Enabling checkusm starts a background process in the form of a
launcher/worker combination that goes through the entire database and
recalculates checksums on each and every page. Only when all pages have
been checksummed are they fully enabled in the cluster. Any failure of
the process will revert to checksums off and the process has to be
started.
This adds a new WAL record that indicates the state of checksums, so
the process works across replicated clusters.
Authors: Magnus Hagander and Daniel Gustafsson
Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete
calls and one amvacuumcleanup call. When workload on particular table
is append-only, then autovacuum isn't intended to touch this table. However,
user may run vacuum manually in order to fill visibility map and get benefits
of index-only scans. Then ambulkdelete wouldn't be called for indexes
of such table (because no heap tuples were deleted), only amvacuumcleanup would
be called In this case, amvacuumcleanup would perform full index scan for
two objectives: put recyclable pages into free space map and update index
statistics.
This patch allows btvacuumclanup to skip full index scan when two conditions
are satisfied: no pages are going to be put into free space map and index
statistics isn't stalled. In order to check first condition, we store
oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then
there are some recyclable pages. In order to check second condition we store
number of heap tuples observed during previous full index scan by cleanup.
If fraction of newly inserted tuples is less than
vacuum_cleanup_index_scale_factor, then statistics isn't considered to be
stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default).
This patch bumps B-tree meta-page version. Upgrade of meta-page is performed
"on the fly": during VACUUM meta-page is rewritten with new version. No special
handling in pg_upgrade is required.
Author: Masahiko Sawada, Alexander Korotkov
Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov
Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides infrastructure to allow JITed code to inline code
implemented in C. This e.g. can be postgres internal functions or
extension code.
This already speeds up long running queries, by allowing the LLVM
optimizer to optimize across function boundaries. The optimization
potential currently doesn't reach its full potential because LLVM
cannot optimize the FunctionCallInfoData argument fully away, because
it's allocated on the heap rather than the stack. Fixing that is
beyond what's realistic for v11.
To be able to do that, use CLANG to convert C code to LLVM bitcode,
and have LLVM build a summary for it. That bitcode can then be used to
to inline functions at runtime. For that the bitcode needs to be
installed. Postgres bitcode goes into $pkglibdir/bitcode/postgres,
extensions go into equivalent directories. PGXS has been modified so
that happens automatically if postgres has been compiled with LLVM
support.
Currently this isn't the fastest inline implementation, modules are
reloaded from disk during inlining. That's to work around an apparent
LLVM bug, triggering an apparently spurious error in LLVM assertion
enabled builds. Once that is resolved we can remove the superfluous
read from disk.
Docs will follow in a later commit containing docs for the whole JIT
feature.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Performing JIT compilation for deforming gains performance benefits
over unJITed deforming from compile-time knowledge of the tuple
descriptor. Fixed column widths, NOT NULLness, etc can be taken
advantage of.
Right now the JITed deforming is only used when deforming tuples as
part of expression evaluation (and obviously only if the descriptor is
known). It's likely to be beneficial in other cases, too.
By default tuple deforming is JITed whenever an expression is JIT
compiled. There's a separate boolean GUC controlling it, but that's
expected to be primarily useful for development and benchmarking.
Docs will follow in a later commit containing docs for the whole JIT
feature.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to the interpretation of expressions (which back
evaluation of WHERE clauses, target list projection, aggregates
transition values etc) support compiling expressions to native code,
using the infrastructure added in earlier commits.
To avoid duplicating a lot of code, only support emitting code for
cases that are likely to be performance critical. For expression steps
that aren't deemed that, use the existing interpreter.
The generated code isn't great - some architectural changes are
required to address that. But this already yields a significant
speedup for some analytics queries, particularly with WHERE clauses
filtering a lot, or computing multiple aggregates.
Author: Andres Freund
Tested-By: Thomas Munro
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
Disable JITing for VALUES() nodes.
VALUES() nodes are only ever executed once. This is primarily helpful
for debugging, when forcing JITing even for cheap queries.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds simple cost based plan time decision about whether JIT
should be performed. jit_above_cost, jit_optimize_above_cost are
compared with the total cost of a plan, and if the cost is above them
JIT is performed / optimization is performed respectively.
For that PlannedStmt and EState have a jitFlags (es_jit_flags) field
that stores information about what JIT operations should be performed.
EState now also has a new es_jit field, which can store a
JitContext. When there are no errors the context is released in
standard_ExecutorEnd().
It is likely that the default values for jit_[optimize_]above_cost
will need to be adapted further, but in my test these values seem to
work reasonably.
Author: Andres Freund, with feedback by Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
| |
This currently requires patches to the LLVM codebase to be
effective (submitted upstream), the GUCs are available without those
patches however.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces the ability to actually generate code using
LLVM. In particular, this adds:
- Ability to emit code both in heavily optimized and largely
unoptimized fashion
- Batching facility to allow functions to be defined in small
increments, but optimized and emitted in executable form in larger
batches (for performance and memory efficiency)
- Type and function declaration synchronization between runtime
generated code and normal postgres code. This is critical to be able
to access struct fields etc.
- Developer oriented jit_dump_bitcode GUC, for inspecting / debugging
the generated code.
- per JitContext statistics of number of functions, time spent
generating code, optimizing, and emitting it. This will later be
employed for EXPLAIN support.
This commit doesn't yet contain any code actually generating
functions. That'll follow in later commits.
Documentation for GUCs added, and for JIT in general, will be added in
later commits.
Author: Andres Freund, with contributions by Pierre Ducroquet
Testing-By: Thomas Munro, Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition. This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition. This should be no worse, and often better, than the normal
approach.
If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results. This is less certain
to be a win, but it's still useful.
Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.
Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces:
1) JIT provider abstraction, which allows JIT functionality to be
implemented in separate shared libraries. That's desirable because
it allows to install JIT support as a separate package, and because
it allows experimentation with different forms of JITing.
2) JITContexts which can be, using functions introduced in follow up
commits, used to emit JITed functions, and have them be cleaned up
on error.
3) The outline of a LLVM JIT provider, which will be fleshed out in
subsequent commits.
Documentation for GUCs added, and for JIT in general, will be added in
later commits.
Author: Andres Freund, with architectural input from Jeff Davis
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pending some solution for the problems noted in commit 742869946,
disallow dynamic creation of GUC_LIST_QUOTE variables.
If there are any extensions out there using this feature, they'd not
be happy for us to start enforcing this rule in minor releases, so
this is a HEAD-only change. The previous commit didn't make things
any worse than they already were for such cases.
Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code that prints out the contents of setconfig or proconfig arrays in
SQL format needs to handle GUC_LIST_QUOTE variables differently from
other ones, because for those variables, flatten_set_variable_args()
already applied a layer of quoting. The value can therefore safely
be printed as-is, and indeed must be, or flatten_set_variable_args()
will muck it up completely on reload. For all other GUC variables,
it's necessary and sufficient to quote the value as a SQL literal.
We'd recognized the need for this long ago, but mis-analyzed the
need slightly, thinking that all GUC_LIST_INPUT variables needed
the special treatment. That's actually wrong, since a valid value
of a LIST variable might include characters that need quoting,
although no existing variables accept such values.
More to the point, we hadn't made any particular effort to keep the
various places that deal with this up-to-date with the set of variables
that actually need special treatment, meaning that we'd do the wrong
thing with, for example, temp_tablespaces values. This affects dumping
of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET
commands.
In ruleutils.c we can fix it reasonably honestly by exporting a guc.c
function that allows discovering the flags for a given GUC variable.
But pg_dump doesn't have easy access to that, so continue the old method
of having a hard-wired list of affected variable names. At least we can
fix it to have just one list not two, and update the list to match
current reality.
A remaining problem with this is that it only works for built-in
GUC variables. pg_dump's list obvious knows nothing of third-party
extensions, and even the "ask guc.c" method isn't bulletproof since
the relevant extension might not be loaded. There's no obvious
solution to that, so for now, we'll just have to discourage extension
authors from inventing custom GUCs that need GUC_LIST_QUOTE.
This has been busted for a long time, so back-patch to all supported
branches.
Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and
Pavel Stehule
Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows specifying an external command for prompting for or
otherwise obtaining passphrases for SSL key files. This is useful
because in many cases there is no TTY easily available during service
startup.
Also add a setting ssl_passphrase_command_supports_reload, which allows
supporting SSL configuration reload even if SSL files need passphrases.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
|
|
|
|
|
|
|
|
|
| |
We call this thing a "transaction block" everywhere except in a few
functions, where it is mysteriously called a "transaction chain". In
the SQL standard, a transaction chain is something different. So rename
these functions to match the common terminology.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the warnings created by the compiler warning options
-Wformat-overflow=2 -Wformat-truncation=2, supported since GCC 7. This
is a more aggressive variant of the fixes in
6275f5d28a1577563f53f2171689d4f890a46881, which GCC 7 warned about by
default.
The issues are all harmless, but some dubious coding patterns are
cleaned up.
One issue that is of external interest is that BGW_MAXLEN is increased
from 64 to 96. Apparently, the old value would cause the bgw_name of
logical replication workers to be truncated in some circumstances.
But this doesn't actually add those warning options. It appears that
the warnings depend a bit on compilation and optimization options, so it
would be annoying to have to keep up with that. This is more of a
once-in-a-while cleanup.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
|
|
|
|
|
|
|
|
|
| |
A Value node would store an integer as a long. This causes needless
portability risks, as long can be of varying sizes. Change it to use
int instead. All code using this was already careful to only store
32-bit values anyway.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were independently checking ReservedBackends < MaxConnections and
max_wal_senders < MaxConnections, but because walsenders aren't allowed
to use superuser-reserved connections, that's really the wrong thing.
Correct behavior is to insist on ReservedBackends + max_wal_senders being
less than MaxConnections. Fix the code and associated documentation.
This has been wrong for a long time, but since the situation probably
hardly ever arises in the field (especially pre-v10, when the default
for max_wal_senders was zero), no back-patch.
Discussion: https://postgr.es/m/28271.1520195491@sss.pgh.pa.us
|
|
|
|
|
| |
Author: Kyotaro HORIGUCHI
Discussion: https://postgr.es/m/20180307.163428.209919771.horiguchi.kyotaro@lab.ntt.co.jp
|