| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
| |
While we probably don't want to split up all error messages into
function and procedure variants, this one is a very prominent one, so
it's helpful to be more specific here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When executing CALL in PL/pgSQL, we need to set a snapshot before
invoking the to-be-called procedure. Otherwise, the to-be-called
procedure might end up running without a snapshot. For LANGUAGE SQL
procedures, this would result in an assertion failure. (For most other
languages, this is usually not a problem, because those use SPI and SPI
sets snapshots in most cases.) Setting the snapshot restores the
behavior of how CALL worked when it was handled as a generic SQL
statement in PL/pgSQL (exec_stmt_execsql()).
This change revealed another problem: In SPI_commit(), we popped the
active snapshot before committing the transaction, to avoid "snapshot %p
still active" errors. However, there is no particular reason why only
at most one snapshot should be on the stack. So change this to pop all
active snapshots instead of only one.
|
|
|
|
|
| |
In order to be able to resolve polymorphic types, we need to set fn_expr
before invoking the procedure.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting with commit 9915de6c1cb2, replication slot drop uses a
condition variable sleep to wait until the current user of the slot goes
away. This is more user friendly than the previous behavior of erroring
out if the slot is in use, but it fails with a not-for-user-consumption
error message in single-user mode; plus, if you're using single-user
mode because you don't want to start the server in the regular mode
(say, disk is full and WAL won't recycle because of the slot), it's
inconvenient.
Fix by skipping the cond variable sleep in single-user mode, since
there can't be anybody to wait for anyway.
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/3b2f809f-326c-38dd-7a9e-897f957a4eb1@enterprisedb.com
|
|
|
|
|
|
| |
Coverity complains that there is no protection in the code (at least in
non-assertion-enabled builds) against speculative insertion failing to
follow the expected protocol. Add an elog(ERROR) for the case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed. This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed. When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed. Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.
Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me. The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.
Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore. The test gets easily supported down to 10, still it
has been tested on all branches.
Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The query lifetime expression context created in
hypothetical_dense_rank_final() was buggily allocated in the calling
memory context. I (Andres) broke that in bf6c614a2f2.
Reported-By: Rajkumar Raghuwanshi
Author: Amit Langote
Discussion: https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com
Backpatch: 11-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When deleting pages the nbtree code has to walk through siblings of a
tree node. When those sibling links are corrupted that can lead to
endless loops - which are currently not interruptible. This is
especially problematic if autovacuum is repeatedly blocked on such
indexes, as it can be hard to get out of that situation without
resorting to single user mode.
Thus add interrupt checks to appropriate places in such
loops. Unfortunately in one of the cases it's it's not easy to do so.
Between 9.3 and 9.4 the page deletion (and page split) code changed
significantly. Before it was significantly less robust against
interruptions. Therefore don't backpatch to 9.3.
Author: Andres Freund
Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
Backpatch: 9.4-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When multiple relations are deleted at the same transaction,
the files of those relations are deleted by one call to smgrdounlinkall(),
which leads to scan whole shared_buffers only one time. OTOH,
previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
called for each file to delete, which led to scan shared_buffers
multiple times. Obviously this could cause to increase the WAL replay
time very much especially when shared_buffers was huge.
To alleviate this situation, this commit changes the recovery so that
it also calls smgrdounlinkall() only one time to delete multiple
relation files.
This is just fix for oversight of commit 279628a0a7, not new feature.
So, per discussion on pgsql-hackers, we concluded to backpatch this
to all supported versions.
Author: Fujii Masao
Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since recent commit 1c7c317c, temporary relations cannot be mixed with
permanent relations within the same partition tree, and the same counts
for temporary relations created by other sessions, which the planner
simply discarded. Instead be paranoid and issue an error, as those
should be blocked at definition time, at least for now.
At the same time, a test case is added to stress what has been moved
when expand_partitioned_rtentry gets called recursively but bumps on a
partitioned relation with no partitions which should be handled the same
way as the non-inheritance case. This code may be reworked in a close
future, and covering this code path will limit surprises.
Reported-by: David Rowley
Author: David Rowley
Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
|
| |
|
|
|
|
|
|
|
|
|
| |
The skip_build flag was not being passed correctly when recursing to
indexes on partitions, leading to attempts to rebuild indexes when they
were not yet ready to be rebuilt.
Reported-by: Rajkumar Raghuwanshi
Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
This includes code comments and documentation. No backpatch as this is
cosmetic even if there are documentation changes which are user-facing.
Author: Daniel Gustafsson
Discussion: https://postgr.es/m/BB89928E-2BC7-489E-A5E4-6D204B3954CF@yesql.se
|
|
|
|
|
|
|
|
|
| |
A slot can not be stored in a tuple but it's vice versa.
Reported-by: Ashutosh Bapat
Author: Ashutosh Bapat
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
The duplicated return clearly doesn't make sense / isn't
reachable. Likely introduced by me (Andres), while revising the code.
Author: Rushabh Lathia
Discussion: https://postgr.es/m/CAGPqQf2raxWOcbuTP36M1rEF3=Rfo7oD29K3psdyHMeE5swBRg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
Changed the name of few structure members for the sake of clarity and
removed spurious whitespace.
Reported-by: Amit Kapila
Author: Amit Kapila, based on suggestion by Andrew Dunstan
Reviewed-by: Alvaro Herrera
Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two closely related bugs are fixed. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them. The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.
To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN. This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment). Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.
The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise. To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.
Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.
Liberally sprinkle code comments, and rewrite a few existing ones. This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.
Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to
DBL_MAX. DBL_MAX appears to be platform-dependent. That causes
many buildfarm animals to fail, because we check boundaries of
vacuum_cleanup_index_scale_factor in regression tests.
This commit changes upper limit from DBL_MAX to just "large enough"
limit, which was arbitrary selected as 1e10.
Author: Alexander Korotkov
Reported-by: Tom Lane, Darafei Praliaskouski
Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com
Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
|
|
|
|
|
|
|
|
| |
randomAccess parallel tuplesorts are disallowed because the leader would
try to write to its own leader tape, not because the leader would try to
write to a worker tape directly.
Cleanup from commit 9da0cc35284.
|
|
|
|
|
|
|
|
|
|
|
| |
Building a new nbtree index through incremental insertions would always
be slower than our actual approach of sorting using tuplesort,
assembling leaf pages from tuplesort output, and writing and WAL-logging
whole pages. Remove a comment block from the Berkeley days claiming
that incremental insertions might be slightly faster with presorted
input.
Discussion: https://postgr.es/m/CAH2-WzmKs4mLAoFgJ3yHMRYc849efc=dw+pNRb3NEog2oJoCNw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Concurrently with partitioned index development (commit 8b08f7d4820f),
the code to handle failure to rename indexes was refactored (commit
8b9e9644dc6a). Turns out that that particular case was untested, which
naturally led it to be broken. Add tests and the missing code line.
Co-authored-by: David Rowley <dgrowley@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>
Discussion: https://postgr.es/m/CAKcux6mfYMS3OX0ywjOiWiGSEKhJf-1zdeTceHFbd0mScUzU5A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
find_appinfos_by_relids had quite a large overhead when the number of
items in the append_rel_list was high, as it had to trawl through the
append_rel_list looking for AppendRelInfos belonging to the given
childrelids. Since there can only be a single AppendRelInfo for each
child rel, it seems much better to store an array in PlannerInfo which
indexes these by child relid, making the function O(1) rather than O(N).
This function was only called once inside the planner, so just replace
that call with a lookup to the new array. find_childrel_appendrelinfo
is now unused and thus removed.
This fixes a planner performance regression new to v11 reported by
Thomas Reiss.
Author: David Rowley
Reported-by: Thomas Reiss
Reviewed-by: Ashutosh Bapat
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upper limits for vacuum_cleanup_index_scale_factor GUC and reloption
were initially set to 100.0 in 857f9c36. However, after further
discussion, it appears that some users like to disable B-tree cleanup
index scan completely (assuming there are no deleted pages).
vacuum_cleanup_index_scale_factor is used barely to protect against
stalled index statistics. And after detailed consideration it appears
that risk of stalled index statistics is low. And it would be nice to
allow advanced users setting higher values of
vacuum_cleanup_index_scale_factor. So, set upper limit for these
GUC and reloption to DBL_MAX.
Author: Alexander Korotkov
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAC8Q8tJCb%3DgxhzcV7T6ctx7PY-Ux1oA-AsTJc6cAVNsQiYcCzA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Standbys frequently need to release all locks held by a given xid.
Instead of searching one big list linearly, let's create one list
per xid and put them in a hash table, so we can find what we need
in O(1) time.
Earlier analysis and a prototype were done by David Rowley, though
this isn't his patch.
Back-patch all the way.
Author: Thomas Munro
Diagnosed-by: David Rowley, Andres Freund
Reviewed-by: Andres Freund, Tom Lane, Robert Haas
Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com
Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
|
|
|
|
|
|
|
| |
Commit 9fab40ad32ef removed some pre-allocating logic in
reorderbuffer.c, but left outdated comments in place. Repair.
Author: Álvaro Herrera
|
|
|
|
|
| |
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 884f33d735870f94357820800840af3e93ff4628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
System calls mixed up in error code paths are causing two issues which
several code paths have not correctly handled:
1) For write() calls, sometimes the system may return less bytes than
what has been written without errno being set. Some paths were careful
enough to consider that case, and assumed that errno should be set to
ENOSPC, other calls missed that.
2) errno generated by a system call is overwritten by other system calls
which may succeed once an error code path is taken, causing what is
reported to the user to be incorrect.
This patch uses the brute-force approach of correcting all those code
paths. Some refactoring could happen in the future, but this is let as
future work, which is not targeted for back-branches anyway.
Author: Michael Paquier
Reviewed-by: Ashutosh Sharma
Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two out of three code paths were mapping column numbers correctly if a
partition had different column numbers than parent table, but the most
commonly used one (recursing in CREATE INDEX to a new index on a
partition) failed to map attribute numbers in expressions. Oddly
enough, attnums in WHERE clauses are already handled correctly
everywhere.
Reported-by: Amit Langote
Author: Amit Langote
Discussion: https://postgr.es/m/dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp
Reviewed-by: Álvaro Herrera
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if some or all partitions had no partially aggregated path,
we would still try to generate a partially aggregated path for the
parent, leading to assertion failures or wrong answers.
Report by Rajkumar Raghuwanshi. Patch by Jeevan Chalke, reviewed
by Ashutosh Bapat. A few changes by me.
Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 16828d5c02 neglected to do this, so upgraded databases would
silently get null instead of the specified default in rows without the
attribute defined.
A new binary upgrade function is provided to perform this and pg_dump is
adjusted to output a call to the function if required in binary upgrade
mode.
Also included is code to drop missing attribute values for dropped
columns. That way if the type is later dropped the missing value won't
have a dangling reference to the type.
Finally the regression tests are adjusted to ensure that there is a row
with a missing value so that this code is exercised in upgrade testing.
Catalog version unfortunately bumped.
Regression test changes from Tom Lane.
Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera
Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vacuum_cleanup_index_scale_factor was located in autovacuum group of
GUCs. However, it affects not only autovacuum, but also manually run
VACUUM. It appears that "client connection defaults" group of GUCs
is more appropriate for vacuum_cleanup_index_scale_factor, because
vacuum_*_age options are already located there.
Also, vacuum_cleanup_index_scale_factor was missed in
postgresql.conf.sample. So, add it there with appropriate comment.
Author: Masahiko Sawada with minor editorization by me
Discussion: https://postgr.es/m/CAD21AoArsoXMLKudXSKN679FRzs6oubEchM53bHwn8Tp%3D2boNg%40mail.gmail.com
|
|
|
|
| |
Author: Shao Bret
|
|
|
|
|
|
|
|
|
|
|
|
| |
The create_append_path code didn't consider that list_concat will
modify it's first argument leading to inconsistent traversal of
resulting list. In practice, it won't lead to any user-visible bug
but changing it for making the code behave consistently.
Reported-by: Tom Lane
Author: Tom Lane
Reviewed-by: Amit Khandekar and Amit Kapila
Discussion: https://postgr.es/m/32365.1528994120@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A typo in numeric_poly_combine caused bogus results for queries using
it, but of course would only manifest if parallel aggregation is
performed. Reported by Rajkumar Raghuwanshi.
David Rowley did the diagnosis and the fix; I editorialized rather
heavily on his regression test additions.
Back-patch to v10 where the breakage was introduced (by 9cca11c91).
Discussion: https://postgr.es/m/CAKcux6nU4E2x8nkSBpLOT2DPvQ5LviJ3SGyAN6Sz7qDH4G4+Pw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the SQL standard, the context of XMLTABLE's XPath
row_expression is the document node of the XML input document, not the
root node. This becomes visible when a relative path rather than
absolute is used as row expression. Absolute paths is what was used in
original tests and docs (and the most common form used in examples
throughout the interwebs), which explains why this wasn't noticed
before.
Other functions such as xpath() and xpath_exists() also have this
problem. While not specified by the SQL standard, it would be pretty
odd to leave those functions to behave differently than XMLTABLE, so
change them too. However, this is a backwards-incompatible change.
No backpatch, out of fear of breaking code depending on the original
broken behavior.
Author: Markus Winand
Reported-By: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
split_pathtarget_at_srfs() neglected to worry about sortgroupref labels
in the intermediate PathTargets it constructs. I think we'd supposed
that their labeling didn't matter, but it does at least for the case that
GroupAggregate/GatherMerge nodes appear immediately under the ProjectSet
step(s). This results in "ERROR: ORDER/GROUP BY expression not found in
targetlist" during create_plan(), as reported by Rajkumar Raghuwanshi.
To fix, make this logic track the sortgroupref labeling of expressions,
not just their contents. This also restores the pre-v10 behavior that
separate GROUP BY expressions will be kept distinct even if they are
textually equal().
Discussion: https://postgr.es/m/CAKcux6=1_Ye9kx8YLBPmJs_xE72PPc6vNi5q2AOHowMaCWjJ2w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Column expressions that match TEXT or CDATA nodes must return the
contents of the nodes themselves, not the content of non-existing
children (i.e. the empty string).
Author: Markus Winand
Reported-by: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
|
|
|
|
|
|
|
|
|
| |
We were using 'partition rel' in a few places, which is quite confusing.
Author: Amit Langote
Reviewed-by: David Rowley
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/fd256561-31a2-4b7e-cd84-d8241e7ebc3f@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit ab72716778 allowed Parallel Append paths to be generated for a
relation that is not parallel safe. Prevent that from happening.
Initial analysis by Tom Lane.
Reported-by: Rajkumar Raghuwanshi
Author: Amit Kapila and Rajkumar Raghuwanshi
Reviewed-by: Amit Khandekar and Robert Haas
Discussion:https://postgr.es/m/CAKcux6=tPJ6nJ08r__nU_pmLQiC0xY15Fn0HvG1Cprsjdd9s_Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since their introduction, partition trees have been a bit lossy
regarding temporary relations. Inheritance trees respect the following
patterns:
1) a child relation can be temporary if the parent is permanent.
2) a child relation can be temporary if the parent is temporary.
3) a child relation cannot be permanent if the parent is temporary.
4) The use of temporary relations also imply that when both parent and
child need to be from the same sessions.
Partitions share many similar patterns with inheritance, however the
handling of the partition bounds make the situation a bit tricky for
case 1) as the partition code bases a lot of its lookup code upon
PartitionDesc which does not really look after relpersistence. This
causes for example a temporary partition created by session A to be
visible by another session B, preventing this session B to create an
extra partition which overlaps with the temporary one created by A with
a non-intuitive error message. There could be use-cases where mixing
permanent partitioned tables with temporary partitions make sense, but
that would be a new feature. Partitions respect 2), 3) and 4) already.
It is a bit depressing to see those error checks happening in
MergeAttributes() whose purpose is different, but that's left as future
refactoring work.
Back-patch down to 10, which is where partitioning has been introduced,
except that default partitions do not apply there. Documentation also
includes limitations related to the use of temporary tables with
partition trees.
Reported-by: David Rowley
Author: Amit Langote, Michael Paquier
Reviewed-by: Ashutosh Bapat, Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f94Ojk0og9GMkRHGt8wHTW=ijq5KzJKuoBoqWLwSVwGmw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ProcedureCreate formerly threw an error if the function to be created
has one argument of composite type and the function name matches some
column of the composite type. This was a (very non-bulletproof) defense
against creating situations where f(x) and x.f are ambiguous. But we
don't really need such a defense in the wake of commit b97a3465d, which
allows us to deal with such situations fairly cleanly. This behavior
also created a dump-and-reload hazard, since a function might be
rejected if a conflicting column name had been added to the input
composite type later. Hence, let's just drop the check.
Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Postgres has traditionally considered the syntactic forms f(x) and x.f
to be equivalent, allowing tricks such as writing a function and then
using it as though it were a computed-on-demand column. However, our
behavior when both interpretations are feasible left something to be
desired: we always chose the column interpretation. This could lead
to very surprising results, as in a recent bug report from Neil Conway.
It also created a dump-and-reload hazard, since what was a function
call in a dumped view could get interpreted as a column reference
at reload, if a matching column name had been added to the underlying
table since the view was created.
What seems better, in ambiguous situations, is to prefer the choice
matching the syntactic form of the reference. This seems much less
astonishing in general, and it fixes the dump/reload hazard.
Although this could be called a bug fix, there have been few complaints
and there's some small risk of breaking applications that depend on the
old behavior, so no back-patch. It does seem reasonable to slip it
into v11, though.
Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a standby's WAL receiver stops reading WAL from a WAL stream, it
writes data to the current WAL segment without having priorily zero'ed
the page currently written to, which can cause the WAL reader to read
junk data from a past recycled segment and then it would try to get a
record from it. While sanity checks in place provide most of the
protection needed, in some rare circumstances, with chances increasing
when a record header crosses a page boundary, then the startup process
could fail violently on an allocation failure, as follows:
FATAL: invalid memory alloc request size XXX
This is confusing for the user and also unhelpful as this requires in
the worst case a manual restart of the instance, impacting potentially
the availability of the cluster, and this also makes WAL data look like
it is in a corrupted state.
The chances of seeing failures are higher if the connection between the
standby and its root node is unstable, causing WAL pages to be written
in the middle. A couple of approaches have been discussed, like
zero-ing new WAL pages within the WAL receiver itself but this has the
disadvantage of impacting performance of any existing instances as this
breaks the sequential writes done by the WAL receiver. This commit
deals with the problem with a more simple approach, which has no
performance impact without reducing the detection of the problem: if a
record is found with a length higher than 1GB for backends, then do not
try any allocation and report a soft failure which will force the
standby to retry reading WAL. It could be possible that the allocation
call passes and that an unnecessary amount of memory is allocated,
however follow-up checks on records would just fail, making this
allocation short-lived anyway.
This patch owes a great deal to Tsunakawa Takayuki for reporting the
failure first, and then discussing a couple of potential approaches to
the problem.
Backpatch down to 9.5, which is where palloc_extended has been
introduced.
Reported-by: Tsunakawa Takayuki
Reviewed-by: Tsunakawa Takayuki
Author: Michael Paquier
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT05
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up four places that result in compiler warnings when using recent
gcc with this warning class enabled (as seen on buildfarm members
calliphoridae, skink, and others). In all these places, this is purely
cosmetic, because the shift distance could not be large enough to risk
a change of sign, so there's no chance of implementation-dependent
behavior. Still, it's easy enough to avoid the warning by casting the
shifted value to unsigned, so let's do that.
Patch HEAD only, this isn't worth a back-patch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent additions to ParseFuncOrColumn to make it reject non-procedure
functions in CALL were neither adequate nor documented. Reorganize
the code to ensure uniform results for all the cases that should be
rejected. Also, use ERRCODE_WRONG_OBJECT_TYPE for this case as well
as the converse case of a procedure in a non-CALL context. The
original coding used ERRCODE_UNDEFINED_FUNCTION which seems wrong,
and is certainly inconsistent with the adjacent wrong-kind-of-routine
errors.
This reorganization also causes the checks for aggregate decoration with
a non-aggregate function to be made in the FUNCDETAIL_COERCION case;
that they were not is a long-standing oversight.
Discussion: https://postgr.es/m/14497.1529089235@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issues relate only to subtransactions that hold AccessExclusiveLocks
when replayed on standby.
Prior to PG10, aborting subtransactions that held an
AccessExclusiveLock failed to release the lock until top level commit or
abort. 49bff5300d527 fixed that.
However, 49bff5300d527 also introduced a similar bug where subtransaction
commit would fail to release an AccessExclusiveLock, leaving the lock to
be removed sometimes early and sometimes late. This commit fixes
that bug also. Backpatch to PG10 needed.
Tested by observation. Note need for multi-node isolationtester to improve
test coverage for this and other HS cases.
Reported-by: Simon Riggs
Author: Simon Riggs
|
|
|
|
|
|
|
|
|
| |
Also this commit unifies some duplicated code in makeBufFile() and
BufFileOpenShared() into new function makeBufFileCommon().
Author: Antonin Houska
Reviewed-By: Thomas Munro, Tatsuo Ishii
Discussion: https://postgr.es/m/16139.1529049566%40localhost
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1eb6d6527aae introduced zeroed alignment bytes in the GID field
of commit/abort WAL records. Fixup commit cf5a1890592b later changed
that representation into a regular cstring with a single terminating
zero byte, but it also introduced an off-by-one mistake. Fix that.
Author: Nikhil Sontakke
Reported-by: Nikhil Sontakke
Discussion: https://postgr.es/m/CAMGcDxey6dG1DP34_tJMoWPcp5sPJUAL4K5CayUUXLQSx2GQpA@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Memory is allocated twice for "file" and "files" variables in
BufFileOpenShared().
Author: Antonin Houska
Discussion: https://postgr.es/m/11329.1529045692%40localhost
|
|
|
|
|
|
|
|
|
|
|
| |
They already fail anyway, but prior to this patch they raise an ugly
error message about a lock that cannot be acquired. This just improves
the message.
Author: Masahiko Sawada
Reported-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBZau4g4_NUf3BKNd=CdYK+xaPdtJCzvOC1TxGdTiJx_Q@mail.gmail.com
Reviewed-by: Kuntal Ghosh, Alexander Korotkov, Simon Riggs, Michaël Paquier, Álvaro Herrera
|