aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Fix more wrong paths in header commentsAlexander Korotkov2018-07-11
| | | | | | | It appears that there are more files, whose header comment paths are wrong. So, fix those paths. No backpatching per proposal of Tom Lane. Discussion: https://postgr.es/m/CAPpHfdsJyYbOj59MOQL%2B4XxdcomLSLfLqBtAvwR%2BpsCqj3ELdQ%40mail.gmail.com
* Rethink how to get float.h in old Windows API for isnan/isinfAlvaro Herrera2018-07-11
| | | | | | | | | | | | | | | | We include <float.h> in every place that needs isnan(), because MSVC used to require it. However, since MSVC 2013 that's no longer necessary (cf. commit cec8394b5ccd), so we can retire the inclusion to a version-specific stanza in win32_port.h, where it doesn't need to pollute random .c files. The header is of course still needed in a few places for other reasons. I (Álvaro) removed float.h from a few more files than in Emre's original patch. This doesn't break the build in my system, but we'll see what the buildfarm has to say about it all. Author: Emre Hasegeli Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
* Fix wrong file path in header commentAlexander Korotkov2018-07-11
| | | | | | Header comment of shm_mq.c was mistakenly specifying path to shm_mq.h. It was introduced in ec9037df. So, theoretically it could be backpatched to 9.4, but it doesn't seem to worth it.
* Use signals for postmaster death on FreeBSD.Thomas Munro2018-07-11
| | | | | | | | | Use FreeBSD 11.2's new support for detecting parent process death to make PostmasterIsAlive() very cheap, as was done for Linux in an earlier commit. Author: Thomas Munro Discussion: https://postgr.es/m/7261eb39-0369-f2f4-1bb5-62f3b6083b5e@iki.fi
* Use signals for postmaster death on Linux.Thomas Munro2018-07-11
| | | | | | | | | | | | Linux provides a way to ask for a signal when your parent process dies. Use that to make PostmasterIsAlive() very cheap. Based on a suggestion from Andres Freund. Author: Thomas Munro, Heikki Linnakangas Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/7261eb39-0369-f2f4-1bb5-62f3b6083b5e%40iki.fi Discussion: https://postgr.es/m/20180411002643.6buofht4ranhei7k%40alap3.anarazel.de
* Block replication slot advance for these not yet reserving WALMichael Paquier2018-07-11
| | | | | | | | | | | | | | | | Such replication slots are physical slots freshly created without WAL being reserved, which is the default behavior, which have not been used yet as WAL consumption resources to retain WAL. This prevents advancing a slot to a position older than any WAL available, which could falsify calculations for WAL segment recycling. This also cleans up a bit the code, as ReplicationSlotRelease() would be called on ERROR, and improves error messages. Reported-by: Kyotaro Horiguchi Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Kyotaro Horiguchi Discussion: https://postgr.es/m/20180626071305.GH31353@paquier.xyz
* Better handle pseudotypes as partition keysAlvaro Herrera2018-07-10
| | | | | | | | | | | | | | | | | | | | | | We fail to handle polymorphic types properly when they are used as partition keys: we were unnecessarily adding a RelabelType node on top, which confuses code examining the nodes. In particular, this makes predtest.c-based partition pruning not to work, and ruleutils.c to emit expressions that are uglier than needed. Fix it by not adding RelabelType when not needed. In master/11 the new pruning code is separate so it doesn't suffer from this problem, since we already fixed it (in essentially the same way) in e5dcbb88a15d, which also added a few tests; back-patch those tests to pg10 also. But since UPDATE/DELETE still uses predtest.c in pg11, this change improves partitioning for those cases too. Add tests for this. The ruleutils.c behavior change is relevant in pg11/master too. Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/54745d13-7ed4-54ac-97d8-ea1eec95ae25@lab.ntt.co.jp
* Remove dynamic_shared_memory_type=nonePeter Eisentraut2018-07-10
| | | | | | | | | PostgreSQL nowadays offers some kind of dynamic shared memory feature on all supported platforms. Having the choice of "none" prevents us from relying on DSM in core features. So this patch removes the choice of "none". Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
* Avoid emitting a bogus WAL record when recycling an all-zero btree page.Tom Lane2018-07-09
| | | | | | | | | | | | | | | | | | | | Commit fafa374f2 caused _bt_getbuf() to possibly emit a WAL record for a page that it was about to recycle. However, it failed to distinguish all-zero pages from dead pages, which is important because only the latter have valid btpo.xact values, or indeed any special space at all. Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore led to an Assert failure, or to emission of a WAL record containing a bogus cutoff XID, which might lead to unnecessary query cancellations on hot standby servers. Per reports from Antonin Houska and 自己. Amit Kapila was first to propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi reviewed it at various times. This is an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/2628.1474272158@localhost Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com
* Flip argument order in XLogSegNoOffsetToRecPtrAlvaro Herrera2018-07-09
| | | | | | | | | Commit fc49e24fa69a added an input argument after the existing output argument. Flip those. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20180708182345.imdgovmkffgtihhk@alvherre.pgsql
* Add UtilityReturnsTuples() support for CALLPeter Eisentraut2018-07-09
| | | | This ensures that prepared statements for CALL can return tuples.
* Rework order of end-of-recovery actions to delay timeline history writeMichael Paquier2018-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A critical failure in some of the end-of-recovery actions before the end-of-recovery record is written can cause PostgreSQL to react inconsistently with the rest of the cluster in the event of a crash before the final record is written. Two such failures are for example an error while processing a two-phase state files or when operating on recovery.conf. With this commit, the failures are still considered FATAL, but the write of the timeline history file is delayed as much as possible so as the window between the moment the file is written and the end-of-recovery record is generated gets minimized. This way, in the event of a crash or a failure, the new timeline decided at promotion will not seem taken by other nodes in the cluster. It is not really possible to reduce to zero this window, hence one could still see failures if a crash happens between the history file write and the end-of-recovery record, so any future code should be careful when adding new end-of-recovery actions. The original report from Magnus Hagander mentioned a renamed recovery.conf as original end-of-recovery failure which caused a timeline to be seen as taken but the subsequent processing on the now-missing recovery.conf cause the startup process to issue stop on FATAL, which at follow-up startup made the system inconsistent because of on-disk changes which already happened. Processing of two-phase state files still needs some work as corrupted entries are simply ignored now. This is left as a future item and this commit fixes the original complain. Reported-by: Magnus Hagander Author: Heikki Linnakangas Reviewed-by: Alexander Korotkov, Michael Paquier, David Steele Discussion: https://postgr.es/m/CABUevEz09XY2EevA2dLjPCY-C5UO4Hq=XxmXLmF6ipNFecbShQ@mail.gmail.com
* Correct obsolete unique index insertion comment.Peter Geoghegan2018-07-08
| | | | | | Commit bc292937ae6 failed to update a comment about unique index checking. _bt_insertonpg() is no longer responsible for finding an insertion location while preventing conflicting insertions.
* Use access() to check file existence in GetNewRelFileNode()Michael Paquier2018-07-08
| | | | | | | | | | Previous code used BasicOpenFile() and close() just to check for a file collision, while there is no need to hold open a file descriptor but that's an overkill here. Author: Paul Guo Reviewed-by: Peter Eisentraut, Michael Paquier Discussion: https://postgr.es/m/CABQrizcUtiHaquxK=d4etBX8GF9kbZB50Nt1gO9_aN-e9SptyQ@mail.gmail.com
* Add separate error message for procedure does not existPeter Eisentraut2018-07-07
| | | | | | While we probably don't want to split up all error messages into function and procedure variants, this one is a very prominent one, so it's helpful to be more specific here.
* Fix assert in nested SQL procedure callPeter Eisentraut2018-07-06
| | | | | | | | | | | | | | | | | When executing CALL in PL/pgSQL, we need to set a snapshot before invoking the to-be-called procedure. Otherwise, the to-be-called procedure might end up running without a snapshot. For LANGUAGE SQL procedures, this would result in an assertion failure. (For most other languages, this is usually not a problem, because those use SPI and SPI sets snapshots in most cases.) Setting the snapshot restores the behavior of how CALL worked when it was handled as a generic SQL statement in PL/pgSQL (exec_stmt_execsql()). This change revealed another problem: In SPI_commit(), we popped the active snapshot before committing the transaction, to avoid "snapshot %p still active" errors. However, there is no particular reason why only at most one snapshot should be on the stack. So change this to pop all active snapshots instead of only one.
* Allow CALL with polymorphic type argumentsPeter Eisentraut2018-07-06
| | | | | In order to be able to resolve polymorphic types, we need to set fn_expr before invoking the procedure.
* Allow replication slots to be dropped in single-user modeAlvaro Herrera2018-07-06
| | | | | | | | | | | | | | | | | | Starting with commit 9915de6c1cb2, replication slot drop uses a condition variable sleep to wait until the current user of the slot goes away. This is more user friendly than the previous behavior of erroring out if the slot is in use, but it fails with a not-for-user-consumption error message in single-user mode; plus, if you're using single-user mode because you don't want to start the server in the regular mode (say, disk is full and WAL won't recycle because of the slot), it's inconvenient. Fix by skipping the cond variable sleep in single-user mode, since there can't be anybody to wait for anyway. Reported-by: tushar <tushar.ahuja@enterprisedb.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3b2f809f-326c-38dd-7a9e-897f957a4eb1@enterprisedb.com
* Print DEBUG2 like that rather than as DEBUGAndrew Dunstan2018-07-06
| | | | | | | | | DEBUG is an alias for DEBUG2, but we want DEBUG2 to show in the settings no matter how it was spelled. Takeshi Ideriha Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A5678EC03@G01JPEXMBKW04
* logical decoding: beware of an unset specinsert changeAlvaro Herrera2018-07-05
| | | | | | Coverity complains that there is no protection in the code (at least in non-assertion-enabled builds) against speculative insertion failing to follow the expected protocol. Add an elog(ERROR) for the case.
* Prevent references to invalid relation pages after fresh promotionMichael Paquier2018-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a standby crashes after promotion before having completed its first post-recovery checkpoint, then the minimal recovery point which marks the LSN position where the cluster is able to reach consistency may be set to a position older than the first end-of-recovery checkpoint while all the WAL available should be replayed. This leads to the instance thinking that it contains inconsistent pages, causing a PANIC and a hard instance crash even if all the WAL available has not been replayed for certain sets of records replayed. When in crash recovery, minRecoveryPoint is expected to always be set to InvalidXLogRecPtr, which forces the recovery to replay all the WAL available, so this commit makes sure that the local copy of minRecoveryPoint from the control file is initialized properly and stays as it is while crash recovery is performed. Once switching to archive recovery or if crash recovery finishes, then the local copy minRecoveryPoint can be safely updated. Pavan Deolasee has reported and diagnosed the failure in the first place, and the base fix idea to rely on the local copy of minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded into a full-fledged patch by me. The test included in this commit has been written by Álvaro Herrera and Pavan Deolasee, which I have modified to make it faster and more reliable with sleep phases. Backpatch down to all supported versions where the bug appears, aka 9.3 which is where the end-of-recovery checkpoint is not run by the startup process anymore. The test gets easily supported down to 10, still it has been tested on all branches. Reported-by: Pavan Deolasee Diagnosed-by: Pavan Deolasee Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro Herrera Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
* Use context with correct lifetime in hypothetical_dense_rank_final.Andres Freund2018-07-04
| | | | | | | | | | | The query lifetime expression context created in hypothetical_dense_rank_final() was buggily allocated in the calling memory context. I (Andres) broke that in bf6c614a2f2. Reported-By: Rajkumar Raghuwanshi Author: Amit Langote Discussion: https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com Backpatch: 11-
* Check for interrupts inside the nbtree page deletion code.Andres Freund2018-07-04
| | | | | | | | | | | | | | | | | | | | When deleting pages the nbtree code has to walk through siblings of a tree node. When those sibling links are corrupted that can lead to endless loops - which are currently not interruptible. This is especially problematic if autovacuum is repeatedly blocked on such indexes, as it can be hard to get out of that situation without resorting to single user mode. Thus add interrupt checks to appropriate places in such loops. Unfortunately in one of the cases it's it's not easy to do so. Between 9.3 and 9.4 the page deletion (and page split) code changed significantly. Before it was significantly less robust against interruptions. Therefore don't backpatch to 9.3. Author: Andres Freund Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de Backpatch: 9.4-
* Improve the performance of relation deletes during recovery.Fujii Masao2018-07-05
| | | | | | | | | | | | | | | | | | | | | | When multiple relations are deleted at the same transaction, the files of those relations are deleted by one call to smgrdounlinkall(), which leads to scan whole shared_buffers only one time. OTOH, previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was called for each file to delete, which led to scan shared_buffers multiple times. Obviously this could cause to increase the WAL replay time very much especially when shared_buffers was huge. To alleviate this situation, this commit changes the recovery so that it also calls smgrdounlinkall() only one time to delete multiple relation files. This is just fix for oversight of commit 279628a0a7, not new feature. So, per discussion on pgsql-hackers, we concluded to backpatch this to all supported versions. Author: Fujii Masao Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
* Remove dead code for temporary relations in partition planningMichael Paquier2018-07-04
| | | | | | | | | | | | | | | | | | | Since recent commit 1c7c317c, temporary relations cannot be mixed with permanent relations within the same partition tree, and the same counts for temporary relations created by other sessions, which the planner simply discarded. Instead be paranoid and issue an error, as those should be blocked at definition time, at least for now. At the same time, a test case is added to stress what has been moved when expand_partitioned_rtentry gets called recursively but bumps on a partitioned relation with no partitions which should be handled the same way as the non-inheritance case. This code may be reworked in a close future, and covering this code path will limit surprises. Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Langote, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
* Add wait event for fsync of WAL segmentsMichael Paquier2018-07-02
| | | | | | | | | | | | This has been visibly a forgotten spot in the first implementation of wait events for I/O added by 249cf07, and what has been missing is a fsync call for WAL segments which is a wrapper reacting on the value of GUC wal_sync_method. Reported-by: Konstantin Knizhnik Author: Konstantin Knizhnik Reviewed-by: Craig Ringer, Michael Paquier Discussion: https://postgr.es/m/4a243897-0ad8-f471-aa40-242591f2476e@postgrespro.ru
* Correct function name in comment of logical decoding codeMichael Paquier2018-07-02
| | | | | | Reported-by: Dave Cramer Author: Euler Taveira Discussion: https://postgr.es/m/CADK3HHKnPGJDLhjOFBY6+70Wd14iEH8c2GKw7UrOuUHp_GNFrA@mail.gmail.com
* pgindent run prior to branchingAndrew Dunstan2018-06-30
|
* Fix crash when ALTER TABLE recreates indexes on partitionsAlvaro Herrera2018-06-29
| | | | | | | | | The skip_build flag was not being passed correctly when recursing to indexes on partitions, leading to attempts to rebuild indexes when they were not yet ready to be rebuilt. Reported-by: Rajkumar Raghuwanshi Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com
* Make capitalization of term "OpenSSL" more consistentMichael Paquier2018-06-29
| | | | | | | | This includes code comments and documentation. No backpatch as this is cosmetic even if there are documentation changes which are user-facing. Author: Daniel Gustafsson Discussion: https://postgr.es/m/BB89928E-2BC7-489E-A5E4-6D204B3954CF@yesql.se
* Fix thinko in comments.Amit Kapila2018-06-27
| | | | | | | | | A slot can not be stored in a tuple but it's vice versa. Reported-by: Ashutosh Bapat Author: Ashutosh Bapat Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
* Remove duplicated return statement from llvmjit code.Andres Freund2018-06-26
| | | | | | | | The duplicated return clearly doesn't make sense / isn't reachable. Likely introduced by me (Andres), while revising the code. Author: Rushabh Lathia Discussion: https://postgr.es/m/CAGPqQf2raxWOcbuTP36M1rEF3=Rfo7oD29K3psdyHMeE5swBRg@mail.gmail.com
* Cosmetic improvements for faster column addition.Amit Kapila2018-06-27
| | | | | | | | | | Changed the name of few structure members for the sake of clarity and removed spurious whitespace. Reported-by: Amit Kapila Author: Amit Kapila, based on suggestion by Andrew Dunstan Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com
* Fix "base" snapshot handling in logical decodingAlvaro Herrera2018-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
* Fix upper limit for vacuum_cleanup_index_scale_factorAlexander Korotkov2018-06-26
| | | | | | | | | | | | | | | 6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to DBL_MAX. DBL_MAX appears to be platform-dependent. That causes many buildfarm animals to fail, because we check boundaries of vacuum_cleanup_index_scale_factor in regression tests. This commit changes upper limit from DBL_MAX to just "large enough" limit, which was arbitrary selected as 1e10. Author: Alexander Korotkov Reported-by: Tom Lane, Darafei Praliaskouski Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
* Correct a comment on logtape.c's leader tape.Peter Geoghegan2018-06-26
| | | | | | | | randomAccess parallel tuplesorts are disallowed because the leader would try to write to its own leader tape, not because the leader would try to write to a worker tape directly. Cleanup from commit 9da0cc35284.
* Remove obsolete comment block in nbtsort.c.Peter Geoghegan2018-06-26
| | | | | | | | | | | Building a new nbtree index through incremental insertions would always be slower than our actual approach of sorting using tuplesort, assembling leaf pages from tuplesort output, and writing and WAL-logging whole pages. Remove a comment block from the Berkeley days claiming that incremental insertions might be slightly faster with presorted input. Discussion: https://postgr.es/m/CAH2-WzmKs4mLAoFgJ3yHMRYc849efc=dw+pNRb3NEog2oJoCNw@mail.gmail.com
* Enable failure to rename a partitioned indexAlvaro Herrera2018-06-26
| | | | | | | | | | | | Concurrently with partitioned index development (commit 8b08f7d4820f), the code to handle failure to rename indexes was refactored (commit 8b9e9644dc6a). Turns out that that particular case was untested, which naturally led it to be broken. Add tests and the missing code line. Co-authored-by: David Rowley <dgrowley@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com> Discussion: https://postgr.es/m/CAKcux6mfYMS3OX0ywjOiWiGSEKhJf-1zdeTceHFbd0mScUzU5A@mail.gmail.com
* Allow direct lookups of AppendRelInfo by child relidAlvaro Herrera2018-06-26
| | | | | | | | | | | | | | | | | | | | | find_appinfos_by_relids had quite a large overhead when the number of items in the append_rel_list was high, as it had to trawl through the append_rel_list looking for AppendRelInfos belonging to the given childrelids. Since there can only be a single AppendRelInfo for each child rel, it seems much better to store an array in PlannerInfo which indexes these by child relid, making the function O(1) rather than O(N). This function was only called once inside the planner, so just replace that call with a lookup to the new array. find_childrel_appendrelinfo is now unused and thus removed. This fixes a planner performance regression new to v11 reported by Thomas Reiss. Author: David Rowley Reported-by: Thomas Reiss Reviewed-by: Ashutosh Bapat Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
* Increase upper limit for vacuum_cleanup_index_scale_factorAlexander Korotkov2018-06-26
| | | | | | | | | | | | | | | | | | Upper limits for vacuum_cleanup_index_scale_factor GUC and reloption were initially set to 100.0 in 857f9c36. However, after further discussion, it appears that some users like to disable B-tree cleanup index scan completely (assuming there are no deleted pages). vacuum_cleanup_index_scale_factor is used barely to protect against stalled index statistics. And after detailed consideration it appears that risk of stalled index statistics is low. And it would be nice to allow advanced users setting higher values of vacuum_cleanup_index_scale_factor. So, set upper limit for these GUC and reloption to DBL_MAX. Author: Alexander Korotkov Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAC8Q8tJCb%3DgxhzcV7T6ctx7PY-Ux1oA-AsTJc6cAVNsQiYcCzA%40mail.gmail.com
* Move RecoveryLockList into a hash table.Thomas Munro2018-06-26
| | | | | | | | | | | | | | | | | | Standbys frequently need to release all locks held by a given xid. Instead of searching one big list linearly, let's create one list per xid and put them in a hash table, so we can find what we need in O(1) time. Earlier analysis and a prototype were done by David Rowley, though this isn't his patch. Back-patch all the way. Author: Thomas Munro Diagnosed-by: David Rowley, Andres Freund Reviewed-by: Andres Freund, Tom Lane, Robert Haas Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
* Update obsolete commentsAlvaro Herrera2018-06-25
| | | | | | | Commit 9fab40ad32ef removed some pre-allocating logic in reorderbuffer.c, but left outdated comments in place. Repair. Author: Álvaro Herrera
* Translation updatesPeter Eisentraut2018-06-25
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 884f33d735870f94357820800840af3e93ff4628
* Address set of issues with errno handlingMichael Paquier2018-06-25
| | | | | | | | | | | | | | | | | | | | System calls mixed up in error code paths are causing two issues which several code paths have not correctly handled: 1) For write() calls, sometimes the system may return less bytes than what has been written without errno being set. Some paths were careful enough to consider that case, and assumed that errno should be set to ENOSPC, other calls missed that. 2) errno generated by a system call is overwritten by other system calls which may succeed once an error code path is taken, causing what is reported to the user to be incorrect. This patch uses the brute-force approach of correcting all those code paths. Some refactoring could happen in the future, but this is let as future work, which is not targeted for back-branches anyway. Author: Michael Paquier Reviewed-by: Ashutosh Sharma Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
* When index recurses to a partition, map columns numbersAlvaro Herrera2018-06-22
| | | | | | | | | | | | | | Two out of three code paths were mapping column numbers correctly if a partition had different column numbers than parent table, but the most commonly used one (recursing in CREATE INDEX to a new index on a partition) failed to map attribute numbers in expressions. Oddly enough, attnums in WHERE clauses are already handled correctly everywhere. Reported-by: Amit Langote Author: Amit Langote Discussion: https://postgr.es/m/dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp Reviewed-by: Álvaro Herrera
* Avoid generating bogus paths with partitionwise aggregate.Robert Haas2018-06-22
| | | | | | | | | | | Previously, if some or all partitions had no partially aggregated path, we would still try to generate a partially aggregated path for the parent, leading to assertion failures or wrong answers. Report by Rajkumar Raghuwanshi. Patch by Jeevan Chalke, reviewed by Ashutosh Bapat. A few changes by me. Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
* Allow for pg_upgrade of attributes with missing valuesAndrew Dunstan2018-06-22
| | | | | | | | | | | | | | | | | | | | | | | | Commit 16828d5c02 neglected to do this, so upgraded databases would silently get null instead of the specified default in rows without the attribute defined. A new binary upgrade function is provided to perform this and pg_dump is adjusted to output a call to the function if required in binary upgrade mode. Also included is code to drop missing attribute values for dropped columns. That way if the type is later dropped the missing value won't have a dangling reference to the type. Finally the regression tests are adjusted to ensure that there is a row with a missing value so that this code is exercised in upgrade testing. Catalog version unfortunately bumped. Regression test changes from Tom Lane. Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us
* Fixes for vacuum_cleanup_index_scale_factor GUC optionAlexander Korotkov2018-06-22
| | | | | | | | | | | | | | vacuum_cleanup_index_scale_factor was located in autovacuum group of GUCs. However, it affects not only autovacuum, but also manually run VACUUM. It appears that "client connection defaults" group of GUCs is more appropriate for vacuum_cleanup_index_scale_factor, because vacuum_*_age options are already located there. Also, vacuum_cleanup_index_scale_factor was missed in postgresql.conf.sample. So, add it there with appropriate comment. Author: Masahiko Sawada with minor editorization by me Discussion: https://postgr.es/m/CAD21AoArsoXMLKudXSKN679FRzs6oubEchM53bHwn8Tp%3D2boNg%40mail.gmail.com
* Fix typo in comment of commit_ts.c for incorrect reference to CLOGMichael Paquier2018-06-22
| | | | Author: Shao Bret
* Improve coding pattern in Parallel Append code.Amit Kapila2018-06-22
| | | | | | | | | | | | The create_append_path code didn't consider that list_concat will modify it's first argument leading to inconsistent traversal of resulting list. In practice, it won't lead to any user-visible bug but changing it for making the code behave consistently. Reported-by: Tom Lane Author: Tom Lane Reviewed-by: Amit Khandekar and Amit Kapila Discussion: https://postgr.es/m/32365.1528994120@sss.pgh.pa.us