aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
...
* Fix create_scan_plan's handling of sortgrouprefs for physical tlists.Tom Lane2018-07-11
| | | | | | | | | | | | | | | | | | | | | | | | We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST was specified, because only in that case has use_physical_tlist checked that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY expression not found in targetlist" error. (This subsumes the previous test about gating_clauses, because we reset "flags" to zero earlier if there are gating clauses to apply.) The only known case in which a failure can occur is with a ProjectSet path directly atop a table scan path, although it seems likely that there are other cases or will be such in future. This means that the failure is currently only visible in the v10 branch: 9.6 didn't have ProjectSet, while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird reason is using create_projection_path not apply_projection_to_path, masking the problem because there's a ProjectionPath in between. Nonetheless this code is clearly wrong on its own terms, so back-patch to 9.6 where this logic was introduced. Per report from Regina Obe. Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
* Fix bugs with degenerate window ORDER BY clauses in GROUPS/RANGE mode.Tom Lane2018-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodeWindowAgg.c failed to cope with the possibility that no ordering columns are defined in the window frame for GROUPS mode or RANGE OFFSET mode, leading to assertion failures or odd errors, as reported by Masahiko Sawada and Lukas Eder. In RANGE OFFSET mode, an ordering column is really required, so add an Assert about that. In GROUPS mode, the code would work, except that the node initialization code wasn't in sync with the execution code about when to set up tuplestore read pointers and spare slots. Fix the latter for consistency's sake (even though I think the changes described below make the out-of-sync cases unreachable for now). Per SQL spec, a single ordering column is required for RANGE OFFSET mode, and at least one ordering column is required for GROUPS mode. The parser enforced the former but not the latter; add a check for that. We were able to reach the no-ordering-column cases even with fully spec compliant queries, though, because the planner would drop partitioning and ordering columns from the generated plan if they were redundant with earlier columns according to the redundant-pathkey logic, for instance "PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual. While in principle that's an optimization that could save some pointless comparisons at runtime, it seems unlikely to be meaningful in the real world. I think this behavior was not so much an intentional optimization as a side-effect of an ancient decision to construct the plan node's ordering-column info by reverse-engineering the PathKeys of the input path. If we give up redundant-column removal then it takes very little code to generate the plan node info directly from the WindowClause, ensuring that we have the expected number of ordering columns in all cases. (If anyone does complain about this, the planner could perhaps be taught to remove redundant columns only when it's safe to do so, ie *not* in RANGE OFFSET mode. But I doubt anyone ever will.) With these changes, the WindowAggPath.winpathkeys field is not used for anything anymore, so remove it. The test cases added here are not actually very interesting given the removal of the redundant-column-removal logic, but they would represent important corner cases if anyone ever tries to put that back. Tom Lane and Masahiko Sawada. Back-patch to v11 where RANGE OFFSET and GROUPS modes were added. Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
* Block replication slot advance for these not yet reserving WALMichael Paquier2018-07-11
| | | | | | | | | | | | | | | | Such replication slots are physical slots freshly created without WAL being reserved, which is the default behavior, which have not been used yet as WAL consumption resources to retain WAL. This prevents advancing a slot to a position older than any WAL available, which could falsify calculations for WAL segment recycling. This also cleans up a bit the code, as ReplicationSlotRelease() would be called on ERROR, and improves error messages. Reported-by: Kyotaro Horiguchi Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Kyotaro Horiguchi Discussion: https://postgr.es/m/20180626071305.GH31353@paquier.xyz
* Better handle pseudotypes as partition keysAlvaro Herrera2018-07-10
| | | | | | | | | | | | | | | | | | | | | | We fail to handle polymorphic types properly when they are used as partition keys: we were unnecessarily adding a RelabelType node on top, which confuses code examining the nodes. In particular, this makes predtest.c-based partition pruning not to work, and ruleutils.c to emit expressions that are uglier than needed. Fix it by not adding RelabelType when not needed. In master/11 the new pruning code is separate so it doesn't suffer from this problem, since we already fixed it (in essentially the same way) in e5dcbb88a15d, which also added a few tests; back-patch those tests to pg10 also. But since UPDATE/DELETE still uses predtest.c in pg11, this change improves partitioning for those cases too. Add tests for this. The ruleutils.c behavior change is relevant in pg11/master too. Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/54745d13-7ed4-54ac-97d8-ea1eec95ae25@lab.ntt.co.jp
* Fix typosPeter Eisentraut2018-07-10
|
* Fix typoPeter Eisentraut2018-07-10
|
* Avoid emitting a bogus WAL record when recycling an all-zero btree page.Tom Lane2018-07-09
| | | | | | | | | | | | | | | | | | | | Commit fafa374f2 caused _bt_getbuf() to possibly emit a WAL record for a page that it was about to recycle. However, it failed to distinguish all-zero pages from dead pages, which is important because only the latter have valid btpo.xact values, or indeed any special space at all. Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore led to an Assert failure, or to emission of a WAL record containing a bogus cutoff XID, which might lead to unnecessary query cancellations on hot standby servers. Per reports from Antonin Houska and 自己. Amit Kapila was first to propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi reviewed it at various times. This is an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/2628.1474272158@localhost Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com
* Flip argument order in XLogSegNoOffsetToRecPtrAlvaro Herrera2018-07-09
| | | | | | | | | Commit fc49e24fa69a added an input argument after the existing output argument. Flip those. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20180708182345.imdgovmkffgtihhk@alvherre.pgsql
* Fix yet more problems with incorrectly-constructed zero-length arrays.Tom Lane2018-07-09
| | | | | | | | | | | | | | | | | | | Commit 716ea626a attempted to fix the problem of building 1-D zero-size arrays once and for all. But it turns out that contrib/intarray has some code that doesn't use construct_array() but just builds arrays by hand, so it didn't get the memo. This appears to affect all of subarray(), intset_subtract(), inner_int_union(), inner_int_inter(), and intarray_concat_arrays(). Back-patch into v11. In the past we've not back-patched this type of change, but since v11 is still in beta it seems all right to include this fix in it. Besides it's more consistent to make the fix in v11 where 716ea626a appeared. Report and patch by Alexey Kryuchkov, some cosmetic adjustments by me Report: https://postgr.es/m/153053285112.13258.434620894305716755@wrigleys.postgresql.org Discussion: https://postgr.es/m/CAN85JcYphDLYt4CpMDLZjjNVqGDrFJ5eS3YF=wLAhFoDQuBsyg@mail.gmail.com
* rel notes: mention enabling of parallelism in PG 10Bruce Momjian2018-07-09
| | | | | | | | Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20180525010025.GT30060@telsasoft.com Backpatch-through: 10
* Add UtilityReturnsTuples() support for CALLPeter Eisentraut2018-07-09
| | | | This ensures that prepared statements for CALL can return tuples.
* Rework order of end-of-recovery actions to delay timeline history writeMichael Paquier2018-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A critical failure in some of the end-of-recovery actions before the end-of-recovery record is written can cause PostgreSQL to react inconsistently with the rest of the cluster in the event of a crash before the final record is written. Two such failures are for example an error while processing a two-phase state files or when operating on recovery.conf. With this commit, the failures are still considered FATAL, but the write of the timeline history file is delayed as much as possible so as the window between the moment the file is written and the end-of-recovery record is generated gets minimized. This way, in the event of a crash or a failure, the new timeline decided at promotion will not seem taken by other nodes in the cluster. It is not really possible to reduce to zero this window, hence one could still see failures if a crash happens between the history file write and the end-of-recovery record, so any future code should be careful when adding new end-of-recovery actions. The original report from Magnus Hagander mentioned a renamed recovery.conf as original end-of-recovery failure which caused a timeline to be seen as taken but the subsequent processing on the now-missing recovery.conf cause the startup process to issue stop on FATAL, which at follow-up startup made the system inconsistent because of on-disk changes which already happened. Processing of two-phase state files still needs some work as corrupted entries are simply ignored now. This is left as a future item and this commit fixes the original complain. Reported-by: Magnus Hagander Author: Heikki Linnakangas Reviewed-by: Alexander Korotkov, Michael Paquier, David Steele Discussion: https://postgr.es/m/CABUevEz09XY2EevA2dLjPCY-C5UO4Hq=XxmXLmF6ipNFecbShQ@mail.gmail.com
* Add separate error message for procedure does not existPeter Eisentraut2018-07-07
| | | | | | While we probably don't want to split up all error messages into function and procedure variants, this one is a very prominent one, so it's helpful to be more specific here.
* Add note in pg_rewind documentation about read-only filesMichael Paquier2018-07-07
| | | | | | | | | | | | | | | | | When performing pg_rewind, the presence of a read-only file which is not accessible for writes will cause a failure while processing. This can cause the control file of the target data folder to be truncated, causing it to not be reusable with a successive run. Also, when pg_rewind fails mid-flight, there is likely no way to be able to recover the target data folder anyway, in which case a new base backup is the best option. A note is added in the documentation as well about. Reported-by: Christian H. Author: Michael Paquier Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/20180104200633.17004.16377%40wrigleys.postgresql.org
* Fix assert in nested SQL procedure callPeter Eisentraut2018-07-06
| | | | | | | | | | | | | | | | | When executing CALL in PL/pgSQL, we need to set a snapshot before invoking the to-be-called procedure. Otherwise, the to-be-called procedure might end up running without a snapshot. For LANGUAGE SQL procedures, this would result in an assertion failure. (For most other languages, this is usually not a problem, because those use SPI and SPI sets snapshots in most cases.) Setting the snapshot restores the behavior of how CALL worked when it was handled as a generic SQL statement in PL/pgSQL (exec_stmt_execsql()). This change revealed another problem: In SPI_commit(), we popped the active snapshot before committing the transaction, to avoid "snapshot %p still active" errors. However, there is no particular reason why only at most one snapshot should be on the stack. So change this to pop all active snapshots instead of only one.
* Allow CALL with polymorphic type argumentsPeter Eisentraut2018-07-06
| | | | | In order to be able to resolve polymorphic types, we need to set fn_expr before invoking the procedure.
* Allow replication slots to be dropped in single-user modeAlvaro Herrera2018-07-06
| | | | | | | | | | | | | | | | | | Starting with commit 9915de6c1cb2, replication slot drop uses a condition variable sleep to wait until the current user of the slot goes away. This is more user friendly than the previous behavior of erroring out if the slot is in use, but it fails with a not-for-user-consumption error message in single-user mode; plus, if you're using single-user mode because you don't want to start the server in the regular mode (say, disk is full and WAL won't recycle because of the slot), it's inconvenient. Fix by skipping the cond variable sleep in single-user mode, since there can't be anybody to wait for anyway. Reported-by: tushar <tushar.ahuja@enterprisedb.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3b2f809f-326c-38dd-7a9e-897f957a4eb1@enterprisedb.com
* doc: Reword old inheritance partitioning documentationAlvaro Herrera2018-07-06
| | | | | | | | | | | | Prefer to use phrases like "child" instead of "partition" when describing the legacy inheritance-based partitioning. The word "partition" now has a fixed meaning for the built-in partitioning, so keeping it out of the documentation of the old method makes things clearer. Author: Justin Pryzby <pryzby@telsasoft.com> Committer: Peter Eisentraut <peter_e@gmx.net> Backpatch of: 0c06534bd63b
* logical decoding: beware of an unset specinsert changeAlvaro Herrera2018-07-05
| | | | | | Coverity complains that there is no protection in the code (at least in non-assertion-enabled builds) against speculative insertion failing to follow the expected protocol. Add an elog(ERROR) for the case.
* doc: Fix typosPeter Eisentraut2018-07-05
| | | | Author: Justin Pryzby <pryzby@telsasoft.com>
* Reduce cost of test_decoding's new oldest_xmin testAlvaro Herrera2018-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change a whole-database VACUUM into doing just pg_attribute, which is the portion that verifies what we want it to do. The original formulation wastes a lot of CPU time, which leads the test to fail when runtime exceeds isolationtester timeout when it's super-slow, such as under CLOBBER_CACHE_ALWAYS. Per buildfarm member friarbird. It turns out that the previous shape of the test doesn't always detect the condition it is supposed to detect (on unpatched reorderbuffer code): the reason is that there is a good chance of encountering a xl_running_xacts record (logged every 15 seconds) before the checkpoint -- and because we advance the xmin when we receive that WAL record, and we *don't* advance the xmin twice consecutively without receiving a client message in between, that means the xmin is not advanced enough for the tuple to be pruned from pg_attribute by VACUUM. So the test would spuriously pass. The reason this test deficiency wasn't detected earlier is that HOT pruning removes the tuple anyway, even if vacuum leaves it in place, so the test correctly fails (detecting the coding mistake), but for the wrong reason. To fix this mess, run the s0_get_changes step twice before vacuum instead of once: this seems to cause the xmin to be advanced reliably, wreaking havoc with more certainty. Author: Arseny Sher Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad
* Prevent references to invalid relation pages after fresh promotionMichael Paquier2018-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a standby crashes after promotion before having completed its first post-recovery checkpoint, then the minimal recovery point which marks the LSN position where the cluster is able to reach consistency may be set to a position older than the first end-of-recovery checkpoint while all the WAL available should be replayed. This leads to the instance thinking that it contains inconsistent pages, causing a PANIC and a hard instance crash even if all the WAL available has not been replayed for certain sets of records replayed. When in crash recovery, minRecoveryPoint is expected to always be set to InvalidXLogRecPtr, which forces the recovery to replay all the WAL available, so this commit makes sure that the local copy of minRecoveryPoint from the control file is initialized properly and stays as it is while crash recovery is performed. Once switching to archive recovery or if crash recovery finishes, then the local copy minRecoveryPoint can be safely updated. Pavan Deolasee has reported and diagnosed the failure in the first place, and the base fix idea to rely on the local copy of minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded into a full-fledged patch by me. The test included in this commit has been written by Álvaro Herrera and Pavan Deolasee, which I have modified to make it faster and more reliable with sleep phases. Backpatch down to all supported versions where the bug appears, aka 9.3 which is where the end-of-recovery checkpoint is not run by the startup process anymore. The test gets easily supported down to 10, still it has been tested on all branches. Reported-by: Pavan Deolasee Diagnosed-by: Pavan Deolasee Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro Herrera Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
* Use context with correct lifetime in hypothetical_dense_rank_final.Andres Freund2018-07-04
| | | | | | | | | | | The query lifetime expression context created in hypothetical_dense_rank_final() was buggily allocated in the calling memory context. I (Andres) broke that in bf6c614a2f2. Reported-By: Rajkumar Raghuwanshi Author: Amit Langote Discussion: https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com Backpatch: 11-
* Check for interrupts inside the nbtree page deletion code.Andres Freund2018-07-04
| | | | | | | | | | | | | | | | | | | | When deleting pages the nbtree code has to walk through siblings of a tree node. When those sibling links are corrupted that can lead to endless loops - which are currently not interruptible. This is especially problematic if autovacuum is repeatedly blocked on such indexes, as it can be hard to get out of that situation without resorting to single user mode. Thus add interrupt checks to appropriate places in such loops. Unfortunately in one of the cases it's it's not easy to do so. Between 9.3 and 9.4 the page deletion (and page split) code changed significantly. Before it was significantly less robust against interruptions. Therefore don't backpatch to 9.3. Author: Andres Freund Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de Backpatch: 9.4-
* Improve the performance of relation deletes during recovery.Fujii Masao2018-07-05
| | | | | | | | | | | | | | | | | | | | | | When multiple relations are deleted at the same transaction, the files of those relations are deleted by one call to smgrdounlinkall(), which leads to scan whole shared_buffers only one time. OTOH, previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was called for each file to delete, which led to scan shared_buffers multiple times. Obviously this could cause to increase the WAL replay time very much especially when shared_buffers was huge. To alleviate this situation, this commit changes the recovery so that it also calls smgrdounlinkall() only one time to delete multiple relation files. This is just fix for oversight of commit 279628a0a7, not new feature. So, per discussion on pgsql-hackers, we concluded to backpatch this to all supported versions. Author: Fujii Masao Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
* Remove dead code for temporary relations in partition planningMichael Paquier2018-07-04
| | | | | | | | | | | | | | | | | | | Since recent commit 1c7c317c, temporary relations cannot be mixed with permanent relations within the same partition tree, and the same counts for temporary relations created by other sessions, which the planner simply discarded. Instead be paranoid and issue an error, as those should be blocked at definition time, at least for now. At the same time, a test case is added to stress what has been moved when expand_partitioned_rtentry gets called recursively but bumps on a partitioned relation with no partitions which should be handled the same way as the non-inheritance case. This code may be reworked in a close future, and covering this code path will limit surprises. Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Langote, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
* Correct commentPeter Eisentraut2018-07-03
|
* Fix libpq example programsPeter Eisentraut2018-07-01
| | | | | | | | When these programs call pg_catalog.set_config, they need to check for PGRES_TUPLES_OK instead of PGRES_COMMAND_OK. Fix for 5770172cb0c9df9e6ce27c507b449557e5b45124. Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>
* perltidy run prior to branchingAndrew Dunstan2018-06-30
|
* pgindent run prior to branchingAndrew Dunstan2018-06-30
|
* Update typedefs listAndrew Dunstan2018-06-30
|
* Documentation spell checking and markup improvementsPeter Eisentraut2018-06-29
|
* doc: Replace non-ASCII lines in psql example outputPeter Eisentraut2018-06-29
|
* psql: show cloned triggers in partitionsAlvaro Herrera2018-06-29
| | | | | | | | | | | | In a partition, row triggers that had been cloned from their parent partitioned table would not be listed at all in psql's \d, which could surprise users, per insistent complaint from Ashutosh Bapat (though his aim was elsewhere). The simplest possible fix, suggested by Peter Eisentraut, seems to be to list triggers marked as internal if they have a row in pg_depend that points to some other trigger. Author: Álvaro Herrera Discussion: https://postgr.es/m/20180618165910.p26vhk7dpq65ix54@alvherre.pgsql
* Fix crash when ALTER TABLE recreates indexes on partitionsAlvaro Herrera2018-06-29
| | | | | | | | | The skip_build flag was not being passed correctly when recursing to indexes on partitions, leading to attempts to rebuild indexes when they were not yet ready to be rebuilt. Reported-by: Rajkumar Raghuwanshi Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com
* Replace search.cpan.org with metacpan.orgMichael Paquier2018-06-29
| | | | | | | | | | search.cpan.org has been EOL'd, with metacpan.org being the official replacement to which URLs now redirect. Update links to match the new URL. Also update links to CPAN to use https as it will redirect from http. Author: Daniel Gustafsson Discussion: https://postgr.es/m/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3@yesql.se
* Make capitalization of term "OpenSSL" more consistentMichael Paquier2018-06-29
| | | | | | | | This includes code comments and documentation. No backpatch as this is cosmetic even if there are documentation changes which are user-facing. Author: Daniel Gustafsson Discussion: https://postgr.es/m/BB89928E-2BC7-489E-A5E4-6D204B3954CF@yesql.se
* Fix typo in commentAlvaro Herrera2018-06-27
| | | | | Author: Amit Langote Discussion: https://postgr.es/m/b23dc88b-df41-ef07-22c5-12f77cf73b57@lab.ntt.co.jp
* Fix thinko in comments.Amit Kapila2018-06-27
| | | | | | | | | A slot can not be stored in a tuple but it's vice versa. Reported-by: Ashutosh Bapat Author: Ashutosh Bapat Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
* Change pqformat.h's integer handling functions to take unsigned integers.Andres Freund2018-06-26
| | | | | | | | | | | | | | | | | | | | | | As added in 1de09ad8eb1fa673ee7899d6dfbb2b49ba204818 the new functions all accept signed integers as parameters. That's not great, because it's perfectly reasonable to call them with unsigned parameters. Unfortunately unsigned to signed conversion is not well defined, when exceeding the range of the signed value. That's presently not a practical issue in postgres (among other reasons because we force gcc's hand with -fwrapv). But it's clearly not quite right. Thus change the signatures to accept unsigned integers instead, signed to unsigned conversion is always well defined. Also change the backward compat pq_sendint() - while it's deprecated it seems better to be consistent. Per discussion between Andrew Gierth, Michael Paquier, Alvaro Herrera and Tom Lane. Reported-By: Andrew Gierth Author: Andres Freund Discussion: https://postgr.es/m/87r2m10zm2.fsf@news-spur.riddles.org.uk
* Remove duplicated return statement from llvmjit code.Andres Freund2018-06-26
| | | | | | | | The duplicated return clearly doesn't make sense / isn't reachable. Likely introduced by me (Andres), while revising the code. Author: Rushabh Lathia Discussion: https://postgr.es/m/CAGPqQf2raxWOcbuTP36M1rEF3=Rfo7oD29K3psdyHMeE5swBRg@mail.gmail.com
* Fix whitespacePeter Eisentraut2018-06-27
|
* doc: Improve wording and fix whitespacePeter Eisentraut2018-06-27
|
* doc: Document some nuances of logical replication of TRUNCATEPeter Eisentraut2018-06-27
|
* Cosmetic improvements for faster column addition.Amit Kapila2018-06-27
| | | | | | | | | | Changed the name of few structure members for the sake of clarity and removed spurious whitespace. Reported-by: Amit Kapila Author: Amit Kapila, based on suggestion by Andrew Dunstan Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com
* Fix "base" snapshot handling in logical decodingAlvaro Herrera2018-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
* Fix upper limit for vacuum_cleanup_index_scale_factorAlexander Korotkov2018-06-26
| | | | | | | | | | | | | | | 6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to DBL_MAX. DBL_MAX appears to be platform-dependent. That causes many buildfarm animals to fail, because we check boundaries of vacuum_cleanup_index_scale_factor in regression tests. This commit changes upper limit from DBL_MAX to just "large enough" limit, which was arbitrary selected as 1e10. Author: Alexander Korotkov Reported-by: Tom Lane, Darafei Praliaskouski Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
* |--- gitweb/email subject limit -----------------|-------------|Bruce Momjian2018-06-26
| | | | | | doc: PG 11 relnotes: remove channel binding from major features Also move to the source code section, and expand the paragraph
* Correct a comment on logtape.c's leader tape.Peter Geoghegan2018-06-26
| | | | | | | | randomAccess parallel tuplesorts are disallowed because the leader would try to write to its own leader tape, not because the leader would try to write to a worker tape directly. Cleanup from commit 9da0cc35284.
* Remove obsolete comment block in nbtsort.c.Peter Geoghegan2018-06-26
| | | | | | | | | | | Building a new nbtree index through incremental insertions would always be slower than our actual approach of sorting using tuplesort, assembling leaf pages from tuplesort output, and writing and WAL-logging whole pages. Remove a comment block from the Berkeley days claiming that incremental insertions might be slightly faster with presorted input. Discussion: https://postgr.es/m/CAH2-WzmKs4mLAoFgJ3yHMRYc849efc=dw+pNRb3NEog2oJoCNw@mail.gmail.com