aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Fix ssl tests for when tls-server-end-point is not supportedPeter Eisentraut2018-01-09
| | | | | | | Add a function to TestLib that allows us to check pg_config.h and then decide the expected test outcome based on that. Author: Michael Paquier <michael.paquier@gmail.com>
* Fix race condition during replication origin drop.Tom Lane2018-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replorigin_drop() misunderstood the API for condition variables: it had ConditionVariablePrepareToSleep and ConditionVariableCancelSleep inside its test-and-sleep loop, rather than outside the loop as intended. The net effect is a narrow race-condition window wherein, if the process using a replication slot releases it immediately after replorigin_drop() releases the ReplicationOriginLock, replorigin_drop() would get into the condition variable's wait list too late and then wait indefinitely for a signal that won't come. Because there's a different CV for each replication slot, we can't just move the ConditionVariablePrepareToSleep call to above the test-and-sleep loop. What we can do, in the wake of commit 13db3b936, is drop the ConditionVariablePrepareToSleep call entirely. This fix depends on that commit because (at least in principle) the slot matching the target replication origin might move around, so that once in a blue moon successive loop iterations might involve different CVs. We can now cope with such a scenario, at the cost of an extra trip through the retry loop. (There are ways we could fix this bug without depending on that commit, but they're all a lot more complicated than this way.) While at it, upgrade the rather skimpy comments in this function. Back-patch to v10 where this code came in. Discussion: https://postgr.es/m/19947.1515455433@sss.pgh.pa.us
* Allow ConditionVariable[PrepareTo]Sleep to auto-switch between CVs.Tom Lane2018-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | The original coding here insisted that callers manually cancel any prepared sleep for one condition variable before starting a sleep on another one. While that's not a huge burden today, it seems like a gotcha that will bite us in future if the use of condition variables increases; anything we can do to make the use of this API simpler and more robust is attractive. Hence, allow these functions to automatically switch their attention to a different CV when required. This is safe for the same reason it was OK for commit aced5a92b to let a broadcast operation cancel any prepared CV sleep: whenever we return to the other test-and-sleep loop, we will automatically re-prepare that CV, paying at most an extra test of that loop's exit condition. Back-patch to v10 where condition variables were introduced. Ordinarily we would probably not back-patch a change like this, but since it does not invalidate any coding pattern that was legal before, it seems safe enough. Furthermore, there's an open bug in replorigin_drop() for which the simplest fix requires this. Even if we chose to fix that in some more complicated way, the hazard would remain that we might back-patch some other bug fix that requires this behavior. Patch by me, reviewed by Thomas Munro. Discussion: https://postgr.es/m/2437.1515368316@sss.pgh.pa.us
* Don't allow VACUUM VERBOSE ANALYZE VERBOSE.Robert Haas2018-01-09
| | | | | | | | | | | | There are plans to extend the syntax for ANALYZE, so we need to break the link between VacuumStmt and AnalyzeStmt. But apart from that, the syntax above is undocumented and, if discovered by users, might give the impression that the VERBOSE option for VACUUM differs from the verbose option from ANALYZE, which it does not. Nathan Bossart, reviewed by Michael Paquier and Masahiko Sawada Discussion: http://postgr.es/m/D3FC73E2-9B1A-4DB4-8180-55F57D116B4E@amazon.com
* Improve scripting language in pgbenchTeodor Sigaev2018-01-09
| | | | | | | | | | | | | | | | | | Added: - variable now might contain integer, double, boolean and null values - functions ln, exp - logical AND/OR/NOT - bitwise AND/OR/NOT/XOR - bit right/left shift - comparison operators - IS [NOT] (NULL|TRUE|FALSE) - conditional choice (in form of when/case/then) New operations and functions allow to implement more complicated test scenario. Author: Fabien Coelho with minor editorization by me Reviewed-By: Pavel Stehule, Jeevan Ladhe, me Discussion: https://www.postgresql.org/message-id/flat/alpine.DEB.2.10.1604030742390.31618@sto
* Fix comment.Robert Haas2018-01-09
| | | | | | RELATION_IS_OTHER_TEMP is tested in the caller, not here. Discussion: http://postgr.es/m/5A5438E4.3090709@lab.ntt.co.jp
* pg_upgrade: prevent check on live cluster from generating errorBruce Momjian2018-01-08
| | | | | | | | | | | Previously an inaccurate but harmless error was generated when running --check on a live server before reporting the servers as compatible. The fix is to split error reporting and exit control in the exec_prog() API. Reported-by: Daniel Westermann Backpatch-through: 10
* Cosmetic improvements in condition_variable.[hc].Tom Lane2018-01-08
| | | | | | Clarify a bunch of comments. Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Improve error detection capability in proclists.Tom Lane2018-01-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, although the initial state of a proclist_node is expected to be next == prev == 0, proclist_delete_offset would reset nodes to next == prev == INVALID_PGPROCNO when removing them from a list. This is the same state that a node in a singleton list has, so that it's impossible to distinguish not-in-a-list from in-a-list. Change proclist_delete_offset to reset removed nodes to next == prev == 0, making it possible to distinguish those cases, and then add Asserts to the list add and delete functions that the supplied node isn't or is in a list at entry. Also tighten assertions about the node being in the particular list (not some other one) where it is possible to check that in O(1) time. In ConditionVariablePrepareToSleep, since we don't expect the process's cvWaitLink to already be in a list, remove the more-or-less-useless proclist_contains check; we'd rather have proclist_push_tail's new assertion fire if that happens. Improve various comments related to proclists, too. Patch by me, reviewed by Thomas Munro. This isn't back-patchable, since there could theoretically be inlined copies of proclist_delete_offset in third-party modules. But it's only improving debuggability anyway. Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Back off chattiness in RemovePgTempFiles().Tom Lane2018-01-07
| | | | | | | | | | | | | | In commit 561885db0, as part of normalizing RemovePgTempFiles's error handling, I removed its behavior of silently ignoring ENOENT failures during directory opens. Thomas Munro points out that this is a bad idea at the top level, because we don't create pgsql_tmp directories until needed. Thus this coding could produce LOG messages in perfectly normal situations, which isn't what I intended. Restore the suppression of ENOENT logging, but only at top level --- it would still be unexpected for a nested temp directory to disappear between seeing it in the parent directory and opening it. Discussion: https://postgr.es/m/CAEepm=2y06SehAkTnd5sU_eVqdv5P-=Srt1y5vYNQk6yVDVaPw@mail.gmail.com
* Add TIMELINE to backup_label fileSimon Riggs2018-01-06
| | | | | | | Allows new test to confirm timelines match Author: Michael Paquier Reviewed-by: David Steele
* Default monitoring roles - errataSimon Riggs2018-01-06
| | | | | | | | | | | | | | | | 25fff40798fc4ac11a241bfd9ab0c45c085e2212 introduced default monitoring roles. Apply these corrections: * Allow access to pg_stat_get_wal_senders() by role pg_read_all_stats * Correct comment in pg_stat_get_wal_receiver() to show it is no longer superuser-only. Author: Feike Steenbergen Reviewed-by: Michael Paquier Apply to HEAD, then later backpatch to 10
* Remove return values of ConditionVariableSignal/Broadcast.Tom Lane2018-01-05
| | | | | | | | | | | | | | | | | | | | | | In the wake of commit aced5a92b, the semantics of these results are a bit squishy: we can tell whether we signaled some other process(es), but we do not know which ones were real waiters versus mere sentinels for ConditionVariableBroadcast operations. It does not help much that ConditionVariableBroadcast will attempt to pass on the signal to the next real waiter, because (a) there might not be one, and (b) that will only happen awhile later, anyway. So these results could overstate how much effect the calls really had. However, no existing caller of either function pays any attention to its result value, so it seems reasonable to just define that as a required property of a correct algorithm. To encourage correctness and save some tiny number of cycles, change both functions to return void. Patch by me, per an observation by Thomas Munro. No back-patch, since if any third parties happen to be using these functions, they might not appreciate an API break in a minor release. Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Reorder steps in ConditionVariablePrepareToSleep for more safety.Tom Lane2018-01-05
| | | | | | | | | | | | | | | In the admittedly-very-unlikely case that AddWaitEventToSet fails, ConditionVariablePrepareToSleep would error out after already having set cv_sleep_target, which is probably bad, and after having already set cv_wait_event_set, which is very bad. Transaction abort might or might not clean up cv_sleep_target properly; but there is nothing that would be aware that the WaitEventSet wasn't fully constructed, so that all future condition variable sleeps would be broken. We can easily guard against these hazards with slight restructuring. Back-patch to v10 where condition_variable.c was introduced. Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Rewrite ConditionVariableBroadcast() to avoid live-lock.Tom Lane2018-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | The original implementation of ConditionVariableBroadcast was, per its self-description, "the dumbest way possible". Thomas Munro found out it was a bit too dumb. An awakened process may immediately re-queue itself, if the specific condition it's waiting for is not yet satisfied. If this happens before ConditionVariableBroadcast is able to see the wait queue as empty, then ConditionVariableBroadcast will re-awaken the same process, repeating the cycle. Given unlucky timing this back-and-forth can repeat indefinitely; loops lasting thousands of seconds have been seen in testing. To fix, add our own process to the end of the wait queue to serve as a sentinel, and exit the broadcast loop once our process is not there anymore. There are various special considerations described in the comments, the principal disadvantage being that wakers can no longer be sure whether they awakened a real waiter or just a sentinel. But in practice nobody pays attention to the result of ConditionVariableSignal or ConditionVariableBroadcast anyway, so that problem seems hypothetical. Back-patch to v10 where condition_variable.c was introduced. Tom Lane and Thomas Munro Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Factor error generation out of ExecPartitionCheck.Robert Haas2018-01-05
| | | | | | | | | | | At present, we always raise an ERROR if the partition constraint is violated, but a pending patch for UPDATE tuple routing will consider instead moving the tuple to the correct partition. Refactor to make that simpler. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me. Discussion: http://postgr.es/m/CAJ3gD9cue54GbEzfV-61nyGpijvjZgCcghvLsB0_nL8Nm8HzCA@mail.gmail.com
* pg_upgrade: remove C commentBruce Momjian2018-01-05
| | | | | | Revert another part of 959ee6d267fb24e667fc64e9837a376e236e84a5 . Backpatch-through: 10
* pg_upgrade: revert part of patch for ease of translationBruce Momjian2018-01-05
| | | | | | Revert part of 959ee6d267fb24e667fc64e9837a376e236e84a5 . Backpatch-through: 10
* pg_upgrade: simplify code layout in a few placesBruce Momjian2018-01-05
| | | | Backpatch-through: 9.4 (9.3 didn't need improving)
* Fix failure to delete spill files of aborted transactionsAlvaro Herrera2018-01-05
| | | | | | | | | | | | | | | | | | Logical decoding's reorderbuffer.c may spill transaction files to disk when transactions are large. These are supposed to be removed when they become "too old" by xid; but file removal requires the boundary LSNs of the transaction to be known. The final_lsn is only set when we see the commit or abort record for the transaction, but nothing sets the value for transactions that crash, so the removal code misbehaves -- in assertion-enabled builds, it crashes by a failed assertion. To fix, modify the final_lsn of transactions that don't have a value set, to the LSN of the very latest change in the transaction. This causes the spilled files to be removed appropriately. Author: Atsushi Torikoshi Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp
* Another attempt at fixing build with various OpenSSL versionsPeter Eisentraut2018-01-04
| | | | | | It seems we can't easily work around the lack of X509_get_signature_nid(), so revert the previous attempts and just disable the tls-server-end-point feature if we don't have it.
* Add missing includesPeter Eisentraut2018-01-04
| | | | | <openssl/x509.h> is necessary to look into the X509 struct, used by ac3ff8b1d8f98da38c53a701e6397931080a39cf.
* Minor preparatory refactoring for UPDATE row movement.Robert Haas2018-01-04
| | | | | | | | | | | | Generalize is_partition_attr to has_partition_attrs and make it accessible from outside tablecmds.c. Change map_partition_varattnos to clarify that it can be used for mapping between any two relations in a partitioning hierarchy, not just parent -> child. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me. Some comment changes by me. Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com
* Fix build with older OpenSSL versionsPeter Eisentraut2018-01-04
| | | | | Apparently, X509_get_signature_nid() is only in fairly new OpenSSL versions, so use the lower-level interface it is built on instead.
* Simplify and encapsulate tuple routing support code.Robert Haas2018-01-04
| | | | | | | | | | | | Instead of having ExecSetupPartitionTupleRouting return multiple out parameters, have it return a pointer to a structure containing all of those different things. Also, provide and use a cleanup function, ExecCleanupTupleRouting, instead of cleaning up all of the resources allocated by ExecSetupPartitionTupleRouting individually. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com
* Implement channel binding tls-server-end-point for SCRAMPeter Eisentraut2018-01-04
| | | | | | | | This adds a second standard channel binding type for SCRAM. It is mainly intended for third-party clients that cannot implement tls-unique, for example JDBC. Author: Michael Paquier <michael.paquier@gmail.com>
* Refactor channel binding code to fetch cbind_data only when necessaryPeter Eisentraut2018-01-04
| | | | | | | | | | | | | | | | | | | | As things stand now, channel binding data is fetched from OpenSSL and saved into the SCRAM exchange context for any SSL connection attempted for a SCRAM authentication, resulting in data fetched but not used if no channel binding is used or if a different channel binding type is used than what the data is here for. Refactor the code in such a way that binding data is fetched from the SSL stack only when a specific channel binding is used for both the frontend and the backend. In order to achieve that, save the libpq connection context directly in the SCRAM exchange state, and add a dependency to SSL in the low-level SCRAM routines. This makes the interface in charge of initializing the SCRAM context cleaner as all its data comes from either PGconn* (for frontend) or Port* (for the backend). Author: Michael Paquier <michael.paquier@gmail.com>
* Define LDAPS_PORT if it's missing and disable implicit LDAPS on WindowsPeter Eisentraut2018-01-04
| | | | | | | | | | | | Some versions of Windows don't define LDAPS_PORT. Also, Windows' ldap_sslinit() is documented to use LDAPS even if you said secure=0 when the port number happens to be 636 or 3269. Let's avoid using the port number to imply that you want LDAPS, so that connection strings have the same meaning on Windows and Unix. Author: Thomas Munro Discussion: https://postgr.es/m/CAEepm%3D23B7GV4AUz3MYH1TKpTv030VHxD2Sn%2BLYWDv8d-qWxww%40mail.gmail.com
* Code review for Parallel Append.Robert Haas2018-01-04
| | | | | | | | | | | | | - Remove unnecessary #include mistakenly added in execnodes.h. - Fix mistake in comment in choose_next_subplan_for_leader. - Adjust row estimates in cost_append for a possibly-different parallel divisor. - Clamp row estimates in cost_append after operations that may not produce integers. Amit Kapila, with cosmetic adjustments by me. Discussion: http://postgr.es/m/CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
* Tweak parallel hash join test case in hopes of improving stability.Tom Lane2018-01-04
| | | | | | | | | This seems to make things better on gaur, let's see what the rest of the buildfarm thinks. Thomas Munro Discussion: https://postgr.es/m/CAEepm=1uuT8iJxMEsR=jL+3zEi87DB2v0+0H9o_rUXXCZPZT3A@mail.gmail.com
* Clean up tupdesc.c for recent changes.Tom Lane2018-01-03
| | | | | | | | | | | | | | | | | | | TupleDescCopy needs to have the same effects as CreateTupleDescCopy in that, since it doesn't copy constraints, it should clear the per-attribute fields associated with them. Oversight in commit cc5f81366. Since TupleDescCopy has already established the presumption that it can just flat-copy the entire attribute array in one go, propagate that approach into CreateTupleDescCopy and CreateTupleDescCopyConstr. (I'm suspicious that this would lead to valgrind complaints if we had any trailing padding in the struct, but we do not, and anyway fixing that seems like a job for a separate commit.) Add some better comments. Thomas Munro, reviewed by Vik Fearing, some additional hacking by me Discussion: https://postgr.es/m/CAEepm=0NvOGZ8B6GbQyQe2C_c2m3LKJ9w=8OMBaYRLgZ_Gw6Nw@mail.gmail.com
* Fix typoAlvaro Herrera2018-01-03
| | | | | Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/d8jefpk4jtd.fsf@dalvik.ping.uio.no
* Revert "Fix isolation test to be less timing-dependent"Alvaro Herrera2018-01-03
| | | | | | | | This reverts commit 2268e6afd596. It turned out that inconsistency in the report is still possible, so go back to the simpler formulation of the test and instead add an alternate expected output. Discussion: https://postgr.es/m/20180103193728.ysqpcp2xjnqpiep7@alvherre.pgsql
* Rename pg_rewind's copy_file_range() to avoid conflict with new linux syscall.Andres Freund2018-01-03
| | | | | | | | | | | | | | | | | Upcoming versions of glibc will contain copy_file_range(2), a wrapper around a new linux syscall for in-kernel copying of data ranges. This conflicts with pg_rewinds function of the same name. Therefore rename pg_rewinds version. As our version isn't a generic copying facility we decided to choose a rewind specific function name. Per buildfarm animal caiman and subsequent discussion with Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/20180103033425.w7jkljth3e26sduc@alap3.anarazel.de https://postgr.es/m/31122.1514951044@sss.pgh.pa.us Backpatch: 9.5-, where pg_rewind was introduced
* Fix use of config-specific libraries for Windows OpenSSLAndrew Dunstan2018-01-03
| | | | | | | | | | | Commit 614350a3 allowed for an different builds of OpenSSL libraries on Windows, but ignored the fact that the alternative builds don't have config-specific libraries. This patch fixes the Solution file to ask for the correct libraries. per offline discussions with Leonardo Cecchi and Marco Nenciarini, Backpatch to all live branches.
* Make XactLockTableWait work for transactions that are not yet self-lockedAlvaro Herrera2018-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas
* Fix some minor errors in new PHJ code.Tom Lane2018-01-03
| | | | | | | | | | | | Correct ExecParallelHashTuplePrealloc's estimate of whether the space_allowed limit is exceeded. Be more consistent about tuples that are exactly HASH_CHUNK_THRESHOLD in size (they're "small", not "large"). Neither of these things explain the current buildfarm unhappiness, but they're still bugs. Thomas Munro, per gripe by me Discussion: https://postgr.es/m/CAEepm=34PDuR69kfYVhmZPgMdy8pSA-MYbpesEN1SR+2oj3Y+w@mail.gmail.com
* Teach eval_const_expressions() to handle some more cases.Tom Lane2018-01-03
| | | | | | | | | | | | | | | | | | | Add some infrastructure (mostly macros) to make it easier to write typical cases for constant-expression simplification. Add simplification processing for ArrayRef, RowExpr, and ScalarArrayOpExpr node types, which formerly went unsimplified even if all their inputs were constants. Also teach it to simplify FieldSelect from a composite constant. Make use of the new infrastructure to reduce the amount of code needed for the existing ArrayExpr and ArrayCoerceExpr cases. One existing test case changes output as a result of the fact that RowExpr can now be folded to a constant. All the new code is exercised by existing test cases according to gcov, so I feel no need to add additional tests. Tom Lane, reviewed by Dmitry Dolgov Discussion: https://postgr.es/m/3be3b82c-e29c-b674-2163-bf47d98817b1@iki.fi
* Allow ldaps when using ldap authenticationPeter Eisentraut2018-01-03
| | | | | | | | | | | | | | | | | | | | While ldaptls=1 provides an RFC 4513 conforming way to do LDAP authentication with TLS encryption, there was an earlier de facto standard way to do LDAP over SSL called LDAPS. Even though it's not enshrined in a standard, it's still widely used and sometimes required by organizations' network policies. There seems to be no reason not to support it when available in the client library. Therefore, add support when using OpenLDAP 2.4+ or Windows. It can be configured with ldapscheme=ldaps or ldapurl=ldaps://... Add tests for both ways of requesting LDAPS and a test for the pre-existing ldaptls=1. Modify the 001_auth.pl test for "diagnostic messages", which was previously relying on the server rejecting ldaptls=1. Author: Thomas Munro Reviewed-By: Peter Eisentraut Discussion: https://postgr.es/m/CAEepm=1s+pA-LZUjQ-9GQz0Z4rX_eK=DFXAF1nBQ+ROPimuOYQ@mail.gmail.com
* Fix isolation test to be less timing-dependentAlvaro Herrera2018-01-03
| | | | | | | | I did this by adding another locking process, which makes the other two wait. This way the output should be stable enough. Per buildfarm and Andres Freund Discussion: https://postgr.es/m/20180103034445.t3utrtrnrevfsghm@alap3.anarazel.de
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Simplify representation of aggregate transition values a bit.Andres Freund2018-01-02
| | | | | | | | | | | | | | | | | | | Previously aggregate transition values for hash and other forms of aggregation (i.e. sort and no group by) were represented differently. Hash based aggregation used a grouping set indexed array pointing to an array of transition values, whereas other forms of aggregation used one flattened array with the index being computed out of grouping set and transition offsets. That made upcoming changes hard, so represent both as grouping set indexed array of per-group data. As a nice side-effect this also makes aggregation slightly faster, because computing offsets with `transno + (setno * numTrans)` turns out not to be that cheap (too big for x86 lea for example). Author: Andres Freund Discussion: https://postgr.es/m/20171128003121.nmxbm2ounxzb6n2t@alap3.anarazel.de
* Ensure proper alignment of tuples in HashMemoryChunkData buffers.Tom Lane2018-01-02
| | | | | | | | | | | | | The previous coding relied (without any documentation) on the data[] member of HashMemoryChunkData being at a MAXALIGN'ed offset. If it was not, the tuples would not be maxaligned either, leading to failures on alignment-picky machines. While there seems to be no live bug on any platform we support, this is clearly pretty fragile: any addition to or rearrangement of the fields in HashMemoryChunkData could break it. Let's remove the hazard by getting rid of the data[] member and instead using pointer arithmetic with an explicitly maxalign'ed offset. Discussion: https://postgr.es/m/14483.1514938129@sss.pgh.pa.us
* Fix deadlock hazard in CREATE INDEX CONCURRENTLYAlvaro Herrera2018-01-02
| | | | | | | | | | | | | | | | | | | | | | Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit c3d09b3bd23f specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by 8aa3e47510b9 keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com
* Don't cast between GinNullCategory and boolPeter Eisentraut2018-01-02
| | | | | | | | | | | | | | The original idea was that we could use an isNull-style bool array directly as a GinNullCategory array. However, the existing code already acknowledges that that doesn't really work, because of the possibility that bool as currently defined can have arbitrary bit patterns for true values. So it has to loop through the nullFlags array to set each bool value to an acceptable value. But if we are looping through the whole array anyway, we might as well build a proper GinNullCategory array instead and abandon the type casting. That makes the code much safer in case bool is ever changed to something else. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Fix EXPLAIN ANALYZE output for Parallel Hash.Andres Freund2018-01-01
| | | | | | | | | | | | | | In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and size information. Refactor so that participants report only on batches they worked on rather than trying to report on all of them, and teach explain.c to consider the HashInstrumentation object from all participants instead of picking the first one it can find. This should fix an occasional build farm failure in the "join" regression test. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/30219.1514428346%40sss.pgh.pa.us
* In tests, await an LSN no later than the recovery target.Noah Misch2017-12-31
| | | | | | | | | | Otherwise, the test fails with "Timed out while waiting for standby to catch up". This happened rarely, perhaps only when autovacuum wrote WAL between our choosing the recovery target and choosing the LSN to await. Commit b26f7fa6ae2b4e5d64525b3d5bc66a0ddccd9e24 fixed one case of this. Fix two more. Back-patch to 9.6, which introduced the affected test. Discussion: https://postgr.es/m/20180101055227.GA2952815@rfd.leadboat.com
* Merge coding of return/exit/continue cases in plpgsql's loop statements.Tom Lane2017-12-31
| | | | | | | | | | | plpgsql's five different loop control statements contained three distinct implementations of the same (or what ought to be the same, at least) logic for handling return/exit/continue result codes from their child statements. At best, that's trouble waiting to happen, and there seems no very good reason for the coding to be so different. Refactor so that all the common logic is expressed in a single macro. Discussion: https://postgr.es/m/26314.1514670401@sss.pgh.pa.us
* Improve regression tests' code coverage for plpgsql control structures.Tom Lane2017-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that our code coverage report showed considerable deficiency in test coverage for PL/pgSQL control statements. Notably, both exec_stmt_block and most of the loop control statements had very poor coverage of handling of return/exit/continue result codes from their child statements; and exec_stmt_fori was seriously lacking in feature coverage, having no test that exercised its BY or REVERSE features, nor verification that its overflow defenses work. Now that we have some infrastructure for plpgsql-specific test scripts, the natural thing to do is make a new script rather than further extend plpgsql.sql. So I created a new script plpgsql_control.sql with the charter to test plpgsql control structures, and moved a few existing tests there because they fell entirely under that charter. I then added new test cases that exercise the bits of code complained of above. Of the five kinds of loop statements, only exec_stmt_while's result code handling is fully exercised by these tests. That would be a deficiency as things stand, but a follow-on commit will merge the loop statements' result code handling into one implementation. So testing each usage of that implementation separately seems redundant. In passing, also add a couple test cases to plpgsql.sql to more fully exercise plpgsql's code related to expanded arrays --- I had thought that area was sufficiently covered already, but the coverage report showed a couple of un-executed code paths. Discussion: https://postgr.es/m/26314.1514670401@sss.pgh.pa.us
* Fix typoAlvaro Herrera2017-12-29
|