aboutsummaryrefslogtreecommitdiff
path: root/src/test/isolation/isolationtester.c
Commit message (Collapse)AuthorAge
* Improve isolationtester's timeout management.Tom Lane2019-12-09
| | | | | | | | | | | | | | | | | | | | | | | isolationtester.c had a hard-wired limit of 3 minutes per test step. It now emerges that this isn't quite enough for some of the slowest buildfarm animals. This isn't the first time we've had to raise this limit (cf. 1db439ad4), so let's make it configurable. This patch raises the default to 5 minutes, and introduces an environment variable PGISOLATIONTIMEOUT that can be set if more time is needed, following the precedent of PGCTLTIMEOUT. Also, modify isolationtester so that when the timeout is hit, it explicitly reports having sent a cancel. This makes the regression failure log considerably more intelligible. (In the worst case, a timed-out test might actually be reported as "passing" without this extra output, so arguably this is a bug fix in itself.) In passing, update the README file, which had apparently not gotten touched when we added "make check" support here. Back-patch to 9.6; older versions don't have comparable timeout logic. Discussion: https://postgr.es/m/22964.1575842935@sss.pgh.pa.us
* Improve test coverage for LISTEN/NOTIFY.Tom Lane2019-11-23
| | | | | | | | | | | | | | | | | Back-patch commit b10f40bf0 into older branches. This adds reporting of NOTIFY messages to isolationtester.c, and extends the async-notify test to include direct tests of basic NOTIFY functionality. This provides useful infrastructure for testing a bug fix I'm about to back-patch, and there seems no good reason not to have better tests of LISTEN/NOTIFY in the back branches. The commit's survived long enough in HEAD to make it unlikely that it will cause problems. Back-patch as far as 9.6. isolationtester.c changed too much in 9.6 to make it sane to try to fix older branches this way, and I don't really want to back-patch those changes too. Discussion: https://postgr.es/m/31304.1564246011@sss.pgh.pa.us
* Don't drop NOTICE messages in isolation tests.Tom Lane2019-09-10
| | | | | | | | | | | | | | | | | | | | | | For its entire existence, isolationtester.c has forced client_min_messages to WARNING, but that seems like a very poor choice of test design. It should be up to individual test scripts to manage whether they emit notices and to ensure that the results are stable. (There were no NOTICE messages in the original set of isolation tests, so this was certainly dead code when committed, but perhaps it was needed at some earlier point.) It's possible that the original motivation was due to platform-dependent variations in the timing of stdout vs. stderr output. That should be moot since commits 73bcb76b7/6eda3e9c2, but just in case, adjust isotesterNoticeProcessor to print to stdout not stderr. (stderr seems like the wrong thing anyway: it should be for error printouts not expected test output.) Back-patch of commit ebd499282 into v12. I'll separately push this into older branches, but this is as much change as v12 needs. Discussion: https://postgr.es/m/14616.1564251339@sss.pgh.pa.us Discussion: https://postgr.es/m/E1i7IqC-0000Uc-5H@gemulon.postgresql.org
* Fix isolationtester race condition for notices sent before blocking.Tom Lane2019-09-09
| | | | | | | | | | | | | | | | | | | | If a test sends a notice just before blocking, it's possible on slow machines for isolationtester to detect the blocked state before it's consumed the notice. (For this to happen, the notice would have to arrive after isolationtester has waited for data for 10ms, so on fast/lightly-loaded machines it's hard to reproduce the failure.) But, if we have seen the backend as blocked, it's certainly already sent any notices it's going to send. Therefore, one more round of PQconsumeInput and PQisBusy should be enough to collect and process any such notices. Back-patch of 30717637c into v12. We're still discussing whether to back-patch this further and/or back-patch some other recent isolationtester fixes, but this much is provably necessary to make the test cases added by 27cc7cd2b stable in v12. Discussion: https://postgr.es/m/14616.1564251339@sss.pgh.pa.us Discussion: https://postgr.es/m/E1i7IqC-0000Uc-5H@gemulon.postgresql.org
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* isolationtester: Use atexit()Peter Eisentraut2019-01-07
| | | | | | | | Replace exit_nicely() calls with standard exit() and register the cleanup actions using atexit(). Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/ec4135ba-84e9-28bf-b584-0e78d47448d5@2ndquadrant.com/
* Raise some timeouts to 180s, in test code.Noah Misch2018-12-10
| | | | | | | | | | | | Slow runs of buildfarm members chipmunk, hornet and mandrill saw the shorter timeouts expire. The 180s timeout in poll_query_until has been trouble-free since 2a0f89cd717ce6d49cdc47850577823682167e87 introduced it two years ago, so use 180s more widely. Back-patch to 9.6, where the first of these timeouts was introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/20181209001601.GC2973271@rfd.leadboat.com
* Add some missing schema qualificationsMichael Paquier2018-12-03
| | | | | | | | | This does not improve the security and reliability of the touched areas, but it makes the style more consistent. Author: Michael Paquier Reviewed-by- Noah Misch Discussion: https://postgr.es/m/20180309075538.GD9376@paquier.xyz
* Indicate session name in isolationtester noticesAlvaro Herrera2018-11-09
| | | | | | | | | | | When a session under isolationtester produces printable notices (NOTICE, WARNING) we were just printing them unadorned, which can be confusing when debugging. Prefix them with the session name, which makes things clearer. Author: Álvaro Herrera Reviewed-by: Hari Babu Kommi Discussion: https://postgr.es/m/20181024213451.75nh3f3dctmcdbfq@alvherre.pgsql
* Fix minor bug in isolationtester.Tom Lane2018-10-17
| | | | | | | | | | | | | | If the lock wait query failed, isolationtester would report the PQerrorMessage from some other connection, meaning there would be no message or an unrelated one. This seems like a pretty unlikely occurrence, but if it did happen, this bug could make it really difficult/confusing to figure out what happened. That seems to justify patching all the way back. In passing, clean up another place where the "wrong" conn was used for an error report. That one's not actually buggy because it's a different alias for the same connection, but it's still confusing to the reader.
* Mark some more functions as pg_attribute_noreturn().Tom Lane2017-11-27
| | | | | | | | Doing this suppresses Coverity warnings and might allow improved code in some cases. The prospects of that are not so bright as to warrant back-patching, though. Michael Paquier, per Coverity
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Move isolationtester's is-blocked query into C code for speed.Tom Lane2017-04-10
| | | | | | | | | | | | | | | | | | | Commit 4deb41381 modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de
* Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE.Kevin Grittner2017-04-05
| | | | | | | | This improves code coverage and lays a foundation for testing similar issues in a distributed environment. Author: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Remove useless duplicate inclusions of system header files.Tom Lane2017-02-25
| | | | | | | | | | | | | | | | c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
* Fix a bunch of places that called malloc and friends with no NULL check.Tom Lane2016-08-30
| | | | | | | | | | | | | | | Where possible, use palloc or pg_malloc instead; otherwise, insert explicit NULL checks. Generally speaking, these are places where an actual OOM is quite unlikely, either because they're in client programs that don't allocate all that much, or they're very early in process startup so that we'd likely have had a fork() failure instead. Hence, no back-patch, even though this is nominally a bug fix. Michael Paquier, with some adjustments by me Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
* Use memmove() not memcpy() to slide some pointers down.Tom Lane2016-04-27
| | | | | | | | | The previous coding here was formally undefined, though it seems to accidentally work on most platforms in the buildfarm. Caught by some OpenBSD platforms in which libc contains an assertion check for overlapping areas passed to memcpy(). Thomas Munro
* Handle invalid libpq sockets in more placesPeter Eisentraut2016-03-08
| | | | | | Also, make error messages consistent. From: Michael Paquier <michael.paquier@gmail.com>
* Create a function to reliably identify which sessions block which others.Tom Lane2016-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else.
* Revert "isolationtester: don't repeat the is-it-waiting query when retrying ↵Tom Lane2016-02-12
| | | | | | | | a step." This mostly reverts commit 9c9782f066e0ce5424b8706df2cce147cb78170f. I left in the parts that rearranged removal of completed waiting steps; but the idea of not rechecking a step's blocked-ness isn't working.
* isolationtester: don't repeat the is-it-waiting query when retrying a step.Tom Lane2016-02-12
| | | | | | | | | | | | | | | | | | If we're retrying a step, then we already decided it was blocked on a lock, and there's no need to recheck that. The original coding of commit 38f8bdcac4982215beb9f65a19debecaf22fd470 resulted in a large number of is-it-waiting queries when dealing with multiple concurrently-blocked sessions, which is fairly pointless and also results in test failures in CLOBBER_CACHE_ALWAYS builds, where the is-it-waiting query is quite slow. This definition also permits appending pg_sleep() calls to steps where it's needed to control the order of finish of concurrent steps. Before, that did not work nicely because we'd decide that a step performing a sleep was not blocked and hang up waiting for it to finish, rather than noticing the completion of the concurrent step we're supposed to notice first. In passing, revise handling of removal of completed waiting steps to make it a bit less messy.
* Re-pgindent isolationtester.c.Tom Lane2016-02-12
| | | | | Need to do some more hacking on this, and got annoyed that it's not indent clean.
* Fix whitespacePeter Eisentraut2016-02-12
|
* Code review for isolationtester changes.Tom Lane2016-02-11
| | | | | | | Fix a few oversights in 38f8bdcac4982215beb9f65a19debecaf22fd470: don't leak memory in run_permutation(), remember when we've issued a cancel rather than issuing another one every 10ms, fix some typos in comments.
* Modify the isolation tester so that multiple sessions can wait.Robert Haas2016-02-11
| | | | | | This allows testing of deadlock scenarios. Scenarios that would previously have been considered invalid are now simply taken as a scenario in which more than one backend will wait.
* Reject isolation test specifications with duplicate step names.Robert Haas2015-08-14
| | | | | alter-table-1.spec has such a case, so change one instance of step rx1 to rx3 instead.
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Centralize getopt-related declarations in a new header file pg_getopt.h.Tom Lane2014-02-15
| | | | | | | | | | | | We used to have externs for getopt() and its API variables scattered all over the place. Now that we find we're going to need to tweak the variable declarations for Cygwin, it seems like a good idea to have just one place to tweak. In this commit, the variables are declared "#ifndef HAVE_GETOPT_H". That may or may not work everywhere, but we'll soon find out. Andres Freund
* isolationtester: Ensure stderr is unbuffered, tooAlvaro Herrera2013-12-19
|
* Make stdout unbufferedAlvaro Herrera2013-12-19
| | | | | | | | This ensures that all stdout output is flushed immediately, to match stderr. This eliminates the need for fflush(stdout) calls sprinkled all over the place. Per Daniel Wood in message 519A79C6.90308@salesforce.com
* Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStrHeikki Linnakangas2013-11-18
| | | | | | | Arguably makes the code a bit more readable, and might give a small performance gain. David Rowley
* Fix pg_isolation_regress to work outside its build directory.Robert Haas2013-11-08
| | | | | | | This makes it possible to, for example, use the isolation tester to test a contrib module. Andres Freund
* Replace pg_asprintf() with psprintf().Tom Lane2013-10-22
| | | | | | This eliminates an awkward coding pattern that's also unnecessarily inconsistent with backend coding. psprintf() is now the thing to use everywhere.
* Add use of asprintf()Peter Eisentraut2013-10-13
| | | | | | | | | Add asprintf(), pg_asprintf(), and psprintf() to simplify string allocation and composition. Replacement implementations taken from NetBSD. Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Asif Naeem <anaeem.it@gmail.com>
* isolationtester: Allow tuples to be returned in more placesAlvaro Herrera2013-10-04
| | | | | | Previously, isolationtester would forbid returning tuples in session-specific teardown (but not global teardown), as well as in global setup. Allow these places to return tuples, too.
* In isolationtester, retry after EINTR return from select(2).Tom Lane2013-04-06
| | | | | Per report from Jaime Casanova. Very curious that no one else has seen this failure ... but the code is clearly wrong as-is.
* Minor robustness improvements for isolationtester.Tom Lane2013-04-02
| | | | | | | | | | | | | Notice and complain about PQcancel() failures. Also, don't dump core if an error PGresult doesn't contain severity and message subfields, as it might not if it was generated by libpq itself. (We have a longstanding TODO item to improve that, but in the meantime isolationtester had better cope.) I tripped across the latter item while investigating a trouble report on buildfarm member spoonbill. As for the former, there's no evidence that PQcancel failure is actually involved in spoonbill's problem, but it still seems like a bad idea to ignore an error return code.
* Flush stderr and stdout in isolation tester.Andrew Dunstan2013-02-27
| | | | | This is a possibly vain attempt to fix a buffering issue observed for some MSVC builds.
* isolationtester: add a few fflush(stderr) callsAlvaro Herrera2013-01-23
| | | | | | The lack of them is causing failures in some BF members. Per Andrew Dunstan.
* Improve concurrency of foreign key lockingAlvaro Herrera2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
* Allow isolation tests to specify multiple setup blocks.Kevin Grittner2012-09-04
| | | | | | | | | | Each setup block is run as a single PQexec submission, and some statements such as VACUUM cannot be combined with others in such a block. Backpatch to 9.2. Kevin Grittner and Tom Lane
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Add simple tests of EvalPlanQual using the isolationtester infrastructure.Tom Lane2012-01-28
| | | | | | | | | | | | Much more could be done here, but at least now we have *some* automated test coverage of that mechanism. In particular this tests the writable-CTE case reported by Phil Sorber. In passing, remove isolationtester's arbitrary restriction on the number of steps in a permutation list. I used this so that a single spec file could be used to run several related test scenarios, but there are other possible reasons to want a step series that's not exactly a permutation. Improve documentation and fix a couple other nits as well.
* Detect invalid permutations in isolationtesterAlvaro Herrera2012-01-14
| | | | | | | | isolationtester is now able to continue running other permutations when it detects that one of them is invalid, which is useful during initial development of spec files. Author: Alexander Shulgin
* Avoid NULL pointer dereference in isolationtesterAlvaro Herrera2012-01-14
|
* Validate number of steps specified in permutationAlvaro Herrera2012-01-11
| | | | | | A permutation that specifies more steps than defined causes isolationtester to crash, so avoid that. Using less steps than defined should probably not be a problem, but no spec currently does that.
* Unbreak isolationtester on Win32Alvaro Herrera2011-11-04
| | | | | | | I broke it in a previous commit because I neglected to install the necessary incantations to have getopt() work on Windows. Per red blots in buildfarm.
* Implement a dry-run mode for isolationtesterAlvaro Herrera2011-11-03
| | | | | | | | This mode prints out the permutations that would be run by the given spec file, in the same format used by the permutation lines in spec files. This helps in building new spec files. Author: Alexander Shulgin, with some tweaks by me
* Add debugging aid in isolationtesterAlvaro Herrera2011-10-24
|
* Remove dependency on error ordering in isolation testsAlvaro Herrera2011-09-27
| | | | | | We now report errors reported by the just-unblocked and unblocking transactions identically; this should fix relatively common buildfarm failures reported by animals that are failing the "wrong" session.