aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Add more debugging information with log checks in TAP tests of pgbenchMichael Paquier2021-06-25
| | | | | | | | | | | | fairywren is not happy with the pattern checks introduced by c13585f. I am not sure if this outlines a bug in pgbench or if the regex patterns used in the tests are too restrictive for this buildfarm member's environment. This adds more debugging information to show the log entries that do not match with the expected pattern, to help in finding out what's happening. That seems like a good addition in the long-term anyway as that may not be the only issue in this area. Discussion: https://postgr.es/m/YNUad2HvgW+6eXyo@paquier.xyz
* doc: Move remove_temp_files_after_crash to section for developer optionsMichael Paquier2021-06-25
| | | | | | | | | | | The main goal of this option is to allow inspecting temporary files for debugging purposes, so moving the parameter there is natural. Oversight in cd91de0. Reported-by: Justin Pryzby Author: Euler Taveira Discussion: https://postgr.es/m/20210612004347.GP16435@telsasoft.com
* Prepare for forthcoming LLVM 13 API change.Thomas Munro2021-06-25
| | | | | | | | | | | | | | | | | LLVM 13 (due out in September) has changed the semantics of LLVMOrcAbsoluteSymbols(), so we need to bump some reference counts to avoid a double-free that causes crashes and bad query results. A proactive change seems necessary to avoid having a window of time where our respective latest releases would interact badly. It's possible that the situation could change before then, though. Thanks to Fabien Coelho for monitoring bleeding edge LLVM and Andres Freund for tracking down the change. Back-patch to 11, where the JIT code arrived. Discussion: https://postgr.es/m/CA%2BhUKGLEy8mgtN7BNp0ooFAjUedDTJj5dME7NxLU-m91b85siA%40mail.gmail.com
* Fix pattern matching logic for logs in TAP tests of pgbenchMichael Paquier2021-06-25
| | | | | | | | | | | | | | | | The logic checking for the format of per-thread logs used grep() with directly "$re", which would cause the test to consider all the logs as a match without caring about their format at all. Using "/$re/" makes grep() perform a regex test, which is what we want here. While on it, improve some of the tests to be more picky with the patterns expected and add more comments to describe the tests. Issue discovered while digging into a separate patch. Author: Fabien Coelho, Michael Paquier Discussion: https://postgr.es/m/YNPsPAUoVDCpPOGk@paquier.xyz Backpatch-through: 11
* Another fix to relmapper race condition.Heikki Linnakangas2021-06-24
| | | | | | | | In previous commit, I missed that relmap_redo() was also not acquiring the RelationMappingLock. Thanks to Thomas Munro for pointing that out. Backpatch-through: 9.6, like previous commit. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com
* Prevent race condition while reading relmapper file.Heikki Linnakangas2021-06-24
| | | | | | | | | | | Contrary to the comment here, POSIX does not guarantee atomicity of a read(), if another process calls write() concurrently. Or at least Linux does not. Add locking to load_relmap_file() to avoid the race condition. Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case. Backpatch-through: 9.6, all supported versions. Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org
* Allow non-quoted identifiers as isolation test session/step names.Tom Lane2021-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | For no obvious reason, isolationtester has always insisted that session and step names be written with double quotes. This is fairly tedious and does little for test readability, especially since the names that people actually choose almost always look like normal identifiers. Hence, let's tweak the lexer to allow SQL-like identifiers not only double-quoted strings. (They're SQL-like, not exactly SQL, because I didn't add any case-folding logic. Also there's no provision for U&"..." names, not that anyone's likely to care.) There is one incompatibility introduced by this change: if you write "foo""bar" with no space, that used to be taken as two identifiers, but now it's just one identifier with an embedded quote mark. I converted all the src/test/isolation/ specfiles to remove unnecessary double quotes, but stopped there because my eyes were glazing over already. Like 741d7f104, back-patch to all supported branches, so that this isn't a stumbling block for back-patching isolation test changes. Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us
* Don't assume GSSAPI result strings are null-terminated.Tom Lane2021-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | Our uses of gss_display_status() and gss_display_name() assumed that the gss_buffer_desc strings returned by those functions are null-terminated. It appears that they generally are, given the lack of field complaints up to now. However, the available documentation does not promise this, and some man pages for gss_display_status() show examples that rely on the gss_buffer_desc.length field instead of expecting null termination. Also, we now have a report that on some implementations, clang's address sanitizer is of the opinion that the byte after the specified length is undefined. Hence, change the code to rely on the length field instead. This might well be cosmetic rather than fixing any real bug, but it's hard to be sure, so back-patch to all supported branches. While here, also back-patch the v12 changes that made pg_GSS_error deal honestly with multiple messages available from gss_display_status. Per report from Sudheer H R. Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com
* Improve display of query results in isolation tests.Tom Lane2021-06-23
| | | | | | | | | | | | | | | | | | | Previously, isolationtester displayed SQL query results using some ad-hoc code that clearly hadn't had much effort expended on it. Field values longer than 14 characters weren't separated from the next field, and usually caused misalignment of the columns too. Also there was no visual separation of a query's result from subsequent isolationtester output. This made test result files confusing and hard to read. To improve matters, let's use libpq's PQprint() function. Although that's long since unused by psql, it's still plenty good enough for the purpose here. Like 741d7f104, back-patch to all supported branches, so that this isn't a stumbling block for back-patching isolation test changes. Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us
* Add test case for obsoleting slot with active walsender, take 2Alvaro Herrera2021-06-23
| | | | | | | | | | | | | | | | | | | | | | The code to signal a running walsender when its reserved WAL size grows too large is completely uncovered before this commit; this adds coverage for that case. This test involves sending SIGSTOP to walsender and walreceiver, then advancing enough WAL for a checkpoint to trigger, then sending SIGCONT. There's no precedent for STOP signalling in Perl tests, and my reading of relevant manpages says it's likely to fail on Windows. Because of this, this test is always skipped on that platform. This version fixes a couple of rarely hit race conditions in the previous attempt 09126984a263; most notably, both LOG string searches are loops, not just the second one; we acquire the start-of-log position before STOP-signalling; and reference the correct process name in the test description. All per Tom Lane. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202106102202.mjw4huiix7lo@alvherre.pgsql
* Use annotations to reduce instability of isolation-test results.Tom Lane2021-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've long contended with isolation test results that aren't entirely stable. Some test scripts insert long delays to try to force stable results, which is not terribly desirable; but other erratic failure modes remain, causing unrepeatable buildfarm failures. I've spent a fair amount of time trying to solve this by improving the server-side support code, without much success: that way is fundamentally unable to cope with diffs that stem from chance ordering of arrival of messages from different server processes. We can improve matters on the client side, however, by annotating the test scripts themselves to show the desired reporting order of events that might occur in different orders. This patch adds three types of annotations to deal with (a) test steps that might or might not complete their waits before the isolationtester can see them waiting; (b) test steps in different sessions that can legitimately complete in either order; and (c) NOTIFY messages that might arrive before or after the completion of a step in another session. We might need more annotation types later, but this seems to be enough to deal with the instabilities we've seen in the buildfarm. It also lets us get rid of all the long delays that were previously used, cutting more than a minute off the runtime of the isolation tests. Back-patch to all supported branches, because the buildfarm instabilities affect all the branches, and because it seems desirable to keep isolationtester's capabilities the same across all branches to simplify possible future back-patching of tests. Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
* Restore the portal-level snapshot for simple expressions, too.Tom Lane2021-06-22
| | | | | | | | | | | | | | | | | | Commits 84f5c2908 et al missed the need to cover plpgsql's "simple expression" code path. If the first thing we execute after a COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command, we must explicitly do EnsurePortalSnapshotExists() to make sure we have an outer snapshot. Note that it wouldn't be good enough to just push a snapshot for the duration of the expression execution: what comes back might be toasted, so we'd better have a snapshot protecting it. The test case demonstrating this fact cheats a bit by marking a SQL function immutable even though it fetches from a table. That's nothing that users haven't been seen to do, though. Per report from Jim Nasby. Back-patch to v11, like the previous fix. Discussion: https://postgr.es/m/378885e4-f85f-fc28-6c91-c4d1c080bf26@amazon.com
* Add list of ignorable pgindent commits for git-blame.Peter Geoghegan2021-06-22
| | | | | | | | | | | | Add a .git-blame-ignore-revs file with a list of pgindent, pgperlyidy, and reformat-dat-files commit hashes. Postgres hackers that configure git to use the ignore file will get git-blame output that avoids attributing line changes to the ignored indent commits. This makes git-blame output much easier to work with in practice. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAH2-Wz=cVh3GHTP6SdLU-Gnmt2zRdF8vZkcrFdSzXQ=WhbWm9Q@mail.gmail.com
* Use correct horizon when vacuuming catalog relations.Andres Freund2021-06-21
| | | | | | | | | | | | | | | | | | | | | | | | In dc7420c2c92 I (Andres) accidentally used RelationIsAccessibleInLogicalDecoding() as the sole condition to use the non-shared catalog horizon in GetOldestNonRemovableTransactionId(). That is incorrect, as RelationIsAccessibleInLogicalDecoding() checks whether wal_level is logical. The correct check, as done e.g. in GlobalVisTestFor(), is to check IsCatalogRelation() and RelationIsAccessibleInLogicalDecoding(). The observed misbehavior of this bug was that there could be an endless loop in lazy_scan_prune(), because the horizons used in heap_page_prune() and the individual tuple liveliness checks did not match. Likely there are other potential consequences as well. A later commit will unify the determination which horizon has to be used, and add additional assertions to make it easier to catch a bug like this. Reported-By: Justin Pryzby <pryzby@telsasoft.com> Diagnosed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wg32Y9+WJfw=aofkRx1ZRFt_Ev6bNPc4PSaz7PjSFtZgQ@mail.gmail.com
* Fix assert failure in expand_grouping_setsDavid Rowley2021-06-21
| | | | | | | | | | | | | | | | | linitial_node() fails in assert enabled builds if the given pointer is not of the specified type. Here the type is IntList. The code thought it should be expecting List, but it was wrong. In the existing tests which run this code the initial list element is always NIL. Since linitial_node() allows NULL, we didn't trigger any assert failures in the existing regression tests. There is still some discussion as to whether we need a few more tests in this area, but for now, since beta2 is looming, fix the bug first. Bug: #17067 Discussion: https://postgr.es/m/17067-665d50fa321f79e0@postgresql.org Reported-by: Yaoguang Chen
* Translation updatesPeter Eisentraut2021-06-21
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 70796ae860c444c764bb591c885f22cac1c168ec
* Remove overzealous VACUUM failsafe assertions.Peter Geoghegan2021-06-20
| | | | | | | | | | | The failsafe can trigger when index processing is already disabled. This can happen when VACUUM's INDEX_CLEANUP parameter is "off" and the failsafe happens to trigger. Remove assertions that assume that index processing is directly tied to the failsafe. Oversight in commit c242baa4, which made it possible for the failsafe to trigger in a two-pass strategy VACUUM that has yet to make its first call to lazy_vacuum_all_indexes().
* Revert "Add test case for obsoleting slot with active walsender"Alvaro Herrera2021-06-20
| | | | | | This reverts commit 09126984a263; the test case added there failed once in circumstances that remain mysterious. It seems better to remove the test for now so that 14beta2 doesn't have random failures built in.
* Provide feature-test macros for libpq features added in v14.Tom Lane2021-06-19
| | | | | | | | | | | | | | | | | | | | We had a request to provide a way to test at compile time for the availability of the new pipeline features. More generally, it seems like a good idea to provide a way to test via #ifdef for all new libpq API features. People have been using the version from pg_config.h for that; but that's more likely to represent the server version than the libpq version, in the increasingly-common scenario where they're different. It's safer if libpq-fe.h itself is the source of truth about what features it offers. Hence, establish a policy that starting in v14 we'll add a suitable feature-is-present macro to libpq-fe.h when we add new API there. (There doesn't seem to be much point in applying this policy retroactively, but it's not too late for v14.) Tom Lane and Alvaro Herrera, per suggestion from Boris Kolpackov. Discussion: https://postgr.es/m/boris.20210617102439@codesynthesis.com
* Handle no replica identity index case in RelationGetIdentityKeyBitmap.Amit Kapila2021-06-19
| | | | | | | | | | Commit e7eea52b2d has introduced a new function RelationGetIdentityKeyBitmap which omits to handle the case where there is no replica identity index on a relation. Author: Mark Dilger Reviewed-by: Takamichi Osumi, Amit Kapila Discussion: https://www.postgresql.org/message-id/4C99A862-69C8-431F-960A-81B1151F1B89@enterprisedb.com
* Support disabling index bypassing by VACUUM.Peter Geoghegan2021-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Generalize the INDEX_CLEANUP VACUUM parameter (and the corresponding reloption): make it into a ternary style boolean parameter. It now exposes a third option, "auto". The "auto" option (which is now the default) enables the "bypass index vacuuming" optimization added by commit 1e55e7d1. "VACUUM (INDEX_CLEANUP TRUE)" is redefined to once again make VACUUM simply do any required index vacuuming, regardless of how few dead tuples are encountered during the first scan of the target heap relation (unless there are exactly zero). This gives users a way of opting out of the "bypass index vacuuming" optimization, if for whatever reason that proves necessary. It is also expected to be used by PostgreSQL developers as a testing option from time to time. "VACUUM (INDEX_CLEANUP FALSE)" does the same thing as it always has: it forcibly disables both index vacuuming and index cleanup. It's not expected to be used much in PostgreSQL 14. The failsafe mechanism added by commit 1e55e7d1 addresses the same problem in a simpler way. INDEX_CLEANUP can now be thought of as a testing and compatibility option. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CAH2-WznrBoCST4_Gxh_G9hA8NzGUbeBGnOUC8FcXcrhqsv6OHQ@mail.gmail.com
* Add test case for obsoleting slot with active walsenderAlvaro Herrera2021-06-18
| | | | | | | | | | | | | | | The code to signal a running walsender when its reserved WAL size grows too large is completely uncovered before this commit; this adds coverage for that case. This test involves sending SIGSTOP to walsender and walreceiver and running a checkpoint while advancing WAL, then sending SIGCONT. There's no precedent for this coding in Perl tests, and my reading of relevant manpages says it's likely to fail on Windows. Because of this, this test is always skipped on that platform. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202106102202.mjw4huiix7lo@alvherre.pgsql
* Fix misbehavior of DROP OWNED BY with duplicate polroles entries.Tom Lane2021-06-18
| | | | | | | | | | | | | | | | | | Ordinarily, a pg_policy.polroles array wouldn't list the same role more than once; but CREATE POLICY does not prevent that. If we perform DROP OWNED BY on a role that is listed more than once, RemoveRoleFromObjectPolicy either suffered an assertion failure or encountered a tuple-updated-by-self error. Rewrite it to cope correctly with duplicate entries, and add a CommandCounterIncrement call to prevent the other problem. Per discussion, there's other cleanup that ought to happen here, but this seems like the minimum essential fix. Per bug #17062 from Alexander Lakhin. It's been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
* Improve version reporting in pgbench.Tom Lane2021-06-18
| | | | | | | | | | | Commit 547f04e73 caused pgbench to start printing its version number, which seems like a fine idea, but it needs a bit more work: * Print the server version number too, when different. * Print the PG_VERSION string, not some reconstructed approximation. This patch copies psql's well-tested code for the same purpose. Discussion: https://postgr.es/m/1226654.1624036821@sss.pgh.pa.us
* Centralize the logic for protective copying of utility statements.Tom Lane2021-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the "simple Query" code path, it's fine for parse analysis or execution of a utility statement to scribble on the statement's node tree, since that'll just be thrown away afterwards. However it's not fine if the node tree is in the plan cache, as then it'd be corrupted for subsequent executions. Up to now we've dealt with that by having individual utility-statement functions apply copyObject() if they were going to modify the tree. But that's prone to errors of omission. Bug #17053 from Charles Samborski shows that CREATE/ALTER DOMAIN didn't get this memo, and can crash if executed repeatedly from plan cache. In the back branches, we'll just apply a narrow band-aid for that, but in HEAD it seems prudent to have a more principled fix that will close off the possibility of other similar bugs in future. Hence, let's hoist the responsibility for doing copyObject up into ProcessUtility from its children, thus ensuring that it happens for all utility statement types. Also, modify ProcessUtility's API so that its callers can tell it whether a copy step is necessary. It turns out that in all cases, the immediate caller knows whether the node tree is transient, so this doesn't involve a huge amount of code thrashing. In this way, while we lose a little bit in the execute-from-cache code path due to sometimes copying node trees that wouldn't be mutated anyway, we gain something in the simple-Query code path by not copying throwaway node trees. Statements that are complex enough to be expensive to copy are almost certainly ones that would have to be copied anyway, so the loss in the cache code path shouldn't be much. (Note that this whole problem applies only to utility statements. Optimizable statements don't have the issue because we long ago made the executor treat Plan trees as read-only. Perhaps someday we will make utility statement execution act likewise, but I'm not holding my breath.) Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
* Don't set a fast default for anything but a plain tableAndrew Dunstan2021-06-18
| | | | | | | | | | | | | | | | | The fast default code added in Release 11 omitted to check that the table a fast default was being added to was a plain table. Thus one could be added to a foreign table, which predicably blows up. Here we perform that check. In addition, on the back branches, since some of these might have escaped into the wild, if we encounter a missing value for an attribute of something other than a plain table we ignore it. Fixes bug #17056 Backpatch to release 11, Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
* Make archiver process handle barrier events.Fujii Masao2021-06-18
| | | | | | | | | | | Commit d75288fb27 made WAL archiver process an auxiliary process. An auxiliary process needs to handle barrier events but the commit forgot to make archiver process do that. Reported-by: Thomas Munro Author: Fujii Masao Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CA+hUKGLah2w1pWKHonZP_+EQw69=q56AHYwCgEN8GDzsRG_Hgw@mail.gmail.com
* Tidy up GetMultiXactIdMembers()'s behavior on errorHeikki Linnakangas2021-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | One of the error paths left *members uninitialized. That's not a live bug, because most callers don't look at *members when the function returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does "if (members != NULL) pfree(members)", but AFAICS it never passes an invalid 'multi' value so it should not reach that error case. The callers are also a bit inconsistent in their expectations. heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's not a live bug either, because the function should never return 0, but add an Assert for that to make it more clear. I left the callers alone for now. I also moved the line where we set *nmembers. It wasn't wrong before, but I like to do that right next to the 'return' statement, to make it clear that it's always set on return. Also remove one unreachable return statement after ereport(ERROR), for brevity and for consistency with the similar if-block right after it. Author: Greg Nancarrow with the additional changes by me Backpatch-through: 9.6, all supported versions
* Fix plancache refcount leak after error in ExecuteQuery.Tom Lane2021-06-16
| | | | | | | | | | | | | | | | | When stuffing a plan from the plancache into a Portal, one is not supposed to risk throwing an error between GetCachedPlan and PortalDefineQuery; if that happens, the plan refcount incremented by GetCachedPlan will be leaked. I managed to break this rule while refactoring code in 9dbf2b7d7. There is no visible consequence other than some memory leakage, and since nobody is very likely to trigger the relevant error conditions many times in a row, it's not surprising we haven't noticed. Nonetheless, it's a bug, so rearrange the order of operations to remove the hazard. Noted on the way to looking for a better fix for bug #17053. This mistake is pretty old, so back-patch to all supported branches.
* Fix copying data into slots with FDW batchingTomas Vondra2021-06-16
| | | | | | | | | | | | | | | | Commit b676ac443b optimized handling of tuple slots with bulk inserts into foreign tables, so that the slots are initialized only once and reused for all batches. The data was however copied into the slots only after the initialization, inserting duplicate values when the slot gets reused. Fixed by moving the ExecCopySlot outside the init branch. The existing postgres_fdw tests failed to catch this due to inserting data into foreign tables without unique indexes, and then checking only the number of inserted rows. This adds a new test with both a unique index and a check of inserted values. Reported-by: Alexander Pyhalov Discussion: https://postgr.es/m/7a8cf8d56b3d18e5c0bccd6cd42d04ac%40postgrespro.ru
* Improve SQLSTATE reporting in some replication-related code.Tom Lane2021-06-16
| | | | | | | | | | | | | | | | I started out with the goal of reporting ERRCODE_CONNECTION_FAILURE when walrcv_connect() fails, but as I looked around I realized that whoever wrote this code was of the opinion that errcodes are purely optional. That's not my understanding of our project policy. Hence, make sure that an errcode is provided in each ereport that (a) is ERROR or higher level and (b) isn't arguably an internal logic error. Also fix some very dubious existing errcode assignments. While this is not per policy, it's also largely cosmetic, since few of these cases could get reported to applications. So I don't feel a need to back-patch. Discussion: https://postgr.es/m/2189704.1623512522@sss.pgh.pa.us
* Fix outdated comment that talked about seek position of WAL file.Heikki Linnakangas2021-06-16
| | | | | | | | Since commit c24dcd0cfd, we have been using pg_pread() to read the WAL file, which doesn't change the seek position (unless we fall back to the implementation in src/port/pread.c). Update comment accordingly. Backpatch-through: 12, where we started to use pg_pread()
* Update another variant expected-result file.Tom Lane2021-06-15
| | | | | This should have been updated in 533e9c6b0, but it was overlooked. Given the lack of complaints, I won't bother back-patching.
* Remove another orphan expected-result file.Tom Lane2021-06-15
| | | | | | | | aborted-keyrevoke_2.out was apparently needed when it was added (in commit 0ac5ad513) to handle the case of serializable transaction mode. However, the output in serializable mode actually matches the regular aborted-keyrevoke.out file, and AFAICT has done so for a long time. There's no need to keep dragging this variant along.
* Further refinement of stuck_on_old_timeline recovery testAndrew Dunstan2021-06-15
| | | | | | | | | TestLib::perl2host can take a file argument as well as a directory argument, so that code becomes substantially simpler. Also add comments on why we're using forward slashes, and why we're setting PERL_BADLANG=0. Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net
* Revert 29854ee8d1 due to buildfarm failuresAlexander Korotkov2021-06-15
| | | | | Reported-by: Tom Lane Discussion: https://postgr.es/m/CAPpHfdvcnw3x7jdV3r52p4%3D5S4WUxBCzcQKB3JukQHoicv1LSQ%40mail.gmail.com
* Remove unneeded field from VACUUM state.Peter Geoghegan2021-06-15
| | | | | | | Bugfix commit 5fc89376 effectively made the lock_waiter_detected field from vacuumlazy.c's global state struct into private state owned by lazy_truncate_heap(). Finish this off by replacing the struct field with a local variable.
* Support for unnest(multirange) and cast multirange as an array of rangesAlexander Korotkov2021-06-15
| | | | | | | | | | | | | | | | | | | It has been spotted that multiranges lack of ability to decompose them into individual ranges. Subscription and proper expanded object representation require substantial work, and it's too late for v14. This commit provides the implementation of unnest(multirange) and cast multirange as an array of ranges, which is quite trivial. unnest(multirange) is defined as a polymorphic procedure. The catalog description of the cast underlying procedure is duplicated for each multirange type because we don't have anyrangearray polymorphic type to use here. Catversion is bumped. Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org Author: Alexander Korotkov Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu
* Fix decoding of speculative aborts.Amit Kapila2021-06-15
| | | | | | | | | | | | | | | | | | | | | | During decoding for speculative inserts, we were relying for cleaning toast hash on confirmation records or next change records. But that could lead to multiple problems (a) memory leak if there is neither a confirmation record nor any other record after toast insertion for a speculative insert in the transaction, (b) error and assertion failures if the next operation is not an insert/update on the same table. The fix is to start queuing spec abort change and clean up toast hash and change record during its processing. Currently, we are queuing the spec aborts for both toast and main table even though we perform cleanup while processing the main table's spec abort record. Later, if we have a way to distinguish between the spec abort record of toast and the main table, we can avoid queuing the change for spec aborts of toast tables. Reported-by: Ashutosh Bapat Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 9.6, where it was introduced Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
* Update variant expected-result file.Tom Lane2021-06-14
| | | | | | | | | | This should have been updated in d2d8a229b, but it was overlooked. According to 31a877f18 which added it, this file is meant to show the results you get under default_transaction_isolation = serializable. We've largely lost track of that goal in other isolation tests, but as long as we've got this one, it should be right. Noted while fooling about with the isolationtester.
* Remove orphaned expected-result file.Tom Lane2021-06-14
| | | | | | This should have been removed in 43e084197, which removed the corresponding spec file. Noted while fooling about with the isolationtester.
* Remove pg_wait_for_backend_termination().Noah Misch2021-06-14
| | | | | | | | | | It was unable to wait on a backend that had already left the procarray. Users tolerant of that limitation can poll pg_stat_activity. Other users can employ the "timeout" argument of pg_terminate_backend(). Reviewed by Bharath Rupireddy. Discussion: https://postgr.es/m/20210605013236.GA208701@rfd.leadboat.com
* Copy-edit text for the pg_terminate_backend() "timeout" parameter.Noah Misch2021-06-14
| | | | | | | | | | | Revert the pg_description entry to its v13 form, since those messages usually remain shorter and don't discuss individual parameters. No catversion bump, since pg_description content does not impair backend compatibility or application compatibility. Justin Pryzby Discussion: https://postgr.es/m/20210612182743.GY16435@telsasoft.com
* Fix logic bug in 1632ea43682fAlvaro Herrera2021-06-14
| | | | | | | | | | | | | I overlooked that one condition was logically inverted. The fix is a little bit more involved than simply negating the condition, to make the code easier to read. Fix some outdated comments left by the same commit, while at it. Author: Masahiko Sawada <sawada.mshk@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/YMRlmB3/lZw8YBH+@paquier.xyz
* Improve handling of dropped objects in pg_event_trigger_ddl_commands()Michael Paquier2021-06-14
| | | | | | | | | | | | | | | | | | | | | An object found as dropped when digging into the list of objects returned by pg_event_trigger_ddl_commands() could cause a cache lookup error, as the calls grabbing for the object address and the type name would fail if the object was missing. Those lookup errors could be seen with combinations of ALTER TABLE sub-commands involving identity columns. The lookup logic is changed in this code path to get a behavior similar to any other SQL-callable function by ignoring objects that are not found, taking advantage of 2a10fdc. The back-branches are not changed, as they require this commit that is too invasive for stable branches. While on it, add test cases to exercise event triggers with identity columns, and stress more cases with the event ddl_command_end for relations. Author: Sven Klemm, Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/CAMCrgp2R1cEXU53iYKtW6yVEp2_yKUz+z=3-CTrYpPP+xryRtg@mail.gmail.com
* Remove forced toast recompression in VACUUM FULL/CLUSTERMichael Paquier2021-06-14
| | | | | | | | | | | | | | | | The extra checks added by the recompression of toast data introduced in bbe0a81 is proving to have a performance impact on VACUUM or CLUSTER even if no recompression is done. This is more noticeable with more toastable columns that contain non-NULL values. Improvements could be done to make those extra checks less expensive, but that's not material for 14 at this stage, and we are not sure either if the code path of VACUUM FULL/CLUSTER is adapted for this job. Per discussion with several people, including Andres Freund, Robert Haas, Álvaro Herrera, Tom Lane and myself. Discussion: https://postgr.es/m/20210527003144.xxqppojoiwurc2iz@alap3.anarazel.de
* Work around portability issue with newer versions of mktime().Tom Lane2021-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent glibc versions have made mktime() fail if tm_isdst is inconsistent with the prevailing timezone; in particular it fails for tm_isdst = 1 when the zone is UTC. (This seems wildly inconsistent with the POSIX-mandated treatment of "incorrect" values for the other fields of struct tm, so if you ask me it's a bug, but I bet they'll say it's intentional.) This has been observed to cause cosmetic problems when pg_restore'ing an archive created in a different timezone. To fix, do mktime() using the field values from the archive, and if that fails try again with tm_isdst = -1. This will give a result that's off by the UTC-offset difference from the original zone, but that was true before, too. It's not terribly critical since we don't do anything with the result except possibly print it. (Someday we should flush this entire bit of logic and record a standard-format timestamp in the archive instead. That's not okay for a back-patched bug fix, though.) Also, guard our only other use of mktime() by having initdb's build_time_t() set tm_isdst = -1 not 0. This case could only have an issue in zones that are DST year-round; but I think some do exist, or could in future. Per report from Wells Oliver. Back-patch to all supported versions, since any of them might need to run with a newer glibc. Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
* Further tweaks to stuck_on_old_timeline recovery testAndrew Dunstan2021-06-13
| | | | | | | | | Translate path slashes on target directory path. This was confusing old branches, but is applied to all branches for the sake of uniformity. Perl is perfectly able to understand paths with forward slashes. Along the way, restore the previous archive_wait query, for the sake of uniformity with other tests, per gripe from Tom Lane.
* Ignore more environment variables in pg_regress.cMichael Paquier2021-06-13
| | | | | | | | | | | | | | This is similar to the work done in 8279f68 for TestLib.pm, where environment variables set may cause unwanted failures if using a temporary installation with pg_regress. The list of variables reset is adjusted in each stable branch depending on what is supported. Comments are added to remember that the lists in TestLib.pm and pg_regress.c had better be kept in sync. Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz Backpatch-through: 9.6
* Restore robustness of TAP tests that wait for postmaster restart.Tom Lane2021-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several TAP tests use poll_query_until() to wait for the postmaster to restart. They were checking to see if a trivial query (e.g. "SELECT 1") succeeds. However, that's problematic in the wake of commit 11e9caff8, because now that we feed said query to psql via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql quits before reading the query. Hence, we can't use a nonempty query in cases where we need to wait for connection failures to stop happening. Per the precedent of commits c757a3da0 and 6d41dd045, we can pass "undef" as the query in such cases to ensure that IPC::Run has nothing to write. However, then we have to say that the expected output is empty, and this exposes a deficiency in poll_query_until: if psql fails altogether and returns empty stdout, poll_query_until will treat that as a success! That's because, contrary to its documentation, it makes no actual check for psql failure, looking neither at the exit status nor at stderr. To fix that, adjust poll_query_until to insist on empty stderr as well as a stdout match. (I experimented with checking exit status instead, but it seems that psql often does exit(1) in cases that we need to consider successes. That might be something to fix someday, but it would be a non-back-patchable behavior change.) Back-patch to v10. The test cases needing this exist only as far back as v11, but it seems wise to keep poll_query_until's behavior the same in v10, in case we back-patch another such test case in future. (9.6 does not currently need this change, because in that branch poll_query_until can't be told to accept empty stdout as a success case.) Per assorted buildfarm failures, mostly on hoverfly. Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com