aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Prevent panic during shutdown checkpointPeter Eisentraut2017-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and certain variants of CREATE_REPLICATION_SLOT. So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, the postmaster sends a SIGUSR2 signal to all walsenders. The walsenders then put themselves into the "stopping" state. In this state, they reject any new commands. (For simplicity, we reject all new commands, so that in the future we do not have to track meticulously which commands might generate WAL.) The checkpointer waits for all walsenders to reach this state before proceeding with the shutdown checkpoint. After the shutdown checkpoint is done, the postmaster sends SIGINT (previously unused) to the walsenders. This triggers the existing shutdown behavior of sending out the shutdown checkpoint record and then terminating. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>
* Fix wording in pg_upgrade docsMagnus Hagander2017-05-05
| | | | Author: Daniel Gustafsson
* Build pgoutput.dll in MSVC buildMagnus Hagander2017-05-05
| | | | | | Without this, logical replication obviously does not work on Windows MauMau, with clean.bet additions from me per note from Michael Paquier
* Make SCRAM salts and nonces longer.Heikki Linnakangas2017-05-05
| | | | | | | | | | | | | | | The salt is stored base64-encoded. With the old 10 bytes raw length, it was always padded to 16 bytes after encoding. We might as well use 12 raw bytes for the salt, and it's still encoded into 16 bytes. Similarly for the random nonces, use a raw length that's divisible by 3, so that there's no padding after base64 encoding. Make the nonces longer while we're at it. 10 bytes was probably enough to prevent replay attacks, but there's no reason to be skimpy here. Per suggestion from Álvaro Hernández Tortosa. Discussion: https://www.postgresql.org/message-id/df8c6e27-4d8e-5281-96e5-131a4e638fc8@8kdata.com
* Misc cleanup of SCRAM code.Heikki Linnakangas2017-05-05
| | | | | | | | | | * Remove is_scram_verifier() function. It was unused. * Fix sanitize_char() function, used in error messages on protocol violations, to print bytes >= 0x7F correctly. * Change spelling of scram_MockSalt() function to be more consistent with the surroundings. * Change a few more references to "server proof" to "server signature" that I missed in commit d981074c24.
* Don't use SCRAM-specific "e=invalid-proof" on invalid password.Heikki Linnakangas2017-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | Instead, send the same FATAL message as with other password-based authentication mechanisms. This gives a more user-friendly message: psql: FATAL: password authentication failed for user "test" instead of: psql: error received from server in SASL exchange: invalid-proof Even before this patch, the server sent that FATAL message, after the SCRAM-specific "e=invalid-proof" message. But libpq would stop at the SCRAM error message, and not process the ErrorResponse that would come after that. We could've taught libpq to check for an ErrorResponse after failed authentication, but it's simpler to modify the server to send only the ErrorResponse. The SCRAM specification allows for aborting the authentication at any point, using an application-defined error mechanism, like PostgreSQL's ErrorResponse. Using the e=invalid-proof message is optional. Reported by Jeff Janes. Discussion: https://www.postgresql.org/message-id/CAMkU%3D1w3jQ53M1OeNfN8Cxd9O%2BA_9VONJivTbYoYRRdRsLT6vA@mail.gmail.com
* Change the way pg_dump retrieves partitioning infoStephen Frost2017-05-04
| | | | | | | | | | | | | | | | | | | | This gets rid of the code that issued separate queries to retrieve the partitioning parent-child relationship, parent partition key, and child partition bound information. With this patch, the information is retrieved instead using the queries issued from getTables() and getInherits(), which is both more efficient than the previous approach and doesn't require any new code. Since the partitioning parent-child relationship is now retrieved with the same old code that handles inheritance, partition attributes receive a proper flagInhAttrs() treatment (that it didn't receive before), which is needed so that the inherited NOT NULL constraints are not emitted if we already emitted it for the parent. Also, fix a bug in pg_dump's --binary-upgrade code, which caused pg_dump to emit invalid command to attach a partition to its parent. Author: Amit Langote, with some additional changes by me.
* Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan.Tom Lane2017-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | GiST's getNextNearest() function attempts to pfree the previously-returned tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older branches). However, if we are rescanning a plan node after ending a previous scan early, those tuple pointers could be pointing to garbage, because they would be pointing into the scan's pageDataCxt or queueCxt which has been reset. In a debug build this reliably results in a crash, although I think it might sometimes accidentally fail to fail in production builds. To fix, clear the pointer field anyplace we reset a context it might be pointing into. This may be overkill --- I think probably only the queueCxt case is involved in this bug, so that resetting in gistrescan() would be sufficient --- but dangling pointers are generally bad news, so let's avoid them. Another plausible answer might be to just not bother with the pfree in getNextNearest(). The reconstructed tuples would go away anyway in the context resets, and I'm far from convinced that freeing them a bit earlier really saves anything meaningful. I'll stick with the original logic in this patch, but if we find more problems in the same area we should consider that approach. Per bug #14641 from Denis Smirnov. Back-patch to 9.5 where this logic was introduced. Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
* Fix PQencryptPasswordConn to work with older server versions.Heikki Linnakangas2017-05-04
| | | | | | | | | | password_encryption was a boolean before version 10, so cope with "on" and "off". Also, change the behavior with "plain", to treat it the same as "md5". We're discussing removing the password_encryption='plain' option from the server altogether, which will make this the only reasonable choice, but even if we kept it, it seems best to never send the password in cleartext.
* Fix cursor_to_xml in tableforest false modePeter Eisentraut2017-05-03
| | | | | | | | | | | | | It only produced <row> elements but no wrapping <table> element. By contrast, cursor_to_xmlschema produced a schema that is now correct but did not previously match the XML data produced by cursor_to_xml. In passing, also fix a minor misunderstanding about moving cursors in the tests related to this. Reported-by: filip@jirsak.org Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
* Remove useless and rather expensive stanza in matview regression test.Tom Lane2017-05-03
| | | | | | | | | | | | | | | | This removes a test case added by commit b69ec7cc9, which was intended to exercise a corner case involving the rule used at that time that materialized views were unpopulated iff they had physical size zero. We got rid of that rule very shortly later, in commit 1d6c72a55, but kept the test case. However, because the case now asks what VACUUM will do to a zero-sized physical file, it would be pretty surprising if the answer were ever anything but "nothing" ... and if things were indeed that broken, surely we'd find it out from other tests. Since the test involves a table that's fairly large by regression-test standards (100K rows), it's quite slow to run. Dropping it should save some buildfarm cycles, so let's do that. Discussion: https://postgr.es/m/32386.1493831320@sss.pgh.pa.us
* Add pg_dump tests for CREATE STATISTICSAlvaro Herrera2017-05-03
| | | | | | | CREATE STATISTICS pg_dump support code was not covered at all by previous tests. Discussion: https://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
* pg_dump/t/002: append terminating semicolon to SQL commandsAlvaro Herrera2017-05-03
| | | | | | | | It's easy to overlook the need for one, and its lack is annoying for the next developer wanting to create a new test. Rather than expect every individual command to add the semicolon, just append one automatically. Discussion: http://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
* Add PQencryptPasswordConn function to libpq, use it in psql and createuser.Heikki Linnakangas2017-05-03
| | | | | | | | | | | | The new function supports creating SCRAM verifiers, in addition to md5 hashes. The algorithm is chosen based on password_encryption, by default. This fixes the issue reported by Jeff Janes, that there was previously no way to create a SCRAM verifier with "\password". Michael Paquier and me Discussion: https://www.postgresql.org/message-id/CAMkU%3D1wfBgFPbfAMYZQE78p%3DVhZX7nN86aWkp0QcCp%3D%2BKxZ%3Dbg%40mail.gmail.com
* Improve performance of timezone loading, especially pg_timezone_names view.Tom Lane2017-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tzparse() would attempt to load the "posixrules" timezone database file on each call. That might seem like it would only be an issue when selecting a POSIX-style zone name rather than a zone defined in the timezone database, but it turns out that each zone definition file contains a POSIX-style zone string and tzload() will call tzparse() to parse that. Thus, when scanning the whole timezone file tree as we do in the pg_timezone_names view, "posixrules" was read repetitively for each zone definition file. Fix that by caching the file on first use within any given process. (We cache other zone definitions for the life of the process, so there seems little reason not to cache this one as well.) This probably won't help much in processes that never run pg_timezone_names, but even one additional SET of the timezone GUC would come out ahead. An even worse problem for pg_timezone_names is that pg_open_tzfile() has an inefficient way of identifying the canonical case of a zone name: it basically re-descends the directory tree to the zone file. That's not awful for an individual "SET timezone" operation, but it's pretty horrid when we're inspecting every zone in the database. And it's pointless too because we already know the canonical spelling, having just read it from the filesystem. Fix by teaching pg_open_tzfile() to avoid the directory search if it's not asked for the canonical name, and backfilling the proper result in pg_tzenumerate_next(). In combination these changes seem to make the pg_timezone_names view about 3x faster to read, for me. Since a scan of pg_timezone_names has up to now been one of the slowest queries in the regression tests, this should help some little bit for buildfarm cycle times. Back-patch to all supported branches, not so much because it's likely that users will care much about the view's performance as because tracking changes in the upstream IANA timezone code is really painful if we don't keep all the branches in sync. Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us
* Remove create_singleton_array(), hard-coding the case in its sole caller.Tom Lane2017-05-02
| | | | | | | | | | | | | | | | | | | | | | create_singleton_array() was not really as useful as we perhaps thought when we added it. It had never accreted more than one call site, and is only saving a dozen lines of code at that one, which is considerably less bulk than the function itself. Moreover, because of its insistence on using the caller's fn_extra cache space, it's arguably a coding hazard. text_to_array_internal() does not currently use fn_extra in any other way, but if it did it would be subtly broken, since the conflicting fn_extra uses could be needed within a single query, in the seldom-tested case that the field separator varies during the query. The same objection seems likely to apply to any other potential caller. The replacement code is a bit uglier, because it hardwires knowledge of the storage parameters of type TEXT, but it's not like we haven't got dozens or hundreds of other places that do the same. Uglier seems like a good tradeoff for smaller, faster, and safer. Per discussion with Neha Khatri. Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
* Ensure commands in extension scripts see the results of preceding DDL.Tom Lane2017-05-02
| | | | | | | | | | | | | Due to a missing CommandCounterIncrement() call, parsing of a non-utility command in an extension script would not see the effects of the immediately preceding DDL command, unless that command's execution ends with CommandCounterIncrement() internally ... which some do but many don't. Report by Philippe Beaudoin, diagnosis by Julien Rouhaud. Rather remarkably, this bug has evaded detection since extensions were invented, so back-patch to all supported branches. Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
* extstats: change output functions to emit valid JSONAlvaro Herrera2017-05-02
| | | | | | | Manipulating extended statistics is more convenient as JSON than the current ad-hoc format, so let's change before it's too late. Discussion: https://postgr.es/m/20170420193828.k3fliiock5hdnehn@alvherre.pgsql
* Fix typos in comments.Robert Haas2017-05-02
| | | | | | Etsuro Fujita Discussion: http://postgr.es/m/00e88999-684d-d79a-70e4-908c937a0126@lab.ntt.co.jp
* Avoid unnecessary catalog updates in ALTER SEQUENCEPeter Eisentraut2017-05-02
| | | | | | | | | | | | | | | ALTER SEQUENCE can do nontransactional changes to the sequence (RESTART clause) and transactional updates to the pg_sequence catalog (most other clauses). When just calling RESTART, the code would still needlessly do a catalog update without any changes. This would entangle that operation in the concurrency issues of a catalog update (causing either locking or concurrency errors, depending on how that issue is to be resolved). Fix by keeping track during options parsing whether a catalog update is needed, and skip it if not. Reported-by: Jason Petersen <jason@citusdata.com>
* Fix perl thinko in commit fed6df486dcaAndrew Dunstan2017-05-02
| | | | | | Report and fix from Vaishnavi Prabakaran Backpatch to 9.4 like original.
* Change hot_standby default value to 'on'Magnus Hagander2017-05-02
| | | | | | | | This goes together with the changes made to enable replication on the sending side by default (wal_level, max_wal_senders etc) by making the receiving stadby node also enable it by default. Huong Dangminh
* Don't wake up logical replication launcher unnecessarilyPeter Eisentraut2017-05-01
| | | | | | | In CREATE SUBSCRIPTION, only wake up the launcher when the subscription is enabled. Author: Fujii Masao <masao.fujii@gmail.com>
* Improve function header comment for create_singleton_array().Tom Lane2017-05-01
| | | | | | | | Mentioning the caller is neither future-proof nor an adequate substitute for giving an API specification. Per gripe from Neha Khatri, though I changed the patch around some. Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
* Reduce semijoins with unique inner relations to plain inner joins.Tom Lane2017-05-01
| | | | | | | | | | | | | | | | | | | | If the inner relation can be proven unique, that is it can have no more than one matching row for any row of the outer query, then we might as well implement the semijoin as a plain inner join, allowing substantially more freedom to the planner. This is a form of outer join strength reduction, but it can't be implemented in reduce_outer_joins() because we don't have enough info about the individual relations at that stage. Instead do it much like remove_useless_joins(): once we've built base relations, we can make another pass over the SpecialJoinInfo list and get rid of any entries representing reducible semijoins. This is essentially a followon to the inner-unique patch (commit 9c7f5229a) and makes use of the proof machinery that that patch created. We need only minor refactoring of innerrel_is_unique's API to support this usage. Per performance complaint from Teodor Sigaev. Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
* Fix mis-optimization of semijoins with more than one LHS relation.Tom Lane2017-05-01
| | | | | | | | | | | | | | | | | | | | The inner-unique patch (commit 9c7f5229a) supposed that if we're considering a JOIN_UNIQUE_INNER join path, we can always set inner_unique for the join, because the inner path produced by create_unique_path should be unique relative to the outer relation. However, that's true only if we're considering joining to the whole outer relation --- otherwise we may be applying only some of the join quals, and so the inner path might be non-unique from the perspective of this join. Adjust the test to only believe that we can set inner_unique if we have the whole semijoin LHS on the outer side. There is more that can be done in this area, but this commit is only intended to provide the minimal fix needed to get correct plans. Per report from Teodor Sigaev. Thanks to David Rowley for preliminary investigation. Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
* Update time zone data files to tzdata release 2017b.Tom Lane2017-05-01
| | | | | | | | | | | | | | | | | | DST law changes in Chile, Haiti, and Mongolia. Historical corrections for Ecuador, Kazakhstan, Liberia, and Spain. The IANA crew continue their campaign to replace invented time zone abbrevations with numeric GMT offsets. This update changes numerous zones in South America, the Pacific and Indian oceans, and some Asian and Middle Eastern zones. I kept these abbreviations in the tznames/ data files, however, so that we will still accept them for input. (We may want to start trimming those files someday, but I think we should wait for the upstream dust to settle before deciding what to do.) In passing, add MESZ (Mitteleuropaeische Sommerzeit) to the tznames lists; since we accept MEZ (Mitteleuropaeische Zeit) it seems rather strange not to take the other one. And fix some incorrect, or at least obsolete, comments that certain abbreviations are not traceable to the IANA data.
* libpq: Fix inadvertent change in .pgpass lookup behavior.Robert Haas2017-05-01
| | | | | | | | | | Commit 274bb2b3857cc987cfa21d14775cae9b0dababa5 caused password file lookups to use the hostaddr in preference to the host, but that was not intended and the documented behavior is the opposite. Report and patch by Kyotaro Horiguchi. Discussion: http://postgr.es/m/20170428.165432.60857995.horiguchi.kyotaro@lab.ntt.co.jp
* Allow vcregress.pl to run an arbitrary TAP test setAndrew Dunstan2017-05-01
| | | | | | | | | | | Currently only provision for running the bin checks in a single step is provided for. Now these tests can be run individually, as well as tests in other locations (e.g. src.test/recover). Also provide for suppressing unnecessary temp installs by setting the NO_TEMP_INSTALL environment variable just as the Makefiles do. Backpatch to 9.4.
* Fix logical replication launcher wake up and resetPeter Eisentraut2017-05-01
| | | | | | | | | | | | | After the logical replication launcher was told to wake up at commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake up was not reset, so it would be woken up at every following commit as well. So fix that by resetting the flag. Also, we don't need to wake up anything if the transaction was rolled back. Just reset the flag in that case. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>
* Fire per-statement triggers on partitioned tables.Robert Haas2017-05-01
| | | | | | | | | | | Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
* Sync our copy of the timezone library with IANA release tzcode2017b.Tom Lane2017-04-30
| | | | | | | | | | | zic no longer mishandles some transitions in January 2038 when it attempts to work around Qt bug 53071. This fixes a bug affecting Pacific/Tongatapu that was introduced in zic 2016e. localtime.c now contains a workaround, useful when loading a file generated by a buggy zic. There are assorted cosmetic changes as well, notably relocation of a bunch of #defines.
* Fix possible null pointer dereference or invalid warning message.Tom Lane2017-04-30
| | | | | | | | | | | | Thinko in commit de4389712: this warning message references the wrong "LogicalRepWorker *" variable. This would often result in a core dump, but if it didn't, the message would show the wrong subscription OID. In passing, adjust the message text to format a subscription OID similarly to how that's done elsewhere in the function; and fix grammatical issues in some nearby messages. Per Coverity testing.
* Micro-optimize some slower queries in the opr_sanity regression test.Tom Lane2017-04-29
| | | | | | | | | | | | | | | | | | | | | | | | Convert the binary_coercible() and physically_coercible() functions from SQL to plpgsql. It's not that plpgsql is inherently better at doing queries; if you simply convert the previous single SQL query into one RETURN expression, it's no faster. The problem with the existing code is that it fools the plancache into deciding that it's worth re-planning the query every time, since constant-folding with a concrete value for $2 allows elimination of at least one sub-SELECT. In reality that's using the planner to do the equivalent of a few runtime boolean tests, causing the function to run much slower than it should. Splitting the AND/OR logic into separate plpgsql statements allows each if-expression to acquire a static plan. Also, get rid of some uses of obj_description() in favor of explicitly joining to pg_description, allowing the joins to be optimized better. (Someday we might improve the SQL-function-inlining logic enough that this happens automatically, but today is not that day.) Together, these changes reduce the runtime of the opr_sanity regression test by about a factor of two on one of my slower machines. They don't seem to help as much on a fast machine, but this should at least benefit the buildfarm.
* Fix VALIDATE CONSTRAINT to consider NO INHERIT attribute.Robert Haas2017-04-28
| | | | | | | | | | Currently, trying to validate a NO INHERIT constraint on the parent will search for the constraint in child tables (where it is not supposed to exist), wrongly causing a "constraint does not exist" error. Amit Langote, per a report from Hans Buschmann. Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
* psql: Support identity columns in sequence displayPeter Eisentraut2017-04-28
| | | | | | | Where the footer for an owned serial sequence would say "Owned by", put something analogous for a sequence belonging to an identity column. Reported-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
* In load_relcache_init_file, initialize rd_pdcxt.Robert Haas2017-04-28
| | | | | | Oversight noted by Gao Zeng Qi. Discussion: http://postgr.es/m/CAFmBtr1N3-SbepJbnGpaYp=jw-FvWMnYY7-bTtRgvjvbyB8YJA@mail.gmail.com
* Speed up dropping tables with many partitions.Robert Haas2017-04-28
| | | | | | | | | We need to lock the parent, but we don't need a relcache entry for it. Gao Zeng Qi, reviewed by Amit Langote Discussion: http://postgr.es/m/CAFmBtr0ukqJjRJEhPWL5wt4rNMrJUUxggVAGXPR3SyYh3E+HDQ@mail.gmail.com
* Fix crash when partitioned column specified twice.Robert Haas2017-04-28
| | | | | | Amit Langote, reviewed by Beena Emerson Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
* Wait between tablesync worker restartsPeter Eisentraut2017-04-28
| | | | | | | | | | | | | | | | | | | | | Before restarting a tablesync worker for the same relation, wait wal_retrieve_retry_interval (currently 5s by default). This avoids restarting failing workers in a tight loop. We keep the last start times in a hash table last_start_times that is separate from the table_states list, because that list is cleared out on syscache invalidation, which happens whenever a table finishes syncing. The hash table is kept until all tables have finished syncing. A future project might be to unify these two and keep everything in one data structure, but for now this is a less invasive change to accomplish the original purpose. For the test suite, set wal_retrieve_retry_interval to its minimum value, to not increase the test suite run time. Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
* Misc SCRAM code cleanups.Heikki Linnakangas2017-04-28
| | | | | | | | | | | | | | | | | | | | * Move computation of SaltedPassword to a separate function from scram_ClientOrServerKey(). This saves a lot of cycles in libpq, by computing SaltedPassword only once per authentication. (Computing SaltedPassword is expensive by design.) * Split scram_ClientOrServerKey() into two functions. Improves readability, by making the calling code less verbose. * Rename "server proof" to "server signature", to better match the nomenclature used in RFC 5802. * Rename SCRAM_SALT_LEN to SCRAM_DEFAULT_SALT_LEN, to make it more clear that the salt can be of any length, and the constant only specifies how long a salt we use when we generate a new verifier. Also rename SCRAM_ITERATIONS_DEFAULT to SCRAM_DEFAULT_ITERATIONS, for consistency. These things caught my eye while working on other upcoming changes.
* Remove unnecessairly duplicated gram.y productionsStephen Frost2017-04-27
| | | | | | | | | | | | | | | Declarative partitioning duplicated the TypedTableElement productions, evidently to remove the need to specify WITH OPTIONS when creating partitions. Instead, simply make WITH OPTIONS optional in the TypedTableElement production and remove all of the duplicate PartitionElement-related productions. This change simplifies the syntax and makes WITH OPTIONS optional when adding defaults, constraints or storage parameters to columns when creating either typed tables or partitions. Also update pg_dump to no longer include WITH OPTIONS, since it's not necessary, and update the documentation to reflect that WITH OPTIONS is now optional.
* Don't build full initial logical decoding snapshot if NOEXPORT_SNAPSHOT.Andres Freund2017-04-27
| | | | | | | | | Earlier commits (56e19d938dd14 and 2bef06d5164) make it cheaper to create a logical slot if not exporting the initial snapshot. If NOEXPORT_SNAPSHOT is specified, we can skip the overhead, not just when creating a slot via sql (which can't export snapshots). As NOEXPORT_SNAPSHOT has only recently been introduced, this shouldn't be backpatched.
* Don't use on-disk snapshots for exported logical decoding snapshot.Andres Freund2017-04-27
| | | | | | | | | | | | | | | | | | | | | Logical decoding stores historical snapshots on disk, so that logical decoding can restart without having to reconstruct a snapshot from scratch (for which the resources are not guaranteed to be present anymore). These serialized snapshots were also used when creating a new slot via the walsender interface, which can export a "full" snapshot (i.e. one that can read all tables, not just catalog ones). The problem is that the serialized snapshots are only useful for catalogs and not for normal user tables. Thus the use of such a serialized snapshot could result in an inconsistent snapshot being exported, which could lead to queries returning wrong data. This would only happen if logical slots are created while another logical slot already exists. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.
* Avoid slow shutdown of pg_basebackup.Tom Lane2017-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | pg_basebackup's child process did not pay any attention to the pipe from its parent while waiting for input from the source server. If no server data was arriving, it would only wake up and check the pipe every standby_message_timeout or so. This creates a problem since the parent process might determine and send the desired stop position only after the server has reached end-of-WAL and stopped sending data. In the src/test/recovery regression tests, the timing is repeatably such that it takes nearly 10 seconds for the child process to realize that it should shut down. It's not clear how often that would happen in real-world cases, but it sure seems like a bug --- and if the user turns off standby_message_timeout or sets it very large, the delay could be a lot worse. To fix, expand the StreamCtl API to allow the pipe input FD to be passed down to the low-level wait routine, and watch both sockets when sleeping. (Note: AFAICS this issue doesn't affect the Windows port, since it doesn't rely on a pipe to transfer the stop position to the child thread.) Discussion: https://postgr.es/m/6456.1493263884@sss.pgh.pa.us
* Fix bug so logical rep launcher saves correctly time of last startup of worker.Fujii Masao2017-04-28
| | | | | | | | | | | | | | | | | | | | | | Previously the logical replication launcher stored the last timestamp when it started the worker, in the local variable "last_start_time", in order to check whether wal_retrive_retry_interval elapsed since the last startup of worker. If it has elapsed, the launcher sees pg_subscription and starts new worker if necessary. This is for limitting the startup of worker to once a wal_retrieve_retry_interval. The bug was that the variable "last_start_time" was defined and always initialized with 0 at the beginning of the launcher's main loop. So even if it's set to the last timestamp in later phase of the loop, it's always reset to 0. Therefore the launcher could not check correctly whether wal_retrieve_retry_interval elapsed since the last startup. This patch moves the variable "last_start_time" outside the main loop so that it will not be reset. Reviewed-by: Petr Jelinek Discussion: http://postgr.es/m/CAHGQGwGJrPO++XM4mFENAwpy1eGXKsGdguYv43GUgLgU-x8nTQ@mail.gmail.com
* Cope with glibc too old to have epoll_create1().Tom Lane2017-04-27
| | | | | | | | | Commit fa31b6f4e supposed that we didn't have to worry about that anymore, but it seems that RHEL5 is like that, and that's still a supported platform. Put back the prior coding under an #ifdef, adding an explicit fcntl() to retain the desired CLOEXEC property. Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us
* Preserve required !catalog tuples while computing initial decoding snapshot.Andres Freund2017-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logical decoding machinery already preserved all the required catalog tuples, which is sufficient in the course of normal logical decoding, but did not guarantee that non-catalog tuples were preserved during computation of the initial snapshot when creating a slot over the replication protocol. This could cause a corrupted initial snapshot being exported. The time window for issues is usually not terribly large, but on a busy server it's perfectly possible to it hit it. Ongoing decoding is not affected by this bug. To avoid increased overhead for the SQL API, only retain additional tuples when a logical slot is being created over the replication protocol. To do so this commit changes the signature of CreateInitDecodingContext(), but it seems unlikely that it's being used in an extension, so that's probably ok. In a drive-by fix, fix handling of ReplicationSlotsComputeRequiredXmin's already_locked argument, which should only apply to ProcArrayLock, not ReplicationSlotControlLock. Reported-By: Erik Rijkers Analyzed-By: Petr Jelinek Author: Petr Jelinek, heavily editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.
* Make latch.c more paranoid about child-process cases.Tom Lane2017-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | Although the postmaster doesn't currently create a self-pipe or any latches, there's discussion of it doing so in future. It's also conceivable that a shared_preload_libraries extension would try to create such a thing in the postmaster process today. In that case the self-pipe FDs would be inherited by forked child processes. latch.c was entirely unprepared for such a case and could suffer an assertion failure, or worse try to use the inherited pipe if somebody called WaitLatch without having called InitializeLatchSupport in that process. Make it keep track of whether InitializeLatchSupport has been called in the *current* process, and do the right thing if state has been inherited from a parent. Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe, as well as epoll event sets). This ensures that child processes spawned in backends, the archiver, etc cannot accidentally or intentionally mess with these FDs. It also ensures that we end up with the right state for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't know to close the postmaster's self-pipe FDs. Back-patch to 9.6, mainly to keep latch.c looking similar in all branches it exists in. Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us
* Rework handling of subtransactions in 2PC recoverySimon Riggs2017-04-27
| | | | | | | | | | | | | | The bug fixed by 0874d4f3e183757ba15a4b3f3bf563e0393dd9c2 caused us to question and rework the handling of subtransactions in 2PC during and at end of recovery. Patch adds checks and tests to ensure no further bugs. This effectively removes the temporary measure put in place by 546c13e11b29a5408b9d6a6e3cca301380b47f7f. Author: Simon Riggs Reviewed-by: Tom Lane, Michael Paquier Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com