aboutsummaryrefslogtreecommitdiff
path: root/src/backend/commands
Commit message (Collapse)AuthorAge
...
* Support SCRAM-SHA-256 authentication (RFC 5802 and 7677).Heikki Linnakangas2017-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a new generic SASL authentication method, similar to the GSS and SSPI methods. The server first tells the client which SASL authentication mechanism to use, and then the mechanism-specific SASL messages are exchanged in AuthenticationSASLcontinue and PasswordMessage messages. Only SCRAM-SHA-256 is supported at the moment, but this allows adding more SASL mechanisms in the future, without changing the overall protocol. Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later. The SASLPrep algorithm, for pre-processing the password, is not yet implemented. That could cause trouble, if you use a password with non-ASCII characters, and a client library that does implement SASLprep. That will hopefully be added later. Authorization identities, as specified in the SCRAM-SHA-256 specification, are ignored. SET SESSION AUTHORIZATION provides more or less the same functionality, anyway. If a user doesn't exist, perform a "mock" authentication, by constructing an authentic-looking challenge on the fly. The challenge is derived from a new system-wide random value, "mock authentication nonce", which is created at initdb, and stored in the control file. We go through these motions, in order to not give away the information on whether the user exists, to unauthenticated users. Bumps PG_CONTROL_VERSION, because of the new field in control file. Patch by Michael Paquier and Heikki Linnakangas, reviewed at different stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev, and many others. Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
* Avoid dangling pointer to relation name in RLS code path in DoCopy().Tom Lane2017-03-06
| | | | | | | | | | | | | | | | | With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE, and would sometimes fail without that, because it used the relation name directly from the relcache as part of the parsetree it's building. That becomes a potentially-dangling pointer as soon as the relcache entry is closed, a bit further down. Typical symptom if the relcache entry chanced to get cleared would be "relation does not exist" error with a garbage relation name, or possibly a core dump; but if you were really truly unlucky, the COPY might copy from the wrong table. Per report from Andrew Dunstan that regression tests fail with -DRELCACHE_FORCE_RELEASE. The core tests now pass for me (but have not tried "make check-world" yet). Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
* Replace LookupFuncNameTypeNames() with LookupFuncWithArgs()Peter Eisentraut2017-03-06
| | | | | | | | | | | The old function took function name and function argument list as separate arguments. Now that all function signatures are passed around as ObjectWithArgs structs, this is no longer necessary and can be replaced by a function that takes ObjectWithArgs directly. Similarly for aggregates and operators. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Remove objname/objargs split for referring to objectsPeter Eisentraut2017-03-06
| | | | | | | | | | | | | | | | | | | | | | | In simpler times, it might have worked to refer to all kinds of objects by a list of name components and an optional argument list. But this doesn't work for all objects, which has resulted in a collection of hacks to place various other nodes types into these fields, which have to be unpacked at the other end. This makes it also weird to represent lists of such things in the grammar, because they would have to be lists of singleton lists, to make the unpacking work consistently. The other problem is that keeping separate name and args fields makes it awkward to deal with lists of functions. Change that by dropping the objargs field and have objname, renamed to object, be a generic Node, which can then be flexibly assigned and managed using the normal Node mechanisms. In many cases it will still be a List of names, in some cases it will be a string Value, for types it will be the existing Typename, for functions it will now use the existing ObjectWithArgs node type. Some of the more obscure object types still use somewhat arbitrary nested lists. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Add operator_with_argtypes grammar rulePeter Eisentraut2017-03-06
| | | | | | | | | | This makes the handling of operators similar to that of functions and aggregates. Rename node FuncWithArgs to ObjectWithArgs, to reflect the expanded use. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Use class_args field in opclass_dropPeter Eisentraut2017-03-06
| | | | | | | This makes it consistent with the usage in opclass_item. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Allow partitioned tables to be dropped without CASCADESimon Riggs2017-03-06
| | | | | | | | | | Record partitioned table dependencies as DEPENDENCY_AUTO rather than DEPENDENCY_NORMAL, so that DROP TABLE just works. Remove all the tests for partitioned tables where earlier work had deliberately avoided using CASCADE. Amit Langote, reviewed by Ashutosh Bapat and myself
* In rebuild_relation(), don't access an already-closed relcache entry.Tom Lane2017-03-04
| | | | | | | | | | | | | | | | This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by Andrew Dunstan, and could sometimes fail in normal operation, resulting in a wrong persistence value being used for the transient table. It's not immediately clear to me what effects that might have beyond the risk of a crash while accessing OldHeap->rd_rel->relpersistence, but it's probably not good. Bug introduced by commit f41872d0c, and made substantially worse by commit 85b506bbf, which added a second such access significantly later than the heap_close. I doubt the first reference could fail in a production scenario, but the second one definitely could. Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
* Disallow CREATE/DROP SUBSCRIPTION in transaction blockPeter Eisentraut2017-03-03
| | | | | | | | Disallow CREATE SUBSCRIPTION and DROP SUBSCRIPTION in a transaction block when the replication slot is to be created or dropped, since that cannot be rolled back. based on patch by Masahiko Sawada <sawada.mshk@gmail.com>
* Fix typoPeter Eisentraut2017-03-03
|
* Add RENAME support for PUBLICATIONs and SUBSCRIPTIONsPeter Eisentraut2017-03-03
| | | | From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
* Allow vacuums to report oldestxminSimon Riggs2017-03-03
| | | | | | Allow VACUUM and Autovacuum to report the oldestxmin value they used while cleaning tables, helping to make better sense out of the other statistics we report in various cases.
* Don't uselessly rewrite, truncate, VACUUM, or ANALYZE partitioned tables.Robert Haas2017-03-02
| | | | | | | | | | Also, recursively perform VACUUM and ANALYZE on partitions when the command is applied to a partitioned table. In passing, some related documentation updates. Amit Langote, reviewed by Michael Paquier, Ashutosh Bapat, and by me. Discussion: http://postgr.es/m/47288cf1-f72c-dfc2-5ff0-4af962ae5c1b@lab.ntt.co.jp
* Remove useless duplicate inclusions of system header files.Tom Lane2017-02-25
| | | | | | | | | | | | | | | | c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
* Consistently declare timestamp variables as TimestampTz.Tom Lane2017-02-23
| | | | | | | | | | | | | | | | | | | Twiddle the replication-related code so that its timestamp variables are declared TimestampTz, rather than the uninformative "int64" that was previously used for meant-to-be-always-integer timestamps. This resolves the int64-vs-TimestampTz declaration inconsistencies introduced by commit 7c030783a, though in the opposite direction to what was originally suggested. This required including datatype/timestamp.h in a couple more places than before. I decided it would be a good idea to slim down that header by not having it pull in <float.h> etc, as those headers are no longer at all relevant to its purpose. Unsurprisingly, a small number of .c files turn out to have been depending on those inclusions, so add them back in the .c files as needed. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us
* Remove now-dead code for !HAVE_INT64_TIMESTAMP.Tom Lane2017-02-23
| | | | | | | This is a basically mechanical removal of #ifdef HAVE_INT64_TIMESTAMP tests and the negative-case controlled code. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
* Fix whitespacePeter Eisentraut2017-02-21
|
* Fix connection leak in DROP SUBSCRIPTION command.Fujii Masao2017-02-22
| | | | | Previously the command forgot to close the connection to the publisher when it failed to drop the replication slot.
* Make more use of castNode()Peter Eisentraut2017-02-21
|
* Make partitions automatically inherit OIDs.Robert Haas2017-02-19
| | | | | | | | | | Previously, if the parent was specified as WITH OIDS, each child also had to be explicitly specified as WITH OIDS. Amit Langote, per a report from Simon Riggs. Some additional work on the documentation changes by me. Discussion: http://postgr.es/m/CANP8+jJBpWocfKrbJcaf3iBt9E3U=WPE_NC8YE6rye+YJ1sYnQ@mail.gmail.com
* Add CREATE COLLATION IF NOT EXISTS clausePeter Eisentraut2017-02-15
| | | | | | The core of the functionality was already implemented when pg_import_system_collations was added. This just exposes it as an option in the SQL command.
* Don't disallow dropping NOT NULL for a list partition key.Robert Haas2017-02-14
| | | | | | | Range partitioning doesn't support nulls in the partitioning columns, but list partitioning does. Amit Langote, per a complaint from Amul Sul
* Ignore tablespace ACLs when ignoring schema ACLs.Noah Misch2017-02-12
| | | | | | | | | | | The ALTER TABLE ALTER TYPE implementation can issue DROP INDEX and CREATE INDEX to refit existing indexes for the new column type. Since this CREATE INDEX is an implementation detail of an index alteration, the ensuing DefineIndex() should skip ACL checks specific to index creation. It already skips the namespace ACL check. Make it skip the tablespace ACL check, too. Back-patch to 9.2 (all supported versions). Reviewed by Tom Lane.
* Add CREATE SEQUENCE AS <data type> clausePeter Eisentraut2017-02-10
| | | | | | | | | | | | | | | | | | | | | | | | This stores a data type, required to be an integer type, with the sequence. The sequences min and max values default to the range supported by the type, and they cannot be set to values exceeding that range. The internal implementation of the sequence is not affected. Change the serial types to create sequences of the appropriate type. This makes sure that the min and max values of the sequence for a serial column match the range of values supported by the table column. So the sequence can no longer overflow the table column. This also makes monitoring for sequence exhaustion/wraparound easier, which currently requires various contortions to cross-reference the sequences with the table columns they are used with. This commit also effectively reverts the pg_sequence column reordering in f3b421da5f4addc95812b9db05a24972b8fd9739, because the new seqtypid column allows us to fill the hole in the struct and create a more natural overall column ordering. Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Allow index AMs to cache data across aminsert calls within a SQL command.Tom Lane2017-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | It's always been possible for index AMs to cache data across successive amgettuple calls within a single SQL command: the IndexScanDesc.opaque field is meant for precisely that. However, no comparable facility exists for amortizing setup work across successive aminsert calls. This patch adds such a feature and teaches GIN, GIST, and BRIN to use it to amortize catalog lookups they'd previously been doing on every call. (The other standard index AMs keep everything they need in the relcache, so there's little to improve there.) For GIN, the overall improvement in a statement that inserts many rows can be as much as 10%, though it seems a bit less for the other two. In addition, this makes a really significant difference in runtime for CLOBBER_CACHE_ALWAYS tests, since in those builds the repeated catalog lookups are vastly more expensive. The reason this has been hard up to now is that the aminsert function is not passed any useful place to cache per-statement data. What I chose to do is to add suitable fields to struct IndexInfo and pass that to aminsert. That's not widening the index AM API very much because IndexInfo is already within the ken of ambuild; in fact, by passing the same info to aminsert as to ambuild, this is really removing an inconsistency in the AM API. Discussion: https://postgr.es/m/27568.1486508680@sss.pgh.pa.us
* Add WAL consistency checking facility.Robert Haas2017-02-08
| | | | | | | | | | | | | | When the new GUC wal_consistency_checking is set to a non-empty value, it triggers recording of additional full-page images, which are compared on the standby against the results of applying the WAL record (without regard to those full-page images). Allowable differences such as hints are masked out, and the resulting pages are compared; any difference results in a FATAL error on the standby. Kuntal Ghosh, based on earlier patches by Michael Paquier and Heikki Linnakangas. Extensively reviewed and revised by Michael Paquier and by me, with additional reviews and comments from Amit Kapila, Álvaro Herrera, Simon Riggs, and Peter Eisentraut.
* Fix typos in comments.Heikki Linnakangas2017-02-06
| | | | | | | | | Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
* Be sure to release LogicalRepLauncherLock in DROP SUBSCRIPTION command.Fujii Masao2017-02-04
| | | | | | | | Previously DROP SUBSCRIPTION command forgot to release the lock at all. Original patches by Kyotaro Horiguchi and Michael Paquier, but I didn't use them. Discussion: http://postgr.es/m/20170201.173623.66249355.horiguchi.kyotaro@lab.ntt.co.jp
* Fix CatalogTupleInsert/Update abstraction for case of shared indstate.Tom Lane2017-02-01
| | | | | | | | | | | | | | | | | | | | | | Add CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo to let callers use the CatalogTupleXXX abstraction layer even in cases where we want to share the results of CatalogOpenIndexes across multiple inserts/updates for efficiency. This finishes the job begun in commit 2f5c9d9c9, by allowing some remaining simple_heap_insert/update calls to be replaced. The abstraction layer is now complete enough that we don't have to export CatalogIndexInsert at all anymore. Also, this fixes several places in which 2f5c9d9c9 introduced performance regressions by using retail CatalogTupleInsert or CatalogTupleUpdate even though the previous coding had been able to amortize CatalogOpenIndexes work across multiple tuples. A possible future improvement is to arrange for the indexing.c functions to cache the CatalogIndexState somewhere, maybe in the relcache, in which case we could get rid of CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo again. But that's a task for another day. Discussion: https://postgr.es/m/27502.1485981379@sss.pgh.pa.us
* Provide CatalogTupleDelete() as a wrapper around simple_heap_delete().Tom Lane2017-02-01
| | | | | | | | | | | | | | | | This extends the work done in commit 2f5c9d9c9 to provide a more nearly complete abstraction layer hiding the details of index updating for catalog changes. That commit only invented abstractions for catalog inserts and updates, leaving nearby code for catalog deletes still calling the heap-level routines directly. That seems rather ugly from here, and it does little to help if we ever want to shift to a storage system in which indexing work is needed at delete time. Hence, create a wrapper function CatalogTupleDelete(), and replace calls of simple_heap_delete() on catalog tuples with it. There are now very few direct calls of [simple_]heap_delete remaining in the tree. Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us
* Replace isMD5() with a more future-proof way to check if pw is encrypted.Heikki Linnakangas2017-02-01
| | | | | | | | | | | | | | | | | | | The rule is that if pg_authid.rolpassword begins with "md5" and has the right length, it's an MD5 hash, otherwise it's a plaintext password. The idiom has been to use isMD5() to check for that, but that gets awkward, when we add new kinds of verifiers, like the verifiers for SCRAM authentication in the pending SCRAM patch set. Replace isMD5() with a new get_password_type() function, so that when new verifier types are added, we don't need to remember to modify every place that currently calls isMD5(), to also recognize the new kinds of verifiers. Also, use the new plain_crypt_verify function in passwordcheck, so that it doesn't need to know about MD5, or in the future, about other kinds of hashes or password verifiers. Reviewed by Michael Paquier and Peter Eisentraut. Discussion: https://www.postgresql.org/message-id/2d07165c-1793-e243-a2a9-e45b624c7580@iki.fi
* Tweak catalog indexing abstraction for upcoming WARMAlvaro Herrera2017-01-31
| | | | | | | | | | | | | | | | | | | | | Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com
* Handle ALTER EXTENSION ADD/DROP with pg_init_privsStephen Frost2017-01-29
| | | | | | | | | | | | | | | | | | | | | | | | In commit 6c268df, pg_init_privs was added to track the initial privileges of catalog objects and extensions. Unfortunately, that commit didn't include understanding of ALTER EXTENSION ADD/DROP, which allows the objects associated with an extension to be changed after the initial CREATE EXTENSION script has been run. The result of this meant that ACLs for objects added through ALTER EXTENSION ADD were not recorded into pg_init_privs and we would end up including those ACLs in pg_dump when we shouldn't have. This commit corrects that by making sure to have pg_init_privs updated when ALTER EXTENSION ADD/DROP is run, recording the permissions as they are at ALTER EXTENSION ADD time, and removing any if/when ALTER EXTENSION DROP is called. This issue was pointed out by Moshe Jacobson as commentary on bug #14456 (which was actually a bug about versions prior to 9.6 not handling custom ACLs on extensions correctly, an issue now addressed with pg_init_privs in 9.6). Back-patch to 9.6 where pg_init_privs was introduced.
* Use castNode() in a bunch of statement-list-related code.Tom Lane2017-01-26
| | | | | | | | | | | | | When I wrote commit ab1f0c822, I really missed the castNode() macro that Peter E. had proposed shortly before. This back-fills the uses I would have put it to. It's probably not all that significant, but there are more assertions here than there were before, and conceivably they will help catch any bugs associated with those representation changes. I left behind a number of usages like "(Query *) copyObject(query_var)". Those could have been converted as well, but Peter has proposed another notational improvement that would handle copyObject cases automatically, so I let that be for now.
* Use the new castNode() macro in a number of places.Andres Freund2017-01-26
| | | | | | | | | This is far from a pervasive conversion, but it's a good starting point. Author: Peter Eisentraut, with some minor changes by me Reviewed-By: Tom Lane Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
* Update copyright years in some recently added filesPeter Eisentraut2017-01-25
|
* Close replication connection when slot creation errorsPeter Eisentraut2017-01-25
| | | | From: Petr Jelinek <pjmodos@pjmodos.net>
* Reindent table partitioning code.Robert Haas2017-01-24
| | | | | | We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.
* Fix interaction of partitioned tables with BulkInsertState.Robert Haas2017-01-24
| | | | | | | | | | | | | | | | | | | | When copying into a partitioned table, the target heap may change from one tuple to next. We must ask ReadBufferBI() to get a new buffer every time such change occurs. To do that, use new function ReleaseBulkInsertStatePin(). This fixes the bug that tuples ended up being inserted into the wrong partition, which occurred exactly because the wrong buffer was used. Amit Langote, per a suggestion from Robert Haas. Some cosmetic adjustments by me. Reports by 高增琦 (Gao Zengqi), Venkata B Nagothi, and Ragnar Ouchterlony. Discussion: http://postgr.es/m/CAFmBtr32FDOqofo8yG-4mjzL1HnYHxXK5S9OGFJ%3D%3DcJpgEW4vA%40mail.gmail.com Discussion: http://postgr.es/m/CAEyp7J9WiX0L3DoiNcRrY-9iyw%3DqP%2Bj%3DDLsAnNFF1xT2J1ggfQ%40mail.gmail.com Discussion: http://postgr.es/m/16d73804-c9cd-14c5-463e-5caad563ff77%40agama.tv Discussion: http://postgr.es/m/CA+TgmoaiZpDVUUN8LZ4jv1qFE_QyR+H9ec+79f5vNczYarg5Zg@mail.gmail.com
* Fix default minimum value for descending sequencesPeter Eisentraut2017-01-23
| | | | | | | | | | | For some reason that is lost in history, a descending sequence would default its minimum value to -2^63+1 (-PG_INT64_MAX) instead of -2^63 (PG_INT64_MIN), even though explicitly specifying a minimum value of -2^63 would work. Fix this inconsistency by using the full range by default. Reported-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Don't error when no system locales were foundPeter Eisentraut2017-01-23
| | | | | | initdb used to warn about that, but it was changed to an error in pg_import_system_locales, but some build farm members failed because of that. Change it back to a warning.
* Prefetch blocks during lazy vacuum's truncation scanAlvaro Herrera2017-01-23
| | | | | | | | | | | | | | Vacuum truncation scan can be sped up on rotating media by prefetching blocks in forward direction. That makes the blocks already present in memory by the time they are needed, while also letting OS read-ahead kick in. The truncate scan has been measured to be five times faster than without this patch (that was on a slow disk, but it shouldn't hurt on fast disks.) Author: Álvaro Herrera, loosely based on a submission by Claudio Freire Discussion: https://postgr.es/m/CAGTBQpa6NFGO_6g_y_7zQx8L9GcHDSQKYdo1tGuh791z6PYgEg@mail.gmail.com
* Move some things from builtins.h to new header filesPeter Eisentraut2017-01-20
| | | | This avoids that builtins.h has to include additional header files.
* Record dependencies on owners for logical replication objectsAlvaro Herrera2017-01-20
| | | | | | | | | This was forgotten in 665d1fad99e7b11678b0d5fa24d2898424243cd6 and caused the whole buildfarm to become red for a little while. Author: Petr Jelínek Also fix a typo in a nearby error message.
* Logical replicationPeter Eisentraut2017-01-20
| | | | | | | | | | | | | - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
* Remove obsoleted code relating to targetlist SRF evaluation.Andres Freund2017-01-19
| | | | | | | | | | | | | Since 69f4b9c plain expression evaluation (and thus normal projection) can't return sets of tuples anymore. Thus remove code dealing with that possibility. This will require adjustments in external code using ExecEvalExpr()/ExecProject() - that should neither be hard nor very common. Author: Andres Freund and Tom Lane Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
* Fix race condition in reading commit timestampsAlvaro Herrera2017-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | If a user requests the commit timestamp for a transaction old enough that its data is concurrently being truncated away by vacuum at just the right time, they would receive an ugly internal file-not-found error message from slru.c rather than the expected NULL return value. In a primary server, the window for the race is very small: the lookup has to occur exactly between the two calls by vacuum, and there's not a lot that happens between them (mostly just a multixact truncate). In a standby server, however, the window is larger because the truncation is executed as soon as the WAL record for it is replayed, but the advance of the oldest-Xid is not executed until the next checkpoint record. To fix in the primary, simply reverse the order of operations in vac_truncate_clog. To fix in the standby, augment the WAL truncation record so that the standby is aware of the new oldest-XID value and can apply the update immediately. WAL version bumped because of this. No backpatch, because of the low importance of the bug and its rarity. Author: Craig Ringer Reviewed-By: Petr Jelínek, Peter Eisentraut Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
* Fix RETURNING to work correctly with partition tuple routing.Robert Haas2017-01-19
| | | | | | | | | | In ExecInsert(), do not switch back to the root partitioned table ResultRelInfo until after we finish ExecProcessReturning(), so that RETURNING projection is done using the partition's descriptor. For the projection to work correctly, we must initialize the same for each leaf partition during ModifyTableState initialization. Amit Langote
* Fix failure to enforce partitioning contraint for internal partitions.Robert Haas2017-01-19
| | | | | | | | | | | | | When a tuple is inherited into a partitioning root, no partition constraints need to be enforced; when it is inserted into a leaf, the parent's partitioning quals needed to be enforced. The previous coding got both of those cases right. When a tuple is inserted into an intermediate level of the partitioning hierarchy (i.e. a table which is both a partition itself and in turn partitioned), it must enforce the partitioning qual inherited from its parent. That case got overlooked; repair. Amit Langote
* Move targetlist SRF handling from expression evaluation to new executor node.Andres Freund2017-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de