aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Rename IndexInfo.ii_KeyAttrNumbers arrayTeodor Sigaev2018-04-12
| | | | | | | | Rename ii_KeyAttrNumbers to ii_IndexAttrNumbers to prevent confusion with ii_NumIndexAttrs/ii_NumIndexKeyAttrs. ii_IndexAttrNumbers contains all attributes including "including" columns, not only key attribute. Discussion: https://www.postgresql.org/message-id/13123421-1d52-d0e4-c95c-6d69011e0595%40sigaev.ru
* Set relispartition correctly for index partitionsAlvaro Herrera2018-04-11
| | | | | | | | | | | Oversight in commit 8b08f7d4820f: pg_class.relispartition was not being set for index partitions, which is a bit odd, and was also causing the code to unnecessarily call has_superclass() when simply checking the flag was enough. Author: Álvaro Herrera Reported-by: Amit Langote Discussion: https://postgr.es/m/12085bc4-0bc6-0f3a-4c43-57fe0681772b@lab.ntt.co.jp
* Ignore nextOid when replaying an ONLINE checkpoint.Tom Lane2018-04-11
| | | | | | | | | | | | | | | The nextOid value is from the start of the checkpoint and may well be stale compared to values from more recent XLOG_NEXTOID records. Previously, we adopted it anyway, allowing the OID counter to go backwards during a crash. While this should be harmless, it contributed to the severity of the bug fixed in commit 0408e1ed5, by allowing duplicate TOAST OIDs to be assigned immediately following a crash. Without this error, that issue would only have arisen when TOAST objects just younger than a multiple of 2^32 OIDs were deleted and then not vacuumed in time to avoid a conflict. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
* Do not select new object OIDs that match recently-dead entries.Tom Lane2018-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When selecting a new OID, we take care to avoid picking one that's already in use in the target table, so as not to create duplicates after the OID counter has wrapped around. However, up to now we used SnapshotDirty when scanning for pre-existing entries. That ignores committed-dead rows, so that we could select an OID matching a deleted-but-not-yet-vacuumed row. While that mostly worked, it has two problems: * If recently deleted, the dead row might still be visible to MVCC snapshots, creating a risk for duplicate OIDs when examining the catalogs within our own transaction. Such duplication couldn't be visible outside the object-creating transaction, though, and we've heard few if any field reports corresponding to such a symptom. * When selecting a TOAST OID, deleted toast rows definitely *are* visible to SnapshotToast, and will remain so until vacuumed away. This leads to a conflict that will manifest in errors like "unexpected chunk number 0 (expected 1) for toast value nnnnn". We've been seeing reports of such errors from the field for years, but the cause was unclear before. The fix is simple: just use SnapshotAny to search for conflicting rows. This results in a slightly longer window before object OIDs can be recycled, but that seems unlikely to create any large problems. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
* Allocate enough shared string memory for stats of auxiliary processes.Heikki Linnakangas2018-04-11
| | | | | | | | | | | | | | This fixes a bug whereby the st_appname, st_clienthostname, and st_activity_raw fields for auxiliary processes point beyond the end of their respective shared memory segments. As a result, the application_name of a backend might show up as the client hostname of an auxiliary process. Backpatch to v10, where this bug was introduced, when the auxiliary processes were added to the array. Author: Edmund Horner Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com
* Make local copy of client hostnames in backend status array.Heikki Linnakangas2018-04-11
| | | | | | | | | | | | | The other strings, application_name and query string, were snapshotted to local memory in pgstat_read_current_status(), but we forgot to do that for client hostnames. As a result, the client hostname would appear to change in the local copy, if the client disconnected. Backpatch to all supported versions. Author: Edmund Horner Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com
* Fix ALTER TABLE .. ATTACH PARTITION ... DEFAULTAlvaro Herrera2018-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the table being attached contained values that contradict the default partition's partition constraint, it would fail to complain, because CommandCounterIncrement changes in 4dba331cb3dc coupled with some bogus coding in the existing ValidatePartitionConstraints prevented the partition constraint from being validated after all -- or rather, it caused to constraint to become an empty one, always succeeding. Fix by not re-reading the OID of the default partition in ATExecAttachPartition. To forestall similar problems, revise the existing code: * rename routine from ValidatePartitionConstraints() to QueuePartitionConstraintValidation, to better represent what it actually does. * add an Assert() to make sure that when queueing a constraint for a partition we're not overwriting a constraint previously queued. * add an Assert() that we don't try to invoke the special-purpose validation of the default partition when attaching the default partition itself. While at it, change some loops to obtain partition OIDs from partdesc->oids rather than find_all_inheritors; reduce the lock level of partitions being scanned from AccessExclusiveLock to ShareLock; rewrite QueuePartitionConstraintValidation in a recursive fashion rather than repetitive. Author: Álvaro Herrera. Tests written by Amit Langote Reported-by: Rushabh Lathia Diagnosed-by: Kyotaro HORIGUCHI, who also provided the initial fix. Reviewed-by: Kyotaro HORIGUCHI, Amit Langote, Jeevan Ladhe Discussion: https://postgr.es/m/CAGPqQf0W+v-Ci_qNV_5R3A=Z9LsK4+jO7LzgddRncpp_rrnJqQ@mail.gmail.com
* Temporary revert 5c6110c6a960ad6fe1b0d0fec6ae36ef4eb913f5Teodor Sigaev2018-04-11
| | | | It discovers one more bug in CompareIndexInfo(), should be fixed first.
* Fix interference between cavering indexes and partitioned tablesTeodor Sigaev2018-04-11
| | | | | | | | The bug is caused due to the original IndexStmt that DefineIndex receives being overwritten when processing the INCLUDE columns. Use separate list of index params to propagate to child tables. Add tests covering this case. Amit Langote and Alexander Korotkov.
* minor comment fixes in nbtinsert.cAndrew Dunstan2018-04-10
|
* Fix incorrect close() call in dsm_impl_mmap().Tom Lane2018-04-10
| | | | | | | | | | | | One improbable error-exit path in this function used close() where it should have used CloseTransientFile(). This is unlikely to be hit in the field, and I think the consequences wouldn't be awful (just an elog(LOG) bleat later). But a bug is a bug, so back-patch to 9.4 where this code came in. Pan Bian Discussion: https://postgr.es/m/152056616579.4966.583293218357089052@wrigleys.postgresql.org
* Adjustments to the btree fastpath optimization.Andrew Dunstan2018-04-10
| | | | | | | | | | | | | | | This optimization was introduced in commit 2b272734. The changes include some additional comments and documentation, and also these more substantive changes: . ensure the optimization is only applied on the leaf node of a tree whose root is on level 2 or more. It's of little value on small trees. . Delay calling RelationSetTargetBlock() until after the critical section of _bt_insertonpg . ensure the optimization is also applied to unlogged tables. Pavan Deolasee and Peter Geoghegan with some very light editing from me. Discussion: https://postgr.es/m/CABOikdO8jhRarNC60nZLktZYhxt+TK8z_V97+Ny499YQdyAfug@mail.gmail.com
* Fix IndexOnlyScan counter for heap fetches in parallel modeAlvaro Herrera2018-04-10
| | | | | | | | | | | | The HeapFetches counter was using a simple value in IndexOnlyScanState, which fails to propagate values from parallel workers; so the counts are wrong when IndexOnlyScan runs in parallel. Move it to Instrumentation, like all the other counters. While at it, change INSERT ON CONFLICT conflicting tuple counter to use the new ntuples2 instead of nfiltered2, which is a blatant misuse. Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql
* Fix comment on B-tree insertion fastpath condition.Heikki Linnakangas2018-04-10
| | | | | | The comment earlier in the function correctly states "and the insertion key is strictly greater than the first key in this page". That is what we check here, not "greater than or equal".
* Fix partial-build problems introduced by having more generated headers.Tom Lane2018-04-09
| | | | | | | | | | | | | | | | | | | | Commit 372728b0d created some problems for usages like building a subdirectory without having first done "make all" at the top level, or for proceeding directly to "make install" without "make all". The only reasonably clean way to fix this seems to be to force the submake-generated-headers rule to fire in *any* "make all" or "make install" command anywhere in the tree. To avoid lots of redundant work, as well as parallel make jobs possibly clobbering each others' output, we still need to be sure that the rule fires only once in a recursive build. For that, adopt the same MAKELEVEL hack previously used for "temp-install". But try to document it a bit better. The submake-errcodes mechanism previously used in src/port/ and src/common/ is subsumed by this, so we can get rid of those special cases. It was inadequate for src/common/ anyway after the aforesaid commit, and it always risked parallel attempts to build errcodes.h. Discussion: https://postgr.es/m/E1f5FAB-0006LU-MB@gemulon.postgresql.org
* Fix incorrect logic for choosing the next Parallel Append subplanAlvaro Herrera2018-04-09
| | | | | | | | | | | | | | | | | In 499be013de support for pruning unneeded Append subnodes was added. The logic in that commit was not correctly checking if the next subplan was in fact a valid subplan. This could cause parallel workers processes to be given a subplan to work on which didn't require any work. Per code review following an otherwise unexplained regression failure in buildfarm member Pademelon. (We haven't been able to reproduce the failure, so this is a bit of a blind fix in terms of whether it'll actually fix it; but it is a clear bug nonetheless). In passing, also add a comment to explain what first_partial_plan means. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f_E5r05hHUVG3UmCQJ49DGKKHtN=SHybD44LdzBn+CJng@mail.gmail.com
* Reduce chattiness of genbki.pl and Gen_fmgrtab.pl.Tom Lane2018-04-09
| | | | | | | | | | | Make these scripts emit just one log message when they run, not one per output file. The latter is way too verbose in the wake of commit 372728b0d. The specific wording used is what already existed in the MSVC scripts. John Naylor Discussion: https://postgr.es/m/11103.1523208822@sss.pgh.pa.us
* Further cleanup of client dependencies on src/include/catalog headers.Tom Lane2018-04-09
| | | | | | | | | | | | | | | | | In commit 9c0a0de4c, I'd failed to notice that catalog/catalog.h should also be considered a frontend-unsafe header, because it includes (and needs) the full form of pg_class.h, not to mention relcache.h. However, various frontend code was depending on it to get TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for. The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY, as well as the OIDCHARS symbol, to common/relpath.h. Do that, and mop up inclusions as necessary. (I found that quite a few current users of catalog/catalog.h don't seem to need it at all anymore, apparently as a result of the refactorings that created common/relpath.[hc]. And initdb.c needed it only as a route to pg_class_d.h.) Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us
* Revert "Allow on-line enabling and disabling of data checksums"Magnus Hagander2018-04-09
| | | | | | | | This reverts the backend sides of commit 1fde38beaa0c3e66c340efc7cc0dc272d6254bb0. I have, at least for now, left the pg_verify_checksums tool in place, as this tool can be very valuable without the rest of the patch as well, and since it's a read-only tool that only runs when the cluster is down it should be a lot safer.
* Minor comment updatesAlvaro Herrera2018-04-09
| | | | | | | | Fix a couple of typos, and update a comment about why we set a BMS to NULL. Author: David Rowley Discussion: http://postgr.es/m/CAKJS1f-tux=KdUz6ENJ9GHM_V2qgxysadYiOyQS9Ko9PTteVhQ@mail.gmail.com
* Add missed bms_copy() in perform_pruning_combine_stepAlvaro Herrera2018-04-09
| | | | | | | | | | | We were initializing a BMS to merely reference an existing one, which would cause a double-free (and a crash) when the recursive algorithm tried to intersect it with an empty one. Fix it by creating a copy at initialization time. Reported-by: sqlsmith (by way of Andreas Seltenreich) Author: Amit Langote Discussion: https://postgr.es/m/87in923lyw.fsf@ansel.ydns.eu
* Fix typo in comment.Heikki Linnakangas2018-04-09
| | | | Author: Kyotaro Horiguchi
* Fix additional breakage in covering-index patch.Tom Lane2018-04-08
| | | | | | | | | | | | | | | | | | | | | | | CheckIndexCompatible() misused ComputeIndexAttrs() by not bothering to fill ii_NumIndexAttrs and ii_NumIndexKeyAttrs in the passed IndexInfo. Omission of ii_NumIndexAttrs was previously unimportant, but now this matters because ComputeIndexAttrs depends on ii_NumIndexKeyAttrs to decide how many columns it needs to report on. (BTW, the fact that this oversight wasn't detected earlier implies that we have no regression test verifying whether CheckIndexCompatible ever succeeds. Bad dog. Not the job of this patch to fix it, though.) Also, change the API of ComputeIndexAttrs so that it fills the opclass output array for all column positions, as it does for the options output array; positions for non-key index columns are filled with zeroes. This isn't directly fixing any bug, but it seems like a good idea. Per valgrind failure reports from buildfarm. Alexander Korotkov, tweaked a bit by me Discussion: https://postgr.es/m/CAPpHfduWrysrT-qAhn+3Ea5+Mg6Vhc-oA6o2Z-hRCPRdvf3tiw@mail.gmail.com
* Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.Tom Lane2018-04-08
| | | | | | | | | | | | | Traditionally, include/catalog/pg_foo.h contains extern declarations for functions in backend/catalog/pg_foo.c, in addition to its function as the authoritative definition of the pg_foo catalog's rowtype. In some cases, we'd been forced to split out those extern declarations into separate pg_foo_fn.h headers so that the catalog definitions could be #include'd by frontend code. That problem is gone as of commit 9c0a0de4c, so let's undo the splits to make things less confusing. Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
* Replace our traditional initial-catalog-data format with a better design.Tom Lane2018-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically, the initial catalog data to be installed during bootstrap has been written in DATA() lines in the catalog header files. This had lots of disadvantages: the format was badly underdocumented, it was very difficult to edit the data in any mechanized way, and due to the lack of any abstraction the data was verbose, hard to read/understand, and easy to get wrong. Hence, move this data into separate ".dat" files and represent it in a way that can easily be read and rewritten by Perl scripts. The new format is essentially "key => value" for each column; while it's a bit repetitive, explicit labeling of each value makes the data far more readable and less error-prone. Provide a way to abbreviate entries by omitting field values that match a specified default value for their column. This allows removal of a large amount of repetitive boilerplate and also lowers the barrier to adding new columns. Also teach genbki.pl how to translate symbolic OID references into numeric OIDs for more cases than just "regproc"-like pg_proc references. It can now do that for regprocedure-like references (thus solving the problem that regproc is ambiguous for overloaded functions), operators, types, opfamilies, opclasses, and access methods. Use this to turn nearly all OID cross-references in the initial data into symbolic form. This represents a very large step forward in readability and error resistance of the initial catalog data. It should also reduce the difficulty of renumbering OID assignments in uncommitted patches. Also, solve the longstanding problem that frontend code that would like to use OID macros and other information from the catalog headers often had difficulty with backend-only code in the headers. To do this, arrange for all generated macros, plus such other declarations as we deem fit, to be placed in "derived" header files that are safe for frontend inclusion. (Once clients migrate to using these pg_*_d.h headers, it will be possible to get rid of the pg_*_fn.h headers, which only exist to quarantine code away from clients. That is left for follow-on patches, however.) The now-automatically-generated macros include the Anum_xxx and Natts_xxx constants that we used to have to update by hand when adding or removing catalog columns. Replace the former manual method of generating OID macros for pg_type entries with an automatic method, ensuring that all built-in types have OID macros. (But note that this patch does not change the way that OID macros for pg_proc entries are built and used. It's not clear that making that match the other catalogs would be worth extra code churn.) Add SGML documentation explaining what the new data format is and how to work with it. Despite being a very large change in the catalog headers, there is no catversion bump here, because postgres.bki and related output files haven't changed at all. John Naylor, based on ideas from various people; review and minor additional coding by me; previous review by Alvaro Herrera Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com
* match_clause_to_index should check only key columnsTeodor Sigaev2018-04-08
| | | | | Alexander Korotkov per gripe from Tom Lane noticed on valgrind-enabled buildfarm members
* Remove unused variable in non-assert-enabled buildTeodor Sigaev2018-04-08
| | | | | | Use field of structure in Assert directly Jeff Janes
* Support index INCLUDE in the AM properties interface.Andrew Gierth2018-04-08
| | | | | | | This rectifies an oversight in commit 8224de4f4, by adding a new property 'can_include' for pg_indexam_has_property, and adjusting the results of pg_index_column_has_property to give more appropriate results for INCLUDEd columns.
* Fix EXEC BACKEND + Windows builds for group privsStephen Frost2018-04-07
| | | | | | | | | | | | Under EXEC BACKEND we also need to be going through the group privileges setup since we do support that on Unixy systems, so add that to SubPostmasterMain(). Under Windows, we need to simply return true from GetDataDirectoryCreatePerm(), but that wasn't happening due to a missing #else clause. Per buildfarm.
* Allow group access on PGDATAStephen Frost2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the cluster to be optionally init'd with read access for the group. This means a relatively non-privileged user can perform a backup of the cluster without requiring write privileges, which enhances security. The mode of PGDATA is used to determine whether group permissions are enabled for directory and file creates. This method was chosen as it's simple and works well for the various utilities that write into PGDATA. Changing the mode of PGDATA manually will not automatically change the mode of all the files contained therein. If the user would like to enable group access on an existing cluster then changing the mode of all the existing files will be required. Note that pg_upgrade will automatically change the mode of all migrated files if the new cluster is init'd with the -g option. Tests are included for the backend and all the utilities which operate on the PG data directory to ensure that the correct mode is set based on the data directory permissions. Author: David Steele <david@pgmasters.net> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
* Refactor dir/file permissionsStephen Frost2018-04-07
| | | | | | | | | | | | | | | | | | | Consolidate directory and file create permissions for tools which work with the PG data directory by adding a new module (common/file_perm.c) that contains variables (pg_file_create_mode, pg_dir_create_mode) and constants to initialize them (0600 for files and 0700 for directories). Convert mkdir() calls in the backend to MakePGDirectory() if the original call used default permissions (always the case for regular PG directories). Add tests to make sure permissions in PGDATA are set correctly by the tools which modify the PG data directory. Authors: David Steele <david@pgmasters.net>, Adam Brightwell <adam.brightwell@crunchydata.com> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
* Support partition pruning at execution timeAlvaro Herrera2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
* Add bms_prev_member functionAlvaro Herrera2018-04-07
| | | | | | | | | | | | This works very much like the existing bms_last_member function, only it traverses through the Bitmapset in the opposite direction from the most significant bit down to the least significant bit. A special prevbit value of -1 may be used to have the function determine the most significant bit. This is useful for starting a loop. When there are no members less than prevbit, the function returns -2 to indicate there are no more members. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com
* Raise error when affecting tuple moved into different partition.Andres Freund2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an update moves a row between partitions (supported since 2f178441044b), our normal logic for following update chains in READ COMMITTED mode doesn't work anymore. Cross partition updates are modeled as an delete from the old and insert into the new partition. No ctid chain exists across partitions, and there's no convenient space to introduce that link. Not throwing an error in a partitioned context when one would have been thrown without partitioning is obviously problematic. This commit introduces infrastructure to detect when a tuple has been moved, not just plainly deleted. That allows to throw an error when encountering a deletion that's actually a move, while attempting to following a ctid chain. The row deleted as part of a cross partition update is marked by pointing it's t_ctid to an invalid block, instead of self as a normal update would. That was deemed to be the least invasive and most future proof way to represent the knowledge, given how few infomask bits are there to be recycled (there's also some locking issues with using infomask bits). External code following ctid chains should be updated to check for moved tuples. The most likely consequence of not doing so is a missed error. Author: Amul Sul, editorialized by me Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com
* Indexes with INCLUDE columns and their support in B-treeTeodor Sigaev2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
* Add json(b)_to_tsvector functionTeodor Sigaev2018-04-07
| | | | | | | | | | | | | | | | Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com
* Logical replication support for TRUNCATEPeter Eisentraut2018-04-07
| | | | | | | | | | | | | | | | | | Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
* Logical decoding of TRUNCATEPeter Eisentraut2018-04-07
| | | | | | | | | | | | | | Add a new WAL record type for TRUNCATE, which is only used when wal_level >= logical. (For physical replication, TRUNCATE is already replicated via SMGR records.) Add new callback for logical decoding output plugins to receive TRUNCATE actions. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
* Predicate locking in hash indexes.Teodor Sigaev2018-04-07
| | | | | | | | | | | | Hash index searches acquire predicate locks on the primary page of a bucket. It acquires a lock on both the old and new buckets for scans that happen concurrently with page splits. During a bucket split, a predicate lock is copied from the primary page of an old bucket to the primary page of a new bucket. Author: Shubham Barai, Amit Kapila Reviewed by: Amit Kapila, Alexander Korotkov, Thomas Munro Discussion: https://www.postgresql.org/message-id/flat/CALxAEPvNsM2GTiXdRgaaZ1Pjd1bs+sxfFsf7Ytr+iq+5JJoYXA@mail.gmail.com
* Document partprune.c a little betterAlvaro Herrera2018-04-07
| | | | | | Author: Amit Langote Reviewed-by: Álvaro Herrera, David Rowley Discussion: https://postgr.es/m/CA+HiwqGzq4D6z=8R0AP+XhbTFCQ-4Ct+t2ekqjE9Fpm84_JUGg@mail.gmail.com
* Fix and improve pg_atomic_flag fallback implementation.Andres Freund2018-04-06
| | | | | | | | | | | | | | | | | | | | The atomics fallback implementation for pg_atomic_flag was broken, returning the inverted value from pg_atomic_test_set_flag(). This was unnoticed because a) atomic flags were unused until recently b) the test code wasn't run when the fallback implementation was in use (because it didn't allow to test for some edge cases). Fix the bug, and improve the fallback so it has the same behaviour as the non-fallback implementation in the problematic edge cases. That breaks ABI compatibility in the back branches when fallbacks are in use, but given they were broken until now... Author: Andres Freund Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de Backpatch: 9.5-, where the atomics abstraction was introduced.
* Fix possible failure in parallel index build.Robert Haas2018-04-06
| | | | | | | Report and proposed fix by David Rowley, put in patch form by Peter Geoghegan. Discussion: http://postgr.es/m/CAKJS1f91kq1wfYR8rnRRfKtxyhU2woEA+=whd640UxMyU+O0EQ@mail.gmail.com
* Allow insert and update tuple routing and COPY for foreign tables.Robert Haas2018-04-06
| | | | | | | | | | | Also enable this for postgres_fdw. Etsuro Fujita, based on an earlier patch by Amit Langote. The larger patch series of which this is a part has been reviewed by Amit Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost, and me. Minor documentation changes to the final version by me. Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp
* Faster partition pruningAlvaro Herrera2018-04-06
| | | | | | | | | | | | | | | | | | | | | Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
* Support new default roles with adminpackStephen Frost2018-04-06
| | | | | | | | | | | | | | | | | | | | | | | This provides a newer version of adminpack which works with the newly added default roles to support GRANT'ing to non-superusers access to read and write files, along with related functions (unlinking files, getting file length, renaming/removing files, scanning the log file directory) which are supported through adminpack. Note that new versions of the functions are required because an environment might have an updated version of the library but still have the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to PUBLIC for the functions). This patch also removes the long-deprecated alternative names for functions that adminpack used to include and which are now included in the backend, in adminpack v1.1. Applications using the deprecated names should be updated to use the backend functions instead. Existing installations which continue to use adminpack v1.0 should continue to function until/unless adminpack is upgraded. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
* Add default roles for file/program accessStephen Frost2018-04-06
| | | | | | | | | | | | | | | | | | | This patch adds new default roles named 'pg_read_server_files', 'pg_write_server_files', 'pg_execute_server_program' which allow an administrator to GRANT to a non-superuser role the ability to access server-side files or run programs through PostgreSQL (as the user the database is running as). Having one of these roles allows a non-superuser to use server-side COPY to read, write, or with a program, and to use file_fdw (if installed by a superuser and GRANT'd USAGE on it) to read from files or run a program. The existing misc file functions are also changed to allow a user with the 'pg_read_server_files' default role to read any files on the filesystem, matching the privileges given to that role through COPY and file_fdw from above. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
* Remove explicit superuser checks in favor of ACLsStephen Frost2018-04-06
| | | | | | | | | | | This removes the explicit superuser checks in the various file-access functions in the backend, specifically pg_ls_dir(), pg_read_file(), pg_read_binary_file(), and pg_stat_file(). Instead, EXECUTE is REVOKE'd from public for these, meaning that only a superuser is able to run them by default, but access to them can be GRANT'd to other roles. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
* Add memory context identifier to portal contextPeter Eisentraut2018-04-06
| | | | Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
* Rename MemoryContextCopySetIdentifier() for clarityPeter Eisentraut2018-04-06
| | | | | | MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
* Enforce child constraints during COPY TO a partitioned table.Robert Haas2018-04-06
| | | | | | | | | | | | | | The previous coding inadvertently checked the constraints for the partitioned table rather than the target partition, which could lead to data in a partition that fails to satisfy some constraint on that partition. This problem seems to date back to when table partitioning was introduced; prior to that, there was only one target table for a COPY, so the problem didn't occur, and the code just didn't get updated. Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp