aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/gist
Commit message (Collapse)AuthorAge
...
* Avoid using lcons and list_delete_first where it's easy to do so.Tom Lane2019-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, lcons was about the same speed as lappend, but with the new List implementation, that's not so; with a long List, data movement imposes an O(N) cost on lcons and list_delete_first, but not lappend. Hence, invent list_delete_last with semantics parallel to list_delete_first (but O(1) cost), and change various places to use lappend and list_delete_last where this can be done without much violence to the code logic. There are quite a few places that construct result lists using lcons not lappend. Some have semantic rationales for that; I added comments about it to a couple that didn't have them already. In many such places though, I think the coding is that way only because back in the dark ages lcons was faster than lappend. Hence, switch to lappend where this can be done without causing semantic changes. In ExecInitExprRec(), this results in aggregates and window functions that are in the same plan node being executed in a different order than before. Generally, the executions of such functions ought to be independent of each other, so this shouldn't result in visibly different query results. But if you push it, as one regression test case does, you can show that the order is different. The new order seems saner; it's closer to the order of the functions in the query text. And we never documented or promised anything about this, anyway. Also, in gistfinishsplit(), don't bother building a reverse-order list; it's easy now to iterate backwards through the original list. It'd be possible to go further towards removing uses of lcons and list_delete_first, but it'd require more extensive logic changes, and I'm not convinced it's worth it. Most of the remaining uses deal with queues that probably never get long enough to be worth sweating over. (Actually, I doubt that any of the changes in this patch will have measurable performance effects either. But better to have good examples than bad ones in the code base.) Patch by me, thanks to David Rowley and Daniel Gustafsson for review. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-16
| | | | | | | | | | | This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com
* Add support for <-> (box, point) operator to GiST box_opsAlexander Korotkov2019-07-14
| | | | | | | | | | Index-based calculation of this operator is exact. So, signature of gist_bbox_distance() function is changes so that caller is responsible for setting *recheck flag. Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Tom Lane, Alexander Korotkov
* Fix variable initialization when using buffering build with GiSTMichael Paquier2019-07-10
| | | | | | | | | | | | | | | | This can cause valgrind to complain, as the flag marking a buffer as a temporary copy was not getting initialized. While on it, fill in with zeros newly-created buffer pages. This does not matter when loading a block from a temporary file, but it makes the push of an index tuple into a new buffer page safer. This has been introduced by 1d27dcf, so backpatch all the way down to 9.4. Author: Alexander Lakhin Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org Backpatch-through: 9.4
* Fix typos in various placesMichael Paquier2019-06-03
| | | | | | Author: Andrea Gelmini Reviewed-by: Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/20190528181718.GA39034@glet
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Detect internal GiST page splits correctly during index build.Heikki Linnakangas2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we descend the GiST tree during insertion, we modify any downlinks on the way down to include the new tuple we're about to insert (if they don't cover it already). Modifying an existing downlink might cause an internal page to split, if the new downlink tuple is larger than the old one. If that happens, we need to back up to the parent and re-choose a page to insert to. We used to detect that situation, thanks to the NSN-LSN interlock normally used to detect concurrent page splits, but that got broken by commit 9155580fd5. With that commit, we now use a dummy constant LSN value for every page during index build, so the LSN-NSN interlock no longer works. I thought that was OK because there can't be any other backends modifying the index during index build, but missed that the insertion itself can modify the page we're inserting to. The consequence was that we would sometimes insert the new tuple to an incorrect page, one whose downlink doesn't cover the new tuple. To fix, add a flag to the stack that keeps track of the state while descending tree, to indicate that a page was split, and that we need to retry the descend from the parent. Thomas Munro first reported that the contrib/intarray regression test was failing occasionally on the buildfarm after commit 9155580fd5. The failure was intermittent, because the gistchoose() function is not deterministic, and would only occasionally create the right circumstances for this bug to cause the failure. Patch by Anastasia Lubennikova, with some changes by me to make it work correctly also when the internal page split also causes the "grandparent" to be split. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJRzLo7tZExWfSbwM3XuK7aAK7FhdBV0FLkbUG%2BW0v0zg%40mail.gmail.com
* Convert gist to compute page level xid horizon on primary.Andres Freund2019-04-22
| | | | | | | | | | | | | | | | | Due to parallel development, gist added the missing conflict information in c952eae52a3, while 558a9165e08 moved that computation to the primary for the index types that already had it. Thus adapt gist to also compute on the primary, using index_compute_xid_horizon_for_tuples() instead of its own copy of the logic. This also adds pg_waldump support for XLOG_GIST_DELETE records, which previously was not properly present. Bumps WAL version. Author: Andres Freund Discussion: https://postgr.es/m/20190406050243.bszosdg4buvabfrt@alap3.anarazel.de
* Generate less WAL during GiST, GIN and SP-GiST index build.Heikki Linnakangas2019-04-03
| | | | | | | | | | | | | | | | | | | | Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov
* Report progress of CREATE INDEX operationsAlvaro Herrera2019-04-02
| | | | | | | | | | | | | | | | | | | | This uses the progress reporting infrastructure added by c16dc1aca5e0, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
* tableam: Support for an index build's initial table scan(s).Andres Freund2019-03-27
| | | | | | | | | | | | | To support building indexes over tables of different AMs, the scans to do so need to be routed through the table AM. While moving a fair amount of code, nearly all the changes are just moving code to below a callback. Currently the range based interface wouldn't make much sense for non block based table AMs. But that seems aceptable for now. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Switch some palloc/memset calls to palloc0Michael Paquier2019-03-27
| | | | | | | | | | | Some code paths have been doing some allocations followed by an immediate memset() to initialize the allocated area with zeros, this is a bit overkill as there are already interfaces to do both things in one call. Author: Daniel Gustafsson Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se
* Fix bug in the GiST vacuum's 2nd stage.Heikki Linnakangas2019-03-22
| | | | | | | We mustn't assume that the IndexVacuumInfo pointer passed to bulkdelete() stage is still valid in the vacuumcleanup() stage. Per very pink buildfarm.
* Delete empty pages during GiST VACUUM.Heikki Linnakangas2019-03-22
| | | | | | | | | | | | | | | To do this, we scan GiST two times. In the first pass we make note of empty leaf pages and internal pages. At second pass we scan through internal pages, looking for downlinks to the empty pages. Deleting internal pages is still not supported, like in nbtree, the last child of an internal page is never deleted. That means that if you have a workload where new keys are always inserted to different area than where old keys are removed, the index will still grow without bound. But the rate of growth will be an order of magnitude slower than before. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/B1E4DF12-6CD3-4706-BDBD-BF3283328F60@yandex-team.ru
* tableam: Add and use scan APIs.Andres Freund2019-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Too allow table accesses to be not directly dependent on heap, several new abstractions are needed. Specifically: 1) Heap scans need to be generalized into table scans. Do this by introducing TableScanDesc, which will be the "base class" for individual AMs. This contains the AM independent fields from HeapScanDesc. The previous heap_{beginscan,rescan,endscan} et al. have been replaced with a table_ version. There's no direct replacement for heap_getnext(), as that returned a HeapTuple, which is undesirable for a other AMs. Instead there's table_scan_getnextslot(). But note that heap_getnext() lives on, it's still used widely to access catalog tables. This is achieved by new scan_begin, scan_end, scan_rescan, scan_getnextslot callbacks. 2) The portion of parallel scans that's shared between backends need to be able to do so without the user doing per-AM work. To achieve that new parallelscan_{estimate, initialize, reinitialize} callbacks are introduced, which operate on a new ParallelTableScanDesc, which again can be subclassed by AMs. As it is likely that several AMs are going to be block oriented, block oriented callbacks that can be shared between such AMs are provided and used by heap. table_block_parallelscan_{estimate, intiialize, reinitialize} as callbacks, and table_block_parallelscan_{nextpage, init} for use in AMs. These operate on a ParallelBlockTableScanDesc. 3) Index scans need to be able to access tables to return a tuple, and there needs to be state across individual accesses to the heap to store state like buffers. That's now handled by introducing a sort-of-scan IndexFetchTable, which again is intended to be subclassed by individual AMs (for heap IndexFetchHeap). The relevant callbacks for an AM are index_fetch_{end, begin, reset} to create the necessary state, and index_fetch_tuple to retrieve an indexed tuple. Note that index_fetch_tuple implementations need to be smarter than just blindly fetching the tuples for AMs that have optimizations similar to heap's HOT - the currently alive tuple in the update chain needs to be fetched if appropriate. Similar to table_scan_getnextslot(), it's undesirable to continue to return HeapTuples. Thus index_fetch_heap (might want to rename that later) now accepts a slot as an argument. Core code doesn't have a lot of call sites performing index scans without going through the systable_* API (in contrast to loads of heap_getnext calls and working directly with HeapTuples). Index scans now store the result of a search in IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the target is not generally a HeapTuple anymore that seems cleaner. To be able to sensible adapt code to use the above, two further callbacks have been introduced: a) slot_callbacks returns a TupleTableSlotOps* suitable for creating slots capable of holding a tuple of the AMs type. table_slot_callbacks() and table_slot_create() are based upon that, but have additional logic to deal with views, foreign tables, etc. While this change could have been done separately, nearly all the call sites that needed to be adapted for the rest of this commit also would have been needed to be adapted for table_slot_callbacks(), making separation not worthwhile. b) tuple_satisfies_snapshot checks whether the tuple in a slot is currently visible according to a snapshot. That's required as a few places now don't have a buffer + HeapTuple around, but a slot (which in heap's case internally has that information). Additionally a few infrastructure changes were needed: I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now internally uses a slot to keep track of tuples. While systable_getnext() still returns HeapTuples, and will so for the foreseeable future, the index API (see 1) above) now only deals with slots. The remainder, and largest part, of this commit is then adjusting all scans in postgres to use the new APIs. Author: Andres Freund, Haribabu Kommi, Alvaro Herrera Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
* Support for INCLUDE attributes in GiST indexesAlexander Korotkov2019-03-10
| | | | | | | | | | | | | | | | | Similarly to B-tree, GiST index access method gets support of INCLUDE attributes. These attributes aren't used for tree navigation and aren't present in non-leaf pages. But they are present in leaf pages and can be fetched during index-only scan. The point of having INCLUDE attributes in GiST indexes is slightly different from the point of having them in B-tree. The main point of INCLUDE attributes in B-tree is to define UNIQUE constraint over part of attributes enabled for index-only scan. In GiST the main point of INCLUDE attributes is to use index-only scan for attributes, whose data types don't have GiST opclasses. Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru Author: Andrey Borodin, with small changes by me Reviewed-by: Andreas Karlsson
* Scan GiST indexes in physical order during VACUUM.Heikki Linnakangas2019-03-05
| | | | | | | | | | | | Scanning an index in physical order is faster than walking it in logical order, because sequential I/O is faster than random I/O. The idea and code structure is borrowed from B-tree vacuum code. Patch by Andrey Borodin, with changes by me. Based on early work by Konstantin Kuznetsov, although the patch has been rewritten multiple times since his original version. Discussion: https://www.postgresql.org/message-id/1B9FAC6F-FA19-4A24-8C1B-F4F574844892%40yandex-team.ru
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Check for conflicting queries during replay of gistvacuumpage()Alexander Korotkov2018-12-21
| | | | | | | | | | | | | | | | | | | | | | | 013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks index tuples as dead when kill_prior_tuple is set. Later, when new tuple insertion claims page space, those dead index tuples are physically deleted from page. When this deletion is replayed on standby, it might conflict with read-only queries. But 013ebc0a7b doesn't handle this. That may lead to disappearance of some tuples from read-only snapshots on standby. This commit implements resolving of conflicts between replay of GiST microvacuum and standby queries. On the master we implement new WAL record type XLOG_GIST_DELETE, which comprises necessary information. On stable releases we've to be tricky to keep WAL compatibility. Information required for conflict processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So, PostgreSQL version, which doesn't know about conflict processing, will just ignore that. Reported-by: Andres Freund Diagnosed-by: Andres Freund Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de Author: Alexander Korotkov Backpatch-through: 9.6
* Remove WITH OIDS support, change oid catalog column visibility.Andres Freund2018-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
* Add support for nearest-neighbor (KNN) searches to SP-GiSTAlexander Korotkov2018-09-19
| | | | | | | | | | | | | Currently, KNN searches were supported only by GiST. SP-GiST also capable to support them. This commit implements that support. SP-GiST scan stack is replaced with queue, which serves as stack if no ordering is specified. KNN support is provided for three SP-GIST opclasses: quad_point_ops, kd_point_ops and poly_ops (catversion is bumped). Some common parts between GiST and SP-GiST KNNs are extracted into separate functions. Discussion: https://postgr.es/m/570825e8-47d0-4732-2bf6-88d67d2d51c8%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov based on GSoC work by Vlad Sterzhanov Review: Andrey Borodin, Alexander Korotkov
* doc: Update uses of the word "procedure"Peter Eisentraut2018-08-22
| | | | | | | | | | | | | | | | | | | Historically, the term procedure was used as a synonym for function in Postgres/PostgreSQL. Now we have procedures as separate objects from functions, so we need to clean up the documentation to not mix those terms. In particular, mentions of "trigger procedures" are changed to "trigger functions", and access method "support procedures" are changed to "support functions". (The latter already used FUNCTION in the SQL syntax anyway.) Also, the terminology in the SPI chapter has been cleaned up. A few tests, examples, and code comments are also adjusted to be consistent with documentation changes, but not everything. Reported-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
* Use the built-in float datatypes to implement geometric typesTomas Vondra2018-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the geometric operators and functions use the exported function of the float4/float8 datatypes. The main reason of doing so is to check for underflow and overflow, and to handle NaNs consciously. The float datatypes consider NaNs values to be equal and greater than all non-NaN values. This change considers NaNs equal only for equality operators. The placement operators, contains, overlaps, left/right of etc. continue to return false when NaNs are involved. We don't need to worry about them being considered greater than any-NaN because there aren't any basic comparison operators like less/greater than for the geometric datatypes. The changes may be summarised as: * Check for underflow, overflow and division by zero * Consider NaN values to be equal * Return NULL when the distance is NaN for all closest point operators * Favour not-NaN over NaN where it makes sense The patch also replaces all occurrences of "double" as "float8". They are the same, but were used inconsistently in the same file. Author: Emre Hasegeli Reviewed-by: Kyotaro Horiguchi, Tomas Vondra Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
* Provide separate header file for built-in float typesTomas Vondra2018-07-29
| | | | | | | | | | | | | | | | | | Some data types under adt/ have separate header files, but most simple ones do not, and their public functions are defined in builtins.h. As the patches improving geometric types will require making additional functions public, this seems like a good opportunity to create a header for floats types. Commit 1acf757255 made _cmp functions public to solve NaN issues locally for GiST indexes. This patch reworks it in favour of a more widely applicable API. The API uses inline functions, as they are easier to use compared to macros, and avoid double-evaluation hazards. Author: Emre Hasegeli Reviewed-by: Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
* Rethink how to get float.h in old Windows API for isnan/isinfAlvaro Herrera2018-07-11
| | | | | | | | | | | | | | | | We include <float.h> in every place that needs isnan(), because MSVC used to require it. However, since MSVC 2013 that's no longer necessary (cf. commit cec8394b5ccd), so we can retire the inclusion to a version-specific stanza in win32_port.h, where it doesn't need to pollute random .c files. The header is of course still needed in a few places for other reasons. I (Álvaro) removed float.h from a few more files than in Emre's original patch. This doesn't break the build in my system, but we'll see what the buildfarm has to say about it all. Author: Emre Hasegeli Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
* Re-think predicate locking on GIN indexes.Teodor Sigaev2018-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The principle behind the locking was not very well thought-out, and not documented. Add a section in the README to explain how it's supposed to work, and change the code so that it actually works that way. This fixes two bugs: 1. If fast update was turned on concurrently, subsequent inserts to the pending list would not conflict with predicate locks that were acquired earlier, on entry pages. The included 'predicate-gin-fastupdate' test demonstrates that. To fix, make all scans acquire a predicate lock on the metapage. That lock represents a scan of the pending list, whether or not there is a pending list at the moment. Forget about the optimization to skip locking/checking for locks, when fastupdate=off. 2. If a scan finds no match, it still needs to lock the entry page. The point of predicate locks is to lock the gabs between values, whether or not there is a match. The included 'predicate-gin-nomatch' test tests that case. In addition to those two bug fixes, this removes some unnecessary locking, following the principle laid out in the README. Because all items in a posting tree have the same key value, a lock on the posting tree root is enough to cover all the items. (With a very large posting tree, it would possibly be better to lock the posting tree leaf pages instead, so that a "skip scan" with a query like "A & B", you could avoid unnecessary conflict if a new tuple is inserted with A but !B. But let's keep this simple.) Also, some spelling fixes. Author: Heikki Linnakangas with some editorization by me Review: Andrey Borodin, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/0b3ad2c2-2692-62a9-3a04-5724f2af9114@iki.fi
* Post-feature-freeze pgindent run.Tom Lane2018-04-26
| | | | Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
* Indexes with INCLUDE columns and their support in B-treeTeodor Sigaev2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
* Add predicate locking for GiSTTeodor Sigaev2018-03-27
| | | | | | | | | | | | Add page-level predicate locking, due to gist's code organization, patch seems close to trivial: add check before page changing, add predicate lock before page scanning. Although choosing right place to check is not simple: it should not be called during index build, it should support insertion of new downlink and so on. Author: Shubham Barai with editorization by me and Alexander Korotkov Reviewed by: Alexander Korotkov, Andrey Borodin, me Discussion: https://www.postgresql.org/message-id/flat/CALxAEPtdcANpw5ePU3LvnTP8HCENFw6wygupQAyNBgD-sG3h0g@mail.gmail.com
* Make gistvacuumcleanup() count the actual number of index tuples.Tom Lane2018-03-02
| | | | | | | | | | | | | | | Previously, it just returned the heap tuple count, which might be only an estimate, and would be completely the wrong thing if the index is partial. Since this function scans every index page anyway to find free pages, it's practically free to count the surviving index tuples. Let's do that and return an accurate count. This is easily visible as a wrong reltuples value for a partial GiST index following VACUUM, so back-patch to all supported branches. Andrey Borodin, reviewed by Michail Nikolaev Discussion: https://postgr.es/m/151956654251.6915.675951950408204404.pgcf@coridan.postgresql.org
* Support parallel btree index builds.Robert Haas2018-02-02
| | | | | | | | | | | | | | | | | | | | | | To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
* Change some bogus PageGetLSN calls to BufferGetLSNAtomicAlvaro Herrera2018-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | As src/backend/access/transam/README says, PageGetLSN may only be called by processes holding either exclusive lock on buffer, or a shared lock on buffer plus buffer header lock. Therefore any place that only holds a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN, which internally obtains buffer header lock prior to reading the LSN. A few callsites failed to comply with this rule. This was detected by running all tests under a new (not committed) assertion that verifies PageGetLSN locking contract. All but one of the callsites that failed the assertion are fixed by this patch. Remaining callsites were inspected manually and determined not to need any change. The exception (unfixed callsite) is in TestForOldSnapshot, which only has a Page argument, making it impossible to access the corresponding Buffer from it. Fixing that seems a much larger patch that will have to be done separately; and that's just as well, since it was only introduced in 9.6 and other bugs are much older. Some of these bugs are ancient; backpatch all the way back to 9.3. Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Add some const decorations to prototypesPeter Eisentraut2017-11-10
| | | | Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* For wal_consistency_checking, mask page checksum as well as page LSN.Robert Haas2017-09-22
| | | | | | | | If the LSN is different, the checksum will be different, too. Ashwin Agrawal, reviewed by Michael Paquier and Kuntal Ghosh Discussion: http://postgr.es/m/CALfoeis5iqrAU-+JAN+ZzXkpPr7+-0OAGv7QUHwFn=-wDy4o4Q@mail.gmail.com
* Remove no-op GiST support functions in the core GiST opclasses.Tom Lane2017-09-19
| | | | | | | | | | | | The preceding patch allowed us to remove useless GiST support functions. This patch actually does that for all the no-op cases in the core GiST code. This buys us whatever performance gain is to be had, and more importantly exercises the preceding patch. There remain no-op functions in the contrib GiST opclasses, but those will take more work to remove. Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
* Allow no-op GiST support functions to be omitted.Tom Lane2017-09-19
| | | | | | | | | | | | | | | | | | | | | | There are common use-cases in which the compress and/or decompress functions can be omitted, with the result being that we make no data transformation when storing or retrieving index values. Previously, you had to provide a no-op function anyway, but this patch allows such opclass support functions to be omitted. Furthermore, if the compress function is omitted, then the core code knows that the stored representation is the same as the original data. This means we can allow index-only scans without requiring a fetch function to be provided either. Previously you had to provide a no-op fetch function if you wanted IOS to work. This reportedly provides a small performance benefit in such cases, but IMO the real reason for doing it is just to reduce the amount of useless boilerplate code that has to be written for GiST opclasses. Andrey Borodin, reviewed by Dmitriy Sarafannikov Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
* Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).Andres Freund2017-08-20
| | | | | | | | | | | This is a mechanical change in preparation for a later commit that will change the layout of TupleDesc. Introducing a macro to abstract the details of where attributes are stored will allow us to change that in separate step and revise it in future. Author: Thomas Munro, editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Re-run pgindent.Tom Lane2017-06-13
| | | | | | | | This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com
* Fix wording in amvalidate error messagesAlvaro Herrera2017-05-30
| | | | | | | | Remove some gratuituous message differences by making the AM name previously embedded in each message be a %s instead. While at it, get rid of terminology that's unclear and unnecessary in one message. Discussion: https://postgr.es/m/20170523001557.bq2hbq7hxyvyw62q@alvherre.pgsql
* Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan.Tom Lane2017-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | GiST's getNextNearest() function attempts to pfree the previously-returned tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older branches). However, if we are rescanning a plan node after ending a previous scan early, those tuple pointers could be pointing to garbage, because they would be pointing into the scan's pageDataCxt or queueCxt which has been reset. In a debug build this reliably results in a crash, although I think it might sometimes accidentally fail to fail in production builds. To fix, clear the pointer field anyplace we reset a context it might be pointing into. This may be overkill --- I think probably only the queueCxt case is involved in this bug, so that resetting in gistrescan() would be sufficient --- but dangling pointers are generally bad news, so let's avoid them. Another plausible answer might be to just not bother with the pfree in getNextNearest(). The reconstructed tuples would go away anyway in the context resets, and I'm far from convinced that freeing them a bit earlier really saves anything meaningful. I'll stick with the original logic in this patch, but if we find more problems in the same area we should consider that approach. Per bug #14641 from Denis Smirnov. Back-patch to 9.5 where this logic was introduced. Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
* Put back <float.h> in a few files that need it for _isnan().Tom Lane2017-03-08
| | | | | | | | | | | Further fallout from commit c29aff959: there are some files that need <float.h>, and were getting it from datatype/timestamp.h, but it was not apparent in my (tgl's) testing because the requirement for <float.h> exists only on certain Windows toolchains. Report and patch by David Rowley. Discussion: https://postgr.es/m/CAKJS1f-BHceaFzZScFapDV48gUVM2CAOBfhkgffdqXzFb+kwew@mail.gmail.com
* Allow index AMs to return either HeapTuple or IndexTuple format during IOS.Tom Lane2017-02-27
| | | | | | | | | | | | | | | | | | | | | | Previously, only IndexTuple format was supported for the output data of an index-only scan. This is fine for btree, which is just returning a verbatim index tuple anyway. It's not so fine for SP-GiST, which can return reconstructed data that's much larger than a page. To fix, extend the index AM API so that index-only scan data can be returned in either HeapTuple or IndexTuple format. There's other ways we could have done it, but this way avoids an API break for index AMs that aren't concerned with the issue, and it costs little except a couple more fields in IndexScanDescs. I changed both GiST and SP-GiST to use the HeapTuple method. I'm not very clear on whether GiST can reconstruct data that's too large for an IndexTuple, but that seems possible, and it's not much of a code change to fix. Per a complaint from Vik Fearing. Reviewed by Jason Li. Discussion: https://postgr.es/m/49527f79-530d-0bfe-3dad-d183596afa92@2ndquadrant.fr
* Add optimizer and executor support for parallel index scans.Robert Haas2017-02-15
| | | | | | | | | | | | In combination with 569174f1be92be93f5366212cc46960d28a5c5cd, which taught the btree AM how to perform parallel index scans, this allows parallel index scan plans on btree indexes. This infrastructure should be general enough to support parallel index scans for other index AMs as well, if someone updates them to support parallel scans. Amit Kapila, reviewed and tested by Anastasia Lubennikova, Tushar Ahuja, and Haribabu Kommi, and me.
* Split index xlog headers from other private index headers.Robert Haas2017-02-14
| | | | | | | | | | | | | The xlog-specific headers need to be included in both frontend code - specifically, pg_waldump - and the backend, but the remainder of the private headers for each index are only needed by the backend. By splitting the xlog stuff out into separate headers, pg_waldump pulls in fewer backend headers, which is a good thing. Patch by me, reviewed by Michael Paquier and Andres Freund, per a complaint from Dilip Kumar. Discussion: http://postgr.es/m/CA+TgmoZ=F=GkxV0YEv-A8tb+AEGy_Qa7GSiJ8deBKFATnzfEug@mail.gmail.com