aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/gist/gistutil.c
Commit message (Collapse)AuthorAge
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Generate less WAL during GiST, GIN and SP-GiST index build.Heikki Linnakangas2019-04-03
| | | | | | | | | | | | | | | | | | | | Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov
* Delete empty pages during GiST VACUUM.Heikki Linnakangas2019-03-22
| | | | | | | | | | | | | | | To do this, we scan GiST two times. In the first pass we make note of empty leaf pages and internal pages. At second pass we scan through internal pages, looking for downlinks to the empty pages. Deleting internal pages is still not supported, like in nbtree, the last child of an internal page is never deleted. That means that if you have a workload where new keys are always inserted to different area than where old keys are removed, the index will still grow without bound. But the rate of growth will be an order of magnitude slower than before. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/B1E4DF12-6CD3-4706-BDBD-BF3283328F60@yandex-team.ru
* Support for INCLUDE attributes in GiST indexesAlexander Korotkov2019-03-10
| | | | | | | | | | | | | | | | | Similarly to B-tree, GiST index access method gets support of INCLUDE attributes. These attributes aren't used for tree navigation and aren't present in non-leaf pages. But they are present in leaf pages and can be fetched during index-only scan. The point of having INCLUDE attributes in GiST indexes is slightly different from the point of having them in B-tree. The main point of INCLUDE attributes in B-tree is to define UNIQUE constraint over part of attributes enabled for index-only scan. In GiST the main point of INCLUDE attributes is to use index-only scan for attributes, whose data types don't have GiST opclasses. Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru Author: Andrey Borodin, with small changes by me Reviewed-by: Andreas Karlsson
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Add support for nearest-neighbor (KNN) searches to SP-GiSTAlexander Korotkov2018-09-19
| | | | | | | | | | | | | Currently, KNN searches were supported only by GiST. SP-GiST also capable to support them. This commit implements that support. SP-GiST scan stack is replaced with queue, which serves as stack if no ordering is specified. KNN support is provided for three SP-GIST opclasses: quad_point_ops, kd_point_ops and poly_ops (catversion is bumped). Some common parts between GiST and SP-GiST KNNs are extracted into separate functions. Discussion: https://postgr.es/m/570825e8-47d0-4732-2bf6-88d67d2d51c8%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov based on GSoC work by Vlad Sterzhanov Review: Andrey Borodin, Alexander Korotkov
* Provide separate header file for built-in float typesTomas Vondra2018-07-29
| | | | | | | | | | | | | | | | | | Some data types under adt/ have separate header files, but most simple ones do not, and their public functions are defined in builtins.h. As the patches improving geometric types will require making additional functions public, this seems like a good opportunity to create a header for floats types. Commit 1acf757255 made _cmp functions public to solve NaN issues locally for GiST indexes. This patch reworks it in favour of a more widely applicable API. The API uses inline functions, as they are easier to use compared to macros, and avoid double-evaluation hazards. Author: Emre Hasegeli Reviewed-by: Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
* Rethink how to get float.h in old Windows API for isnan/isinfAlvaro Herrera2018-07-11
| | | | | | | | | | | | | | | | We include <float.h> in every place that needs isnan(), because MSVC used to require it. However, since MSVC 2013 that's no longer necessary (cf. commit cec8394b5ccd), so we can retire the inclusion to a version-specific stanza in win32_port.h, where it doesn't need to pollute random .c files. The header is of course still needed in a few places for other reasons. I (Álvaro) removed float.h from a few more files than in Emre's original patch. This doesn't break the build in my system, but we'll see what the buildfarm has to say about it all. Author: Emre Hasegeli Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Allow no-op GiST support functions to be omitted.Tom Lane2017-09-19
| | | | | | | | | | | | | | | | | | | | | | There are common use-cases in which the compress and/or decompress functions can be omitted, with the result being that we make no data transformation when storing or retrieving index values. Previously, you had to provide a no-op function anyway, but this patch allows such opclass support functions to be omitted. Furthermore, if the compress function is omitted, then the core code knows that the stored representation is the same as the original data. This means we can allow index-only scans without requiring a fetch function to be provided either. Previously you had to provide a no-op fetch function if you wanted IOS to work. This reportedly provides a small performance benefit in such cases, but IMO the real reason for doing it is just to reduce the amount of useless boilerplate code that has to be written for GiST opclasses. Andrey Borodin, reviewed by Dmitriy Sarafannikov Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Put back <float.h> in a few files that need it for _isnan().Tom Lane2017-03-08
| | | | | | | | | | | Further fallout from commit c29aff959: there are some files that need <float.h>, and were getting it from datatype/timestamp.h, but it was not apparent in my (tgl's) testing because the requirement for <float.h> exists only on certain Windows toolchains. Report and patch by David Rowley. Discussion: https://postgr.es/m/CAKJS1f-BHceaFzZScFapDV48gUVM2CAOBfhkgffdqXzFb+kwew@mail.gmail.com
* Allow index AMs to return either HeapTuple or IndexTuple format during IOS.Tom Lane2017-02-27
| | | | | | | | | | | | | | | | | | | | | | Previously, only IndexTuple format was supported for the output data of an index-only scan. This is fine for btree, which is just returning a verbatim index tuple anyway. It's not so fine for SP-GiST, which can return reconstructed data that's much larger than a page. To fix, extend the index AM API so that index-only scan data can be returned in either HeapTuple or IndexTuple format. There's other ways we could have done it, but this way avoids an API break for index AMs that aren't concerned with the issue, and it costs little except a couple more fields in IndexScanDescs. I changed both GiST and SP-GiST to use the HeapTuple method. I'm not very clear on whether GiST can reconstruct data that's too large for an IndexTuple, but that seems possible, and it's not much of a code change to fix. Per a complaint from Vik Fearing. Reviewed by Jason Li. Discussion: https://postgr.es/m/49527f79-530d-0bfe-3dad-d183596afa92@2ndquadrant.fr
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Suppress compiler warnings in non-cassert builds.Tom Lane2016-08-23
| | | | | | With Asserts off, these variables are set but never used, resulting in warnings from pickier compilers. Fix that with our standard solution. Per report from Jeff Janes.
* Add SQL-accessible functions for inspecting index AM properties.Tom Lane2016-08-13
| | | | | | | | | | | | | | | | | | | | | Per discussion, we should provide such functions to replace the lost ability to discover AM properties by inspecting pg_am (cf commit 65c5fcd35). The added functionality is also meant to displace any code that was looking directly at pg_index.indoption, since we'd rather not believe that the bit meanings in that field are part of any client API contract. As future-proofing, define the SQL API to not assume that properties that are currently AM-wide or index-wide will remain so unless they logically must be; instead, expose them only when inquiring about a specific index or even specific index column. Also provide the ability for an index AM to override the behavior. In passing, document pg_am.amtype, overlooked in commit 473b93287. Andrew Gierth, with kibitzing by me and others Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>
* Revert no-op changes to BufferGetPage()Kevin Grittner2016-04-20
| | | | | | | | | | | | | | | | | | The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas
* Modify BufferGetPage() to prepare for "snapshot too old" featureKevin Grittner2016-04-08
| | | | | | | | | | | This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.
* Restructure index access method API to hide most of it at the C level.Tom Lane2016-01-17
| | | | | | | | | | | | | | | | | | | | | | | | This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* pgindent run for 9.5Bruce Momjian2015-05-23
|
* Fix GiST index-only scans for opclasses with different storage type.Heikki Linnakangas2015-03-26
| | | | | | | | | We cannot use the index's tuple descriptor directly to describe the index tuples returned in an index-only scan. That's because the index might use a different datatype for the values stored on disk than the type originally indexed. As long as they were both pass-by-ref, it worked, but will not work for pass-by-value types of different sizes. I noticed this as a crash when I started hacking a patch to add fetch methods to btree_gist.
* Add support for index-only scans in GiST.Heikki Linnakangas2015-03-26
| | | | | | | | | | | | | | This adds a new GiST opclass method, 'fetch', which is used to reconstruct the original Datum from the value stored in the index. Also, the 'canreturn' index AM interface function gains a new 'attno' argument. That makes it possible to use index-only scans on a multi-column index where some of the opclasses support index-only scans but some do not. This patch adds support in the box and point opclasses. Other opclasses can added later as follow-on patches (btree_gist would be particularly interesting). Anastasia Lubennikova, with additional fixes and modifications by me.
* Minor cleanup of GiST code, for readability.Heikki Linnakangas2015-03-26
| | | | | Remove the gistcentryinit function, inlining the relevant part of it into the only caller.
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Update copyright for 2014Bruce Momjian2014-01-07
| | | | | Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
* pgindent run for release 9.3Bruce Momjian2013-05-29
| | | | | This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
* Support unlogged GiST index.Heikki Linnakangas2013-02-11
| | | | | | | | | | | | | | | The reason this wasn't supported before was that GiST indexes need an increasing sequence to detect concurrent page-splits. In a regular WAL- logged GiST index, the LSN of the page-split record is used for that purpose, and in a temporary index, we can get away with a backend-local counter. Neither of those methods works for an unlogged relation. To provide such an increasing sequence of numbers, create a "fake LSN" counter that is saved and restored across shutdowns. On recovery, unlogged relations are blown away, so the counter doesn't need to survive that either. Jeevan Chalke, based on discussions with Robert Haas, Tom Lane and me.
* Repair bugs in GiST page splitting code for multi-column indexes.Tom Lane2013-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When considering a non-last column in a multi-column GiST index, gistsplit.c tries to improve on the split chosen by the opclass-specific pickSplit function by considering penalties for the next column. However, there were two bugs in this code: it failed to recompute the union keys for the leftmost index columns, even though these might well change after reassigning tuples; and it included the old union keys in the recomputation for the columns it did recompute, so that those keys couldn't get smaller even if they should. The first problem could result in an invalid index in which searches wouldn't find index entries that are in fact present; the second would make the index less efficient to search. Both of these errors were caused by misuse of gistMakeUnionItVec, whose API was designed in a way that just begged such errors to be made. There is no situation in which it's safe or useful to compute the union keys for a subset of the index columns, and there is no caller that wants any previous union keys to be included in the computation; so the undocumented choice to treat the union keys as in/out rather than pure output parameters is a waste of code as well as being dangerous. Hence, rather than just making a minimal patch, I've changed the API of gistMakeUnionItVec to remove the "startkey" parameter (it now always processes all index columns) and treat the attr/isnull arrays as purely output parameters. In passing, also get rid of a couple of unnecessary and dangerous uses of static variables in gistutil.c. It's remarkable that the one in gistMakeUnionKey hasn't given us portability troubles before now, because in addition to posing a re-entrancy hazard, it was unsafely assuming that a static char[] array would have at least Datum alignment. Per investigation of a trouble report from Tomas Vondra. (There are also some bugs in contrib/btree_gist to be fixed, but that seems like material for a separate patch.) Back-patch to all supported branches.
* Add some randomness to the choice of which GiST page to insert to.Heikki Linnakangas2013-01-25
| | | | | | | | | | | | | | | | | When descending the tree for an insert, and there are multiple equally good pages we could insert to, make the choice in random. Previously, we would always choose the tuple with lowest offset number. That meant that when two non-leaf pages overlap - in the extreme case they might have exactly the same key - all but the first such page went unused. That wasn't optimal for space usage; if you deleted some tuples from the non-first pages, the space would never be reused. With this patch, the other pages are sometimes chosen too, although there's still a heavy bias towards low-offset tuples, so that we don't lose cache locality when doing a lot of inserts with similar keys. Original idea by Alexander Korotkov, although this patch version was written by me and copy-edited by Tom Lane.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Improve coding of gistchoose and gistRelocateBuildBuffersOnSplit.Tom Lane2012-08-30
| | | | | | | | This is mostly cosmetic, but it does eliminate a speculative portability issue. The previous coding ignored the fact that sum_grow could easily overflow (in fact, it could be summing multiple IEEE float infinities). On a platform where that didn't guarantee to produce a positive result, the code would misbehave. In any case, it was less than readable.
* Fix logic bug in gistchoose and gistRelocateBuildBuffersOnSplit.Robert Haas2012-08-30
| | | | | | | | | | | | Every time the best-tuple-found-so-far changes, we need to reset all the penalty values in which_grow[] to the penalties for the new best tuple. The old code failed to do this, resulting in inferior index quality. The original patch from Alexander Korotkov was just two lines; I took the liberty of fleshing that out by adding a bunch of comments that I hope will make this logic easier for others to understand than it was for me.
* Replace XLogRecPtr struct with a 64-bit integer.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | This simplifies code that needs to do arithmetic on XLogRecPtrs. To avoid changing on-disk format of data pages, the LSN on data pages is still stored in the old format. That should keep pg_upgrade happy. However, we have XLogRecPtrs embedded in the control file, and in the structs that are sent over the replication protocol, so this changes breaks compatibility of pg_basebackup and server. I didn't do anything about this in this patch, per discussion on -hackers, the right thing to do would to be to change the replication protocol to be architecture-independent, so that you could use a newer version of pg_receivexlog, for example, against an older server version.
* Assorted comment fixes, mostly just typos, but some obsolete statements.Tom Lane2012-01-29
| | | | YAMAMOTO Takashi
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Use IEEE infinity, not 1e10, for null-and-not-null case in gistpenalty().Tom Lane2011-11-27
| | | | | | | Use of a randomly chosen large value was never exactly graceful, and now that there are penalty functions that are intentionally using infinity, it doesn't seem like a good idea for null-vs-not-null to be using something less.
* Buffering GiST index build algorithm.Heikki Linnakangas2011-09-08
| | | | | | | | | When building a GiST index that doesn't fit in cache, buffers are attached to some internal nodes in the index. This speeds up the build by avoiding random I/O that would otherwise be needed to traverse all the way down the tree to the find right leaf page for tuple. Alexander Korotkov
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* Pgindent run before 9.1 beta2.Bruce Momjian2011-06-09
|
* Protect GIST logic that assumes penalty values can't be negative.Tom Lane2011-05-31
| | | | | | | | | | Apparently sane-looking penalty code might return small negative values, for example because of roundoff error. This will confuse places like gistchoose(). Prevent problems by clamping negative penalty values to zero. (Just to be really sure, I also made it force NaNs to zero.) Back-patch to all supported branches. Alexander Korotkov
* Make GIN and GIST pass the index collation to all their support functions.Tom Lane2011-04-22
| | | | | | | Experimentation with contrib/btree_gist shows that the majority of the GIST support functions potentially need collation information. Safest policy seems to be to pass it to all of them, instead of making assumptions about which ones could possibly need it.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Rewrite the GiST insertion logic so that we don't need the post-recoveryHeikki Linnakangas2010-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cleanup stage to finish incomplete inserts or splits anymore. There was two reasons for the cleanup step: 1. When a new tuple was inserted to a leaf page, the downlink in the parent needed to be updated to contain (ie. to be consistent with) the new key. Updating the parent in turn might require recursively updating the parent of the parent. We now handle that by updating the parent while traversing down the tree, so that when we insert the leaf tuple, all the parents are already consistent with the new key, and the tree is consistent at every step. 2. When a page is split, we need to insert the downlink for the new right page(s), and update the downlink for the original page to not include keys that moved to the right page(s). We now handle that by setting a new flag, F_FOLLOW_RIGHT, on the non-rightmost pages in the split. When that flag is set, scans always follow the rightlink, regardless of the NSN mechanism used to detect concurrent page splits. That way the tree is consistent right after split, even though the downlink is still missing. This is very similar to the way B-tree splits are handled. When the downlink is inserted in the parent, the flag is cleared. To keep the insertion algorithm simple, when an insertion sees an incomplete split, indicated by the F_FOLLOW_RIGHT flag, it finishes the split before doing anything else. These changes allow removing the whole "invalid tuple" mechanism, but I retained the scan code to still follow invalid tuples correctly. While we don't create any such tuples anymore, we want to handle them gracefully in case you pg_upgrade a GiST index that has them. If we encounter any on an insert, though, we just throw an error saying that you need to REINDEX. The issue that got me into doing this is that if you did a checkpoint while an insert or split was in progress, and the checkpoint finishes quickly so that there is no WAL record related to the insert between RedoRecPtr and the checkpoint record, recovery from that checkpoint would not know to finish the incomplete insert. IOW, we have the same issue we solved with the rm_safe_restartpoint mechanism during normal operation too. It's highly unlikely to happen in practice, and this fix is far too large to backpatch, so we're just going to live with in previous versions, but this refactoring fixes it going forward. With this patch, you don't get the annoying 'index "FOO" needs VACUUM or REINDEX to finish crash recovery' notices anymore if you crash at an unfortunate moment.
* The GiST scan algorithm uses LSNs to detect concurrent pages splits, butHeikki Linnakangas2010-11-16
| | | | | | | | | | | | | temporary indexes are not WAL-logged. We used a constant LSN for temporary indexes, on the assumption that we don't need to worry about concurrent page splits in temporary indexes because they're only visible to the current session. But that assumption is wrong, it's possible to insert rows and split pages in the same session, while a scan is in progress. For example, by opening a cursor and fetching some rows, and INSERTing new rows before fetching some more. Fix by generating fake increasing LSNs, used in place of real LSNs in temporary GiST indexes.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|