aboutsummaryrefslogtreecommitdiff
path: root/src/include/access/gin.h
Commit message (Collapse)AuthorAge
...
* Rewrite the key-combination logic in GIN's keyGetItem() and scanGetItem()Tom Lane2010-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | routines to make them behave better in the presence of "lossy" index pointers. The previous coding was outright incorrect for some cases, as recently reported by Artur Dabrowski: scanGetItem would fail to return index entries in cases where one index key had multiple exact pointers on the same page as another key had a lossy pointer. Also, keyGetItem was extremely inefficient for cases where a single index key generates multiple "entry" streams, such as an @@ operator with a multiple-clause tsquery. The presence of a lossy page pointer in any one stream defeated its ability to use the opclass consistentFn, resulting in probing many heap pages that didn't really need to be visited. In Artur's example case, a query like WHERE tsvector @@ to_tsquery('a & b') was about 50X slower than the theoretically equivalent WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b') The way that I chose to fix this was to have GIN call the consistentFn twice with both TRUE and FALSE values for the in-doubt entry stream, returning a hit if either call produces TRUE, but not if they both return FALSE. The code handles this for the case of a single in-doubt entry stream, but punts (falling back to the stupid behavior) if there's more than one lossy reference to the same page. The idea could be scaled up to deal with multiple lossy references, but I think that would probably be wasted complexity. At least to judge by Artur's example, such cases don't occur often enough to be worth trying to optimize. Back-patch to 8.4. 8.3 did not have lossy GIN index pointers, so not subject to these problems.
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Generic implementation of red-black binary tree. It's planned to use inTeodor Sigaev2010-02-11
| | | | | | several places, but for now only GIN uses it during index creation. Using self-balanced tree greatly speeds up index creation in corner cases with preordered data.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Make sure that GIN fast-insert and regular code paths enforce the sameTom Lane2009-10-02
| | | | | | | | | | | tuple size limit. Improve the error message for index-tuple-too-large so that it includes the actual size, the limit, and the index name. Sync with the btree occurrences of the same error. Back-patch to 8.4 because it appears that the out-of-sync problem is occurring in the field. Teodor and Tom
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Fix a serious bug introduced into GIN in 8.4: now that MergeItemPointers()Tom Lane2009-06-06
| | | | | | | | | is supposed to remove duplicate heap TIDs, we have to be sure to reduce the tuple size and posting-item count accordingly in addItemPointersToTuple(). Failing to do so resulted in the effective injection of garbage TIDs into the index contents, ie, whatever happened to be in the memory palloc'd for the new tuple. I'm not sure that this fully explains the index corruption reported by Tatsuo Ishii, but the test case I'm using no longer fails.
* GIN's ItemPointerIsMin, ItemPointerIsMax, and ItemPointerIsLossyPage macrosTom Lane2009-06-05
| | | | | | | | | | | | should use GinItemPointerGetBlockNumber/GinItemPointerGetOffsetNumber, not ItemPointerGetBlockNumber/ItemPointerGetOffsetNumber, because the latter will Assert() on ip_posid == 0, ie a "Min" pointer. (Thus, ItemPointerIsMin has never worked at all, but it seems unused at present.) I'm not certain that the case can occur in normal functioning, but it's blowing up on me while investigating Tatsuo-san's data corruption problem. In any case it seems like a problem waiting to bite someone. Back-patch just in case this really is a problem for somebody in the field.
* Adjust the APIs for GIN opclass support functions to allow the extractQuery()Tom Lane2009-03-25
| | | | | | | | | | | | | | method to pass extra data to the consistent() and comparePartial() methods. This is the core infrastructure needed to support the soon-to-appear contrib/btree_gin module. The APIs are still upward compatible with the definitions used in 8.3 and before, although *not* with the previous 8.4devel function definitions. catversion bump for changes in pg_proc entries (although these are just cosmetic, since GIN doesn't actually look at the function signature before calling it...) Teodor Sigaev and Oleg Bartunov
* Install a search tree depth limit in GIN bulk-insert operations, to preventTom Lane2009-03-24
| | | | | | | | | | | | them from degrading badly when the input is sorted or nearly so. In this scenario the tree is unbalanced to the point of becoming a mere linked list, so insertions become O(N^2). The easiest and most safely back-patchable solution is to stop growing the tree sooner, ie limit the growth of N. We might later consider a rebalancing tree algorithm, but it's not clear that the benefit would be worth the cost and complexity. Per report from Sergey Burladyan and an earlier complaint from Heikki. Back-patch to 8.2; older versions didn't have GIN indexes.
* Implement "fastupdate" support for GIN indexes, in which we try to accumulateTom Lane2009-03-24
| | | | | | | | | | | | | | | | | | multiple index entries in a holding area before adding them to the main index structure. This helps because bulk insert is (usually) significantly faster than retail insert for GIN. This patch also removes GIN support for amgettuple-style index scans. The API defined for amgettuple is difficult to support with fastupdate, and the previously committed partial-match feature didn't really work with it either. We might eventually figure a way to put back amgettuple support, but it won't happen for 8.4. catversion bumped because of change in GIN's pg_am entry, and because the format of GIN indexes changed on-disk (there's a metapage now, and possibly a pending list). Teodor Sigaev
* Revise the TIDBitmap API to support multiple concurrent iterations over aTom Lane2009-01-10
| | | | | | bitmap. This is extracted from Greg Stark's posix_fadvise patch; it seems worth committing separately, since it's potentially useful independently of posix_fadvise.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Clean up the messy semantics (not to mention inefficiency) of PageGetTempPageTom Lane2008-11-03
| | | | | | by splitting it into three functions with better-defined behaviors. Zdenek Kotala
* Remove mark/restore support in GIN and GiST indexes.Teodor Sigaev2008-10-20
| | | | | Per Tom's comment. Also revome useless GISTScanOpaque->flags field.
* Change the PageGetContents() macro to guarantee its result is maxalign'd,Tom Lane2008-07-13
| | | | | | | | | thereby forestalling any problems with alignment of the data structure placed there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this does not actually change anything right now, but it is foreseeable that the header size will change again someday. I had to fix a couple of places that were assuming that the content offset is just SizeOfPageHeaderData rather than MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch.
* Multi-column GIN indexes. Teodor SigaevTom Lane2008-07-11
|
* Improve our #include situation by moving pointer types away from theAlvaro Herrera2008-06-19
| | | | | | | corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.
* Change xlog.h to xlogdefs.h in bufpage.h, and fix fallout.Alvaro Herrera2008-06-06
|
* Extend GIN to support partial-match searches, and extend tsquery to supportTom Lane2008-05-16
| | | | | | prefix matching using this facility. Teodor Sigaev and Oleg Bartunov
* Restructure some header files a bit, in particular heapam.h, by removing someAlvaro Herrera2008-05-12
| | | | | | | | | | | | unnecessary #include lines in it. Also, move some tuple routine prototypes and macros to htup.h, which allows removal of heapam.h inclusion from some .c files. For this to work, a new header file access/sysattr.h needed to be created, initially containing attribute numbers of system columns, for pg_dump usage. While at it, make contrib ltree, intarray and hstore header files more consistent with our header style.
* Fix using too many LWLocks bug, reported by Craig RingerTeodor Sigaev2008-04-22
| | | | | | | | | <craig@postnewspapers.com.au>. It was my mistake, I missed limitation of number of held locks, now GIN doesn't use continiuous locks, but still hold buffers pinned to prevent interference with vacuum's deletion algorithm. Backpatch is needed.
* Replace "amgetmulti" AM functions with "amgetbitmap", in which the wholeTom Lane2008-04-10
| | | | | | | | | | | | | | | | | | indexscan always occurs in one call, and the results are returned in a TIDBitmap instead of a limited-size array of TIDs. This should improve speed a little by reducing AM entry/exit overhead, and it is necessary infrastructure if we are ever to support bitmap indexes. In an only slightly related change, add support for TIDBitmaps to preserve (somewhat lossily) the knowledge that particular TIDs reported by an index need to have their quals rechecked when the heap is visited. This facility is not really used yet; we'll need to extend the forced-recheck feature to plain indexscans before it's useful, and that hasn't been coded yet. The intent is to use it to clean up 8.3's horrid @@@ kluge for text search with weighted queries. There might be other uses in future, but that one alone is sufficient reason. Heikki Linnakangas, with some adjustments by me.
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* GIN index build's allocatedMemory counter needs to be long, not uint32.Tom Lane2007-11-16
| | | | | | | | | Else, in a 64-bit machine with maintenance_work_mem set to above 4Gb, the counter overflows and we never recognize having reached the maintenance_work_mem limit. I believe this explains out-of-memory failure recently reported by Sean Davis. This is a bug, so backpatch to 8.2.
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Tsearch2 functionality migrates to core. The bulk of this work is byTom Lane2007-08-21
| | | | | | | | Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing, so anything that's broken is probably my fault. Documentation is nonexistent as yet, but let's land the patch so we can get some portability testing done.
* Rename DLLIMPORT macro to PGDLLIMPORT to avoid conflict withMagnus Hagander2007-07-25
| | | | third party includes (like tcl) that define DLLIMPORT.
* Minor tweaking of index special-space definitions so that the variousTom Lane2007-04-09
| | | | | | | | | | index types can be reliably distinguished by examining the special space on an index page. Per my earlier proposal, plus the realization that there's no need for btree's vacuum cycle ID to cycle through every possible 16-bit value. Restricting its range a little costs nearly nothing and eliminates the possibility of collisions. Memo to self: remember to make bitmap indexes play along with this scheme, assuming that patch ever gets accepted.
* Allow GIN's extractQuery method to signal that nothing can satisfy the query.Teodor Sigaev2007-01-31
| | | | | | | | | | | | | In this case extractQuery should returns -1 as nentries. This changes prototype of extractQuery method to use int32* instead of uint32* for nentries argument. Based on that gincostestimate may see two corner cases: nothing will be found or seqscan should be used. Per proposal at http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php PS tsearch_core patch should be sightly modified to support changes, but I'm waiting a verdict about reviewing of tsearch_core patch.
* Make use of qsort_arg in several places that were formerly using klugyTom Lane2006-10-05
| | | | | | static variables. This avoids any risk of potential non-reentrancy, and in particular offers a much cleaner workaround for the Intel compiler bug that was affecting ginutil.c.
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* If we're going to advertise the array overlap/containment operators,Tom Lane2006-09-10
| | | | | | we probably should make them work reliably for all arrays. Fix code to handle NULLs and multidimensional arrays, move it into arrayfuncs.c. GIN is still restricted to indexing arrays with no null elements, however.
* Make recovery from WAL be restartable, by executing a checkpoint-likeTom Lane2006-08-07
| | | | | | | | | | operation every so often. This improves the usefulness of PITR log shipping for hot standby: formerly, if the standby server crashed, it was necessary to restart it from the last base backup and replay all the WAL since then. Now it will only need to reread about the same amount of WAL as the master server would. The behavior might also come in handy during a long PITR replay sequence. Simon Riggs, with some editorialization by Tom Lane.
* GIN improvementsTeodor Sigaev2006-07-11
| | | | | | | | - Replace sorted array of entries in maintenance_work_mem to binary tree, this should improve create performance. - More precisely calculate allocated memory, eliminate leaks with user-defined extractValue() - Improve wordings in tsearch2
* Allow each C include file to compile on its own by including any neededBruce Momjian2006-07-11
| | | | header files.
* Code review for FILLFACTOR patch. Change WITH grammar as per earlierTom Lane2006-07-03
| | | | | | | | | | | | | | | | discussion (including making def_arg allow reserved words), add missed opt_definition for UNIQUE case. Put the reloptions support code in a less random place (I chose to make a new file access/common/reloptions.c). Eliminate header inclusion creep. Make the index options functions safely user-callable (seems like client apps might like to be able to test validity of options before trying to make an index). Reduce overhead for normal case with no options by allowing rd_options to be NULL. Fix some unmaintainably klugy code, including getting rid of Natts_pg_class_fixed at long last. Some stylistic cleanup too, and pay attention to keeping comments in sync with code. Documentation still needs work, though I did fix the omissions in catalogs.sgml and indexam.sgml.
* Add FILLFACTOR to CREATE INDEX.Bruce Momjian2006-07-02
| | | | ITAGAKI Takahiro
* GIN: Generalized Inverted iNdex.Teodor Sigaev2006-05-02
text[], int4[], Tsearch2 support for GIN.