diff options
author | Peter Eisentraut <peter_e@gmx.net> | 2010-11-23 22:27:50 +0200 |
---|---|---|
committer | Peter Eisentraut <peter_e@gmx.net> | 2010-11-23 22:34:55 +0200 |
commit | fc946c39aeacdff7df60c83fca6582985e8546c8 (patch) | |
tree | 866145f64c09c0673a4aa3d3a2f5647f0b7afc45 /src/backend/access/gin | |
parent | 44475e782f4674d257b9e5c1a3930218a4b4deea (diff) | |
download | postgresql-fc946c39aeacdff7df60c83fca6582985e8546c8.tar.gz postgresql-fc946c39aeacdff7df60c83fca6582985e8546c8.zip |
Remove useless whitespace at end of lines
Diffstat (limited to 'src/backend/access/gin')
-rw-r--r-- | src/backend/access/gin/README | 32 |
1 files changed, 16 insertions, 16 deletions
diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README index 69d5a319413..0f634f83d17 100644 --- a/src/backend/access/gin/README +++ b/src/backend/access/gin/README @@ -9,27 +9,27 @@ Gin stands for Generalized Inverted Index and should be considered as a genie, not a drink. Generalized means that the index does not know which operation it accelerates. -It instead works with custom strategies, defined for specific data types (read -"Index Method Strategies" in the PostgreSQL documentation). In that sense, Gin +It instead works with custom strategies, defined for specific data types (read +"Index Method Strategies" in the PostgreSQL documentation). In that sense, Gin is similar to GiST and differs from btree indices, which have predefined, comparison-based operations. -An inverted index is an index structure storing a set of (key, posting list) -pairs, where 'posting list' is a set of documents in which the key occurs. -(A text document would usually contain many keys.) The primary goal of +An inverted index is an index structure storing a set of (key, posting list) +pairs, where 'posting list' is a set of documents in which the key occurs. +(A text document would usually contain many keys.) The primary goal of Gin indices is support for highly scalable, full-text search in PostgreSQL. Gin consists of a B-tree index constructed over entries (ET, entries tree), where each entry is an element of the indexed value (element of array, lexeme -for tsvector) and where each tuple in a leaf page is either a pointer to a -B-tree over item pointers (PT, posting tree), or a list of item pointers +for tsvector) and where each tuple in a leaf page is either a pointer to a +B-tree over item pointers (PT, posting tree), or a list of item pointers (PL, posting list) if the tuple is small enough. Note: There is no delete operation for ET. The reason for this is that in our experience, the set of distinct words in a large corpus changes very rarely. This greatly simplifies the code and concurrency algorithms. -Gin comes with built-in support for one-dimensional arrays (eg. integer[], +Gin comes with built-in support for one-dimensional arrays (eg. integer[], text[]), but no support for NULL elements. The following operations are available: @@ -59,25 +59,25 @@ Gin Fuzzy Limit There are often situations when a full-text search returns a very large set of results. Since reading tuples from the disk and sorting them could take a -lot of time, this is unacceptable for production. (Note that the search +lot of time, this is unacceptable for production. (Note that the search itself is very fast.) -Such queries usually contain very frequent lexemes, so the results are not -very helpful. To facilitate execution of such queries Gin has a configurable -soft upper limit on the size of the returned set, determined by the -'gin_fuzzy_search_limit' GUC variable. This is set to 0 by default (no +Such queries usually contain very frequent lexemes, so the results are not +very helpful. To facilitate execution of such queries Gin has a configurable +soft upper limit on the size of the returned set, determined by the +'gin_fuzzy_search_limit' GUC variable. This is set to 0 by default (no limit). If a non-zero search limit is set, then the returned set is a subset of the whole result set, chosen at random. "Soft" means that the actual number of returned results could slightly differ -from the specified limit, depending on the query and the quality of the +from the specified limit, depending on the query and the quality of the system's random number generator. From experience, a value of 'gin_fuzzy_search_limit' in the thousands (eg. 5000-20000) works well. This means that 'gin_fuzzy_search_limit' will -have no effect for queries returning a result set with less tuples than this +have no effect for queries returning a result set with less tuples than this number. Limitations @@ -115,5 +115,5 @@ Distant future: Authors ------- -All work was done by Teodor Sigaev (teodor@sigaev.ru) and Oleg Bartunov +All work was done by Teodor Sigaev (teodor@sigaev.ru) and Oleg Bartunov (oleg@sai.msu.su). |