aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2004-11-27 00:01:02 +0000
committerTom Lane <tgl@sss.pgh.pa.us>2004-11-27 00:01:02 +0000
commitb82323e05e57d7c4fb7a8eab9f27eb059d28309a (patch)
tree9ec63971295146e7dc4033744db129e2921893a1
parentc2e563176049e514b8af1af4b759ef4e64850cdc (diff)
downloadpostgresql-b82323e05e57d7c4fb7a8eab9f27eb059d28309a.tar.gz
postgresql-b82323e05e57d7c4fb7a8eab9f27eb059d28309a.zip
This adds mention of my latest tweak to the tsearch2/pg_trgm
integration. It is much better to create a word list of unstemmed words than stemmed ones. Chris K-L
-rw-r--r--contrib/pg_trgm/README.pg_trgm14
1 files changed, 9 insertions, 5 deletions
diff --git a/contrib/pg_trgm/README.pg_trgm b/contrib/pg_trgm/README.pg_trgm
index ac2eb012de5..608c30c455c 100644
--- a/contrib/pg_trgm/README.pg_trgm
+++ b/contrib/pg_trgm/README.pg_trgm
@@ -100,11 +100,15 @@ Tsearch2 Integration
The first step is to generate an auxiliary table containing all
the unique words in the Tsearch2 index:
- CREATE TABLE words AS
- SELECT word FROM stat('SELECT vector FROM documents');
-
- Where 'documents' is the table that contains the Tsearch2 index
- column 'vector', of type 'tsvector'.
+ CREATE TABLE words AS SELECT word FROM
+ stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
+
+ Where 'documents' is a table that has a text field 'bodytext'
+ that TSearch2 is used to search. The use of the 'simple' dictionary
+ with the to_tsvector function, instead of just using the already
+ existing vector is to avoid creating a list of already stemmed
+ words. This way, only the original, unstemmed words are added
+ to the word list.
Next, create a trigram index on the word column: