Text Search Functions and Operators
full text search
functions and operators
text search
functions and operators
,
and
summarize the functions and operators that are provided
for full text searching. See for a detailed
explanation of PostgreSQL's text search
facility.
Text Search Operators
Operator
Description
Example(s)
tsvector @@ tsquery
boolean
tsquery @@ tsvector
boolean
Does tsvector match tsquery?
(The arguments can be given in either order.)
to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat')
t
text @@ tsquery
boolean
Does text string, after implicit invocation
of to_tsvector(), match tsquery?
'fat cats ate rats' @@ to_tsquery('cat & rat')
t
tsvector || tsvector
tsvector
Concatenates two tsvectors. If both inputs contain
lexeme positions, the second input's positions are adjusted
accordingly.
'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector
'a':1 'b':2,5 'c':3 'd':4
tsquery && tsquery
tsquery
ANDs two tsquerys together, producing a query that
matches documents that match both input queries.
'fat | rat'::tsquery && 'cat'::tsquery
( 'fat' | 'rat' ) & 'cat'
tsquery || tsquery
tsquery
ORs two tsquerys together, producing a query that
matches documents that match either input query.
'fat | rat'::tsquery || 'cat'::tsquery
'fat' | 'rat' | 'cat'
!! tsquery
tsquery
Negates a tsquery, producing a query that matches
documents that do not match the input query.
!! 'cat'::tsquery
!'cat'
tsquery <-> tsquery
tsquery
Constructs a phrase query, which matches if the two input queries
match at successive lexemes.
to_tsquery('fat') <-> to_tsquery('rat')
'fat' <-> 'rat'
tsquery @> tsquery
boolean
Does first tsquery contain the second? (This considers
only whether all the lexemes appearing in one query appear in the
other, ignoring the combining operators.)
'cat'::tsquery @> 'cat & rat'::tsquery
f
tsquery <@ tsquery
boolean
Is first tsquery contained in the second? (This
considers only whether all the lexemes appearing in one query appear
in the other, ignoring the combining operators.)
'cat'::tsquery <@ 'cat & rat'::tsquery
t
'cat'::tsquery <@ '!cat & rat'::tsquery
t
In addition to these specialized operators, the usual comparison
operators shown in are
available for types tsvector and tsquery.
These are not very
useful for text searching but allow, for example, unique indexes to be
built on columns of these types.
Text Search Functions
Function
Description
Example(s)
array_to_tsvector
array_to_tsvector ( text[] )
tsvector
Converts an array of text strings to a tsvector.
The given strings are used as lexemes as-is, without further
processing. Array elements must not be empty strings
or NULL.
array_to_tsvector('{fat,cat,rat}'::text[])
'cat' 'fat' 'rat'
get_current_ts_config
get_current_ts_config ( )
regconfig
Returns the OID of the current default text search configuration
(as set by ).
get_current_ts_config()
english
length
length ( tsvector )
integer
Returns the number of lexemes in the tsvector.
length('fat:2,4 cat:3 rat:5A'::tsvector)
3
numnode
numnode ( tsquery )
integer
Returns the number of lexemes plus operators in
the tsquery.
numnode('(fat & rat) | cat'::tsquery)
5
plainto_tsquery
plainto_tsquery (
config regconfig,
query text )
tsquery
Converts text to a tsquery, normalizing words according to
the specified or default configuration. Any punctuation in the string
is ignored (it does not determine query operators). The resulting
query matches documents containing all non-stopwords in the text.
plainto_tsquery('english', 'The Fat Rats')
'fat' & 'rat'
phraseto_tsquery
phraseto_tsquery (
config regconfig,
query text )
tsquery
Converts text to a tsquery, normalizing words according to
the specified or default configuration. Any punctuation in the string
is ignored (it does not determine query operators). The resulting
query matches phrases containing all non-stopwords in the text.
phraseto_tsquery('english', 'The Fat Rats')
'fat' <-> 'rat'
phraseto_tsquery('english', 'The Cat and Rats')
'cat' <2> 'rat'
websearch_to_tsquery
websearch_to_tsquery (
config regconfig,
query text )
tsquery
Converts text to a tsquery, normalizing words according
to the specified or default configuration. Quoted word sequences are
converted to phrase tests. The word or
is understood
as producing an OR operator, and a dash produces a NOT operator;
other punctuation is ignored.
This approximates the behavior of some common web search tools.
websearch_to_tsquery('english', '"fat rat" or cat dog')
'fat' <-> 'rat' | 'cat' & 'dog'
querytree
querytree ( tsquery )
text
Produces a representation of the indexable portion of
a tsquery. A result that is empty or
just T indicates a non-indexable query.
querytree('foo & ! bar'::tsquery)
'foo'
setweight
setweight ( vector tsvector, weight "char" )
tsvector
Assigns the specified weight to each element
of the vector.
setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')
'cat':3A 'fat':2A,4A 'rat':5A
setweight
setweight for specific lexeme(s)
setweight ( vector tsvector, weight "char", lexemes text[] )
tsvector
Assigns the specified weight to elements
of the vector that are listed
in lexemes.
The strings in lexemes are taken as lexemes
as-is, without further processing. Strings that do not match any
lexeme in vector are ignored.
setweight('fat:2,4 cat:3 rat:5,6B'::tsvector, 'A', '{cat,rat}')
'cat':3A 'fat':2,4 'rat':5A,6A
strip
strip ( tsvector )
tsvector
Removes positions and weights from the tsvector.
strip('fat:2,4 cat:3 rat:5A'::tsvector)
'cat' 'fat' 'rat'
to_tsquery
to_tsquery (
config regconfig,
query text )
tsquery
Converts text to a tsquery, normalizing words according to
the specified or default configuration. The words must be combined
by valid tsquery operators.
to_tsquery('english', 'The & Fat & Rats')
'fat' & 'rat'
to_tsvector
to_tsvector (
config regconfig,
document text )
tsvector
Converts text to a tsvector, normalizing words according
to the specified or default configuration. Position information is
included in the result.
to_tsvector('english', 'The Fat Rats')
'fat':2 'rat':3
to_tsvector (
config regconfig,
document json )
tsvector
to_tsvector (
config regconfig,
document jsonb )
tsvector
Converts each string value in the JSON document to
a tsvector, normalizing words according to the specified
or default configuration. The results are then concatenated in
document order to produce the output. Position information is
generated as though one stopword exists between each pair of string
values. (Beware that document order
of the fields of a
JSON object is implementation-dependent when the input
is jsonb; observe the difference in the examples.)
to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json)
'dog':5 'fat':2 'rat':3
to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb)
'dog':1 'fat':4 'rat':5
json_to_tsvector
json_to_tsvector (
config regconfig,
document json,
filter jsonb )
tsvector
jsonb_to_tsvector
jsonb_to_tsvector (
config regconfig,
document jsonb,
filter jsonb )
tsvector
Selects each item in the JSON document that is requested by
the filter and converts each one to
a tsvector, normalizing words according to the specified
or default configuration. The results are then concatenated in
document order to produce the output. Position information is
generated as though one stopword exists between each pair of selected
items. (Beware that document order
of the fields of a
JSON object is implementation-dependent when the input
is jsonb.)
The filter must be a jsonb
array containing zero or more of these keywords:
"string" (to include all string values),
"numeric" (to include all numeric values),
"boolean" (to include all boolean values),
"key" (to include all keys), or
"all" (to include all the above).
As a special case, the filter can also be a
simple JSON value that is one of these keywords.
json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]')
'123':5 'fat':2 'rat':3
json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"')
'123':9 'cat':1 'dog':7 'fat':4 'rat':5
ts_delete
ts_delete ( vector tsvector, lexeme text )
tsvector
Removes any occurrence of the given lexeme
from the vector.
The lexeme string is treated as a lexeme as-is,
without further processing.
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')
'cat':3 'rat':5A
ts_delete ( vector tsvector, lexemes text[] )
tsvector
Removes any occurrences of the lexemes
in lexemes
from the vector.
The strings in lexemes are taken as lexemes
as-is, without further processing. Strings that do not match any
lexeme in vector are ignored.
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])
'cat':3
ts_filter
ts_filter ( vector tsvector, weights "char"[] )
tsvector
Selects only elements with the given weights
from the vector.
ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}')
'cat':3B 'rat':5A
ts_headline
ts_headline (
config regconfig,
document text,
query tsquery
, options text )
text
Displays, in an abbreviated form, the match(es) for
the query in
the document, which must be raw text not
a tsvector. Words in the document are normalized
according to the specified or default configuration before matching to
the query. Use of this function is discussed in
, which also describes the
available options.
ts_headline('The fat cat ate the rat.', 'cat')
The fat <b>cat</b> ate the rat.
ts_headline (
config regconfig,
document json,
query tsquery
, options text )
text
ts_headline (
config regconfig,
document jsonb,
query tsquery
, options text )
text
Displays, in an abbreviated form, match(es) for
the query that occur in string values
within the JSON document.
See for more details.
ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat')
{"cat": "raining <b>cats</b> and dogs"}
ts_rank
ts_rank (
weights real[],
vector tsvector,
query tsquery
, normalization integer )
real
Computes a score showing how well
the vector matches
the query. See
for details.
ts_rank(to_tsvector('raining cats and dogs'), 'cat')
0.06079271
ts_rank_cd
ts_rank_cd (
weights real[],
vector tsvector,
query tsquery
, normalization integer )
real
Computes a score showing how well
the vector matches
the query, using a cover density
algorithm. See for details.
ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat')
0.1
ts_rewrite
ts_rewrite ( query tsquery,
target tsquery,
substitute tsquery )
tsquery
Replaces occurrences of target
with substitute
within the query.
See for details.
ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery)
'b' & ( 'foo' | 'bar' )
ts_rewrite ( query tsquery,
select text )
tsquery
Replaces portions of the query according to
target(s) and substitute(s) obtained by executing
a SELECT command.
See for details.
SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')
'b' & ( 'foo' | 'bar' )
tsquery_phrase
tsquery_phrase ( query1 tsquery, query2 tsquery )
tsquery
Constructs a phrase query that searches
for matches of query1
and query2 at successive lexemes (same
as <-> operator).
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))
'fat' <-> 'cat'
tsquery_phrase ( query1 tsquery, query2 tsquery, distance integer )
tsquery
Constructs a phrase query that searches
for matches of query1 and
query2 that occur exactly
distance lexemes apart.
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)
'fat' <10> 'cat'
tsvector_to_array
tsvector_to_array ( tsvector )
text[]
Converts a tsvector to an array of lexemes.
tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector)
{cat,fat,rat}
unnest
for tsvector
unnest ( tsvector )
setof record
( lexeme text,
positions smallint[],
weights text )
Expands a tsvector into a set of rows, one per lexeme.
select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector)
lexeme | positions | weights
--------+-----------+---------
cat | {3} | {D}
fat | {2,4} | {D,D}
rat | {5} | {A}
All the text search functions that accept an optional regconfig
argument will use the configuration specified by
when that argument is omitted.
The functions in
are listed separately because they are not usually used in everyday text
searching operations. They are primarily helpful for development and
debugging of new text search configurations.
Text Search Debugging Functions
Function
Description
Example(s)
ts_debug
ts_debug (
config regconfig,
document text )
setof record
( alias text,
description text,
token text,
dictionaries regdictionary[],
dictionary regdictionary,
lexemes text[] )
Extracts and normalizes tokens from
the document according to the specified or
default text search configuration, and returns information about how
each token was processed.
See for details.
ts_debug('english', 'The Brightest supernovaes')
(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...
ts_lexize
ts_lexize ( dict regdictionary, token text )
text[]
Returns an array of replacement lexemes if the input token is known to
the dictionary, or an empty array if the token is known to the
dictionary but it is a stop word, or NULL if it is not a known word.
See for details.
ts_lexize('english_stem', 'stars')
{star}
ts_parse
ts_parse ( parser_name text,
document text )
setof record
( tokid integer,
token text )
Extracts tokens from the document using the
named parser.
See for details.
ts_parse('default', 'foo - bar')
(1,foo) ...
ts_parse ( parser_oid oid,
document text )
setof record
( tokid integer,
token text )
Extracts tokens from the document using a
parser specified by OID.
See for details.
ts_parse(3722, 'foo - bar')
(1,foo) ...
ts_token_type
ts_token_type ( parser_name text )
setof record
( tokid integer,
alias text,
description text )
Returns a table that describes each type of token the named parser can
recognize.
See for details.
ts_token_type('default')
(1,asciiword,"Word, all ASCII") ...
ts_token_type ( parser_oid oid )
setof record
( tokid integer,
alias text,
description text )
Returns a table that describes each type of token a parser specified
by OID can recognize.
See for details.
ts_token_type(3722)
(1,asciiword,"Word, all ASCII") ...
ts_stat
ts_stat ( sqlquery text
, weights text )
setof record
( word text,
ndoc integer,
nentry integer )
Executes the sqlquery, which must return a
single tsvector column, and returns statistics about each
distinct lexeme contained in the data.
See for details.
ts_stat('SELECT vector FROM apod')
(foo,10,15) ...