Text Search Functions and Operators

Text Search Functions and Operators full text search functions and operators text search functions and operators , and summarize the functions and operators that are provided for full text searching. See for a detailed explanation of PostgreSQL's text search facility. Text Search Operators Operator Description Example(s) tsvector @@ tsquery boolean tsquery @@ tsvector boolean Does tsvector match tsquery? (The arguments can be given in either order.) to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') t text @@ tsquery boolean Does text string, after implicit invocation of to_tsvector(), match tsquery? 'fat cats ate rats' @@ to_tsquery('cat & rat') t tsvector || tsvector tsvector Concatenates two tsvectors. If both inputs contain lexeme positions, the second input's positions are adjusted accordingly. 'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector 'a':1 'b':2,5 'c':3 'd':4 tsquery && tsquery tsquery ANDs two tsquerys together, producing a query that matches documents that match both input queries. 'fat | rat'::tsquery && 'cat'::tsquery ( 'fat' | 'rat' ) & 'cat' tsquery || tsquery tsquery ORs two tsquerys together, producing a query that matches documents that match either input query. 'fat | rat'::tsquery || 'cat'::tsquery 'fat' | 'rat' | 'cat' !! tsquery tsquery Negates a tsquery, producing a query that matches documents that do not match the input query. !! 'cat'::tsquery !'cat' tsquery <-> tsquery tsquery Constructs a phrase query, which matches if the two input queries match at successive lexemes. to_tsquery('fat') <-> to_tsquery('rat') 'fat' <-> 'rat' tsquery @> tsquery boolean Does first tsquery contain the second? (This considers only whether all the lexemes appearing in one query appear in the other, ignoring the combining operators.) 'cat'::tsquery @> 'cat & rat'::tsquery f tsquery <@ tsquery boolean Is first tsquery contained in the second? (This considers only whether all the lexemes appearing in one query appear in the other, ignoring the combining operators.) 'cat'::tsquery <@ 'cat & rat'::tsquery t 'cat'::tsquery <@ '!cat & rat'::tsquery t

In addition to these specialized operators, the usual comparison operators shown in are available for types tsvector and tsquery. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types. Text Search Functions Function Description Example(s) array_to_tsvector array_to_tsvector ( text[] ) tsvector Converts an array of text strings to a tsvector. The given strings are used as lexemes as-is, without further processing. Array elements must not be empty strings or NULL. array_to_tsvector('{fat,cat,rat}'::text[]) 'cat' 'fat' 'rat' get_current_ts_config get_current_ts_config ( ) regconfig Returns the OID of the current default text search configuration (as set by ). get_current_ts_config() english length length ( tsvector ) integer Returns the number of lexemes in the tsvector. length('fat:2,4 cat:3 rat:5A'::tsvector) 3 numnode numnode ( tsquery ) integer Returns the number of lexemes plus operators in the tsquery. numnode('(fat & rat) | cat'::tsquery) 5 plainto_tsquery plainto_tsquery ( config regconfig, query text ) tsquery Converts text to a tsquery, normalizing words according to the specified or default configuration. Any punctuation in the string is ignored (it does not determine query operators). The resulting query matches documents containing all non-stopwords in the text. plainto_tsquery('english', 'The Fat Rats') 'fat' & 'rat' phraseto_tsquery phraseto_tsquery ( config regconfig, query text ) tsquery Converts text to a tsquery, normalizing words according to the specified or default configuration. Any punctuation in the string is ignored (it does not determine query operators). The resulting query matches phrases containing all non-stopwords in the text. phraseto_tsquery('english', 'The Fat Rats') 'fat' <-> 'rat' phraseto_tsquery('english', 'The Cat and Rats') 'cat' <2> 'rat' websearch_to_tsquery websearch_to_tsquery ( config regconfig, query text ) tsquery Converts text to a tsquery, normalizing words according to the specified or default configuration. Quoted word sequences are converted to phrase tests. The word or is understood as producing an OR operator, and a dash produces a NOT operator; other punctuation is ignored. This approximates the behavior of some common web search tools. websearch_to_tsquery('english', '"fat rat" or cat dog') 'fat' <-> 'rat' | 'cat' & 'dog' querytree querytree ( tsquery ) text Produces a representation of the indexable portion of a tsquery. A result that is empty or just T indicates a non-indexable query. querytree('foo & ! bar'::tsquery) 'foo' setweight setweight ( vector tsvector, weight "char" ) tsvector Assigns the specified weight to each element of the vector. setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') 'cat':3A 'fat':2A,4A 'rat':5A setweight setweight for specific lexeme(s) setweight ( vector tsvector, weight "char", lexemes text[] ) tsvector Assigns the specified weight to elements of the vector that are listed in lexemes. The strings in lexemes are taken as lexemes as-is, without further processing. Strings that do not match any lexeme in vector are ignored. setweight('fat:2,4 cat:3 rat:5,6B'::tsvector, 'A', '{cat,rat}') 'cat':3A 'fat':2,4 'rat':5A,6A strip strip ( tsvector ) tsvector Removes positions and weights from the tsvector. strip('fat:2,4 cat:3 rat:5A'::tsvector) 'cat' 'fat' 'rat' to_tsquery to_tsquery ( config regconfig, query text ) tsquery Converts text to a tsquery, normalizing words according to the specified or default configuration. The words must be combined by valid tsquery operators. to_tsquery('english', 'The & Fat & Rats') 'fat' & 'rat' to_tsvector to_tsvector ( config regconfig, document text ) tsvector Converts text to a tsvector, normalizing words according to the specified or default configuration. Position information is included in the result. to_tsvector('english', 'The Fat Rats') 'fat':2 'rat':3 to_tsvector ( config regconfig, document json ) tsvector to_tsvector ( config regconfig, document jsonb ) tsvector Converts each string value in the JSON document to a tsvector, normalizing words according to the specified or default configuration. The results are then concatenated in document order to produce the output. Position information is generated as though one stopword exists between each pair of string values. (Beware that document order of the fields of a JSON object is implementation-dependent when the input is jsonb; observe the difference in the examples.) to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json) 'dog':5 'fat':2 'rat':3 to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb) 'dog':1 'fat':4 'rat':5 json_to_tsvector json_to_tsvector ( config regconfig, document json, filter jsonb ) tsvector jsonb_to_tsvector jsonb_to_tsvector ( config regconfig, document jsonb, filter jsonb ) tsvector Selects each item in the JSON document that is requested by the filter and converts each one to a tsvector, normalizing words according to the specified or default configuration. The results are then concatenated in document order to produce the output. Position information is generated as though one stopword exists between each pair of selected items. (Beware that document order of the fields of a JSON object is implementation-dependent when the input is jsonb.) The filter must be a jsonb array containing zero or more of these keywords: "string" (to include all string values), "numeric" (to include all numeric values), "boolean" (to include all boolean values), "key" (to include all keys), or "all" (to include all the above). As a special case, the filter can also be a simple JSON value that is one of these keywords. json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]') '123':5 'fat':2 'rat':3 json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"') '123':9 'cat':1 'dog':7 'fat':4 'rat':5 ts_delete ts_delete ( vector tsvector, lexeme text ) tsvector Removes any occurrence of the given lexeme from the vector. The lexeme string is treated as a lexeme as-is, without further processing. ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') 'cat':3 'rat':5A ts_delete ( vector tsvector, lexemes text[] ) tsvector Removes any occurrences of the lexemes in lexemes from the vector. The strings in lexemes are taken as lexemes as-is, without further processing. Strings that do not match any lexeme in vector are ignored. ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) 'cat':3 ts_filter ts_filter ( vector tsvector, weights "char"[] ) tsvector Selects only elements with the given weights from the vector. ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}') 'cat':3B 'rat':5A ts_headline ts_headline ( config regconfig, document text, query tsquery , options text ) text Displays, in an abbreviated form, the match(es) for the query in the document, which must be raw text not a tsvector. Words in the document are normalized according to the specified or default configuration before matching to the query. Use of this function is discussed in , which also describes the available options. ts_headline('The fat cat ate the rat.', 'cat') The fat <b>cat</b> ate the rat. ts_headline ( config regconfig, document json, query tsquery , options text ) text ts_headline ( config regconfig, document jsonb, query tsquery , options text ) text Displays, in an abbreviated form, match(es) for the query that occur in string values within the JSON document. See for more details. ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat') {"cat": "raining <b>cats</b> and dogs"} ts_rank ts_rank ( weights real[], vector tsvector, query tsquery , normalization integer ) real Computes a score showing how well the vector matches the query. See for details. ts_rank(to_tsvector('raining cats and dogs'), 'cat') 0.06079271 ts_rank_cd ts_rank_cd ( weights real[], vector tsvector, query tsquery , normalization integer ) real Computes a score showing how well the vector matches the query, using a cover density algorithm. See for details. ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat') 0.1 ts_rewrite ts_rewrite ( query tsquery, target tsquery, substitute tsquery ) tsquery Replaces occurrences of target with substitute within the query. See for details. ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery) 'b' & ( 'foo' | 'bar' ) ts_rewrite ( query tsquery, select text ) tsquery Replaces portions of the query according to target(s) and substitute(s) obtained by executing a SELECT command. See for details. SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') 'b' & ( 'foo' | 'bar' ) tsquery_phrase tsquery_phrase ( query1 tsquery, query2 tsquery ) tsquery Constructs a phrase query that searches for matches of query1 and query2 at successive lexemes (same as <-> operator). tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) 'fat' <-> 'cat' tsquery_phrase ( query1 tsquery, query2 tsquery, distance integer ) tsquery Constructs a phrase query that searches for matches of query1 and query2 that occur exactly distance lexemes apart. tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) 'fat' <10> 'cat' tsvector_to_array tsvector_to_array ( tsvector ) text[] Converts a tsvector to an array of lexemes. tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) {cat,fat,rat} unnest for tsvector unnest ( tsvector ) setof record ( lexeme text, positions smallint[], weights text ) Expands a tsvector into a set of rows, one per lexeme. select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector) lexeme | positions | weights --------+-----------+--------- cat | {3} | {D} fat | {2,4} | {D,D} rat | {5} | {A}

All the text search functions that accept an optional regconfig argument will use the configuration specified by when that argument is omitted. The functions in are listed separately because they are not usually used in everyday text searching operations. They are primarily helpful for development and debugging of new text search configurations. Text Search Debugging Functions Function Description Example(s) ts_debug ts_debug ( config regconfig, document text ) setof record ( alias text, description text, token text, dictionaries regdictionary[], dictionary regdictionary, lexemes text[] ) Extracts and normalizes tokens from the document according to the specified or default text search configuration, and returns information about how each token was processed. See for details. ts_debug('english', 'The Brightest supernovaes') (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ... ts_lexize ts_lexize ( dict regdictionary, token text ) text[] Returns an array of replacement lexemes if the input token is known to the dictionary, or an empty array if the token is known to the dictionary but it is a stop word, or NULL if it is not a known word. See for details. ts_lexize('english_stem', 'stars') {star} ts_parse ts_parse ( parser_name text, document text ) setof record ( tokid integer, token text ) Extracts tokens from the document using the named parser. See for details. ts_parse('default', 'foo - bar') (1,foo) ... ts_parse ( parser_oid oid, document text ) setof record ( tokid integer, token text ) Extracts tokens from the document using a parser specified by OID. See for details. ts_parse(3722, 'foo - bar') (1,foo) ... ts_token_type ts_token_type ( parser_name text ) setof record ( tokid integer, alias text, description text ) Returns a table that describes each type of token the named parser can recognize. See for details. ts_token_type('default') (1,asciiword,"Word, all ASCII") ... ts_token_type ( parser_oid oid ) setof record ( tokid integer, alias text, description text ) Returns a table that describes each type of token a parser specified by OID can recognize. See for details. ts_token_type(3722) (1,asciiword,"Word, all ASCII") ... ts_stat ts_stat ( sqlquery text , weights text ) setof record ( word text, ndoc integer, nentry integer ) Executes the sqlquery, which must return a single tsvector column, and returns statistics about each distinct lexeme contained in the data. See for details. ts_stat('SELECT vector FROM apod') (foo,10,15) ...