doc/src/sgml/test-parser.sgml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

<!-- doc/src/sgml/test-parser.sgml -->

<sect1 id="test-parser" xreflabel="test_parser">
 <title>test_parser</title>

 <indexterm zone="test-parser">
  <primary>test_parser</primary>
 </indexterm>

 <para>
  <filename>test_parser</> is an example of a custom parser for full-text
  search.  It doesn't do anything especially useful, but can serve as
  a starting point for developing your own parser.
 </para>

 <para>
  <filename>test_parser</> recognizes words separated by white space,
  and returns just two token types:

<programlisting>
mydb=# SELECT * FROM ts_token_type('testparser');
 tokid | alias |  description
-------+-------+---------------
     3 | word  | Word
    12 | blank | Space symbols
(2 rows)
</programlisting>

  These token numbers have been chosen to be compatible with the default
  parser's numbering.  This allows us to use its <function>headline()</>
  function, thus keeping the example simple.
 </para>

 <sect2>
  <title>Usage</title>

  <para>
   Installing the <literal>test_parser</> extension creates a text search
   parser <literal>testparser</>.  It has no user-configurable parameters.
  </para>

  <para>
   You can test the parser with, for example,

<programlisting>
mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
 tokid | token
-------+--------
     3 | That's
    12 |
     3 | my
    12 |
     3 | first
    12 |
     3 | own
    12 |
     3 | parser
</programlisting>
  </para>

  <para>
   Real-world use requires setting up a text search configuration
   that uses the parser.  For example,

<programlisting>
mydb=# CREATE TEXT SEARCH CONFIGURATION testcfg ( PARSER = testparser );
CREATE TEXT SEARCH CONFIGURATION

mydb=# ALTER TEXT SEARCH CONFIGURATION testcfg
mydb-#   ADD MAPPING FOR word WITH english_stem;
ALTER TEXT SEARCH CONFIGURATION

mydb=#  SELECT to_tsvector('testcfg', 'That''s my first own parser');
          to_tsvector
-------------------------------
 'that':1 'first':3 'parser':5
(1 row)

mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(#                    to_tsquery('testcfg', 'star'));
                           ts_headline
-----------------------------------------------------------------
 Supernovae &lt;b&gt;stars&lt;/b&gt; are the brightest phenomena in galaxies
(1 row)
</programlisting>
  </para>

 </sect2>

</sect1>