blob: 28b1e91c584165e28d6bbd59f4aac80562705ce9 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
<!-- doc/src/sgml/test-parser.sgml -->
<sect1 id="test-parser" xreflabel="test_parser">
<title>test_parser</title>
<indexterm zone="test-parser">
<primary>test_parser</primary>
</indexterm>
<para>
<filename>test_parser</> is an example of a custom parser for full-text
search. It doesn't do anything especially useful, but can serve as
a starting point for developing your own parser.
</para>
<para>
<filename>test_parser</> recognizes words separated by white space,
and returns just two token types:
<programlisting>
mydb=# SELECT * FROM ts_token_type('testparser');
tokid | alias | description
-------+-------+---------------
3 | word | Word
12 | blank | Space symbols
(2 rows)
</programlisting>
These token numbers have been chosen to be compatible with the default
parser's numbering. This allows us to use its <function>headline()</>
function, thus keeping the example simple.
</para>
<sect2>
<title>Usage</title>
<para>
Installing the <literal>test_parser</> extension creates a text search
parser <literal>testparser</>. It has no user-configurable parameters.
</para>
<para>
You can test the parser with, for example,
<programlisting>
mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
tokid | token
-------+--------
3 | That's
12 |
3 | my
12 |
3 | first
12 |
3 | own
12 |
3 | parser
</programlisting>
</para>
<para>
Real-world use requires setting up a text search configuration
that uses the parser. For example,
<programlisting>
mydb=# CREATE TEXT SEARCH CONFIGURATION testcfg ( PARSER = testparser );
CREATE TEXT SEARCH CONFIGURATION
mydb=# ALTER TEXT SEARCH CONFIGURATION testcfg
mydb-# ADD MAPPING FOR word WITH english_stem;
ALTER TEXT SEARCH CONFIGURATION
mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser');
to_tsvector
-------------------------------
'that':1 'first':3 'parser':5
(1 row)
mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(# to_tsquery('testcfg', 'star'));
ts_headline
-----------------------------------------------------------------
Supernovae <b>stars</b> are the brightest phenomena in galaxies
(1 row)
</programlisting>
</para>
</sect2>
</sect1>
|