diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2014-06-30 21:46:29 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2014-06-30 21:46:29 -0400 |
commit | 1b2488731cc2c87cc9a4cb8d654e4d9981fdf9ac (patch) | |
tree | 45cbe297cbcfcdd475193642056484e7c49c4dbe /doc/src | |
parent | 97c40ce61465582b96944e41ed6ec06c2016b95c (diff) | |
download | postgresql-1b2488731cc2c87cc9a4cb8d654e4d9981fdf9ac.tar.gz postgresql-1b2488731cc2c87cc9a4cb8d654e4d9981fdf9ac.zip |
Allow multi-character source strings in contrib/unaccent.
This could be useful in languages where diacritic signs are represented as
separate characters; more generally it supports using unaccent dictionaries
for substring substitutions beyond narrowly conceived "diacritic removal".
In any case, since the rule-file parser doesn't complain about
multi-character source strings, it behooves us to do something unsurprising
with them.
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/unaccent.sgml | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/doc/src/sgml/unaccent.sgml b/doc/src/sgml/unaccent.sgml index aef0031dcbc..1382fafc5ec 100644 --- a/doc/src/sgml/unaccent.sgml +++ b/doc/src/sgml/unaccent.sgml @@ -72,6 +72,14 @@ <listitem> <para> + Actually, each <quote>character</> can be any string not containing + whitespace, so <filename>unaccent</> dictionaries could be used for + other sorts of substring substitutions besides diacritic removal. + </para> + </listitem> + + <listitem> + <para> As with other <productname>PostgreSQL</> text search configuration files, the rules file must be stored in UTF-8 encoding. The data is automatically translated into the current database's encoding when |