aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/catalogs.sgml16
-rw-r--r--doc/src/sgml/charset.sgml73
-rw-r--r--doc/src/sgml/indices.sgml6
-rw-r--r--doc/src/sgml/ref/create_database.sgml45
-rw-r--r--doc/src/sgml/ref/initdb.sgml41
-rw-r--r--doc/src/sgml/ref/pg_controldata.sgml4
-rw-r--r--doc/src/sgml/ref/pg_resetxlog.sgml14
-rw-r--r--doc/src/sgml/ref/select.sgml5
-rw-r--r--doc/src/sgml/ref/show.sgml10
-rw-r--r--doc/src/sgml/runtime.sgml13
-rw-r--r--doc/src/sgml/textsearch.sgml4
11 files changed, 137 insertions, 94 deletions
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 40f1ce568ed..bf1ac314f73 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.175 2008/09/19 19:03:40 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.176 2008/09/23 09:20:33 heikki Exp $ -->
<!--
Documentation of the system catalogs, directed toward PostgreSQL developers
-->
@@ -2150,6 +2150,20 @@
</row>
<row>
+ <entry><structfield>datcollate</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry></entry>
+ <entry>LC_COLLATE for this database</entry>
+ </row>
+
+ <row>
+ <entry><structfield>datctype</structfield></entry>
+ <entry><type>name</type></entry>
+ <entry></entry>
+ <entry>LC_CTYPE for this database</entry>
+ </row>
+
+ <row>
<entry><structfield>datistemplate</structfield></entry>
<entry><type>bool</type></entry>
<entry></entry>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 1f4866b203c..c012294ef81 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.87 2008/07/15 17:45:03 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.88 2008/09/23 09:20:34 heikki Exp $ -->
<chapter id="charset">
<title>Localization</>
@@ -130,23 +130,23 @@ initdb --locale=sv_SE
<para>
The nature of some locale categories is that their value has to be
- fixed for the lifetime of a database cluster. That is, once
- <command>initdb</command> has run, you cannot change them anymore.
- <literal>LC_COLLATE</literal> and <literal>LC_CTYPE</literal> are
- those categories. They affect the sort order of indexes, so they
- must be kept fixed, or indexes on text columns will become corrupt.
- <productname>PostgreSQL</productname> enforces this by recording
- the values of <envar>LC_COLLATE</> and <envar>LC_CTYPE</> that are
- seen by <command>initdb</>. The server automatically adopts
- those two values when it is started.
+ fixed when the database is created. You can use different settings
+ for different databases, but once a database is created, you cannot
+ change them for that database anymore. <literal>LC_COLLATE</literal>
+ and <literal>LC_CTYPE</literal> are those categories. They affect
+ the sort order of indexes, so they must be kept fixed, or indexes on
+ text columns will become corrupt. The default values for these
+ categories are defined when <command>initdb</command> is run, and
+ those values are used when new databases are created, unless
+ specified otherwise in the <command>CREATE DATABASE</command> command.
</para>
<para>
The other locale categories can be changed as desired whenever the
server is running by setting the run-time configuration variables
that have the same name as the locale categories (see <xref
- linkend="runtime-config-client-format"> for details). The defaults that are
- chosen by <command>initdb</command> are actually only written into
+ linkend="runtime-config-client-format"> for details). The defaults
+ that are chosen by <command>initdb</command> are actually only written into
the configuration file <filename>postgresql.conf</filename> to
serve as defaults when the server is started. If you delete these
assignments from <filename>postgresql.conf</filename> then the
@@ -261,7 +261,7 @@ initdb --locale=sv_SE
<para>
Check that <productname>PostgreSQL</> is actually using the locale
- that you think it is. <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
+ that you think it is. The default <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
settings are determined at <command>initdb</> time and cannot be
changed without repeating <command>initdb</>. Other locale
settings including <envar>LC_MESSAGES</> and <envar>LC_MONETARY</>
@@ -319,17 +319,11 @@ initdb --locale=sv_SE
</para>
<para>
- An important restriction, however, is that each database character set
- must be compatible with the server's <envar>LC_CTYPE</> setting.
+ An important restriction, however, is that each database's character set
+ must be compatible with the database's <envar>LC_CTYPE</> setting.
When <envar>LC_CTYPE</> is <literal>C</> or <literal>POSIX</>, any
character set is allowed, but for other settings of <envar>LC_CTYPE</>
there is only one character set that will work correctly.
- Since the <envar>LC_CTYPE</> setting is frozen by <command>initdb</>, the
- apparent flexibility to use different encodings in different databases
- of a cluster is more theoretical than real, except when you select
- <literal>C</> or <literal>POSIX</> locale (thus disabling any real locale
- awareness). It is likely that these mechanisms will be revisited in future
- versions of <productname>PostgreSQL</productname>.
</para>
<sect2 id="multibyte-charset-supported">
@@ -734,19 +728,19 @@ initdb -E EUC_JP
</para>
<para>
- If you have selected <literal>C</> or <literal>POSIX</> locale,
- you can create a database with a different character set:
+ You can specify a non-default encoding at database creation time,
+ provided that the encoding is compatible with the selected locale:
<screen>
-createdb -E EUC_KR korean
+createdb -E EUC_KR -T template0 --lc-collate=ko_KR.euckr --lc-ctype=ko_KR.euckr korean
</screen>
This will create a database named <literal>korean</literal> that
- uses the character set <literal>EUC_KR</literal>. Another way to
- accomplish this is to use this SQL command:
+ uses the character set <literal>EUC_KR</literal>, and locale <literal>ko_KR</literal>.
+ Another way to accomplish this is to use this SQL command:
<programlisting>
-CREATE DATABASE korean WITH ENCODING 'EUC_KR';
+CREATE DATABASE korean WITH ENCODING 'EUC_KR' COLLATE='ko_KR.euckr' CTYPE='ko_KR.euckr' TEMPLATE=template0;
</programlisting>
The encoding for a database is stored in the system catalog
@@ -756,20 +750,17 @@ CREATE DATABASE korean WITH ENCODING 'EUC_KR';
<screen>
$ <userinput>psql -l</userinput>
- List of databases
- Database | Owner | Encoding
----------------+---------+---------------
- euc_cn | t-ishii | EUC_CN
- euc_jp | t-ishii | EUC_JP
- euc_kr | t-ishii | EUC_KR
- euc_tw | t-ishii | EUC_TW
- mule_internal | t-ishii | MULE_INTERNAL
- postgres | t-ishii | EUC_JP
- regression | t-ishii | SQL_ASCII
- template1 | t-ishii | EUC_JP
- test | t-ishii | EUC_JP
- utf8 | t-ishii | UTF8
-(9 rows)
+ List of databases
+ Name | Owner | Encoding | Collation | Ctype | Access Privileges
+-----------+----------+-----------+-------------+-------------+-------------------------------------
+ clocaledb | hlinnaka | SQL_ASCII | C | C |
+ englishdb | hlinnaka | UTF8 | en_GB.UTF8 | en_GB.UTF8 |
+ japanese | hlinnaka | UTF8 | ja_JP.UTF8 | ja_JP.UTF8 |
+ korean | hlinnaka | EUC_KR | ko_KR.euckr | ko_KR.euckr |
+ postgres | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 |
+ template0 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 | {=c/hlinnaka,hlinnaka=CTc/hlinnaka}
+ template1 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 | {=c/hlinnaka,hlinnaka=CTc/hlinnaka}
+(7 rows)
</screen>
</para>
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 2ab713c39be..0993a8be03f 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.74 2008/07/11 21:06:28 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.75 2008/09/23 09:20:34 heikki Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
@@ -157,7 +157,7 @@ CREATE INDEX test1_id_index ON test1 (id);
<emphasis>if</emphasis> the pattern is a constant and is anchored to
the beginning of the string &mdash; for example, <literal>col LIKE
'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not
- <literal>col LIKE '%bar'</literal>. However, if your server does not
+ <literal>col LIKE '%bar'</literal>. However, if your database does not
use the C locale you will need to create the index with a special
operator class to support indexing of pattern-matching queries. See
<xref linkend="indexes-opclass"> below. It is also possible to use
@@ -922,7 +922,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
according to the locale-specific collation rules. This makes
these operator classes suitable for use by queries involving
pattern matching expressions (<literal>LIKE</literal> or POSIX
- regular expressions) when the server does not use the standard
+ regular expressions) when the database does not use the standard
<quote>C</quote> locale. As an example, you might index a
<type>varchar</type> column like this:
<programlisting>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index b1b13332456..5e72768981c 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.48 2007/09/28 22:25:49 tgl Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.49 2008/09/23 09:20:34 heikki Exp $
PostgreSQL documentation
-->
@@ -24,6 +24,8 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">dbowner</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
+ [ COLLATE [=] <replaceable class="parameter">collate</replaceable> ]
+ [ CTYPE [=] <replaceable class="parameter">ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace</replaceable> ]
[ CONNECTION LIMIT [=] <replaceable class="parameter">connlimit</replaceable> ] ]
</synopsis>
@@ -113,6 +115,29 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
+ <term><replaceable class="parameter">collate</replaceable></term>
+ <listitem>
+ <para>
+ Collation order (<literal>LC_COLLATE</>) to use in the new database.
+ This affects the sort order applied to strings, e.g in queries with
+ ORDER BY, as well as the order used in indexes on text columns.
+ The default is to use the collation order of the template database.
+ See below for additional restrictions.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><replaceable class="parameter">ctype</replaceable></term>
+ <listitem>
+ <para>
+ Character classification (<literal>LC_CTYPE</>) to use in the new
+ database. This affects the categorization of characters, e.g. lower,
+ upper and digit. The default is to use the character classification of
+ the template database. See below for additional restrictions.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><replaceable class="parameter">tablespace</replaceable></term>
<listitem>
<para>
@@ -180,13 +205,11 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
- Any character set encoding specified for the new database must be
- compatible with the server's <envar>LC_CTYPE</> locale setting.
+ The character set encoding specified for the new database must be
+ compatible with the chosen COLLATE and CTYPE settings.
If <envar>LC_CTYPE</> is <literal>C</> (or equivalently
<literal>POSIX</>), then all encodings are allowed, but for other
- locale settings there is only one encoding that will work properly,
- and so the apparent freedom to specify an encoding is illusory if
- you didn't initialize the database cluster in <literal>C</> locale.
+ locale settings there is only one encoding that will work properly.
<command>CREATE DATABASE</> will allow superusers to specify
<literal>SQL_ASCII</> encoding regardless of the locale setting,
but this choice is deprecated and may result in misbehavior of
@@ -195,6 +218,16 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
+ The <literal>COLLATE</> and <literal>CTYPE</> settings must match
+ those of the template database, except when template0 is used as
+ template. This is because <literal>COLLATE</> and <literal>CTYPE</>
+ affects the ordering in indexes, so that any indexes copied from the
+ template database would be invalid in the new database with different
+ settings. <literal>template0</literal>, however, is known to not
+ contain any indexes that would be affected.
+ </para>
+
+ <para>
The <literal>CONNECTION LIMIT</> option is only enforced approximately;
if two new sessions start at about the same time when just one
connection <quote>slot</> remains for the database, it is possible that
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 312da7085a9..110c21eb8c5 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.43 2007/03/26 17:23:36 tgl Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.44 2008/09/23 09:20:34 heikki Exp $
PostgreSQL documentation
-->
@@ -76,25 +76,34 @@ PostgreSQL documentation
<para>
<command>initdb</command> initializes the database cluster's default
- locale and character set encoding. The collation order
- (<literal>LC_COLLATE</>) and character set classes
- (<literal>LC_CTYPE</>, e.g. upper, lower, digit) are fixed for all
- databases and cannot be changed. Collation orders other than
- <literal>C</> or <literal>POSIX</> also have a performance penalty.
- For these reasons it is important to choose the right locale when
- running <command>initdb</command>. The remaining locale categories
- can be changed later when the server is started. All server locale
- values (<literal>lc_*</>) can be displayed via <command>SHOW ALL</>.
+ locale and character set encoding. The character set encoding,
+ collation order (<literal>LC_COLLATE</>) and character set classes
+ (<literal>LC_CTYPE</>, e.g. upper, lower, digit) can be set separately
+ for a database when it is created. <command>initdb</command> determines
+ those settings for the <literal>template1</literal> database, which will
+ serve as the default for all other databases.
+ </para>
+
+ <para>
+ To alter the default collation order or character set classes, use the
+ <option>--lc-collate</option> and <option>--lc-ctype</option> options.
+ Collation orders other than <literal>C</> or <literal>POSIX</> also have
+ a performance penalty. For these reasons it is important to choose the
+ right locale when running <command>initdb</command>.
+ </para>
+
+ <para>
+ The remaining locale categories can be changed later when the server
+ is started. You can also use <option>--locale</option> to set the
+ default for all locale categories, including collation order and
+ character set classes. All server locale values (<literal>lc_*</>) can
+ be displayed via <command>SHOW ALL</>.
More details can be found in <xref linkend="locale">.
</para>
<para>
- The character set encoding can be set separately for a database when
- it is created. <command>initdb</command> determines the encoding for
- the <literal>template1</literal> database, which will serve as the
- default for all other databases. To alter the default encoding use
- the <option>--encoding</option> option. More details can be found in
- <xref linkend="multibyte">.
+ To alter the default encoding, use the <option>--encoding</option>.
+ More details can be found in <xref linkend="multibyte">.
</para>
</refsect1>
diff --git a/doc/src/sgml/ref/pg_controldata.sgml b/doc/src/sgml/ref/pg_controldata.sgml
index 466c03e2244..62695963e2b 100644
--- a/doc/src/sgml/ref/pg_controldata.sgml
+++ b/doc/src/sgml/ref/pg_controldata.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.10 2007/02/20 18:10:58 momjian Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.11 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -30,7 +30,7 @@ PostgreSQL documentation
<title>Description</title>
<para>
<command>pg_controldata</command> prints information initialized during
- <command>initdb</>, such as the catalog version and server locale.
+ <command>initdb</>, such as the catalog version.
It also shows information about write-ahead logging and checkpoint
processing. This information is cluster-wide, and not specific to any one
database.
diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml
index 588ff38c1bb..a9d34298e4c 100644
--- a/doc/src/sgml/ref/pg_resetxlog.sgml
+++ b/doc/src/sgml/ref/pg_resetxlog.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.20 2007/01/31 23:26:04 momjian Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.21 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -62,14 +62,10 @@ PostgreSQL documentation
by specifying the <literal>-f</> (force) switch. In this case plausible
values will be substituted for the missing data. Most of the fields can be
expected to match, but manual assistance might be needed for the next OID,
- next transaction ID and epoch, next multitransaction ID and offset,
- WAL starting address, and database locale fields.
- The first six of these can be set using the switches discussed below.
- <command>pg_resetxlog</command>'s own environment is the source for its
- guess at the locale fields; take care that <envar>LANG</> and so forth
- match the environment that <command>initdb</> was run in.
- If you are not able to determine correct values for all these fields,
- <literal>-f</> can still be used, but
+ next transaction ID and epoch, next multitransaction ID and offset, and
+ WAL starting address fields. These fields can be set using the switches
+ discussed below. If you are not able to determine correct values for all
+ these fields, <literal>-f</> can still be used, but
the recovered database must be treated with even more suspicion than
usual: an immediate dump and reload is imperative. <emphasis>Do not</>
execute any data-modifying operations in the database before you dump,
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 000b5614dd2..d8ed7aef9c6 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.103 2008/02/15 22:17:06 tgl Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.104 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -747,8 +747,7 @@ SELECT name FROM distributors ORDER BY code;
<para>
Character-string data is sorted according to the locale-specific
- collation order that was established when the database cluster
- was initialized.
+ collation order that was established when the database was created.
</para>
</refsect2>
diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml
index ebd1acee35a..fdc348053ea 100644
--- a/doc/src/sgml/ref/show.sgml
+++ b/doc/src/sgml/ref/show.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.45 2008/01/03 21:23:15 tgl Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.46 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -82,8 +82,8 @@ SHOW ALL
<para>
Shows the database's locale setting for collation (text
ordering). At present, this parameter can be shown but not
- set, because the setting is determined at
- <command>initdb</> time.
+ set, because the setting is determined at database creation
+ time.
</para>
</listitem>
</varlistentry>
@@ -94,8 +94,8 @@ SHOW ALL
<para>
Shows the database's locale setting for character
classification. At present, this parameter can be shown but
- not set, because the setting is determined at
- <command>initdb</> time.
+ not set, because the setting is determined at database creation
+ time.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 75c6d266e9d..adde49e1a39 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.416 2008/04/26 22:47:40 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.417 2008/09/23 09:20:34 heikki Exp $ -->
<chapter Id="runtime">
<title>Operating System Environment</title>
@@ -145,11 +145,12 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
Normally, it will just take the locale settings in the environment
and apply them to the initialized database. It is possible to
specify a different locale for the database; more information about
- that can be found in <xref linkend="locale">. The sort order used
- within a particular database cluster is set by
- <command>initdb</command> and cannot be changed later, short of
- dumping all data, rerunning <command>initdb</command>, and reloading
- the data. There is also a performance impact for using locales
+ that can be found in <xref linkend="locale">. The default sort order used
+ within the particular database cluster is set by
+ <command>initdb</command>, and while you can create new databases using
+ different sort order, the order used in the template databases that initdb
+ creates cannot be changed without dropping and recreating them.
+ There is also a performance impact for using locales
other than <literal>C</> or <literal>POSIX</>. Therefore, it is
important to make this choice correctly the first time.
</para>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 41db566b6cc..45a9f5a389f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.44 2008/05/16 16:31:01 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.45 2008/09/23 09:20:34 heikki Exp $ -->
<chapter id="textsearch">
<title id="textsearch-title">Full Text Search</title>
@@ -1896,7 +1896,7 @@ LIMIT 10;
<note>
<para>
- The parser's notion of a <quote>letter</> is determined by the server's
+ The parser's notion of a <quote>letter</> is determined by the database's
locale setting, specifically <varname>lc_ctype</>. Words containing
only the basic ASCII letters are reported as a separate token type,
since it is sometimes useful to distinguish them. In most European