1 files changed, 31 insertions, 29 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 287cabc33b4..eeef7a22c43 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.95 2009/05/18 08:59:28 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.96 2010/02/03 17:25:05 momjian Exp $ -->
 
 <chapter id="charset">
  <title>Localization</>
@@ -6,8 +6,8 @@
  <para>
   This chapter describes the available localization features from the
   point of view of the administrator.
-  <productname>PostgreSQL</productname> supports localization with
-  two approaches:
+  <productname>PostgreSQL</productname> supports two localization
+  facilities:
 
    <itemizedlist>
     <listitem>
@@ -67,10 +67,10 @@ initdb --locale=sv_SE
     (<literal>sv</>) as spoken
     in Sweden (<literal>SE</>).  Other possibilities might be
     <literal>en_US</> (U.S. English) and <literal>fr_CA</> (French
-    Canadian).  If more than one character set can be useful for a
+    Canadian).  If more than one character set can be used for a
     locale then the specifications look like this:
-    <literal>cs_CZ.ISO8859-2</>. What locales are available under what
-    names on your system depends on what was provided by the operating
+    <literal>cs_CZ.ISO8859-2</>. What locales are available on your 
+    system under what names depends on what was provided by the operating
     system vendor and what was installed.  On most Unix systems, the command
     <literal>locale -a</> will provide a list of available locales.
     Windows uses more verbose locale names, such as <literal>German_Germany</>
@@ -80,8 +80,8 @@ initdb --locale=sv_SE
    <para>
     Occasionally it is useful to mix rules from several locales, e.g.,
     use English collation rules but Spanish messages.  To support that, a
-    set of locale subcategories exist that control only a certain
-    aspect of the localization rules:
+    set of locale subcategories exist that control only certain
+    aspects of the localization rules:
 
     <informaltable>
      <tgroup cols="2">
@@ -127,13 +127,13 @@ initdb --locale=sv_SE
    </para>
 
    <para>
-    The nature of some locale categories is that their value has to be
+    Some locale categories must have their values
     fixed when the database is created.  You can use different settings
     for different databases, but once a database is created, you cannot
     change them for that database anymore. <literal>LC_COLLATE</literal>
-    and <literal>LC_CTYPE</literal> are these categories.  They affect
+    and <literal>LC_CTYPE</literal> are these type of categories.  They affect
     the sort order of indexes, so they must be kept fixed, or indexes on
-    text columns will become corrupt.  The default values for these
+    text columns would become corrupt.  The default values for these
     categories are determined when <command>initdb</command> is run, and
     those values are used when new databases are created, unless
     specified otherwise in the <command>CREATE DATABASE</command> command.
@@ -146,7 +146,7 @@ initdb --locale=sv_SE
     linkend="runtime-config-client-format"> for details).  The values
     that are chosen by <command>initdb</command> are actually only written
     into the configuration file <filename>postgresql.conf</filename> to
-    serve as defaults when the server is started.  If you delete these
+    serve as defaults when the server is started.  If you disable these
     assignments from <filename>postgresql.conf</filename> then the
     server will inherit the settings from its execution environment.
    </para>
@@ -178,7 +178,7 @@ initdb --locale=sv_SE
      settings for the purpose of setting the language of messages.  If
      in doubt, please refer to the documentation of your operating
      system, in particular the documentation about
-     <application>gettext</>, for more information.
+     <application>gettext</>.
     </para>
    </note>
 
@@ -320,8 +320,9 @@ initdb --locale=sv_SE
 
   <para>
    An important restriction, however, is that each database's character set
-   must be compatible with the database's <envar>LC_CTYPE</> and
-   <envar>LC_COLLATE</> locale settings. For <literal>C</> or
+   must be compatible with the database's <envar>LC_CTYPE</> (character
+   classification) and <envar>LC_COLLATE</> (string sort order) locale
+   settings. For <literal>C</> or
    <literal>POSIX</> locale, any character set is allowed, but for other
    locales there is only one character set that will work correctly.
    (On Windows, however, UTF-8 encoding can be used with any locale.)
@@ -543,7 +544,7 @@ initdb --locale=sv_SE
          <entry>LATIN1 with Euro and accents</entry>
          <entry>Yes</entry>
          <entry>1</entry>
-         <entry>ISO885915</entry>
+         <entry><literal>ISO885915</></entry>
         </row>
         <row>
          <entry><literal>LATIN10</literal></entry>
@@ -694,7 +695,7 @@ initdb --locale=sv_SE
      </table>
 
      <para>
-      Not all <acronym>API</>s support all the listed character sets. For example, the
+      Not all client <acronym>API</>s support all the listed character sets. For example, the
       <productname>PostgreSQL</>
       JDBC driver does not support <literal>MULE_INTERNAL</>, <literal>LATIN6</>,
       <literal>LATIN8</>, and <literal>LATIN10</>.
@@ -710,7 +711,7 @@ initdb --locale=sv_SE
       much a declaration that a specific encoding is in use, as a declaration
       of ignorance about the encoding.  In most cases, if you are
       working with any non-ASCII data, it is unwise to use the
-      <literal>SQL_ASCII</> setting, because
+      <literal>SQL_ASCII</> setting because
       <productname>PostgreSQL</productname> will be unable to help you by
       converting or validating non-ASCII characters.
      </para>
@@ -720,17 +721,17 @@ initdb --locale=sv_SE
     <title>Setting the Character Set</title>
 
     <para>
-     <command>initdb</> defines the default character set
+     <command>initdb</> defines the default character set (encoding)
      for a <productname>PostgreSQL</productname> cluster. For example,
 
 <screen>
 initdb -E EUC_JP
 </screen>
 
-     sets the default character set (encoding) to
+     sets the default character set to
      <literal>EUC_JP</literal> (Extended Unix Code for Japanese).  You
      can use <option>--encoding</option> instead of
-     <option>-E</option> if you prefer to type longer option strings.
+     <option>-E</option> if you prefer longer option strings.
      If no <option>-E</> or <option>--encoding</option> option is
      given, <command>initdb</> attempts to determine the appropriate
      encoding to use based on the specified or default locale.
@@ -762,8 +763,8 @@ CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE=
     <para>
      The encoding for a database is stored in the system catalog
      <literal>pg_database</literal>.  You can see it by using the
-     <option>-l</option> option or the <command>\l</command> command
-     of <command>psql</command>.
+     <command>psql</command> <option>-l</option> option or the
+     <command>\l</command> command.
 
 <screen>
 $ <userinput>psql -l</userinput>
@@ -784,11 +785,11 @@ $ <userinput>psql -l</userinput>
     <important>
      <para>
       On most modern operating systems, <productname>PostgreSQL</productname>
-      can determine which character set is implied by an <envar>LC_CTYPE</>
+      can determine which character set is implied by the <envar>LC_CTYPE</>
       setting, and it will enforce that only the matching database encoding is
       used.  On older systems it is your responsibility to ensure that you use
       the encoding expected by the locale you have selected.  A mistake in
-      this area is likely to lead to strange misbehavior of locale-dependent
+      this area is likely to lead to strange behavior of locale-dependent
       operations such as sorting.
      </para>
 
@@ -1190,9 +1191,9 @@ RESET client_encoding;
     <para>
      If the conversion of a particular character is not possible
      &mdash; suppose you chose <literal>EUC_JP</literal> for the
-     server and <literal>LATIN1</literal> for the client, then some
-     Japanese characters do not have a representation in
-     <literal>LATIN1</literal> &mdash; then an error is reported.
+     server and <literal>LATIN1</literal> for the client, and some
+     Japanese characters are returned that do not have a representation in
+     <literal>LATIN1</literal> &mdash; an error is reported.
     </para>
 
     <para>
@@ -1249,7 +1250,8 @@ RESET client_encoding;
 
        <listitem>
         <para>
-         <acronym>UTF</acronym>-8 is defined here.
+         <acronym>UTF</acronym>-8 (8-bit UCS/Unicode Transformation
+         Format) is defined here.
         </para>
        </listitem>
       </varlistentry>