diff options
Diffstat (limited to 'doc/src/sgml/indices.sgml')
-rw-r--r-- | doc/src/sgml/indices.sgml | 93 |
1 files changed, 49 insertions, 44 deletions
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index add55501e51..6bf10180698 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -1,4 +1,4 @@ -<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.38 2002/11/11 20:14:03 petere Exp $ --> +<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.39 2003/03/13 01:30:28 petere Exp $ --> <chapter id="indexes"> <title id="indexes-title">Indexes</title> @@ -83,8 +83,8 @@ CREATE INDEX test1_id_index ON test1 (id); </para> <para> - Indexes can benefit <command>UPDATE</command>s and - <command>DELETE</command>s with search conditions. Indexes can also be + Indexes can also benefit <command>UPDATE</command> and + <command>DELETE</command> commands with search conditions. Indexes can moreover be used in join queries. Thus, an index defined on a column that is part of a join condition can significantly speed up queries with joins. @@ -119,7 +119,7 @@ CREATE INDEX test1_id_index ON test1 (id); By default, the <command>CREATE INDEX</command> command will create a B-tree index, which fits the most common situations. In - particular, the <productname>PostgreSQL</productname> query optimizer + particular, the <productname>PostgreSQL</productname> query planner will consider using a B-tree index whenever an indexed column is involved in a comparison using one of these operators: @@ -146,7 +146,7 @@ CREATE INDEX test1_id_index ON test1 (id); <synopsis> CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING RTREE (<replaceable>column</replaceable>); </synopsis> - The <productname>PostgreSQL</productname> query optimizer will + The <productname>PostgreSQL</productname> query planner will consider using an R-tree index whenever an indexed column is involved in a comparison using one of these operators: @@ -172,7 +172,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> <primary>hash</primary> <see>indexes</see> </indexterm> - The query optimizer will consider using a hash index whenever an + The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the <literal>=</literal> operator. The following command is used to create a hash index: @@ -196,9 +196,8 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> standard R-trees using Guttman's quadratic split algorithm. The hash index is an implementation of Litwin's linear hashing. We mention the algorithms used solely to indicate that all of these - access methods are fully dynamic and do not have to be optimized - periodically (as is the case with, for example, static hash access - methods). + index methods are fully dynamic and do not have to be optimized + periodically (as is the case with, for example, static hash methods). </para> </sect1> @@ -242,17 +241,17 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor); </para> <para> - The query optimizer can use a multicolumn index for queries that - involve the first <parameter>n</parameter> consecutive columns in - the index (when used with appropriate operators), up to the total - number of columns specified in the index definition. For example, + The query planner can use a multicolumn index for queries that + involve the leftmost column in the index definition and any number + of columns listed to the right of it without a gap (when + used with appropriate operators). For example, an index on <literal>(a, b, c)</literal> can be used in queries involving all of <literal>a</literal>, <literal>b</literal>, and <literal>c</literal>, or in queries involving both <literal>a</literal> and <literal>b</literal>, or in queries involving only <literal>a</literal>, but not in other combinations. (In a query involving <literal>a</literal> and <literal>c</literal> - the optimizer might choose to use the index for + the planner might choose to use the index for <literal>a</literal> only and treat <literal>c</literal> like an ordinary unindexed column.) </para> @@ -296,7 +295,7 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla <para> When an index is declared unique, multiple table rows with equal - indexed values will not be allowed. NULL values are not considered + indexed values will not be allowed. Null values are not considered equal. </para> @@ -342,7 +341,7 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla SELECT * FROM test1 WHERE lower(col1) = 'value'; </programlisting> This query can use an index, if one has been - defined on the result of the <literal>lower(column)</literal> + defined on the result of the <literal>lower(col1)</literal> operation: <programlisting> CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1)); @@ -353,7 +352,7 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1)); The function in the index definition can take more than one argument, but they must be table columns, not constants. Functional indexes are always single-column (namely, the function - result) even if the function uses more than one input field; there + result) even if the function uses more than one input column; there cannot be multicolumn indexes that contain function calls. </para> @@ -377,29 +376,32 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1)); CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <replaceable>opclass</replaceable> <optional>, ...</optional>); </synopsis> The operator class identifies the operators to be used by the index - for that column. For example, a B-tree index on four-byte integers + for that column. For example, a B-tree index on the type <type>int4</type> would use the <literal>int4_ops</literal> class; this operator - class includes comparison functions for four-byte integers. In + class includes comparison functions for values of type <type>int4</type>. In practice the default operator class for the column's data type is usually sufficient. The main point of having operator classes is that for some data types, there could be more than one meaningful ordering. For example, we might want to sort a complex-number data type either by absolute value or by real part. We could do this by defining two operator classes for the data type and then selecting - the proper class when making an index. There are also some - operator classes with special purposes: + the proper class when making an index. + </para> + + <para> + There are also some built-in operator classes besides the default ones: <itemizedlist> <listitem> <para> The operator classes <literal>box_ops</literal> and <literal>bigbox_ops</literal> both support R-tree indexes on the - <literal>box</literal> data type. The difference between them is + <type>box</type> data type. The difference between them is that <literal>bigbox_ops</literal> scales box coordinates down, to avoid floating-point exceptions from doing multiplication, addition, and subtraction on very large floating-point coordinates. If the field on which your rectangles lie is about - 20 000 units square or larger, you should use + 20 000 square units or larger, you should use <literal>bigbox_ops</literal>. </para> </listitem> @@ -409,25 +411,25 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> <para> The following query shows all defined operator classes: - <programlisting> -SELECT am.amname AS acc_method, - opc.opcname AS ops_name +<programlisting> +SELECT am.amname AS index_method, + opc.opcname AS opclass_name FROM pg_am am, pg_opclass opc WHERE opc.opcamid = am.oid - ORDER BY acc_method, ops_name; - </programlisting> + ORDER BY index_method, opclass_name; +</programlisting> It can be extended to show all the operators included in each class: - <programlisting> -SELECT am.amname AS acc_method, - opc.opcname AS ops_name, - opr.oprname AS ops_comp +<programlisting> +SELECT am.amname AS index_method, + opc.opcname AS opclass_name, + opr.oprname AS opclass_operator FROM pg_am am, pg_opclass opc, pg_amop amop, pg_operator opr WHERE opc.opcamid = am.oid AND amop.amopclaid = opc.oid AND amop.amopopr = opr.oid - ORDER BY acc_method, ops_name, ops_comp; - </programlisting> + ORDER BY index_method, opclass_name, opclass_operator; +</programlisting> </para> </sect1> @@ -465,7 +467,7 @@ SELECT am.amname AS acc_method, <para> Suppose you are storing web server access logs in a database. - Most accesses originate from the IP range of your organization but + Most accesses originate from the IP address range of your organization but some are from elsewhere (say, employees on dial-up connections). If your searches by IP are primarily for outside accesses, you probably do not need to index the IP range that corresponds to your @@ -575,16 +577,16 @@ SELECT * FROM orders WHERE order_nr = 3501; predicate must match the conditions used in the queries that are supposed to benefit from the index. To be precise, a partial index can be used in a query only if the system can recognize that - the query's WHERE condition mathematically <firstterm>implies</> - the index's predicate. + the <literal>WHERE</> condition of the query mathematically implies + the predicate of the index. <productname>PostgreSQL</productname> does not have a sophisticated theorem prover that can recognize mathematically equivalent - predicates that are written in different forms. (Not + expressions that are written in different forms. (Not only is such a general theorem prover extremely difficult to create, it would probably be too slow to be of any real use.) The system can recognize simple inequality implications, for example <quote>x < 1</quote> implies <quote>x < 2</quote>; otherwise - the predicate condition must exactly match the query's WHERE condition + the predicate condition must exactly match the query's <literal>WHERE</> condition or the index will not be recognized to be usable. </para> @@ -606,15 +608,18 @@ SELECT * FROM orders WHERE order_nr = 3501; a given subject and target combination, but there might be any number of <quote>unsuccessful</> entries. Here is one way to do it: <programlisting> -CREATE TABLE tests (subject text, - target text, - success bool, - ...); +CREATE TABLE tests ( + subject text, + target text, + success boolean, + ... +); + CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target) WHERE success; </programlisting> This is a particularly efficient way of doing it when there are few - successful trials and many unsuccessful ones. + successful tests and many unsuccessful ones. </para> </example> |