aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2006-06-06 17:59:58 +0000
committerTom Lane <tgl@sss.pgh.pa.us>2006-06-06 17:59:58 +0000
commit8a30cc212745681e5328c030f8fc5d210d733b92 (patch)
treef0795602f61aea976b8fc6ebf70a387716734fbb /doc/src
parent05631354f35897dc4f35378b8aa4e9a375e19f42 (diff)
downloadpostgresql-8a30cc212745681e5328c030f8fc5d210d733b92.tar.gz
postgresql-8a30cc212745681e5328c030f8fc5d210d733b92.zip
Make the planner estimate costs for nestloop inner indexscans on the basis
that the Mackert-Lohmann formula applies across all the repetitions of the nestloop, not just each scan independently. We use the M-L formula to estimate the number of pages fetched from the index as well as from the table; that isn't what it was designed for, but it seems reasonably applicable anyway. This makes large numbers of repetitions look much cheaper than before, which accords with many reports we've received of overestimation of the cost of a nestloop. Also, change the index access cost model to charge random_page_cost per index leaf page touched, while explicitly not counting anything for access to metapage or upper tree pages. This may all need tweaking after we get some field experience, but in simple tests it seems to be giving saner results than before. The main thing is to get the infrastructure in place to let cost_index() and amcostestimate functions take repeated scans into account at all. Per my recent proposal. Note: this patch changes pg_proc.h, but I did not force initdb because the changes are basically cosmetic --- the system does not look into pg_proc to decide how to call an index amcostestimate function, and there's no way to call such a function from SQL at all.
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/indexam.sgml26
1 files changed, 25 insertions, 1 deletions
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4bf14ba7e60..a001cd4e33a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.13 2006/06/05 02:49:58 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.14 2006/06/06 17:59:57 tgl Exp $ -->
<chapter id="indexam">
<title>Index Access Method Interface Definition</title>
@@ -344,6 +344,7 @@ void
amcostestimate (PlannerInfo *root,
IndexOptInfo *index,
List *indexQuals,
+ RelOptInfo *outer_rel,
Cost *indexStartupCost,
Cost *indexTotalCost,
Selectivity *indexSelectivity,
@@ -681,6 +682,7 @@ void
amcostestimate (PlannerInfo *root,
IndexOptInfo *index,
List *indexQuals,
+ RelOptInfo *outer_rel,
Cost *indexStartupCost,
Cost *indexTotalCost,
Selectivity *indexSelectivity,
@@ -718,6 +720,20 @@ amcostestimate (PlannerInfo *root,
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>outer_rel</term>
+ <listitem>
+ <para>
+ If the index is being considered for use in a join inner indexscan,
+ the planner's information about the outer side of the join. Otherwise
+ NULL. When non-NULL, some of the qual clauses will be join clauses
+ with this rel rather than being simple restriction clauses. Also,
+ the cost estimator should expect that the index scan will be repeated
+ for each row of the outer rel.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -808,6 +824,11 @@ amcostestimate (PlannerInfo *root,
table.
</para>
+ <para>
+ In the join case, the returned numbers should be averages expected for
+ any one scan of the index.
+ </para>
+
<procedure>
<title>Cost Estimation</title>
<para>
@@ -859,6 +880,9 @@ amcostestimate (PlannerInfo *root,
*indexTotalCost = seq_page_cost * numIndexPages +
(cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;
</programlisting>
+
+ However, the above does not account for amortization of index reads
+ across repeated index scans in the join case.
</para>
</step>