diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2006-06-06 17:59:58 +0000 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2006-06-06 17:59:58 +0000 |
commit | 8a30cc212745681e5328c030f8fc5d210d733b92 (patch) | |
tree | f0795602f61aea976b8fc6ebf70a387716734fbb /doc/src | |
parent | 05631354f35897dc4f35378b8aa4e9a375e19f42 (diff) | |
download | postgresql-8a30cc212745681e5328c030f8fc5d210d733b92.tar.gz postgresql-8a30cc212745681e5328c030f8fc5d210d733b92.zip |
Make the planner estimate costs for nestloop inner indexscans on the basis
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/indexam.sgml | 26 |
1 files changed, 25 insertions, 1 deletions
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index 4bf14ba7e60..a001cd4e33a 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.13 2006/06/05 02:49:58 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.14 2006/06/06 17:59:57 tgl Exp $ --> <chapter id="indexam"> <title>Index Access Method Interface Definition</title> @@ -344,6 +344,7 @@ void amcostestimate (PlannerInfo *root, IndexOptInfo *index, List *indexQuals, + RelOptInfo *outer_rel, Cost *indexStartupCost, Cost *indexTotalCost, Selectivity *indexSelectivity, @@ -681,6 +682,7 @@ void amcostestimate (PlannerInfo *root, IndexOptInfo *index, List *indexQuals, + RelOptInfo *outer_rel, Cost *indexStartupCost, Cost *indexTotalCost, Selectivity *indexSelectivity, @@ -718,6 +720,20 @@ amcostestimate (PlannerInfo *root, </para> </listitem> </varlistentry> + + <varlistentry> + <term>outer_rel</term> + <listitem> + <para> + If the index is being considered for use in a join inner indexscan, + the planner's information about the outer side of the join. Otherwise + NULL. When non-NULL, some of the qual clauses will be join clauses + with this rel rather than being simple restriction clauses. Also, + the cost estimator should expect that the index scan will be repeated + for each row of the outer rel. + </para> + </listitem> + </varlistentry> </variablelist> </para> @@ -808,6 +824,11 @@ amcostestimate (PlannerInfo *root, table. </para> + <para> + In the join case, the returned numbers should be averages expected for + any one scan of the index. + </para> + <procedure> <title>Cost Estimation</title> <para> @@ -859,6 +880,9 @@ amcostestimate (PlannerInfo *root, *indexTotalCost = seq_page_cost * numIndexPages + (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples; </programlisting> + + However, the above does not account for amortization of index reads + across repeated index scans in the join case. </para> </step> |