aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/nodeAgg.c
Commit message (Collapse)AuthorAge
* For some reason access/tupmacs.h has been #including utils/memutils.h,Tom Lane2005-05-06
| | | | | | | which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.
* Convert oidvector and int2vector into variable-length arrays. ThisTom Lane2005-03-29
| | | | | | | | | | | | | change saves a great deal of space in pg_proc and its primary index, and it eliminates the former requirement that INDEX_MAX_KEYS and FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded in the on-disk representation (because it affects index tuple header size), but FUNC_MAX_ARGS is not. I believe it would now be possible to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet. There are still a lot of vestigial references to FUNC_MAX_ARGS, which I will clean up in a separate pass. However, getting rid of it altogether would require changing the FunctionCallInfoData struct, and I'm not sure I want to buy into that.
* Use InitFunctionCallInfoData() macro instead of MemSet in performanceTom Lane2005-03-22
| | | | critical places in execQual. By Atsushi Ogawa; some minor cleanup by moi.
* Revise TupleTableSlot code to avoid unnecessary construction and disassemblyTom Lane2005-03-16
| | | | | | | | | | | | | | | | | | | | | of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.
* Adjust the API for aggregate function calls so that a C-coded functionTom Lane2005-03-12
| | | | | | | | | | | | | can tell whether it is being used as an aggregate or not. This allows such a function to avoid re-pallocing a pass-by-reference transition value; normally it would be unsafe for a function to scribble on an input, but in the aggregate case it's safe to reuse the old transition value. Make int8inc() do this. This gets a useful improvement in the speed of COUNT(*), at least on narrow tables (it seems to be swamped by I/O when the table rows are wide). Per a discussion in early December with Neil Conway. I also fixed int_aggregate.c to check this, thereby turning it into something approaching a supportable technique instead of being a crude hack.
* Improve planner's estimation of the space needed for HashAgg plans:Tom Lane2005-01-28
| | | | | | look at the actual aggregate transition datatypes and the actual overhead needed by nodeAgg.c, instead of using pessimistic round numbers. Per a discussion with Michael Tiemann.
* Check that aggregate creator has the right to execute the transitionTom Lane2005-01-27
| | | | functions of the aggregate, at both aggregate creation and execution times.
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
* Pgindent run for 8.0.Bruce Momjian2004-08-29
|
* Update copyright to 2004.Bruce Momjian2004-08-29
|
* Test HAVING condition before computing targetlist of an Aggregate node.Tom Lane2004-07-10
| | | | | | | This is required by SQL spec to avoid failures in cases like SELECT sum(win)/sum(lose) FROM ... GROUP BY ... HAVING sum(lose) > 0; AFAICT we have gotten this wrong since day one. Kudos to Holger Jakobs for being the first to notice.
* Infrastructure for I/O of composite types: arrange for the I/O routinesTom Lane2004-06-06
| | | | | | | | | | of a composite type to get that type's OID as their second parameter, in place of typelem which is useless. The actual changes are mostly centralized in getTypeInputInfo and siblings, but I had to fix a few places that were fetching pg_type.typelem for themselves instead of using the lsyscache.c routines. Also, I renamed all the related variables from 'typelem' to 'typioparam' to discourage people from assuming that they necessarily contain array element types.
* Use the new List API function names throughout the backend, and disable theNeil Conway2004-05-30
| | | | | list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.
* Reimplement the linked list data structure used throughout the backend.Neil Conway2004-05-26
| | | | | | | | | | | | | | | | In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.
* Repair memory leakage introduced into the non-hashed aggregate case byTom Lane2004-03-13
| | | | | | | 7.4 rewrite for hashed aggregate support. If the transition data type is pass-by-reference, the transValue must be pfreed when starting a new group boundary, else we have a one-value-per-group leakage. Thanks to Rae Steining for providing a reproducible test case.
* Rename SortMem and VacuumMem to work_mem and maintenance_work_mem.Tom Lane2004-02-03
| | | | | | | Make btree index creation and initial validation of foreign-key constraints use maintenance_work_mem rather than work_mem as their memory limit. Add some code to guc.c to allow these variables to be referenced by their old names in SHOW and SET commands, for backwards compatibility.
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Improve dynahash.c's API so that caller can specify the comparison functionTom Lane2003-08-19
| | | | | | | | | | | | | as well as the hash function (formerly the comparison function was hardwired as memcmp()). This makes it possible to eliminate the special-purpose hashtable management code in execGrouping.c in favor of using dynahash to manage tuple hashtables; which is a win because dynahash knows how to expand a hashtable when the original size estimate was too small, whereas the special-purpose code was too stupid to do that. (See recent gripe from Stephan Szabo about poor performance when hash table size estimate is way off.) Free side benefit: when using string_hash, the default comparison function is now strncmp() instead of memcmp(). This should eliminate some part of the overhead associated with larger NAMEDATALEN values.
* Another pgindent run with updated typedefs.Bruce Momjian2003-08-08
|
* Update copyrights to 2003.Bruce Momjian2003-08-04
|
* pgindent run.Bruce Momjian2003-08-04
|
* Adjust 'permission denied' messages to be more useful and consistent.Tom Lane2003-08-01
|
* Error message editing in backend/executor.Tom Lane2003-07-21
|
* Aggregates can be polymorphic, using polymorphic implementation functions.Tom Lane2003-07-01
| | | | | | It also works to create a non-polymorphic aggregate from polymorphic functions, should you want to do that. Regression test added, docs still lacking. By Joe Conway, with some kibitzing from Tom Lane.
* Back out array mega-patch.Bruce Momjian2003-06-25
| | | | Joe Conway
* Array mega-patch.Bruce Momjian2003-06-24
| | | | Joe Conway
* Revise hash join and hash aggregation code to use the same datatype-Tom Lane2003-06-22
| | | | | | | | specific hash functions used by hash indexes, rather than the old not-datatype-aware ComputeHashFunc routine. This makes it safe to do hash joining on several datatypes that previously couldn't use hashing. The sets of datatypes that are hash indexable and hash joinable are now exactly the same, whereas before each had some that weren't in the other.
* Implement outer-level aggregates to conform to the SQL spec, withTom Lane2003-06-06
| | | | | | | | extensions to support our historical behavior. An aggregate belongs to the closest query level of any of the variables in its argument, or the current query level if there are no variables (e.g., COUNT(*)). The implementation involves adding an agglevelsup field to Aggref, and treating outer aggregates like outer variables at planning time.
* Small performance improvement for hash joins and hash aggregation:Tom Lane2003-05-30
| | | | | | | when the plan is ReScanned, we don't have to rebuild the hash table if there is no parameter change for its child node. This idea has been used for a long time in Sort and Material nodes, but was not in the hash code till now.
* Make further use of new bitmapset code: executor's chgParam, extParam,Tom Lane2003-02-09
| | | | | | | | locParam lists can be converted to bitmapsets to speed updating. Also, replace 'locParam' with 'allParam', which contains all the paramIDs relevant to the node (i.e., the union of extParam and locParam); this saves a step during SetChangedParamList() without costing anything elsewhere.
* Detect duplicate aggregate calls and evaluate only one copy. ThisTom Lane2003-02-04
| | | | | speeds up some useful real-world cases like SELECT x, COUNT(*) FROM t GROUP BY x HAVING COUNT(*) > 100.
* Create a new file executor/execGrouping.c to centralize utility routinesTom Lane2003-01-10
| | | | shared by nodeGroup, nodeAgg, and soon nodeSubplan.
* Revise executor APIs so that all per-query state structure is built inTom Lane2002-12-15
| | | | | | a per-query memory context created by CreateExecutorState --- and destroyed by FreeExecutorState. This provides a final solution to the longstanding problem of memory leaked by various ExecEndNode calls.
* Phase 3 of read-only-plans project: ExecInitExpr now builds expressionTom Lane2002-12-13
| | | | | | | execution state trees, and ExecEvalExpr takes an expression state tree not an expression plan tree. The plan tree is now read-only as far as the executor is concerned. Next step is to begin actually exploiting this property.
* Phase 2 of read-only-plans project: restructure expression-tree nodesTom Lane2002-12-12
| | | | | | | | | so that all executable expression nodes inherit from a common supertype Expr. This is somewhat of an exercise in code purity rather than any real functional advance, but getting rid of the extra Oper or Func node formerly used in each operator or function call should provide at least a little space and speed improvement. initdb forced by changes in stored-rules representation.
* Phase 1 of read-only-plans project: cause executor state nodes to pointTom Lane2002-12-05
| | | | | | | | | | to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes.
* Tighten selection of equality and ordering operators for groupingTom Lane2002-11-29
| | | | | | | operations: make sure we use operators that are compatible, as determined by a mergejoin link in pg_operator. Also, add code to planner to ensure we don't try to use hashed grouping when the grouping operators aren't marked hashable.
* Add an at-least-marginally-plausible method of estimating the numberTom Lane2002-11-19
| | | | | | of groups produced by GROUP BY. This improves the accuracy of planning estimates for grouped subselects, and is needed to check whether a hashed aggregation plan risks memory overflow.
* Add new palloc0 call as merge of palloc and MemSet(0).Bruce Momjian2002-11-13
|
* Back out use of palloc0 in place if palloc/MemSet. Seems constant lenBruce Momjian2002-11-11
| | | | to MemSet is a performance boost.
* Merge palloc()/MemSet(0) calls into a single palloc0() call.Bruce Momjian2002-11-10
|
* Phase 2 of hashed-aggregation project. nodeAgg.c now knows how to doTom Lane2002-11-06
| | | | hashed aggregation, but there's not yet planner support for it.
* First phase of implementing hash-based grouping/aggregation. An AGG planTom Lane2002-11-06
| | | | | | | | | | | | | node now does its own grouping of the input rows, and has no need for a preceding GROUP node in the plan pipeline. This allows elimination of the misnamed tuplePerGroup option for GROUP, and actually saves more code in nodeGroup.c than it costs in nodeAgg.c, as well as being presumably faster. Restructure the API of query_planner so that we do not commit to using a sorted or unsorted plan in query_planner; instead grouping_planner makes the decision. (Right now it isn't any smarter than query_planner was, but that will change as soon as it has the option to select a hash- based aggregation step.) Despite all the hackery, no initdb needed since only in-memory node types changed.
* Reduce a couple of debugging messages from LOG to DEBUG1 category.Tom Lane2002-11-01
|
* Tweak a few of the most heavily used function call points to zero outTom Lane2002-10-04
| | | | | | | | | | just the significant fields of FunctionCallInfoData, rather than MemSet'ing the whole struct to zero. Unused positions in the arg[] array will thereby contain garbage rather than zeroes. This buys back some of the performance hit from increasing FUNC_MAX_ARGS. Also tweak tuplesort.c code for more speed by marking some routines 'inline'. All together these changes speed up simple sorts, like count(distinct int4column), by about 25% on a P4 running RH Linux 7.2.
* Make the world at least somewhat safe for zero-column tables, andTom Lane2002-09-28
| | | | | remove the special case in ALTER DROP COLUMN to prohibit dropping a table's last column.
* Extend pg_cast castimplicit column to a three-way value; this allows usTom Lane2002-09-18
| | | | | | | | | | | | | | | | | | | | | | | | to be flexible about assignment casts without introducing ambiguity in operator/function resolution. Introduce a well-defined promotion hierarchy for numeric datatypes (int2->int4->int8->numeric->float4->float8). Change make_const to initially label numeric literals as int4, int8, or numeric (never float8 anymore). Explicitly mark Func and RelabelType nodes to indicate whether they came from a function call, explicit cast, or implicit cast; use this to do reverse-listing more accurately and without so many heuristics. Explicit casts to char, varchar, bit, varbit will truncate or pad without raising an error (the pre-7.2 behavior), while assigning to a column without any explicit cast will still raise an error for wrong-length data like 7.3. This more nearly follows the SQL spec than 7.2 behavior (we should be reporting a 'completion condition' in the explicit-cast cases, but we have no mechanism for that, so just do silent truncation). Fix some problems with enforcement of typmod for array elements; it didn't work at all in 'UPDATE ... SET array[n] = foo', for example. Provide a generalized array_length_coerce() function to replace the specialized per-array-type functions that used to be needed (and were missing for NUMERIC as well as all the datetime types). Add missing conversions int8<->float4, text<->numeric, oid<->int8. initdb forced.
* pgindent run.Bruce Momjian2002-09-04
|
* Update copyright to 2002.Bruce Momjian2002-06-20
|
* Get rid of the last few uses of typeidTypeName() rather thanTom Lane2002-05-17
| | | | format_type_be() in error messages.