aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Change hash indexes to store only the hash code rather than the whole indexedTom Lane2008-09-15
| | | | | | | | | | | | | | | | value. This means that hash index lookups are always lossy and have to be rechecked when the heap is visited; however, the gain in index compactness outweighs this when the indexed values are wide. Also, we only need to perform datatype comparisons when the hash codes match exactly, rather than for every entry in the hash bucket; so it could also win for datatypes that have expensive comparison functions. A small additional win is gained by keeping hash index pages sorted by hash code and using binary search to reduce the number of index tuples we have to look at. Xiao Meng This commit also incorporates Zdenek Kotala's patch to isolate hash metapages and hash bitmaps a bit better from the page header datastructures.
* Parse pg_hba.conf in postmaster, instead of once in each backend forMagnus Hagander2008-09-15
| | | | | | | | | each connection. This makes it possible to catch errors in the pg_hba file when it's being reloaded, instead of silently reloading a broken file and failing only when a user tries to connect. This patch also makes the "sameuser" argument to ident authentication optional.
* Skip opfamily check in eclass_matches_any_index() when the index isn't aTom Lane2008-09-12
| | | | | | | | | btree. We can't easily tell whether clauses generated from the equivalence class could be used with such an index, so just assume that they might be. This bit of over-optimization prevented use of non-btree indexes for nestloop inner indexscans, in any case where the join uses an equality operator that is also a btree operator --- which in particular is typically true for hash indexes. Noted while trying to test the current hash index patch.
* Tighten up to_date/to_timestamp so that they are more likely to rejectTom Lane2008-09-11
| | | | | | | erroneous input, rather than silently producing bizarre results as formerly happened. Brendan Jurd
* Adjust the parser to accept the typename syntax INTERVAL ... SECOND(n)Tom Lane2008-09-11
| | | | | | | | | | | | | | | | | | | and the literal syntax INTERVAL 'string' ... SECOND(n), as required by the SQL standard. Our old syntax put (n) directly after INTERVAL, which was a mistake, but will still be accepted for backward compatibility as well as symmetry with the TIMESTAMP cases. Change intervaltypmodout to show it in the spec's way, too. (This could potentially affect clients, if there are any that analyze the typmod of an INTERVAL in any detail.) Also fix interval input to handle 'min:sec.frac' properly; I had overlooked this case in my previous patch. Document the use of the interval fields qualifier, which up to now we had never mentioned in the docs. (I think the omission was intentional because it didn't work per spec; but it does now, or at least close enough to be credible.)
* Initialize the minimum frozen Xid in vac_update_datfrozenxid usingAlvaro Herrera2008-09-11
| | | | | | | | | | | | | | | | | | | | GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not depend on the latter being correctly set elsewhere, and while it is more expensive, this code path is not performance-critical. This is a real risk for autovacuum, because it can execute whole cycles without doing a single vacuum, which would mean that RecentGlobalXmin would stay at its initialization value, FirstNormalTransactionId, causing a bogus value to be inserted in pg_database. This bug could explain some recent reports of failure to truncate pg_clog. At the same time, change the initialization of RecentGlobalXmin to InvalidTransactionId, and ensure that it's set to something else whenever it's going to be used. Using it as FirstNormalTransactionId in HOT page pruning could incur in data loss. InitPostgres takes care of setting it to a valid value, but the extra checks are there to prevent "special" backends from behaving in unusual ways. Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us
* Tweak newly added set_config_sourcefile() so that the target recordTom Lane2008-09-10
| | | | isn't left corrupt if guc_strdup should fail.
* Make our parsing of INTERVAL literals spec-compliant (or at least a heck ofTom Lane2008-09-10
| | | | | | | | | | | | | | | a lot closer than it was before). To do this, tweak coerce_type() to pass through the typmod information when invoking interval_in() on an UNKNOWN constant; then fix DecodeInterval to pay attention to the typmod when deciding how to interpret a units-less integer value. I changed one or two other details as well. I believe the code now reacts as expected by spec for all the literal syntaxes that are specifically enumerated in the spec. There are corner cases involving strings that don't exactly match the set of fields called out by the typmod, for which we might want to tweak the behavior some more; but I think this is an area of user friendliness rather than spec compliance. There remain some non-compliant details about the SQL syntax (as opposed to what's inside the literal string); but at least we'll throw error rather than silently doing the wrong thing in those cases.
* Add "source file" and "source line" information to each GUC variable.Alvaro Herrera2008-09-10
| | | | | | initdb forced due to changes in the pg_settings view. Magnus Hagander and Alvaro Herrera.
* Improve the plan cache invalidation mechanism to make it invalidate plansTom Lane2008-09-09
| | | | | | | | | when user-defined functions used in a plan are modified. Also invalidate plans when schemas, operators, or operator classes are modified; but for these cases we just invalidate everything rather than tracking exact dependencies, since these types of objects seldom change in a production database. Tom Lane; loosely based on a patch by Martin Pihlak.
* Fix a couple of problems pointed out by Fujii Masao in the 2008-Apr-05 patchTom Lane2008-09-08
| | | | | | | | | | for pg_stop_backup. First, it is possible that the history file name is not alphabetically later than the last WAL file name, so we should explicitly check that both have been archived. Second, the previous coding would wait forever if a checkpoint had managed to remove the WAL file before we look for it. Simon Riggs, plus some code cleanup by me.
* Create a separate grantable privilege for TRUNCATE, rather than having it beTom Lane2008-09-08
| | | | | | | always owner-only. The TRUNCATE privilege works identically to the DELETE privilege so far as interactions with the rest of the system go. Robert Haas
* Support set-returning functions in the target lists of Agg and Group planTom Lane2008-09-08
| | | | | | nodes. This is a pretty ugly feature but since we don't yet have a plausible substitute, we'd better support it everywhere. Per gripe from Jeff Davis.
* Reimplement text_position and related functions to use Boyer-Moore-HorspoolTom Lane2008-09-07
| | | | | | | | searching instead of naive matching. In the worst case this has the same O(M*N) complexity as the naive method, but the worst case is hard to hit, and the average case is very fast, especially with longer patterns. David Rowley
* Adjust psql's new \ef command to present an empty CREATE FUNCTION templateTom Lane2008-09-06
| | | | | | | | | | for editing if no function name is specified. This seems a much cleaner way to offer that functionality than the original patch had. In passing, de-clutter the error displays that are given for a bogus function-name argument, and standardize on "$function$" as the default delimiter for the function body. (The original coding would use the shortest possible dollar-quote delimiter, which seems to create unnecessarily high risk of later conflicts with the user-modified function body.)
* Implement a psql command "\ef" to edit the definition of a function.Tom Lane2008-09-06
| | | | | | | In support of that, create a backend function pg_get_functiondef(). The psql command is functional but maybe a bit rough around the edges... Abhijit Menon-Sen
* Fix an oversight in the 8.2 patch that improved mergejoin performance byTom Lane2008-09-05
| | | | | | | | | | | | | | | | | | inserting a materialize node above an inner-side sort node, when the sort is expected to spill to disk. (The materialize protects the sort from having to support mark/restore, allowing it to do its final merge pass on-the-fly.) We neglected to teach cost_mergejoin about that hack, so it was failing to include the materialize's costs in the estimated cost of the mergejoin. The materialize's costs are generally going to be pretty negligible in comparison to the sort's, so this is only a small error and probably not worth back-patching; but it's still wrong. In the similar case where a materialize is inserted to protect an inner-side node that can't do mark/restore at all, it's still true that the materialize should not spill to disk, and so we should cost it cheaply rather than expensively. Noted while thinking about a question from Tom Raney.
* Code coverage testing with gcov. Documentation is in the regression testPeter Eisentraut2008-09-05
| | | | | | chapter. Author: Michelle Caisse <Michelle.Caisse@Sun.COM>
* Fix strategy propagation to scanEntry for partial match by moving propagationTeodor Sigaev2008-09-04
| | | | to initializaion of scanEntry.
* If a loadable module has wrong values in its magic block, spell outTom Lane2008-09-03
| | | | | exactly what they are in the complaint message. Marko Kreen, some editorialization by me.
* Prevent memory leaks in our various bison parsers when an error occursTom Lane2008-09-02
| | | | | | | during parsing. Formerly the parser's stack was allocated with malloc and so wouldn't be reclaimed; this patch makes it use palloc instead, so that flushing the current context will reclaim the memory. Per Marko Kreen.
* Add a bunch of new error location reports to parse-analysis error messages.Tom Lane2008-09-01
| | | | | There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.
* HeapTupleHeaderAdjustCmax made the incorrect assumption that the rawHeikki Linnakangas2008-09-01
| | | | | | | | | command id is the cmin, when it can in fact be a combo cid. That made rows incorrectly invisible to a transaction where a tuple was deleted by multiple aborted subtransactions. Report and patch Karl Schnaitter. Back-patch to 8.3, where combo cids was introduced.
* Fix the raw-parsetree representation of star (as in SELECT * FROM orTom Lane2008-08-30
| | | | | | SELECT foo.*) so that it cannot be confused with a quoted identifier "*". Instead create a separate node type A_Star to represent this notation. Per pgsql-hackers discussion of 2007-Sep-27.
* In GCC-based builds, use a better newNode() macro that relies on GCC-specificTom Lane2008-08-29
| | | | | | syntax to avoid a useless store into a global variable. Per experimentation, this works better than my original thought of trying to push the code into an out-of-line subroutine.
* Suppress gcc warning about possibly-uninitialized variable. It's notTom Lane2008-08-29
| | | | | | | | clear to me why I'd not seen this message before --- on F-9 it seems to only happen if Asserts are disabled, which ought to be irrelevant. Maybe that affects a decision whether to inline get_ten(), which would be needed to expose the warning condition to the compiler? Anyway, the fix is clear.
* Remove all traces that suggest that a non-Bison yacc might be supported, andPeter Eisentraut2008-08-29
| | | | | change build system to use only Bison. Simplify build rules, make file names uniform. Don't build the token table header file where it is not needed.
* Extend the parser location infrastructure to include a location field inTom Lane2008-08-28
| | | | | | | | | | | | | most node types used in expression trees (both before and after parse analysis). This allows us to place an error cursor in many situations where we formerly could not, because the information wasn't available beyond the very first level of parse analysis. There's a fair amount of work still to be done to persuade individual ereport() calls to actually include an error location, but this gets the initdb-forcing part of the work out of the way; and the situation is already markedly better than before for complaints about unimplementable implicit casts, such as CASE and UNION constructs with incompatible alternative data types. Per my proposal of a few days ago.
* Teach eval_const_expressions() to simplify an ArrayCoerceExpr to a constantTom Lane2008-08-26
| | | | | | | | when its input is constant and the element coercion function is immutable (or nonexistent, ie, binary-coercible case). This is an oversight in the 8.3 implementation of ArrayCoerceExpr, and its result is that certain cases involving IN or NOT IN with constants don't get optimized as they should be. Per experimentation with an example from Ow Mun Heng.
* Move exprType(), exprTypmod(), expression_tree_walker(), and related routinesTom Lane2008-08-25
| | | | | | into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.
* Get rid of the last remaining uses of var_is_rel(), to wit some debuggingTom Lane2008-08-25
| | | | | | | | | checks in ExecIndexBuildScanKeys() that were inadequate anyway: it's better to verify the correct varno on an expected index key, not just reject OUTER and INNER. This makes the entire current contents of nodeFuncs.c dead code. I'll be replacing it with some other stuff later, as per recent proposal.
* Unconditionally write the statsfile when SIGHUP is received, to minimizeMagnus Hagander2008-08-25
| | | | the window during which backends have no statistics file to read.
* Update URL to Ross William's paper.Alvaro Herrera2008-08-25
| | | | Devrim Gunduz.
* Make stats_temp_directory PGC_SIGHUP, and document how it may cause a temporaryMagnus Hagander2008-08-25
| | | | | | | "outage" of the statistics views. This requires making the stats collector respond to SIGHUP, like the other utility processes already did.
* Convert remaining builtin set-returning functions to use OUT parameters, makingMagnus Hagander2008-08-25
| | | | | | it possible to call them without specifying a column list. Jaime Casanova
* Add missing descriptions for aggregates, functions and conversions.Bruce Momjian2008-08-23
| | | | Bernd Helmle
* Fix possible duplicate tuples while GiST scan. Now page is processedTeodor Sigaev2008-08-23
| | | | | | | | | at once and ItemPointers are collected in memory. Remove tuple's killing by killtuple() if tuple was moved to another page - it could produce unaceptable overhead. Backpatch up to 8.1 because the bug was introduced by GiST's concurrency support.
* Make "log_temp_files" super-user set only, like other logging options.Bruce Momjian2008-08-22
| | | | Simon Riggs
* Minor patch on pgbenchBruce Momjian2008-08-22
| | | | | | | | | | 1. -i option should run vacuum analyze only on pgbench tables, not *all* tables in database. 2. pre-run cleanup step was DELETE FROM HISTORY then VACUUM HISTORY. This is just a slow version of TRUNCATE HISTORY. Simon Riggs
* Improve wording of error message when a postgresql.conf setting isBruce Momjian2008-08-22
| | | | ignored because it can only be set at server start.
* Arrange to convert EXISTS subqueries that are equivalent to hashable INTom Lane2008-08-22
| | | | | | | | | | | | | | | | | | | | | | | subqueries into the same thing you'd have gotten from IN (except always with unknownEqFalse = true, so as to get the proper semantics for an EXISTS). I believe this fixes the last case within CVS HEAD in which an EXISTS could give worse performance than an equivalent IN subquery. The tricky part of this is that if the upper query probes the EXISTS for only a few rows, the hashing implementation can actually be worse than the default, and therefore we need to make a cost-based decision about which way to use. But at the time when the planner generates plans for subqueries, it doesn't really know how many times the subquery will be executed. The least invasive solution seems to be to generate both plans and postpone the choice until execution. Therefore, in a query that has been optimized this way, EXPLAIN will show two subplans for the EXISTS, of which only one will actually get executed. There is a lot more that could be done based on this infrastructure: in particular it's interesting to consider switching to the hash plan if we start out using the non-hashed plan but find a lot more upper rows going by than we expected. I have therefore left some minor inefficiencies in place, such as initializing both subplans even though we will currently only use one.
* Marginal improvement in sublink planning: allow unknownEqFalse optimizationTom Lane2008-08-20
| | | | | | | to be used for SubLinks that are underneath a top-level OR clause. Just as at the very top level of WHERE, it's not necessary to be accurate about whether the sublink returns FALSE or NULL, because either result has the same impact on whether the WHERE will succeed.
* Fix obsolete comment. It's no longer the case that Param nodes don'tTom Lane2008-08-20
| | | | carry typmod.
* Cause the output from debug_print_parse, debug_print_rewritten, andTom Lane2008-08-19
| | | | | | | | debug_print_plan to appear at LOG message level, not DEBUG1 as historically. Make debug_pretty_print default to on. Also, cause plans generated via EXPLAIN to be subject to debug_print_plan. This is all to make debug_print_plan a reasonably comfortable substitute for the former behavior of EXPLAIN VERBOSE.
* Add some defenses against constant-FALSE outer join conditions. SinceTom Lane2008-08-17
| | | | | | | | | | | | | | | | | eval_const_expressions will generally throw away anything that's ANDed with constant FALSE, what we're left with given an example like select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b); is a cartesian product computation, which is really not acceptable. This is a regression in CVS HEAD compared to previous releases, which were able to notice the impossible join condition in this case --- though not in some related cases that are also improved by this patch, such as select * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and 0=1); Fix by skipping evaluation of the appropriate side of the outer join in cases where it's demonstrably unnecessary.
* Remove prohibition against SubLinks in the WHERE clause of an EXISTS subqueryTom Lane2008-08-17
| | | | | | that we're considering pulling up. I hadn't wanted to think through whether that could work during the first pass at this stuff. However, on closer inspection it seems to be safe enough.
* Improve sublink pullup code to handle ANY/EXISTS sublinks that are at topTom Lane2008-08-17
| | | | | | | | | | | | | level of a JOIN/ON clause, not only at top level of WHERE. (However, we can't do this in an outer join's ON clause, unless the ANY/EXISTS refers only to the nullable side of the outer join, so that it can effectively be pushed down into the nullable side.) Per request from Kevin Grittner. In passing, fix a bug in the initial implementation of EXISTS pullup: it would Assert if the EXIST's WHERE clause used a join alias variable. Since we haven't yet flattened join aliases when this transformation happens, it's necessary to include join relids in the computed set of RHS relids.
* Clean up the loose ends in selectivity estimation left by my patch for semiTom Lane2008-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.
* Performance fix for new anti-join code in nodeMergejoin.c: after finding aTom Lane2008-08-15
| | | | | | | | | | | | | | | | | | match in antijoin mode, we should advance to next outer tuple not next inner. We know we don't want to return this outer tuple, and there is no point in advancing over matching inner tuples now, because we'd just have to do it again if the next outer tuple has the same merge key. This makes a noticeable difference if there are lots of duplicate keys in both inputs. Similarly, after finding a match in semijoin mode, arrange to advance to the next outer tuple after returning the current match; or immediately, if it fails the extra quals. The rationale is the same. (This is a performance bug in existing releases; perhaps worth back-patching? The planner tries to avoid using mergejoin with lots of duplicates, so it may not be a big issue in practice.) Nestloop and hash got this right to start with, but I made some cosmetic adjustments there to make the corresponding bits of logic look more similar.
* Make the temporary directory for pgstat files configurable by the GUCMagnus Hagander2008-08-15
| | | | | | | | variable stats_temp_directory, instead of requiring the admin to mount/symlink the pg_stat_tmp directory manually. For now the config variable is PGC_POSTMASTER. Room for further improvment that would allow it to be changed on-the-fly.