aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Standardize treatment of strcmp() return valuePeter Eisentraut2011-12-27
| | | | | Always compare the return value to 0, don't use cute tricks like if (!strcmp(...)).
* Rethink representation of index clauses' mapping to index columns.Tom Lane2011-12-24
| | | | | | | | | | | | | | | | | | | | | In commit e2c2c2e8b1df7dfdb01e7e6f6191a569ce3c3195 I made use of nested list structures to show which clauses went with which index columns, but on reflection that's a data structure that only an old-line Lisp hacker could love. Worse, it adds unnecessary complication to the many places that don't much care which clauses go with which index columns. Revert to the previous arrangement of flat lists of clauses, and instead add a parallel integer list of column numbers. The places that care about the pairing can chase both lists with forboth(), while the places that don't care just examine one list the same as before. The only real downside to this is that there are now two more lists that need to be passed to amcostestimate functions in case they care about column matching (which btcostestimate does, so not passing the info is not an option). Rather than deal with 11-argument amcostestimate functions, pass just the IndexPath and expect the functions to extract fields from it. That gets us down to 7 arguments which is better than 11, and it seems more future-proof against likely additions to the information we keep about an index path.
* Improve planner's handling of duplicated index column expressions.Tom Lane2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's potentially useful for an index to repeat the same indexable column or expression in multiple index columns, if the columns have different opclasses. (If they share opclasses too, the duplicate column is pretty useless, but nonetheless we've allowed such cases since 9.0.) However, the planner failed to cope with this, because createplan.c was relying on simple equal() matching to figure out which index column each index qual is intended for. We do have that information available upstream in indxpath.c, though, so the fix is to not flatten the multi-level indexquals list when putting it into an IndexPath. Then we can rely on the sublist structure to identify target index columns in createplan.c. There's a similar issue for index ORDER BYs (the KNNGIST feature), so introduce a multi-level-list representation for that too. This adds a bit more representational overhead, but we might more or less buy that back by not having to search for matching index columns anymore in createplan.c; likewise btcostestimate saves some cycles. Per bug #6351 from Christian Rudolph. Likely symptoms include the "btree index keys must be ordered by attribute" failure shown there, as well as "operator MMMM is not a member of opfamily NNNN". Although this is a pre-existing problem that can be demonstrated in 9.0 and 9.1, I'm not going to back-patch it, because the API changes in the planner seem likely to break things such as index plugins. The corner cases where this matters seem too narrow to justify possibly breaking things in a minor release.
* Add bytea_agg, parallel to string_agg.Robert Haas2011-12-23
| | | | Pavel Stehule
* Add a security_barrier option for views.Robert Haas2011-12-22
| | | | | | | | | | | | | | When a view is marked as a security barrier, it will not be pulled up into the containing query, and no quals will be pushed down into it, so that no function or operator chosen by the user can be applied to rows not exposed by the view. Views not configured with this option cannot provide robust row-level security, but will perform far better. Patch by KaiGai Kohei; original problem report by Heikki Linnakangas (in October 2009!). Review (in earlier versions) by Noah Misch and others. Design advice by Tom Lane and myself. Further review and cleanup by me.
* Shave a few cycles in string_agg().Robert Haas2011-12-21
| | | | Pavel Stehule
* Fix gincostestimate to handle ScalarArrayOpExpr reasonably.Tom Lane2011-12-20
| | | | | | | | | | | | | | | | | | The original coding of this function overlooked the possibility that it could be passed anything except simple OpExpr indexquals. But ScalarArrayOpExpr is possible too, and the code would probably crash (and surely give ridiculous answers) in such a case. Add logic to try to estimate sanely for such cases. In passing, fix the treatment of inner-indexscan cost estimation: it was failing to scale up properly for multiple iterations of a nestloop. (I think somebody might've thought that index_pages_fetched() is linear, but of course it's not.) Report, diagnosis, and preliminary patch by Marti Raudsepp; I refactored it a bit and fixed the cost estimation. Back-patch into 9.1 where the bogus code was introduced.
* Add support for privileges on typesPeter Eisentraut2011-12-20
| | | | | | | | | This adds support for the more or less SQL-conforming USAGE privilege on types and domains. The intent is to be able restrict which users can create dependencies on types, which restricts the way in which owners can alter types. reviewed by Yeb Havinga
* Add SP-GiST (space-partitioned GiST) index access method.Tom Lane2011-12-17
| | | | | | | | | | | | SP-GiST is comparable to GiST in flexibility, but supports non-balanced partitioned search structures rather than balanced trees. As described at PGCon 2011, this new indexing structure can beat GiST in both index build time and query speed for search problems that it is well matched to. There are a number of areas that could still use improvement, but at this point the code seems committable. Teodor Sigaev and Oleg Bartunov, with considerable revisions by Tom Lane
* Revert the behavior of inet/cidr functions to not unpack the arguments.Heikki Linnakangas2011-12-12
| | | | | | | | | | | I forgot to change the functions to use the PG_GETARG_INET_PP() macro, when I changed DatumGetInetP() to unpack the datum, like Datum*P macros usually do. Also, I screwed up the definition of the PG_GETARG_INET_PP() macro, and didn't notice because it wasn't used. This fixes the memory leak when sorting inet values, as reported by Jochen Erwied and debugged by Andres Freund. Backpatch to 8.3, like the previous patch that broke it.
* Miscellaneous cleanup to silence compiler warnings seen on Mingw.Andrew Dunstan2011-12-10
| | | | | Remove some dead code, conditionally declare some items or call some code, and fix one or two declarations.
* Fix corner cases in readlink() usage.Tom Lane2011-12-07
| | | | | | Make sure all calls are protected by HAVE_READLINK, and get the buffer overflow tests right. Be a bit more paranoid about string length in _tarWriteHeader(), too.
* Better error reporting if the link target is too longMagnus Hagander2011-12-07
| | | | | This situation won't set errno, so using %m will give an incorrect error message.
* Remove spclocation field from pg_tablespaceMagnus Hagander2011-12-07
| | | | | | | | Instead, add a function pg_tablespace_location(oid) used to return the same information, and do this by reading the symbolic link. Doing it this way makes it possible to relocate a tablespace when the database is down by simply changing the symbolic link.
* Create a "sort support" interface API for faster sorting.Tom Lane2011-12-07
| | | | | | | | | | | | This patch creates an API whereby a btree index opclass can optionally provide non-SQL-callable support functions for sorting. In the initial patch, we only use this to provide a directly-callable comparator function, which can be invoked with a bit less overhead than the traditional SQL-callable comparator. While that should be of value in itself, the real reason for doing this is to provide a datatype-extensible framework for more aggressive optimizations, as in Peter Geoghegan's recent work. Robert Haas and Tom Lane
* Improve table locking behavior in the face of current DDL.Robert Haas2011-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous coding, callers were faced with an awkward choice: look up the name, do permissions checks, and then lock the table; or look up the name, lock the table, and then do permissions checks. The first choice was wrong because the results of the name lookup and permissions checks might be out-of-date by the time the table lock was acquired, while the second allowed a user with no privileges to interfere with access to a table by users who do have privileges (e.g. if a malicious backend queues up for an AccessExclusiveLock on a table on which AccessShareLock is already held, further attempts to access the table will be blocked until the AccessExclusiveLock is obtained and the malicious backend's transaction rolls back). To fix, allow callers of RangeVarGetRelid() to pass a callback which gets executed after performing the name lookup but before acquiring the relation lock. If the name lookup is retried (because invalidation messages are received), the callback will be re-executed as well, so we get the best of both worlds. RangeVarGetRelid() is renamed to RangeVarGetRelidExtended(); callers not wishing to supply a callback can continue to invoke it as RangeVarGetRelid(), which is now a macro. Since the only one caller that uses nowait = true now passes a callback anyway, the RangeVarGetRelid() macro defaults nowait as well. The callback can also be used for supplemental locking - for example, REINDEX INDEX needs to acquire the table lock before the index lock to reduce deadlock possibilities. There's a lot more work to be done here to fix all the cases where this can be a problem, but this commit provides the general infrastructure and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE, LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE. Per discussion with Noah Misch and Alvaro Herrera.
* Improve GiST range-contained-by searches by adding a flag for empty ranges.Tom Lane2011-11-27
| | | | | | | | | | | | | | | | | | | | | | | In the original implementation, a range-contained-by search had to scan the entire index because an empty range could be lurking anywhere. Improve that by adding a flag to upper GiST entries that says whether the represented subtree contains any empty ranges. Also, make a simple mod to the penalty function to discourage empty ranges from getting pushed into subtrees without any. This needs more work, and the picksplit function should be taught about it too, but that code can be improved without causing an on-disk compatibility break; so we'll leave it for another day. Since we're breaking on-disk compatibility of range values anyway, I took the opportunity to reorganize the range flags bits; the unused RANGE_xB_NULL bits are now adjacent, which might open the door for using them in some other way later. In passing, remove the GiST range opclass entry for <>, which doesn't seem like it can really be indexed usefully. Alexander Korotkov, with some editorializing by Tom
* Make GiST index searches smarter about queries against empty ranges.Tom Lane2011-11-26
| | | | | | | In the cases where the result of the called proc is negated, we should explicitly test both inputs for empty, to ensure we'll never return "true" for an unsatisfiable query. In other cases we can rely on the called proc to say the right thing.
* Adjust range_adjacent to support different canonicalization rules.Tom Lane2011-11-23
| | | | | | | | | | | The original coding would not work for discrete ranges in which the canonicalization rule is to produce symmetric boundaries (either [] or () style), as noted by Jeff Davis. Florian Pflug pointed out that we could fix that by invoking the canonicalization function to see if the range "between" the two given ranges normalizes to empty. This implementation of Florian's idea is a tad slower than the original code, but only in the case where there actually is a canonicalization function --- if not, it's essentially the same logic as before.
* Remove user-selectable ANALYZE option for range types.Tom Lane2011-11-23
| | | | | | | | | It's not clear that a per-datatype typanalyze function would be any more useful than a generic typanalyze for ranges. What *is* clear is that letting unprivileged users select typanalyze functions is a crash risk or worse. So remove the option from CREATE TYPE AS RANGE, and instead put in a generic typanalyze function for ranges. The generic function does nothing as yet, but hopefully we'll improve that before 9.2 release.
* Remove zero- and one-argument range constructor functions.Tom Lane2011-11-22
| | | | | | | | | | | | Per discussion, the zero-argument forms aren't really worth the catalog space (just write 'empty' instead). The one-argument forms have some use, but they also have a serious problem with looking too much like functional cast notation; to the point where in many real use-cases, the parser would misinterpret what was wanted. Committing this as a separate patch, with the thought that we might want to revert part or all of it if we can think of some way around the cast ambiguity.
* Improve implementation of range-contains-element tests.Tom Lane2011-11-22
| | | | | | | | | | | | Implement these tests directly instead of constructing a singleton range and then applying range-contains. This saves a range serialize/deserialize cycle as well as a couple of redundant bound-comparison steps, and adds very little code on net. Remove elem_contained_by_range from the GiST opclass: it doesn't belong there because there is no way to use it in an index clause (where the indexed column would have to be on the left). Its commutator is in the opclass, and that's what counts.
* Still more review for range-types patch.Tom Lane2011-11-22
| | | | | | | | | | Per discussion, relax the range input/construction rules so that the only hard error is lower bound > upper bound. Cases where the lower bound is <= upper bound, but the range nonetheless normalizes to empty, are now permitted. Fix core dump in range_adjacent when bounds are infinite. Marginal cleanup of regression test cases, some more code commenting.
* More code review for rangetypes patch.Tom Lane2011-11-21
| | | | | | | | | | | Fix up some infelicitous coding in DefineRange, and add some missing error checks. Rearrange operator strategy number assignments for GiST anyrange opclass so that they don't make such a mess of opr_sanity's table of operator names associated with different strategy numbers. Assign hopefully-temporary selectivity estimators to range operators that didn't have one --- poor as the estimates are, they're still a lot better than the default 0.5 estimate, and they'll shut up the opr_sanity test that wants to see selectivity estimators on all built-in operators.
* Fix range_cmp_bounds for the case of equal-valued exclusive bounds.Tom Lane2011-11-17
| | | | | | Also improve its comments and related regression tests. Jeff Davis, with some further adjustments by Tom
* Improve caching in range type I/O functions.Tom Lane2011-11-15
| | | | | Cache the the element type's I/O info across calls, not only the range type's info. In passing, also clean up hash_range a bit more.
* Restructure function-internal caching in the range type code.Tom Lane2011-11-15
| | | | | | | | | | | | | | | | | | | Move the responsibility for caching specialized information about range types into the type cache, so that the catalog lookups only have to occur once per session. Rearrange APIs a bit so that fn_extra caching is actually effective in the GiST support code. (Use of OidFunctionCallN is bad enough for performance in itself, but it also prevents the function from exploiting fn_extra caching.) The range I/O functions are still not very bright about caching repeated lookups, but that seems like material for a separate patch. Also, avoid unnecessary use of memcpy to fetch/store the range type OID and flags, and don't use the full range_deserialize machinery when all we need to see is the flags value. Also fix API error in range_gist_penalty --- it was failing to set *penalty for any case involving an empty range.
* Fix alignment and toasting bugs in range types.Tom Lane2011-11-14
| | | | | | | | | | | | | | | | | | | | | A range type whose element type has 'd' alignment must have 'd' alignment itself, else there is no guarantee that the element value can be used in-place. (Because range_deserialize uses att_align_pointer which forcibly aligns the given pointer, violations of this rule did not lead to SIGBUS but rather to garbage data being extracted, as in one of the added regression test cases.) Also, you can't put a toast pointer inside a range datum, since the referenced value could disappear with the range datum still present. For consistency with the handling of arrays and records, I also forced decompression of in-line-compressed bound values. It would work to store them as-is, but our policy is to avoid situations that might result in double compression. Add assorted regression tests for this, and bump catversion because of fixes to built-in pg_type entries. Also some marginal cleanup of inconsistent/unnecessary error checks.
* Return NULL instead of throwing error when desired bound is not available.Tom Lane2011-11-14
| | | | | | | | Change range_lower and range_upper to return NULL rather than throwing an error when the input range is empty or the relevant bound is infinite. Per discussion, throwing an error seems likely to be unduly hard to work with. Also, this is more consistent with the behavior of the constructors, which treat NULL as meaning an infinite bound.
* Return FALSE instead of throwing error for comparisons with empty ranges.Tom Lane2011-11-14
| | | | | | | | | | | | | | | | Change range_before, range_after, range_adjacent to return false rather than throwing an error when one or both input ranges are empty. The original definition is unnecessarily difficult to use, and also can result in undesirable planner failures since the planner could try to compare an empty range to something else while deriving statistical estimates. (This was, in fact, the cause of repeatable regression test failures on buildfarm member jaguar, as well as intermittent failures elsewhere.) Also tweak rangetypes regression test to not drop all the objects it creates, so that the final state of the regression database contains some rangetype objects for pg_dump testing.
* Fix copyright notices, other minor editing in new range-types code.Tom Lane2011-11-14
| | | | | | | No functional changes in this commit (except I could not resist the temptation to re-word a couple of error messages). This is just manual cleanup after pgindent to make the code look reasonably like other PG code, in preparation for more detailed code review to come.
* Rerun pgindent with updated typedef list.Bruce Momjian2011-11-14
|
* Run pgindent on range type files, per request from Tom.Bruce Momjian2011-11-14
|
* Make DatumGetInetP() unpack inet datums with a 1-byte header, and addHeikki Linnakangas2011-11-08
| | | | | | | a new macro, DatumGetInetPP(), that does not. This brings these macros in line with other DatumGet*P() macros. Backpatch to 8.3, where 1-byte header varlenas were introduced.
* Fix timestamp range subdiff functions, when using float datetimes.Heikki Linnakangas2011-11-07
|
* Support range data types.Heikki Linnakangas2011-11-03
| | | | | | | Selectivity estimation functions are missing for some range type operators, which is a TODO. Jeff Davis
* Support more locale-specific formatting options in cash_out().Tom Lane2011-10-30
| | | | | | | | | | | | | | | The POSIX spec defines locale fields for controlling the ordering of the value, sign, and currency symbol in monetary output, but cash_out only supported a small subset of these options. Fully implement p/n_sign_posn, p/n_cs_precedes, and p/n_sep_by_space per spec. Fix up cash_in so that it will accept all these format variants. Also, make sure that thousands_sep is only inserted to the left of the decimal point, as required by spec. Per bug #6144 from Eduard Kracmar and discussion of bug #6277. This patch includes some ideas from Alexander Lakhin's proposed patch, though it is very different in detail.
* Further improvement of make_greater_string.Tom Lane2011-10-30
| | | | | | | | | Make sure that it considers all the possibilities that the old code did, instead of trying only one possibility per character position. To keep the runtime in bounds, instead tweak the character incrementers to not try every possible multibyte character code. Remove unnecessary logic to restore the old character value on failure. Additional comment and formatting cleanup.
* Fix assorted bogosities in cash_in() and cash_out().Tom Lane2011-10-29
| | | | | | | | | | | | | cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.
* Improve make_greater_string() with encoding-specific incrementers.Robert Haas2011-10-29
| | | | | | | | | This infrastructure doesn't in any way guarantee that the character we produce will sort before the one we incremented; but it does at least make it much more likely that we'll end up with something that is a valid character, which improves our chances. Kyotaro Horiguchi, with various adjustments by me.
* Don't trust deferred-unique indexes for join removal.Tom Lane2011-10-23
| | | | | | | | | | | The uniqueness condition might fail to hold intra-transaction, and assuming it does can give incorrect query results. Per report from Marti Raudsepp, though this is not his proposed patch. Back-patch to 9.0, where both these features were introduced. In the released branches, add the new IndexOptInfo field to the end of the struct, to try to minimize ABI breakage for third-party code that may be examining that struct.
* Code review for pgstat_get_crashed_backend_activity patch.Tom Lane2011-10-21
| | | | | | | Avoid possibly dumping core when pgstat_track_activity_query_size has a less-than-default value; avoid uselessly searching for the query string of a successfully-exited backend; don't bother putting out an ERRDETAIL if we don't have a query to show; some other minor stylistic improvements.
* Try to log current the query string when a backend crashes.Robert Haas2011-10-21
| | | | | | | | | | | | | | | | To avoid minimize risk inside the postmaster, we subject this feature to a number of significant limitations. We very much wish to avoid doing any complex processing inside the postmaster, due to the posssibility that the crashed backend has completely corrupted shared memory. To that end, no encoding conversion is done; instead, we just replace anything that doesn't look like an ASCII character with a question mark. We limit the amount of data copied to 1024 characters, and carefully sanity check the source of that data. While these restrictions would doubtless be unacceptable in a general-purpose logging facility, even this limited facility seems like an improvement over the status quo ante. Marti Raudsepp, reviewed by PDXPUG and myself
* Teach btree to handle ScalarArrayOpExpr quals natively.Tom Lane2011-10-16
| | | | | This allows "indexedcol op ANY(ARRAY[...])" conditions to be used in plain indexscans, and particularly in index-only scans.
* Rearrange the implementation of index-only scans.Tom Lane2011-10-11
| | | | | | | | | | | | | | | | | | | | | | This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).
* Improve and simplify CREATE EXTENSION's management of GUC variables.Tom Lane2011-10-05
| | | | | | | | | | | | | | | | | | | | | CREATE EXTENSION needs to transiently set search_path, as well as client_min_messages and log_min_messages. We were doing this by the expedient of saving the current string value of each variable, doing a SET LOCAL, and then doing another SET LOCAL with the previous value at the end of the command. This is a bit expensive though, and it also fails badly if there is anything funny about the existing search_path value, as seen in a recent report from Roger Niederland. Fortunately, there's a much better way, which is to piggyback on the GUC infrastructure previously developed for functions with SET options. We just open a new GUC nesting level, do our assignments with GUC_ACTION_SAVE, and then close the nesting level when done. This automatically restores the prior settings without a re-parsing pass, so (in principle anyway) there can't be an error. And guc.c still takes care of cleanup in event of an error abort. The CREATE EXTENSION code for this was modeled on some much older code in ri_triggers.c, which I also changed to use the better method, even though there wasn't really much risk of failure there. Also improve the comments in guc.c to reflect this additional usage.
* Improve define_custom_variable's handling of pre-existing settings.Tom Lane2011-10-04
| | | | | | | | | | | | | | | | Arrange for any problems with pre-existing settings to be reported as WARNING not ERROR, so that we don't undesirably abort the loading of the incoming add-on module. The bad setting is just discarded, as though it had never been applied at all. (This requires a change in the API of set_config_option. After some thought I decided the most potentially useful addition was to allow callers to just pass in a desired elevel.) Arrange to restore the complete stacked state of the variable, rather than cheesily reinstalling only the active value. This ensures that custom GUCs will behave unsurprisingly even when the module loading operation occurs within nested subtransactions that have changed the active value. Since a module load could occur as a result of, eg, a PL function call, this is not an unlikely scenario.
* Fix recursion into previously planned sub-query in examine_simple_variable.Tom Lane2011-09-29
| | | | | | | | This code was looking at the sub-Query tree as seen in the parent query's RangeTblEntry; but that's the pristine parser output, and what we need to look at is the tree as it stands at the completion of planning. Otherwise we might pick up a Var that references a subquery that got flattened and hence has no RelOptInfo in the subroot. Per report from Peter Geoghegan.
* Suppress "unused function" warning when not HAVE_LOCALE_T.Tom Lane2011-09-20
| | | | Forgot to consider this case ...