aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
* Process variadic arguments consistently in json functionsAndrew Dunstan2017-10-25
| | | | | | | | | | | | json_build_object and json_build_array and the jsonb equivalents did not correctly process explicit VARIADIC arguments. They are modified to use the new extract_variadic_args() utility function which abstracts away the details of the call method. Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov. Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as that's where they originated.
* Doc: fix missing explanation of default object privileges.Tom Lane2017-10-11
| | | | | | | | | | | | | | The GRANT reference page, which lists the default privileges for new objects, failed to mention that USAGE is granted by default for data types and domains. As a lesser sin, it also did not specify anything about the initial privileges for sequences, FDWs, foreign servers, or large objects. Fix that, and add a comment to acldefault() in the probably vain hope of getting people to maintain this list in future. Noted by Laurenz Albe, though I editorialized on the wording a bit. Back-patch to all supported branches, since they all have this behavior. Discussion: https://postgr.es/m/1507620895.4152.1.camel@cybertec.at
* Fix behavior when converting a float infinity to numeric.Tom Lane2017-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | float8_numeric() and float4_numeric() failed to consider the possibility that the input is an IEEE infinity. The results depended on the platform-specific behavior of sprintf(): on most platforms you'd get something like ERROR: invalid input syntax for type numeric: "inf" but at least on Windows it's possible for the conversion to succeed and deliver a finite value (typically 1), due to a nonstandard output format from sprintf and lack of syntax error checking in these functions. Since our numeric type lacks the concept of infinity, a suitable conversion is impossible; the best thing to do is throw an explicit error before letting sprintf do its thing. While at it, let's use snprintf not sprintf. Overrunning the buffer should be impossible if sprintf does what it's supposed to, but this is cheap insurance against a stack smash if it doesn't. Problem reported by Taiki Kondo. Patch by me based on fix suggestion from KaiGai Kohei. Back-patch to all supported branches. Discussion: https://postgr.es/m/12A9442FBAE80D4E8953883E0B84E088C8C7A2@BPXM01GP.gisp.nec.co.jp
* Fix datumSerialize infrastructure to not crash on non-varlena data.Tom Lane2017-08-08
| | | | | | | | | | | | | | | | Commit 1efc7e538 did a poor job of emulating existing logic for touching Datums that might be expanded-object pointers. It didn't check for typlen being -1 first, which meant it could crash on fixed-length pass-by-ref values, and probably on cstring values as well. It also didn't use DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently harmless is not according to documentation nor prevailing style. I also think the lack of any explanation as to why datumSerialize makes these particular nonobvious choices is pretty awful, so fix that. Per report from Jarred Ward. Back-patch to 9.6 where this code came in. Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com
* Ensure that pg_get_ruledef()'s output matches pg_get_viewdef()'s.Tom Lane2017-07-24
| | | | | | | | | | | | | | | | | | | | | Various cases involving renaming of view columns are handled by having make_viewdef pass down the view's current relation tupledesc to get_query_def, which then takes care to use the column names from the tupledesc for the output column names of the SELECT. For some reason though, we'd missed teaching make_ruledef to do similarly when it is printing an ON SELECT rule, even though this is exactly the same case. The results from pg_get_ruledef would then be different and arguably wrong. In particular, this breaks pre-v10 versions of pg_dump, which in some situations would define views by means of emitting a CREATE RULE ... ON SELECT command. Third-party tools might not be happy either. In passing, clean up some crufty code in make_viewdef; we'd apparently modernized the equivalent code in make_ruledef somewhere along the way, and missed this copy. Per report from Gilles Darold. Back-patch to all supported versions. Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com
* Fix dumping of outer joins with empty qual lists.Tom Lane2017-07-20
| | | | | | | | | | | | | | | | | Normally, a JoinExpr would have empty "quals" only if it came from CROSS JOIN syntax. However, it's possible to get to this state by specifying NATURAL JOIN between two tables with no common column names, and there might be other ways too. The code previously printed no ON clause if "quals" was empty; that's right for CROSS JOIN but syntactically invalid if it's some type of outer join. Fix by printing ON TRUE in that case. This got broken by commit 2ffa740be, which stopped using NATURAL JOIN syntax in ruleutils output due to its brittleness in the face of column renamings. Back-patch to 9.3 where that commit appeared. Per report from Tushar Ahuja. Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com
* Fix dumping of FUNCTION RTEs that contain non-function-call expressions.Tom Lane2017-07-13
| | | | | | | | | | | | | | | | | | | | | | | | | | The grammar will only accept something syntactically similar to a function call in a function-in-FROM expression. However, there are various ways to input something that ruleutils.c won't deparse that way, potentially leading to a view or rule that fails dump/reload. Fix by inserting a dummy CAST around anything that isn't going to deparse as a function (which is one of the ways to get something like that in there in the first place). In HEAD, also make use of the infrastructure added by this to avoid emitting unnecessary parentheses in CREATE INDEX deparsing. I did not change that in back branches, thinking that people might find it to be unexpected/unnecessary behavioral change. In HEAD, also fix incorrect logic for when to add extra parens to partition key expressions. Somebody apparently thought they could get away with simpler logic than pg_get_indexdef_worker has, but they were wrong --- a counterexample is PARTITION BY LIST ((a[1])). Ignoring the prettyprint flag for partition expressions isn't exactly a nice solution anyway. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us
* Fix ruleutils.c for domain-over-array cases, too.Tom Lane2017-07-12
| | | | | | | | | | | | | Further investigation shows that ruleutils isn't quite up to speed either for cases where we have a domain-over-array: it needs to be prepared to look past a CoerceToDomain at the top level of field and element assignments, else it decompiles them incorrectly. Potentially this would result in failure to dump/reload a rule, if it looked like the one in the new test case. (I also added a test for EXPLAIN; that output isn't broken, but clearly we need more test coverage here.) Like commit b1cb32fb6, this bug is reachable in cases we already support, so back-patch all the way.
* Minor code review for parse_phrase_operator().Tom Lane2017-06-26
| | | | | | | | | | | | | | | Fix its header comment, which described the old behavior of the <N> phrase distance operator; we missed updating that in commit 028350f61. Also, reset errno before strtol() call, to defend against the possibility that it was already ERANGE at entry. (The lack of complaints says that it generally isn't, but this is at least a latent bug.) Very minor stylistic improvements as well. Victor Drobny noted the obsolete comment, I noted the errno issue. Back-patch to 9.6 where this code was added, just in case the errno issue is a live bug in some cases. Discussion: https://postgr.es/m/2b5382fdff9b1f79d5eb2c99c4d2cbe2@postgrespro.ru
* Assorted translatable string fixesAlvaro Herrera2017-06-04
| | | | | Mark our rusage reportage string translatable; remove quotes from type names; unify formatting of very similar messages.
* Tighten checks for whitespace in functions that parse identifiers etc.Tom Lane2017-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces isspace() calls with scanner_isspace() in functions that are likely to be presented with non-ASCII input. isspace() has the small advantage that it will correctly recognize no-break space in single-byte encodings (such as LATIN1); but it cannot work successfully for any multibyte character, and depending on platform it might return false positive results for some fragments of multibyte characters. That's disastrous for functions that are trying to discard whitespace between valid strings, as noted in bug #14662 from Justin Muise. Even treating no-break space as whitespace is pretty questionable for the usages touched here, because the core scanner would think it is an identifier character. Affected functions are parse_ident(), parseNameAndArgTypes (underlying regprocedurein() and siblings), SplitIdentifierString (used for parsing GUCs and options that are qualified names or lists of names), and SplitDirectoriesString (used for parsing GUCs that are lists of directories). All the functions adjusted here are parsing SQL identifiers and similar constructs, so it's reasonable to insist that their definition of whitespace match the core scanner. So we can hope that this won't cause many backwards-compatibility problems. I've left alone isspace() calls in places that aren't really expecting any non-ASCII input characters, such as float8in(). Back-patch to all supported branches. Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
* Fix precision and rounding issues in money multiplication and division.Tom Lane2017-05-21
| | | | | | | | | | | | | | | | | | | | | | | The cash_div_intX functions applied rint() to the result of the division. That's not merely useless (because the result is already an integer) but it causes precision loss for values larger than 2^52 or so, because of the forced conversion to float8. On the other hand, the cash_mul_fltX functions neglected to apply rint() to their multiplication results, thus possibly causing off-by-one outputs. Per C standard, arithmetic between any integral value and a float value is performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced answers good to only about six digits, even when the float value is exact. We can improve matters noticeably by widening the float inputs to double. (It's tempting to consider using "long double" arithmetic if available, but that's probably too much of a stretch for a back-patched fix.) Also, document that cash_div_intX operators truncate rather than round. Per bug #14663 from Richard Pistole. Back-patch to all supported branches. Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us
* Fix typo in comment.Heikki Linnakangas2017-05-18
| | | | Daniel Gustafsson
* Fix new warnings from GCC 7Peter Eisentraut2017-05-16
| | | | | This addresses the new warning types -Wformat-truncation -Wformat-overflow that are part of -Wall, via -Wformat, in GCC 7.
* Further patch rangetypes_selfuncs.c's statistics slot management.Tom Lane2017-05-08
| | | | | | | | | | | | | | | Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484
* Fix possibly-uninitialized variable.Tom Lane2017-05-08
| | | | | | Oversight in e2d4ef8de et al (my fault not Peter's). Per buildfarm. Security: CVE-2017-7484
* Add security checks to selectivity estimation functionsPeter Eisentraut2017-05-08
| | | | | | | | | | | | | | | | | | | | | Some selectivity estimation functions run user-supplied operators over data obtained from pg_statistic without security checks, which allows those operators to leak pg_statistic data without having privileges on the underlying tables. Fix by checking that one of the following is satisfied: (1) the user has table or column privileges on the table underlying the pg_statistic data, or (2) the function implementing the user-supplied operator is leak-proof. If neither is satisfied, planning will proceed as if there are no statistics available. At least one of these is satisfied in most cases in practice. The only situations that are negatively impacted are user-defined or not-leak-proof operators on a security-barrier view. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Peter Eisentraut <peter_e@gmx.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Security: CVE-2017-7484
* Fix cursor_to_xml in tableforest false modePeter Eisentraut2017-05-04
| | | | | | | | | | | | | It only produced <row> elements but no wrapping <table> element. By contrast, cursor_to_xmlschema produced a schema that is now correct but did not previously match the XML data produced by cursor_to_xml. In passing, also fix a minor misunderstanding about moving cursors in the tests related to this. Reported-by: filip@jirsak.org Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
* Fix integer-overflow problems in interval comparison.Tom Lane2017-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using integer timestamps, the interval-comparison functions tried to compute the overall magnitude of an interval as an int64 number of microseconds. As reported by Frazer McLean, this overflows for intervals exceeding about 296000 years, which is bad since we nominally allow intervals many times larger than that. That results in wrong comparison results, and possibly in corrupted btree indexes for columns containing such large interval values. To fix, compute the magnitude as int128 instead. Although some compilers have native support for int128 calculations, many don't, so create our own support functions that can do 128-bit addition and multiplication if the compiler support isn't there. These support functions are designed with an eye to allowing the int128 code paths in numeric.c to be rewritten for use on all platforms, although this patch doesn't do that, or even provide all the int128 primitives that will be needed for it. Back-patch as far as 9.4. Earlier releases did not guard against overflow of interval values at all (commit 146604ec4 fixed that), so it seems not very exciting to worry about overly-large intervals for them. Before 9.6, we did not assume that unreferenced "static inline" functions would not draw compiler warnings, so omit functions not directly referenced by timestamp.c, the only present consumer of int128.h. (We could have omitted these functions in HEAD too, but since they were written and debugged on the way to the present patch, and they look likely to be needed by numeric.c, let's keep them in HEAD.) I did not bother to try to prevent such warnings in a --disable-integer-datetimes build, though. Before 9.5, configure will never define HAVE_INT128, so the part of int128.h that exploits a native int128 implementation is dead code in the 9.4 branch. I didn't bother to remove it, thinking that keeping the file looking similar in different branches is more useful. In HEAD only, add a simple test harness for int128.h in src/tools/. In back branches, this does not change the float-timestamps code path. That's not subject to the same kind of overflow risk, since it computes the interval magnitude as float8. (No doubt, when this code was originally written, overflow was disregarded for exactly that reason.) There is a precision hazard instead :-(, but we'll avert our eyes from that question, since no complaints have been reported and that code's deprecated anyway. Kyotaro Horiguchi and Tom Lane Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com
* Fix support for some operators (&<, &>, $<|, |&>) in box operator classTeodor Sigaev2017-03-21
| | | | | | | | | | | | of SP-GiST. Bug exists since initial commit of box opclass for SP-GiST, so backpath to 9.6 Author: Nikita Glukhov with minor editorization of tests by me Reviewed-by: Kyotaro Horiguchi, Anastasia Lubennikova https://commitfest.postgresql.org/13/981/
* Blind try to fix portability issue in commit 8f93bd851 et al.Tom Lane2017-02-09
| | | | | | | | | | | | | | | | | The S/390 members of the buildfarm are showing failures indicating that they're having trouble with the rint() calls I added yesterday. There's no good reason for that, and I wonder if it is a compiler bug similar to the one we worked around in d9476b838. Try to fix it using the same method as before, namely to store the result of rint() back into a "double" variable rather than immediately converting to int64. (This isn't entirely waving a dead chicken, since on machines with wider-than-double float registers, the extra store forces a width conversion. I don't know if S/390 is like that, but it seems worth trying.) In passing, merge duplicate ereport() calls in float8_timestamptz(). Per buildfarm.
* Fix roundoff problems in float8_timestamptz() and make_interval().Tom Lane2017-02-08
| | | | | | | | | | | | | | | | | | | | | | When converting a float value to integer microseconds, we should be careful to round the value to the nearest integer, typically with rint(); simply assigning to an int64 variable will truncate, causing apparently off-by-one values in cases that should work. Most places in the datetime code got this right, but not these two. float8_timestamptz() is new as of commit e511d878f (9.6). Previous versions effectively depended on interval_mul() to do roundoff correctly, which it does, so this fixes an accuracy regression in 9.6. The problem in make_interval() dates to its introduction in 9.4. Aside from being careful to round not truncate, let's incorporate the hours and minutes inputs into the result with exact integer arithmetic, rather than risk introducing roundoff error where there need not have been any. float8_timestamptz() problem reported by Erik Nordström, though this is not his proposed patch. make_interval() problem found by me. Discussion: https://postgr.es/m/CAHuQZDS76jTYk3LydPbKpNfw9KbACmD=49dC4BrzHcfPv6yA1A@mail.gmail.com
* Fix typos in comments.Heikki Linnakangas2017-02-06
| | | | | | | | | Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
* Ensure that a tsquery like '!foo' matches empty tsvectors.Tom Lane2017-01-26
| | | | | | | | | | | | | | | | | !foo means "the tsvector does not contain foo", and therefore it should match an empty tsvector. ts_match_vq() overenthusiastically supposed that an empty tsvector could never match any query, so it forcibly returned FALSE, the wrong answer. Remove the premature optimization. Our behavior on this point was inconsistent, because while seqscans and GIST index searches both failed to match empty tsvectors, GIN index searches would find them, since GIN scans don't rely on ts_match_vq(). That makes this certainly a bug, not a debatable definition disagreement, so back-patch to all supported branches. Report and diagnosis by Tom Dunstan (bug #14515); added test cases by me. Discussion: https://postgr.es/m/20170126025524.1434.97828@wrigleys.postgresql.org
* Disable transforms that replaced AT TIME ZONE with RelabelType.Tom Lane2017-01-18
| | | | | | | | | | | | | | | | | | These resulted in wrong answers if the relabeled argument could be matched to an index column, as shown in bug #14504 from Evgeniy Kozlov. We might be able to resurrect these optimizations by adjusting the planner's treatment of RelabelType, or by adjusting btree's rules for selecting comparison functions, but either solution will take careful analysis and does not sound like a fit candidate for backpatching. I left the catalog infrastructure in place and just reduced the transform functions to always-return-NULL. This would be necessary anyway in the back branches, and it doesn't seem important to be more invasive in HEAD. Bug introduced by commit b8a18ad48. Back-patch to 9.5 where that came in. Report: https://postgr.es/m/20170118144828.1432.52823@wrigleys.postgresql.org Discussion: https://postgr.es/m/18771.1484759439@sss.pgh.pa.us
* Fix handling of empty arrays in array_fill().Tom Lane2017-01-05
| | | | | | | | | | | | | | | | | array_fill(..., array[0]) produced an empty array, which is probably what users expect, but it was a one-dimensional zero-length array which is not our standard representation of empty arrays. Also, for no very good reason, it rejected empty input arrays; that case should be allowed and produce an empty output array. In passing, remove the restriction that the input array(s) have lower bound 1. That seems rather pointless, and it would have needed extra complexity to make the check deal with empty input arrays. Per bug #14487 from Andrew Gierth. It's been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/20170105152156.10135.64195@wrigleys.postgresql.org
* Fix interval_transform so it doesn't throw away non-no-op casts.Tom Lane2016-12-27
| | | | | | | | | | | | | | | | | | | | | | | | interval_transform() contained two separate bugs that caused it to sometimes mistakenly decide that a cast from interval to restricted interval is a no-op and throw it away. First, it was wrong to rely on dt.h's field type macros to have an ordering consistent with the field's significance; in one case they do not. This led to mistakenly treating YEAR as less significant than MONTH, so that a cast from INTERVAL MONTH to INTERVAL YEAR was incorrectly discarded. Second, fls(1<<k) produces k+1 not k, so comparing its output directly to SECOND was wrong. This led to supposing that a cast to INTERVAL MINUTE was really a cast to INTERVAL SECOND and so could be discarded. To fix, get rid of the use of fls(), and make a function based on intervaltypmodout to produce a field ID code adapted to the need here. Per bug #14479 from Piotr Stefaniak. Back-patch to 9.2 where transform functions were introduced, because this code was born broken. Discussion: https://postgr.es/m/20161227172307.10135.7747@wrigleys.postgresql.org
* Fix handling of expanded objects in CoerceToDomain and CASE execution.Tom Lane2016-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the input value to a CoerceToDomain expression node is a read-write expanded datum, we should pass a read-only pointer to any domain CHECK expressions and then return the original read-write pointer as the expression result. Previously we were blindly passing the same pointer to all the consumers of the value, making it possible for a function in CHECK to modify or even delete the expanded value. (Since a plpgsql function will absorb a passed-in read-write expanded array as a local variable value, it will in fact delete the value on exit.) A similar hazard of passing the same read-write pointer to multiple consumers exists in domain_check() and in ExecEvalCase, so fix those too. The fix requires adding MakeExpandedObjectReadOnly calls at the appropriate places, which is simple enough except that we need to get the data type's typlen from somewhere. For the domain cases, solve this by redefining DomainConstraintRef.tcache as okay for callers to access; there wasn't any reason for the original convention against that, other than not wanting the API of typcache.c to be any wider than it had to be. For CASE, there's no good solution except to add a syscache lookup during executor start. Per bug #14472 from Marcos Castedo. Back-patch to 9.5 where expanded values were introduced. Discussion: https://postgr.es/m/15225.1482431619@sss.pgh.pa.us
* Fix strange behavior (and possible crashes) in full text phrase search.Tom Lane2016-12-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an attempt to simplify the tsquery matching engine, the original phrase search patch invented rewrite rules that would rearrange a tsquery so that no AND/OR/NOT operator appeared below a PHRASE operator. But this approach had numerous problems. The rearrangement step was missed by ts_rewrite (and perhaps other places), allowing tsqueries to be created that would cause Assert failures or perhaps crashes at execution, as reported by Andreas Seltenreich. The rewrite rules effectively defined semantics for operators underneath PHRASE that were buggy, or at least unintuitive. And because rewriting was done in tsqueryin() rather than at execution, the rearrangement was user-visible, which is not very desirable --- for example, it might cause unexpected matches or failures to match in ts_rewrite. As a somewhat independent problem, the behavior of nested PHRASE operators was only sane for left-deep trees; queries like "x <-> (y <-> z)" did not behave intuitively at all. To fix, get rid of the rewrite logic altogether, and instead teach the tsquery execution engine to manage AND/OR/NOT below a PHRASE operator by explicitly computing the match location(s) and match widths for these operators. This requires introducing some additional fields into the publicly visible ExecPhraseData struct; but since there's no way for third-party code to pass such a struct to TS_phrase_execute, it shouldn't create an ABI problem as long as we don't move the offsets of the existing fields. Another related problem was that index searches supposed that "!x <-> y" could be lossily approximated as "!x & y", which isn't correct because the latter will reject, say, "x q y" which the query itself accepts. This required some tweaking in TS_execute_ternary along with the main tsquery engine. Back-patch to 9.6 where phrase operators were introduced. While this could be argued to change behavior more than we'd like in a stable branch, we have to do something about the crash hazards and index-vs-seqscan inconsistency, and it doesn't seem desirable to let the unintuitive behaviors induced by the rewriting implementation stand as precedent. Discussion: https://postgr.es/m/28215.1481999808@sss.pgh.pa.us Discussion: https://postgr.es/m/26706.1482087250@sss.pgh.pa.us
* Fix handling of phrase operator removal while removing tsquery stopwords.Tom Lane2016-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The distance of a removed phrase operator should propagate up to a parent phrase operator if there is one, but this only worked correctly in left-deep trees. Throwing in a few parentheses confused it completely, as indeed was illustrated by bizarre results in existing regression test cases. To fix, track unaccounted-for distances that should propagate to the left and to the right of the current node, rather than trying to make it work with only one returned distance. Also make some adjustments to behave as well as we can for cases of intermixed phrase and regular (AND/OR) operators. I don't think it's possible to be 100% correct for that without a rethinking of the tsquery representation; for example, maybe we should just not drop stopword nodes at all underneath phrase operators. But this is better than it was, and changing tsquery representation wouldn't be safely back-patchable. While at it, I simplified the API of the clean_fakeval_intree function a bit by getting rid of the "char *result" output parameter; that wasn't doing anything that wasn't redundant with whether the result node is NULL or not, and testing for NULL seems a lot clearer/safer. This is part of a larger project to fix various infelicities in the phrase-search implementation, but this part seems comittable on its own. Back-patch to 9.6 where phrase operators were introduced. Discussion: https://postgr.es/m/28215.1481999808@sss.pgh.pa.us Discussion: https://postgr.es/m/26706.1482087250@sss.pgh.pa.us
* Improve documentation around TS_execute().Tom Lane2016-12-16
| | | | | | | | | | I got frustrated by the lack of commentary in this area, so here is some reverse-engineered documentation, along with minor stylistic cleanup. No code changes more significant than removal of unused variables. Back-patch to 9.6, not because that's useful in itself, but because we have some bugs to fix in phrase search and this would cause merge failures if it's only in HEAD.
* Fix off-by-one in memory allocation for quote_literal_cstr().Heikki Linnakangas2016-12-16
| | | | | | | | | | The calculation didn't take into account the NULL terminator. That lead to overwriting the palloc'd buffer by one byte, if the input consists entirely of backslashes. For example "format('%L', E'\\')". Fixes bug #14468. Backpatch to all supported versions. Report: https://www.postgresql.org/message-id/20161216105001.13334.42819%40wrigleys.postgresql.org
* Prevent crash when ts_rewrite() replaces a non-top-level subtree with null.Tom Lane2016-12-11
| | | | | | | | | | | | | | When ts_rewrite()'s replacement argument is an empty tsquery, it's supposed to simplify any operator nodes whose operand(s) become NULL; but it failed to do that reliably, because dropvoidsubtree() only examined the top level of the result tree. Rather than make a second recursive pass, let's just give the responsibility to dofindsubquery() to simplify while it's doing the main replacement pass. Per report from Andreas Seltenreich. Artur Zakirov, with some cosmetic changes by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/8737i01dew.fsf@credativ.de
* Fix crasher bug in array_position(s)Alvaro Herrera2016-12-09
| | | | | | | | | | | | | | | | | | | array_position and its cousin array_positions were caching the element type equality function's FmgrInfo without being careful enough to put it in a long-lived context. This is obviously broken but it didn't matter in most cases; only when using arrays of records (involving record_eq) it becomes a problem. The fix is to ensure that the type's equality function's FmgrInfo is cached in the array_position's flinfo->fn_mcxt rather than the current memory context. Apart from record types, the only other case that seems complex enough to possibly cause the same problem are range types. I didn't find a way to reproduce the problem with those, so I only include the test case submitted with the bug report as regression test. Bug report and patch: Junseok Yang Discussion: https://postgr.es/m/CAE+byMupUURYiZ6bKYgMZb9pgV1CYAijJGqWj-90W=nS7uEOeA@mail.gmail.com Backpatch to 9.5, where array_position appeared.
* Make sure ALTER TABLE preserves index tablespaces.Tom Lane2016-11-23
| | | | | | | | | | | | | | | | | | | | | | | | When rebuilding an existing index, ALTER TABLE correctly kept the physical file in the same tablespace, but it messed up the pg_class entry if the index had been in the database's default tablespace and "default_tablespace" was set to some non-default tablespace. This led to an inaccessible index. Fix by fixing pg_get_indexdef_string() to always include a tablespace clause, whether or not the index is in the default tablespace. The previous behavior was installed in commit 537e92e41, and I think it just wasn't thought through very clearly; certainly the possible effect of default_tablespace wasn't considered. There's some risk in changing the behavior of this function, but there are no other call sites in the core code. Even if it's being used by some third party extension, it's fairly hard to envision a usage that is okay with a tablespace clause being appended some of the time but can't handle it being appended all the time. Back-patch to all supported versions. Code fix by me, investigation and test cases by Michael Paquier. Discussion: <1479294998857-5930602.post@n3.nabble.com>
* Fix PGLC_localeconv() to handle errors better.Tom Lane2016-11-21
| | | | | | | | | | | | | | | | | | | | | | | The code was intentionally not very careful about leaking strdup'd strings in case of an error. That was forgivable probably, but it also failed to notice strdup() failures, which could lead to subsequent null-pointer-dereference crashes, since many callers unsurprisingly didn't check for null pointers in the struct lconv fields. An even worse problem is that it could throw error while we were setlocale'd to a non-C locale, causing unwanted behavior in subsequent libc calls. Rewrite to ensure that we cannot throw elog(ERROR) until after we've restored the previous locale settings, or at least attempted to. (I'm sorely tempted to make restore failure be a FATAL error, but will refrain for the moment.) Having done that, it's not much more work to ensure that we clean up strdup'd storage on the way out, too. This code is substantially the same in all supported branches, so back-patch all the way. Michael Paquier and Tom Lane Discussion: <CAB7nPqRMbGqa_mesopcn4MPyTs34eqtVEK7ELYxvvV=oqS00YA@mail.gmail.com>
* Fix typo in commentMagnus Hagander2016-11-14
| | | | | The function was renamed in 908e23473, but the comment never learned about it.
* Fix nasty performance problem in tsquery_rewrite().Tom Lane2016-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | tsquery_rewrite() tries to find matches to subsets of AND/OR conditions; for example, in the query 'a | b | c' the substitution subquery 'a | c' should match and lead to replacement of the first and third items. That's fine, but the matching algorithm apparently takes about O(2^N) for an N-clause query (I say "apparently" because the code is also both unintelligible and uncommented). We could probably do better than that even without any extra assumptions --- but actually, we know that the subclauses are sorted, indeed are depending on that elsewhere in this very same function. So we can just scan the two lists a single time to detect matches, as though we were doing a merge join. Also do a re-flattening call (QTNTernary()) in tsquery_rewrite_query, just to make sure that the tree fits the expectations of the next search cycle. I didn't try to devise a test case for this, but I'm pretty sure that the oversight could have led to failure to match in some cases where a match would be expected. Improve comments, and also stick a CHECK_FOR_INTERRUPTS into dofindsubquery, just in case it's still too slow for somebody. Per report from Andreas Seltenreich. Back-patch to all supported branches. Discussion: <8760oasf2y.fsf@credativ.de>
* Fix bogus tree-flattening logic in QTNTernary().Tom Lane2016-10-30
| | | | | | | | | | | | | | QTNTernary() contains logic to flatten, eg, '(a & b) & c' into 'a & b & c', which is all well and good, but it tries to do that to NOT nodes as well, so that '!!a' gets changed to '!a'. Explicitly restrict the conversion to be done only on AND and OR nodes, and add a test case illustrating the bug. In passing, provide some comments for the sadly naked functions in tsquery_util.c, and simplify some baroque logic in QTNFree(), which I think may have been leaking some items it intended to free. Noted while investigating a complaint from Andreas Seltenreich. Back-patch to all supported versions.
* Improve speed of aggregates that use array_append as transition function.Tom Lane2016-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous coding, if an aggregate's transition function returned an expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus force it into the flat representation. This led to ping-ponging between flat and expanded formats, which costs a lot. For an aggregate using array_append as transition function, I measured about a 15X slowdown compared to the pre-9.5 code, when working on simple int[] arrays. Of course, the old code was already O(N^2) in this usage due to copying flat arrays all the time, but it wasn't quite this inefficient. To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition values without copying, so long as the transition function takes care to return the transition value already properly parented under the aggcontext. That puts a bit of extra responsibility on the transition function, but doing it this way allows us to not need any extra logic in the fast path of advance_transition_function (ie, with a pass-by-value transition value, or with a modified-in-place pass-by-reference value). We already know that that's a hot spot so I'm loath to add any cycles at all there. Also, while only array_append currently knows how to follow this convention, this solution allows other transition functions to opt-in without needing to have a whitelist in the core aggregation code. (The reason we would need a whitelist is that currently, if you pass a R/W expanded-object pointer to an arbitrary function, it's allowed to do anything with it including deleting it; that breaks the core agg code's assumption that it should free discarded values. Returning a value under aggcontext is the transition function's signal that it knows it is an aggregate transition function and will play nice. Possibly the API rules for expanded objects should be refined, but that would not be a back-patchable change.) With this fix, an aggregate using array_append is no longer O(N^2), so it's much faster than pre-9.5 code rather than much slower. It's still a bit slower than the bespoke infrastructure for array_agg, but the differential seems to be only about 10%-20% rather than orders of magnitude. Discussion: <6315.1477677885@sss.pgh.pa.us>
* Avoid testing tuple visibility without buffer lock in RI_FKey_check().Tom Lane2016-10-23
| | | | | | | | | | | | | | | | Despite the argumentation I wrote in commit 7a2fe85b0, it's unsafe to do this, because in corner cases it's possible for HeapTupleSatisfiesSelf to try to set hint bits on the target tuple; and at least since 8.2 we have required the buffer content lock to be held while setting hint bits. The added regression test exercises one such corner case. Unpatched, it causes an assertion failure in assert-enabled builds, or otherwise would cause a hint bit change in a buffer we don't hold lock on, which given the right race condition could result in checksum failures or other data consistency problems. The odds of a problem in the field are probably pretty small, but nonetheless back-patch to all supported branches. Report: <19391.1477244876@sss.pgh.pa.us>
* Suppress "Factory" zone in pg_timezone_names view for tzdata >= 2016g.Tom Lane2016-10-19
| | | | | | IANA got rid of the really silly "abbreviation" and replaced it with one that's only moderately silly. But it's still pointless, so keep on not showing it.
* Fix cidin() to handle values above 2^31 platform-independently.Tom Lane2016-10-18
| | | | | | | | | | | | | | | | | | | | | | CommandId is declared as uint32, and values up to 4G are indeed legal. cidout() handles them properly by treating the value as unsigned int. But cidin() was just using atoi(), which has platform-dependent behavior for values outside the range of signed int, as reported by Bart Lengkeek in bug #14379. Use strtoul() instead, as xidin() does. In passing, make some purely cosmetic changes to make xidin/xidout look more like cidin/cidout; the former didn't have a monopoly on best practice IMO. Neither xidin nor cidin make any attempt to throw error for invalid input. I didn't change that here, and am not sure it's worth worrying about since neither is really a user-facing type. The point is just to ensure that indubitably-valid inputs work as expected. It's been like this for a long time, so back-patch to all supported branches. Report: <20161018152550.1413.6439@wrigleys.postgresql.org>
* Fix assorted integer-overflow hazards in varbit.c.Tom Lane2016-10-14
| | | | | | | | | | | | | | | | | bitshiftright() and bitshiftleft() would recursively call each other infinitely if the user passed INT_MIN for the shift amount, due to integer overflow in negating the shift amount. To fix, clamp to -VARBITMAXLEN. That doesn't change the results since any shift distance larger than the input bit string's length produces an all-zeroes result. Also fix some places that seemed inadequately paranoid about input typmods exceeding VARBITMAXLEN. While a typmod accepted by anybit_typmodin() will certainly be much less than that, at least some of these spots are reachable with user-chosen integer values. Andreas Seltenreich and Tom Lane Discussion: <87d1j2zqtz.fsf@credativ.de>
* Fix broken jsonb_set() logic for replacing array elements.Tom Lane2016-10-13
| | | | | | | | | | | Commit 0b62fd036 did a fairly sloppy job of refactoring setPath() to support jsonb_insert() along with jsonb_set(). In its defense, though, there was no regression test case exercising the case of replacing an existing element in a jsonb array. Per bug #14366 from Peng Sun. Back-patch to 9.6 where bug was introduced. Report: <20161012065349.1412.47858@wrigleys.postgresql.org>
* Fix miserable coding in pg_stat_get_activity().Tom Lane2016-09-10
| | | | | | | | | | | | | | | | | | | | | | | | Commit dd1a3bccc replaced a test on whether a subroutine returned a null pointer with a test on whether &pointer->backendStatus was null. This accidentally failed to fail, at least on common compilers, because backendStatus is the first field in the struct; but it was surely trouble waiting to happen. Commit f91feba87 then messed things up further, changing the logic to local_beentry = pgstat_fetch_stat_local_beentry(curr_backend); if (!local_beentry) continue; beentry = &local_beentry->backendStatus; if (!beentry) { where the second "if" is now dead code, so that the intended behavior of printing a row with "<backend information not available>" cannot occur. I suspect this is all moot because pgstat_fetch_stat_local_beentry will never actually return null in this function's usage, but it's still very poor coding. Repair back to 9.4 where the original problem was introduced.
* Don't require dynamic timezone abbreviations to match underlying time zone.Tom Lane2016-09-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we threw an error if a dynamic timezone abbreviation did not match any abbreviation recorded in the referenced IANA time zone entry. That seemed like a good consistency check at the time, but it turns out that a number of the abbreviations in the IANA database are things that Olson and crew made up out of whole cloth. Their current policy is to remove such names in favor of using simple numeric offsets. Perhaps unsurprisingly, a lot of these made-up abbreviations have varied in meaning over time, which meant that our commit b2cbced9e and later changes made them into dynamic abbreviations. So with newer IANA database versions that don't mention these abbreviations at all, we fail, as reported in bug #14307 from Neil Anderson. It's worse than just a few unused-in-the-wild abbreviations not working, because the pg_timezone_abbrevs view stops working altogether (since its underlying function tries to compute the whole view result in one call). We considered deleting these abbreviations from our abbreviations list, but the problem with that is that we can't stay ahead of possible future IANA changes. Instead, let's leave the abbreviations list alone, and treat any "orphaned" dynamic abbreviation as just meaning the referenced time zone. It will behave a bit differently than it used to, in that you can't any longer override the zone's standard vs. daylight rule by using the "wrong" abbreviation of a pair, but that's better than failing entirely. (Also, this solution can be interpreted as adding a small new feature, which is that any abbreviation a user wants can be defined as referencing a time zone name.) Back-patch to all supported branches, since this problem affects all of them when using tzdata 2016f or newer. Report: <20160902031551.15674.67337@wrigleys.postgresql.org> Discussion: <6189.1472820913@sss.pgh.pa.us>
* Add macros to make AllocSetContextCreate() calls simpler and safer.Tom Lane2016-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>
* Suppress compiler warnings in non-cassert builds.Tom Lane2016-08-23
| | | | | | With Asserts off, these variables are set but never used, resulting in warnings from pickier compilers. Fix that with our standard solution. Per report from Jeff Janes.
* Suppress -Wunused-result warning for strtol().Tom Lane2016-08-16
| | | | | | | | I'm not sure which bozo thought it's a problem to use strtol() only for its endptr result, but silence the warning using same method used elsewhere. Report: <f845d3a6-5328-3e2a-924f-f8e91aa2b6d2@2ndquadrant.com>