aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
* Accept fractional seconds in jsonpath's datetime() method.Tom Lane2023-06-12
| | | | | | | | | | | | | | | | | | | | | | Commit 927d9abb6 purported to make datetime() accept any string that could be output for a datetime value by to_jsonb(). But it overlooked the possibility of fractional seconds being present, so that cases as simple as to_jsonb(now()) would defeat it. Fix by adding formats that include ".US" to the list in executeDateTimeMethod(). (Note that while this is nominally microseconds, it'll do the right thing for fractions with fewer than six digits.) In passing, re-order the list to restore the datatype ordering specified in its comment. The violation accidentally did not break anything; but the next edit might be less lucky, so add more comments. Per report from Tim Field. Back-patch to v13 where datetime() was added, like the previous patch. Discussion: https://postgr.es/m/014A028B-5CE6-4FDF-AC24-426CA6FC9CEE@mohiohio.com
* pageinspect: Fix gist_page_items() with included columnsMichael Paquier2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14
* Add ruleutils support for decompiling MERGE commands.Tom Lane2023-05-07
| | | | | | | | | This was overlooked when MERGE was added, but it's essential support for MERGE in new-style SQL functions. Alvaro Herrera Discussion: https://postgr.es/m/3579737.1683293801@sss.pgh.pa.us
* In array_position()/array_positions(), beware of empty input array.Tom Lane2023-05-04
| | | | | | | | | | | | These functions incautiously fetched the array's first lower bound even when the array is zero-dimensional, thus fetching the word after the allocated array space. While almost always harmless, with very bad luck this could result in SIGSEGV. Fix by adding an early exit for empty input. Per bug #17920 from Alexander Lakhin. Discussion: https://postgr.es/m/17920-f7c228c627b6d02e%40postgresql.org
* Remove duplicate lines of codeDaniel Gustafsson2023-04-24
| | | | | | | | | | | | | | | Commit 6df7a9698bb accidentally included two identical prototypes for default_multirange_selectivi() and commit 086cf1458c6 added a break; statement where one was already present, thus duplicating it. While there is no bug caused by this, fix by removing the duplicated lines as they provide no value. Backpatch the fix for duplicate prototypes to v14 and the duplicate break statement fix to all supported branches to avoid backpatching hazards due to the removal. Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru> Discussion: https://postgr.es/m/0e69cb60-0176-f6d0-7e15-6478b7d85724@postgrespro.ru
* Fix oversights in array manipulation.Tom Lane2023-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
* Fix t_isspace(), etc., when datlocprovider=i and datctype=C.Jeff Davis2023-03-17
| | | | | | | | | | | | | Check whether the datctype is C to determine whether t_isspace() and related functions use isspace() or iswspace(). Previously, t_isspace() checked whether the database default collation was C; which is incorrect when the default collation uses the ICU provider. Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com Reviewed-by: Peter Eisentraut Backpatch-through: 15
* Work around spurious compiler warning in inet operatorsAndres Freund2023-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | gcc 12+ has complaints like the following: ../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot': ../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 1893 | pdst[nb] = ~pip[nb]; | ~~~~~~~~~^~~~~~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 27 | unsigned char ipaddr[16]; /* up to 128 bits of address */ | ^~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 This is due to a compiler bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986 It has been a year since the bug has been reported without getting fixed. As the warnings are verbose and use of gcc 12 is becoming more common, it seems worth working around the bug. Particularly because a simple reformulation of the loop condition fixes the issue and isn't any less readable. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us Backpatch: 11-
* Fix corner case bug in numeric to_char() some more.Tom Lane2023-03-14
| | | | | | | | | | | | | | | The band-aid applied in commit f0bedf3e4 turns out to still need some work: it made sure we didn't set Np->last_relevant too small (to the left of the decimal point), but it didn't prevent setting it too large (off the end of the partially-converted string). This could result in fetching data beyond the end of the allocated space, which with very bad luck could cause a SIGSEGV, though I don't see any hazard of interesting memory disclosure. Per bug #17839 from Thiago Nunes. The bug's pretty ancient, so back-patch to all supported versions. Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
* Fix JSON error reporting for many cases of erroneous string values.Tom Lane2023-03-13
| | | | | | | | | | | | | | | | | | | | | | | The majority of error exit cases in json_lex_string() failed to set lex->token_terminator, causing problems for the error context reporting code: it would see token_terminator less than token_start and do something more or less nuts. In v14 and up the end result could be as bad as a crash in report_json_context(). Older versions accidentally avoided that fate; but all versions produce error context lines that are far less useful than intended, because they'd stop at the end of the prior token instead of continuing to where the actually-bad input is. To fix, invent some macros that make it less notationally painful to do the right thing. Also add documentation about what the function is actually required to do; and in >= v14, add an assertion in report_json_context about token_terminator being sufficiently far advanced. Per report from Nikolay Shaplov. Back-patch to all supported versions. Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
* Avoid fetching one past the end of translate()'s "to" parameter.Tom Lane2023-03-01
| | | | | | | | | | | | | | | | | This is usually harmless, but if you were very unlucky it could provoke a segfault due to the "to" string being right up against the end of memory. Found via valgrind testing (so we might've found it earlier, except that our regression tests lacked any exercise of translate()'s deletion feature). Fix by switching the order of the test-for-end-of-string and advance-pointer steps. While here, compute "to_ptr + tolen" just once. (Smarter compilers might figure that out for themselves, but let's just make sure.) Report and fix by Daniil Anisimov, in bug #17816. Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
* Fix parsing of ISO-8601 interval fields with exponential notation.Tom Lane2023-02-20
| | | | | | | | | | | | | | | | | | | | | | | | Historically we've accepted interval input like 'P.1e10D'. This is probably an accident of having used strtod() to do the parsing, rather than something anyone intended, but it's been that way for a long time. Commit e39f99046 broke this by trying to parse the integer and fractional parts separately, without accounting for the possibility of an exponent. In principle that coding allowed for precise conversions of field values wider than 15 decimal digits, but that does not seem like a goal worth sweating bullets for. So, rather than trying to manage an exponent on top of the existing complexity, let's just revert to the previous coding that used strtod() by itself. We can still improve on the old code to the extent of allowing the value to range up to 1.0e15 rather than only INT_MAX. (Allowing more than that risks creating problems due to precision loss: the converted fractional part might have absolute value more than 1. Perhaps that could be dealt with in some way, but it really does not seem worth additional effort.) Per bug #17795 from Alexander Lakhin. Back-patch to v15 where the faulty code came in. Discussion: https://postgr.es/m/17795-748d6db3ed95d313@postgresql.org
* Print the correct aliases for DML target tables in ruleutils.Tom Lane2023-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | ruleutils.c blindly printed the user-given alias (or nothing if there hadn't been one) for the target table of INSERT/UPDATE/DELETE queries. That works a large percentage of the time, but not always: for queries appearing in WITH, it's possible that we chose a different alias to avoid conflict with outer-scope names. Since the chosen alias would be used in any Var references to the target table, this'd lead to an inconsistent printout with consequences such as dump/restore failures. The correct logic for printing (or not) a relation alias was embedded in get_from_clause_item. Factor it out to a separate function so that we don't need a jointree node to use it. (Only a limited part of that function can be reached from these new call sites, but this seems like the cleanest non-duplicative factorization.) In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql. Initial report from Jonathan Katz; thanks to Vignesh C for analysis. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
* Avoid dereferencing an undefined pointer in DecodeInterval().Tom Lane2023-02-12
| | | | | | | | | | | | | | | Commit e39f99046 moved some code up closer to the start of DecodeInterval(), without noticing that it had been implicitly relying on previous checks to reject the case of empty input. Given empty input, we'd now dereference a pointer that hadn't been set, possibly leading to a core dump. (But if we fail to provoke a SIGSEGV, nothing bad happens, and the expected syntax error is thrown a bit later.) Per bug #17788 from Alexander Lakhin. Back-patch to v15 where the fault was introduced. Discussion: https://postgr.es/m/17788-dabac9f98f7eafd5@postgresql.org
* Make int64_div_fast_to_numeric() more robust.Dean Rasheed2023-02-03
| | | | | | | | | | | | | | | | | | The prior coding of int64_div_fast_to_numeric() had a number of bugs that would cause it to fail under different circumstances, such as with log10val2 <= 0, or log10val2 a multiple of 4, or in the "slow" numeric path with log10val2 >= 10. None of those could be triggered by any of our current code, which only uses log10val2 = 3 or 6. However, they made it a hazard for any future code that might use it. Also, since this is exported by numeric.c, users writing their own C code might choose to use it. Therefore fix, and back-patch to v14, where it was introduced. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCW8gXgW0tgPxPgHDPhVX71%2BSWFRkhnXy%2BTfGDsKLepu2g%40mail.gmail.com
* Remove arbitrary FUNC_MAX_ARGS limit in int2vectorin and oidvectorin.Tom Lane2023-01-15
| | | | | | | | | | | | | | | | | | | | | | | | | int2vectorin limited the number of array elements it'd take to FUNC_MAX_ARGS, which is probably fine for the traditional use-cases. But now that pg_publication_rel.prattrs is an int2vector, it's not fine at all: it's easy to construct cases where that can have up to about MaxTupleAttributeNumber entries. Trying to replicate such tables leads to logical-replication failures. As long as we have to touch this code anyway, let's just remove the a-priori limit altogether, and let it accept any size that'll be allowed by repalloc. (Note that since int2vector isn't toastable, we cannot store arrays longer than about BLCKSZ/2; but there is no good excuse for letting int2vectorin depend on that. Perhaps we will lift the no-toast restriction someday.) While at it, also improve the equivalent logic in oidvectorin. I don't know of any practical use-case for long oidvectors right now, but doing it right actually makes the code shorter. Per report from Erik Rijkers. Back-patch to v15 where pg_publication_rel.prattrs was added. Discussion: https://postgr.es/m/668ba539-33c5-8190-ca11-def2913cb94b@xs4all.nl
* Fix jsonpath existense checking of missing variablesAlexander Korotkov2023-01-12
| | | | | | | | | | | | | | | | | The current jsonpath code assumes that the referenced variable always exists. It could only throw an error at the value valuation time. At the same time existence checking assumes variable is present without valuation, and error suppression doesn't work for missing variables. This commit makes existense checking trigger an error for missing variables. This makes the overall behavior consistent. Backpatch to 12 where jsonpath was introduced. Reported-by: David G. Johnston Discussion: https://postgr.es/m/CAKFQuwbeytffJkVnEqDyLZ%3DrQsznoTh1OgDoOF3VmOMkxcTMjA%40mail.gmail.com Author: Alexander Korotkov, David G. Johnston Backpatch-through: 12
* Fix typos in comments, code and documentationMichael Paquier2023-01-03
| | | | | | | | | | While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since a7e5457. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11
* Fix jsonb subscripting to cope with toasted subscript values.Tom Lane2022-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | jsonb_get_element() was incautious enough to use VARDATA() and VARSIZE() directly on an arbitrary text Datum. That of course fails if the Datum is short-header, compressed, or out-of-line. The typical result would be failing to match any element of a jsonb object, though matching the wrong one seems possible as well. setPathObject() was slightly brighter, in that it used VARDATA_ANY and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for short-header Datums. push_path() had the same issue. This could result in faulty subscripted insertions, though keys long enough to cause a problem are likely rare in the wild. Having seen these, I looked around for unsafe usages in the rest of the adt/json* files. There are a couple of places where it's not immediately obvious that the Datum can't be compressed or out-of-line, so I added pg_detoast_datum_packed() to cope if it is. Also, remove some other usages of VARDATA/VARSIZE on Datums we just extracted from a text array. Those aren't actively broken, but they will become so if we ever start allowing short-header array elements, which does not seem like a terribly unreasonable thing to do. In any case they are not great coding examples, and they could also do with comments pointing out that we're assuming we don't need pg_detoast_datum_packed. Per report from exe-dealer@yandex.ru. Patch by me, but thanks to David Johnston for initial investigation. Back-patch to v14 where jsonb subscripting was introduced. Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru
* Fix psql's \sf and \ef for new-style SQL functions.Tom Lane2022-12-02
| | | | | | | | | | | | | | Some options of these commands need to be able to identify the start of the function body within the output of pg_get_functiondef(). It used to be that that always began with "AS", but since the introduction of new-style SQL functions, it might also start with "BEGIN" or "RETURN". Fix that on the psql side, and add some regression tests. Noted by me awhile ago, but I didn't do anything about it. Thanks to David Johnston for a nag. Discussion: https://postgr.es/m/AM9PR01MB8268D5CDABDF044EE9F42173FE8C9@AM9PR01MB8268.eurprd01.prod.exchangelabs.com
* Fix memory leak for hashing with nondeterministic collations.Jeff Davis2022-12-01
| | | | | | | Backpatch through 12, where nondeterministic collations were introduced (5e1963fb76). Backpatch-through: 12
* Fix under-parenthesized display of AT TIME ZONE constructs.Tom Lane2022-12-01
| | | | | | | | | | | | | | | | | | In commit 40c24bfef, I forgot to use get_rule_expr_paren() for the arguments of AT TIME ZONE, resulting in possibly not printing parens for expressions that need it. But get_rule_expr_paren() wouldn't have gotten it right anyway, because isSimpleNode() hadn't been taught that COERCE_SQL_SYNTAX parent nodes don't guarantee sufficient parentheses. Improve all that. Also use this methodology for F_IS_NORMALIZED, so that we don't print useless parens for that. In passing, remove a comment that was obsoleted later. Per report from Duncan Sands. Back-patch to v14 where this code came in. (Before that, we didn't try to print AT TIME ZONE that way, so there was no bug just ugliness.) Discussion: https://postgr.es/m/f41566aa-a057-6628-4b7c-b48770ecb84a@deepbluecap.com
* YA attempt at taming worst-case behavior of get_actual_variable_range.Tom Lane2022-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We've made multiple attempts at preventing get_actual_variable_range from taking an unreasonable amount of time (3ca930fc3, fccebe421). But there's still an issue for the very first planning attempt after deletion of a large number of extremal-valued tuples. While that planning attempt will set "killed" bits on the tuples it visits and thereby reduce effort for next time, there's still a lot of work it has to do to visit the heap and then set those bits. It's (usually?) not worth it to do that much work at plan time to have a slightly better estimate, especially in a context like this where the table contents are known to be mutating rapidly. Therefore, let's bound the amount of work to be done by giving up after we've visited 100 heap pages. Giving up just means we'll fall back on the extremal value recorded in pg_statistic, so it shouldn't mean that planner estimates suddenly become worthless. Note that this means we'll still gradually whittle down the problem by setting a few more index "killed" bits in each planning attempt; so eventually we'll reach a good state (barring further deletions), even in the absence of VACUUM. Simon Riggs, per a complaint from Jakub Wartak (with cosmetic adjustments by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com
* Add comments and a missing CHECK_FOR_INTERRUPTS in ts_headline.Tom Lane2022-11-21
| | | | | | | | | | | | | | | | I just spent an annoying amount of time reverse-engineering the 100%-undocumented API between ts_headline and the text search parser's prsheadline function. Add some commentary about that while it's fresh in mind. Also remove some unused macros in wparser_def.c. While at it, I noticed that when commit 78e73e875 added a CHECK_FOR_INTERRUPTS call in TS_execute_recurse, it missed doing so in the parallel function TS_phrase_execute, which surely needs one just as much. Back-patch because of the missing CHECK_FOR_INTERRUPTS. Might as well back-patch the rest of this too.
* Fix planner failure with extended statistics on partitioned tables.Tom Lane2022-11-01
| | | | | | | | | | Some cases would result in "cache lookup failed for statistics object", due to trying to fetch inherited statistics when only non-inherited ones are available or vice versa. Richard Guo and Justin Pryzby Discussion: https://postgr.es/m/20221030170520.GM16921@telsasoft.com
* Rename SetSingleFuncCall() to InitMaterializedSRF()Michael Paquier2022-10-18
| | | | | | | | | | | | | | | | | | Per discussion, the existing routine name able to initialize a SRF function with materialize mode is unpopular, so rename it. Equally, the flags of this function are renamed, as of: - SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC - SRF_SINGLE_BLESS -> MAT_SRF_BLESS The previous function and flags introduced in 9e98583 are kept around for compatibility purposes, so as any extension code already compiled with v15 continues to work as-is. The declarations introduced here for compatibility will be removed from HEAD in a follow-up commit. The new names have been suggested by Andres Freund and Melanie Plageman. Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de Backpatch-through: 15
* Fix EXPLAIN of SEARCH BREADTH FIRST with a constant initial value.Tom Lane2022-10-16
| | | | | | | | | | | | | | | | | | | | | | If the non-recursive term of a SEARCH BREADTH FIRST recursive query has only constants in its target list, the planner will fold the starting RowExpr added by rewrite into a simple Const of type RECORD. The executor doesn't have any problem with that --- but EXPLAIN VERBOSE will encounter the Const as the ultimate source of truth about what the field names of the SET column are, and it didn't know what to do with that. Fortunately, we can pull the identifying typmod out of the Const, in much the same way that record_out would. For reasons that remain a bit obscure to me, this only fails with SEARCH BREADTH FIRST, not SEARCH DEPTH FIRST or CYCLE. But I added regression test cases for both of those options too, just to make sure we don't break it in future. Per bug #17644 from Matthijs van der Vleuten. Back-patch to v14 where these constructs were added. Discussion: https://postgr.es/m/17644-3bd1f3036d6d7a16@postgresql.org
* Revert "Optimize order of GROUP BY keys".Tom Lane2022-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and several follow-on fixes. The idea of making a cost-based choice of the order of the sorting columns is not fundamentally unsound, but it requires cost information and data statistics that we don't really have. For example, relying on procost to distinguish the relative costs of different sort comparators is pretty pointless so long as most such comparator functions are labeled with cost 1.0. Moreover, estimating the number of comparisons done by Quicksort requires more than just an estimate of the number of distinct values in the input: you also need some idea of the sizes of the larger groups, if you want an estimate that's good to better than a factor of three or so. That's data that's often unknown or not very reliable. Worse, to arrive at estimates of the number of calls made to the lower-order-column comparison functions, the code needs to make estimates of the numbers of distinct values of multiple columns, which are necessarily even less trustworthy than per-column stats. Even if all the inputs are perfectly reliable, the cost algorithm as-implemented cannot offer useful information about how to order sorting columns beyond the point at which the average group size is estimated to drop to 1. Close inspection of the code added by db0d67db2 shows that there are also multiple small bugs. These could have been fixed, but there's not much point if we don't trust the estimates to be accurate in-principle. Finally, the changes in cost_sort's behavior made for very large changes (often a factor of 2 or so) in the cost estimates for all sorting operations, not only those for multi-column GROUP BY. That naturally changes plan choices in many situations, and there's precious little evidence to show that the changes are for the better. Given the above doubts about whether the new estimates are really trustworthy, it's hard to summon much confidence that these changes are better on the average. Since we're hard up against the release deadline for v15, let's revert these changes for now. We can always try again later. Note: in v15, I left T_PathKeyInfo in place in nodes.h even though it's unreferenced. Removing it would be an ABI break, and it seems a bit late in the release cycle for that. Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Change some errdetail() to errdetail_internal()Alvaro Herrera2022-09-28
| | | | | | | | | | | | This prevents marking the argument string for translation for gettext, and it also prevents the given string (which is already translated) from being translated at runtime. Also, mark the strings used as arguments to check_rolespec_name for translation. Backpatch all the way back as appropriate. None of this is caught by any tests (necessarily so), so I verified it manually.
* Suppress variable-set-but-not-used warnings from clang 15.Tom Lane2022-09-20
| | | | | | | | | | | | | | | | | | | | | | | | clang 15+ will issue a set-but-not-used warning when the only use of a variable is in autoincrements (e.g., "foo++;"). That's perfectly sensible, but it detects a few more cases that we'd not noticed before. Silence the warnings with our usual methods, such as PG_USED_FOR_ASSERTS_ONLY, or in one case by actually removing a useless variable. One thing that we can't nicely get rid of is that with %pure-parser, Bison emits "yynerrs" as a local variable that falls foul of this warning. To silence those, I inserted "(void) yynerrs;" in the top-level productions of affected grammars. Per recently-established project policy, this is a candidate for back-patching into out-of-support branches: it suppresses annoying compiler warnings but changes no behavior. Hence, back-patch to 9.5, which is as far as these patches go without issues. (A preliminary check shows that the prior branches need some other set-but-not-used cleanups too, so I'll leave them for another day.) Discussion: https://postgr.es/m/514615.1663615243@sss.pgh.pa.us
* Fix NaN comparison in circle_same testDaniel Gustafsson2022-09-12
| | | | | | | | | | | | | | | | | | | | | | Commit c4c340088 changed geometric operators to use float4 and float8 functions, and handle NaN's in a better way. The circle sameness test had a typo in the code which resulted in all comparisons with the left circle having a NaN radius considered same. postgres=# select '<(0,0),NaN>'::circle ~= '<(0,0),1>'::circle; ?column? ---------- t (1 row) This fixes the sameness test to consider the radius of both the left and right circle. Backpatch to v12 where this was introduced. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAo8dK=yctg2ZzjJuzV4zgOPBxRU5+Kb+yatFiddtQk6Rw@mail.gmail.com Backpatch-through: v12
* Revert SQL/JSON featuresAndrew Dunstan2022-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
* Defend against stack overrun in a few more places.Tom Lane2022-08-24
| | | | | | | | | | | | | | | | | | | SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c, and regex_selectivity_sub() in selectivity estimation could recurse until stack overflow; fix by adding check_stack_depth() calls. So could next() in the regex compiler, but that case is better fixed by converting its tail recursion to a loop. (We probably get better code that way too, since next() can now be inlined into its sole caller.) There remains a reachable stack overrun in the Turkish stemmer, but we'll need some advice from the Snowball people about how to fix that. Per report from Egor Chindyaskin and Alexander Lakhin. These mistakes are old, so back-patch to all supported branches. Richard Guo and Tom Lane Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
* Remove shadowed local variables that are new in v15David Rowley2022-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | Compiling with -Wshadow=compatible-local yields quite a few warnings about local variables being shadowed by compatible local variables in an inner scope. Of course, this is perfectly valid in C, but we have had bugs in the past as a result of developers failing to notice this. af7d270dd is a recent example. Here we do a cleanup of warnings we receive from -Wshadow=compatible-local for code which is new to PostgreSQL 15. We've yet to have the discussion about if we actually ever want to run that as a standard compilation flag. We'll need to at least get the number of warnings down to something easier to manage before we can realistically consider if we want this or not. This commit is the first step towards reducing the warnings. The changes being made here are all fairly trivial. Because of that, and the fact that v15 is still in beta, this is being back-patched into 15. It seems more risky not to do this as the risk of future bugs is increased by the additional conflicts that this commit could cause for any future bug fixes touching the same areas as this commit. Author: Justin Pryzby Discussion: https://postgr.es/m/20220817145434.GC26426%40telsasoft.com Backpatch-through: 15
* Preserve memory context of VarStringSortSupport buffers.Tom Lane2022-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enlarging the work buffers of a VarStringSortSupport object, varstrfastcmp_locale was careful to keep them in the ssup_cxt memory context; but varstr_abbrev_convert just used palloc(). The latter creates a hazard that the buffers could be freed out from under the VarStringSortSupport object, resulting in stomping on whatever gets allocated in that memory later. In practice, because we only use this code for ICU collations (cf. 3df9c374e), the problem is confined to use of ICU collations. I believe it may have been unreachable before the introduction of incremental sort, too, as traditional sorting usually just uses one context for the duration of the sort. We could fix this by making the broken stanzas in varstr_abbrev_convert match the non-broken ones in varstrfastcmp_locale. However, it seems like a better idea to dodge the issue altogether by replacing the pfree-and-allocate-anew coding with repalloc, which automatically preserves the chunk's memory context. This fix does add a few cycles because repalloc will copy the chunk's content, which the existing coding assumes is useless. However, we don't expect that these buffer enlargement operations are performance-critical. Besides that, it's far from obvious that copying the buffer contents isn't required, since these stanzas make no effort to mark the buffers invalid by resetting last_returned, cache_blob, etc. That seems to be safe upon examination, but it's fragile and could easily get broken in future, which wouldn't get revealed in testing with short-to-moderate-size strings. Per bug #17584 from James Inform. Whether or not the issue is reachable in the older branches, this code has been broken on its own terms from its introduction, so patch all the way back. Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org
* Change type "char"'s I/O format for non-ASCII characters.Tom Lane2022-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously, a byte with the high bit set was just transmitted as-is by charin() and charout(). This is problematic if the database encoding is multibyte, because the result of charout() won't be validly encoded, which breaks various stuff that expects all text strings to be validly encoded. We've previously decided to enforce encoding validity rather than try to individually harden each place that might have a problem with such strings, so it's time to do something about "char". To fix, represent high-bit-set characters as \ooo (backslash and three octal digits), following the ancient "escape" format for bytea. charin() will continue to accept the old way as well, though that is only reachable in single-byte encodings. Add some test cases just so there is coverage for this code. We'll otherwise leave this question undocumented as it was before, because we don't really want to encourage end-user use of "char". For the moment, back-patch into v15 so that this change appears in 15beta3. If there's not great pushback we should consider absorbing this change into the older branches. Discussion: https://postgr.es/m/2318797.1638558730@sss.pgh.pa.us
* Fix trim_array() for zero-dimensional array argument.Tom Lane2022-07-31
| | | | | | | | | | | | | | The code tried to access ARR_DIMS(v)[0] and ARR_LBOUND(v)[0] whether or not those values exist. This made the range check on the "n" argument unstable --- it might or might not fail, and if it did it would report garbage for the allowed upper limit. These bogus accesses would probably annoy Valgrind, and if you were very unlucky even lead to SIGSEGV. Report and fix by Martin Kalcher. Back-patch to v14 where this function was added. Discussion: https://postgr.es/m/baaeb413-b8a8-4656-5757-ef347e5ec11f@aboutsource.net
* Fix path reference when parsing pg_ident.conf for pg_ident_file_mappingsMichael Paquier2022-07-26
| | | | | | | | | | | | | | | | | | | | | | | | Since a2c8499, HbaFileName (default pg_hba.conf) was getting used instead of IdentFileName (default pg_ident.conf) as the parent file to use as reference when parsing the contents of pg_ident.conf, with pg_ident.conf correctly opened, when feeding this information to pg_ident_file_mappings. This had two consequences: - On an I/O error when reading pg_ident.conf, the user would get an ERROR message referring to pg_hba.conf and not pg_ident.conf. - When reading an external file with a relative path using '@' in pg_ident.conf, the directory used to look at the file to load would be the base directory of pg_hba.conf rather than the one of pg_ident.conf, leading to errors in pg_ident_file_mappings inconsistent with what gets loaded at startup when pg_ident.conf and pg_hba.conf are located in different directories. This error only impacted the SQL view pg_ident_file_mappings that uses a logic new to v15 to fill the view with the parsed information, not the code paths loading these authentication files at startup. Author: Julien Rouhaud Discussion: https://postgr.es/m/20220726050402.vsr6fmz7rsgpmdz3@jrouhaud Backpatch-through: 15
* Close old gap in dependency checks for functions returning composite.Tom Lane2022-07-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dependency logic failed to register a column-level dependency when a view or rule contains a reference to a specific column of the result of a function-returning-composite. That meant you could drop the column from the composite type, causing trouble for future executions of the view. We've known about this for years, but never summoned the energy to actually fix it, instead installing various low-level defenses to prevent crashing on references to dropped columns. We had to do that to plug the hole in stable branches, where there might be pre-existing broken references; but let's fix the root cause today. To do that, add some logic (borrowed from get_rte_attribute_is_dropped) to find_expr_references_walker, to check whether a Var referencing an RTE_FUNCTION RTE is referencing a column of a composite type, and if so add the proper dependency. However ... it seems mighty unwise to remove said low-level defenses, since there could be other bugs now or in the future that allow reaching them. By the same token, letting those defenses go untested seems unwise. Hence, rather than just dropping the associated test cases, hack them to continue working by the expedient of manually dropping the pg_depend entries that this fix installs. Back-patch into v15. I don't want to risk changing this behavior in stable branches, but it seems not too late for v15. (Since we have already forced initdb for beta3, we can be sure that all production v15 installations will have these added dependencies.) Discussion: https://postgr.es/m/182492.1658431155@sss.pgh.pa.us
* Fix ruleutils issues with dropped cols in functions-returning-composite.Tom Lane2022-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
* Tweak detail and hint messages to be consistent with project policyMichael Paquier2022-07-20
| | | | | | | | | | | Detail and hint messages should be full sentences and should end with a period, but some of the messages newly-introduced in v15 did not follow that. Author: Justin Pryzby Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com Backpatch-through: 15
* Fix omissions in support for the "regcollation" type.Tom Lane2022-07-17
| | | | | | | | | | | | | The patch that added regcollation doesn't seem to have been too thorough about supporting it everywhere that other reg* types are supported. Fix that. (The find_expr_references omission is moderately serious, since it could result in missing expression dependencies. The others are less exciting.) Noted while fixing bug #17483. Back-patch to v13 where regcollation was added. Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
* Invent qsort_interruptible().Tom Lane2022-07-12
| | | | | | | | | | | | | | | | | | | | | | | | Justin Pryzby reported that some scenarios could cause gathering of extended statistics to spend many seconds in an un-cancelable qsort() operation. To fix, invent qsort_interruptible(), which is just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS every so often. This bloats the backend by a couple of kB, which seems like a good investment. (We considered just enabling CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions, but there are some callers for which that'd demonstrably be unsafe. Opt-in seems like a better way.) For now, just apply qsort_interruptible() in statistics collection. There's probably more places where it could be useful, but we can always change other call sites as we find problems. Back-patch to v14. Before that we didn't have extended stats on expressions, so that the problem was less severe. Also, this patch depends on the sort_template infrastructure introduced in v14. Tom Lane and Justin Pryzby Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
* Remove %error-verbose directive from jsonpath parserAndrew Dunstan2022-07-03
| | | | | | | | | | | None of the other bison parsers contains this directive, and it gives rise to some unfortunate and impenetrable messages, so just remove it. Backpatch to release 12, where it was introduced. Per gripe from Erik Rijkers Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl
* Fix visibility check when XID is committed in CLOG but not in procarray.Heikki Linnakangas2022-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
* Avoid ERRCODE_INTERNAL_ERROR in oracle_compat.c functions.Tom Lane2022-05-26
| | | | | | | | | | | | | | | | | | | repeat() checked for integer overflow during its calculation of the required output space, but it just passed the resulting integer to palloc(). This meant that result sizes between 1GB and 2GB led to ERRCODE_INTERNAL_ERROR, "invalid memory alloc request size" rather than ERRCODE_PROGRAM_LIMIT_EXCEEDED, "requested length too large". That seems like a bit of a wart, so add an explicit AllocSizeIsValid check to make these error cases uniform. Do likewise in the sibling functions lpad() etc. While we're here, also modernize their overflow checks to use pg_mul_s32_overflow() etc instead of expensive divisions. Per complaint from Japin Li. This is basically cosmetic, so I don't feel a need to back-patch. Discussion: https://postgr.es/m/ME3P282MB16676ED32167189CB0462173B6D69@ME3P282MB1667.AUSP282.PROD.OUTLOOK.COM
* Show 'AS "?column?"' explicitly when it's important.Tom Lane2022-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ruleutils.c was coded to suppress the AS label for a SELECT output expression if the column name is "?column?", which is the parser's fallback if it can't think of something better. This is fine, and avoids ugly clutter, so long as (1) nothing further up in the parse tree relies on that column name or (2) the same fallback would be assigned when the rule or view definition is reloaded. Unfortunately (2) is far from certain, both because ruleutils.c might print the expression in a different form from how it was originally written and because FigureColname's rules might change in future releases. So we shouldn't rely on that. Detecting exactly whether there is any outer-level use of a SELECT column name would be rather expensive. This patch takes the simpler approach of just passing down a flag indicating whether there *could* be any outer use; for example, the output column names of a SubLink are not referenceable, and we also do not care about the names exposed by the right-hand side of a setop. This is sufficient to suppress unwanted clutter in all but one case in the regression tests. That seems like reasonable evidence that it won't be too much in users' faces, while still fixing the cases we need to fix. Per bug #17486 from Nicolas Lutic. This issue is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
* Rename JsonIsPredicate.value_type, fix JSON backend/nodes/ infrastructure.Tom Lane2022-05-13
| | | | | | | | | | | | | | | | | | | | | I started out with the intention to rename value_type to item_type to avoid a collision with a typedef name that appears on some platforms. Along the way, I noticed that the adjacent field "format" was not being correctly handled by the backend/nodes/ infrastructure functions: copyfuncs.c erroneously treated it as a scalar, while equalfuncs, outfuncs, and readfuncs omitted handling it at all. This looks like it might be cosmetic at the moment because the field is always NULL after parse analysis; but that's likely a bug in itself, and the code's certainly not very future-proof. Let's fix it while we can still do so without forcing an initdb on beta testers. Further study found a few other inconsistencies in the backend/nodes/ infrastructure for the recently-added JSON node types, so fix those too. catversion bumped because of potential change in stored rules. Discussion: https://postgr.es/m/526703.1652385613@sss.pgh.pa.us
* Indent C code in flex and bison filesPeter Eisentraut2022-05-13
| | | | | | In the style of pgindent, done semi-manually. Discussion: https://www.postgresql.org/message-id/flat/7d062ecc-7444-23ec-a159-acd8adf9b586%40enterprisedb.com
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.