aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
* Improve error message about valid value for distance in phrase operator.Fujii Masao2021-08-25
| | | | | | | | | | | | | | The distance in phrase operator must be an integer value between zero and MAXENTRYPOS inclusive. But previously the error message about its valid value included the information about its upper limit but not lower limit (i.e., zero). This commit improves the error message so that it also includes the information about its lower limit. Back-patch to v9.6 where full-text phrase search was supported. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com
* Revert refactoring of hex code to src/common/Michael Paquier2021-08-19
| | | | | | | | | | | | | | | | | | | | | | | This is a combined revert of the following commits: - c3826f8, a refactoring piece that moved the hex decoding code to src/common/. This code was cleaned up by aef8948, as it originally included no overflow checks in the same way as the base64 routines in src/common/ used by SCRAM, making it unsafe for its purpose. - aef8948, a more advanced refactoring of the hex encoding/decoding code to src/common/ that added sanity checks on the result buffer for hex decoding and encoding. As reported by Hans Buschmann, those overflow checks are expensive, and it is possible to see a performance drop in the decoding/encoding of bytea or LOs the longer they are. Simple SQLs working on large bytea values show a clear difference in perf profile. - ccf4e27, a cleanup made possible by aef8948. The reverts of all those commits bring back the performance of hex decoding and encoding back to what it was in ~13. Fow now and post-beta3, this is the simplest option. Reported-by: Hans Buschmann Discussion: https://postgr.es/m/1629039545467.80333@nidsa.net Backpatch-through: 14
* Let regexp_replace() make use of REG_NOSUB when feasible.Tom Lane2021-08-09
| | | | | | | | | | | | | | | | | | If the replacement string doesn't contain \1...\9, then we don't need sub-match locations, so we can use the REG_NOSUB optimization here too. There's already a pre-scan of the replacement string to look for backslashes, so extend that to check for digits, and refactor to allow that to happen before we compile the regexp. While at it, try to speed up the pre-scan by using memchr() instead of a handwritten loop. It's likely that this is lost in the noise compared to the regexp processing proper, but maybe not. In any case, this coding is shorter. Also, add some test cases to improve the poor coverage of appendStringInfoRegexpSubstr(). Discussion: https://postgr.es/m/3534632.1628536485@sss.pgh.pa.us
* Avoid determining regexp subexpression matches, when possible.Tom Lane2021-08-09
| | | | | | | | | | | | | | | | | | | | | Identifying the precise match locations for parenthesized subexpressions is a fairly expensive task given the way our regexp engine works, both at regexp compile time (where we must create an optimized NFA for each parenthesized subexpression) and at runtime (where determining exact match locations requires laborious search). Up to now we've made little attempt to optimize this situation. This patch identifies cases where we know at compile time that we won't need to know subexpression match locations, and teaches the regexp compiler to not bother creating per-subexpression regexps for parenthesis pairs that are not referenced by backrefs elsewhere in the regexp. (To preserve semantics, we obviously still have to pin down the match locations of backref references.) Users could have obtained the same results before this by being careful to write "non capturing" parentheses wherever possible, but few people bother with that. Discussion: https://postgr.es/m/2219936.1628115334@sss.pgh.pa.us
* Remove some unnecessary casts in format argumentsPeter Eisentraut2021-08-08
| | | | | | We can use %zd or %zu directly, no need to cast to int. Conversely, some code was casting away from int when it could be using %d directly.
* Adjust the integer overflow tests in the numeric code.Dean Rasheed2021-08-06
| | | | | | | | | | | | | | | | | | | | Formerly, the numeric code tested whether an integer value of a larger type would fit in a smaller type by casting it to the smaller type and then testing if the reverse conversion produced the original value. That's perfectly fine, except that it caused a test failure on buildfarm animal castoroides, most likely due to a compiler bug. Instead, do these tests by comparing against PG_INT16/32_MIN/MAX. That matches existing code in other places, such as int84(), which is more widely tested, and so is less likely to go wrong. While at it, add regression tests covering the numeric-to-int8/4/2 conversions, and adjust the recently added tests to the style of 434ddfb79a (on the v11 branch) to make failures easier to diagnose. Per buildfarm via Tom Lane, reviewed by Tom Lane. Discussion: https://postgr.es/m/2394813.1628179479%40sss.pgh.pa.us
* Fix wordingPeter Eisentraut2021-08-06
|
* Fix division-by-zero error in to_char() with 'EEEE' format.Dean Rasheed2021-08-05
| | | | | | | | | | | | | | | | | | | | This fixes a long-standing bug when using to_char() to format a numeric value in scientific notation -- if the value's exponent is less than -NUMERIC_MAX_DISPLAY_SCALE-1 (-1001), it produced a division-by-zero error. The reason for this error was that get_str_from_var_sci() divides its input by 10^exp, which it produced using power_var_int(). However, the underflow test in power_var_int() causes it to return zero if the result scale is too small. That's not a problem for power_var_int()'s only other caller, power_var(), since that limits the rscale to 1000, but in get_str_from_var_sci() the exponent can be much smaller, requiring a much larger rscale. Fix by introducing a new function to compute 10^exp directly, with no rscale limit. This also allows 10^exp to be computed more efficiently, without any numeric multiplication, division or rounding. Discussion: https://postgr.es/m/CAEZATCWhojfH4whaqgUKBe8D5jNHB8ytzemL-PnRx+KCTyMXmg@mail.gmail.com
* pgstat: split reporting/fetching of bgwriter and checkpointer stats.Andres Freund2021-08-04
| | | | | | | | | | | | | | | | These have been unrelated since bgwriter and checkpointer were split into two processes in 806a2aee379. As there several pending patches (shared memory stats, extending the set of tracked IO / buffer statistics) that are made a bit more awkward by the grouping, split them. Done separately to make reviewing easier. This does *not* change the contents of pg_stat_bgwriter or move fields out of bgwriter/checkpointer stats that arguably do not belong in either. However pgstat_fetch_global() was renamed and split into pgstat_fetch_stat_checkpointer() and pgstat_fetch_stat_bgwriter(). Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
* Add assorted new regexp_xxx SQL functions.Tom Lane2021-08-03
| | | | | | | | | | | | | | | | This patch adds new functions regexp_count(), regexp_instr(), regexp_like(), and regexp_substr(), and extends regexp_replace() with some new optional arguments. All these functions follow the definitions used in Oracle, although there are small differences in the regexp language due to using our own regexp engine -- most notably, that the default newline-matching behavior is different. Similar functions appear in DB2 and elsewhere, too. Aside from easing portability, these functions are easier to use for certain tasks than our existing regexp_match[es] functions. Gilles Darold, heavily revised by me Discussion: https://postgr.es/m/fc160ee0-c843-b024-29bb-97b5da61971f@darold.net
* interval: round values when spilling to monthsBruce Momjian2021-08-03
| | | | | | | | | | | Previously spilled units greater than months were truncated to months. Also document the spill behavior. Reported-by: Bryn Llewelly Discussion: https://postgr.es/m/BDAE4B56-3337-45A2-AC8A-30593849D6C0@yugabyte.com Backpatch-through: master
* Fix corner-case errors and loss of precision in numeric_power().Dean Rasheed2021-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a couple of related problems that arise when raising numbers to very large powers. Firstly, when raising a negative number to a very large integer power, the result should be well-defined, but the previous code would only cope if the exponent was small enough to go through power_var_int(). Otherwise it would throw an internal error, attempting to take the logarithm of a negative number. Fix this by adding suitable handling to the general case in power_var() to cope with negative bases, checking for integer powers there. Next, when raising a (positive or negative) number whose absolute value is slightly less than 1 to a very large power, the result should approach zero as the power is increased. However, in some cases, for sufficiently large powers, this would lose all precision and return 1 instead of 0. This was due to the way that the local_rscale was being calculated for the final full-precision calculation: local_rscale = rscale + (int) val - ln_dweight + 8 The first two terms on the right hand side are meant to give the number of significant digits required in the result ("val" being the estimated result weight). However, this failed to account for the fact that rscale is clipped to a maximum of NUMERIC_MAX_DISPLAY_SCALE (1000), and the result weight might be less then -1000, causing their sum to be negative, leading to a loss of precision. Fix this by forcing the number of significant digits calculated to be nonnegative. It's OK for it to be zero (when the result weight is less than -1000), since the local_rscale value then includes a few extra digits to ensure an accurate result. Finally, add additional underflow checks to exp_var() and power_var(), so that they consistently return zero for cases like this where the result is indistinguishable from zero. Some paths through this code already returned zero in such cases, but others were throwing overflow errors. Dean Rasheed, reviewed by Yugo Nagata. Discussion: http://postgr.es/m/CAEZATCW6Dvq7+3wN3tt5jLj-FyOcUgT5xNoOqce5=6Su0bCR0w@mail.gmail.com
* Disallow negative strides in date_bin()John Naylor2021-07-28
| | | | | | | | | | | It's not clear what the semantics of negative strides would be, so throw an error instead. Per report from Bauyrzhan Sakhariyev Reviewed-by: Tom Lane, Michael Paquier Discussion: https://www.postgresql.org/message-id/CAKpL73vZmLuFVuwF26FJ%2BNk11PVHhAnQRoREFcA03x7znRoFvA%40mail.gmail.com Backpatch to v14
* Avoid using ambiguous word "non-negative" in error messages.Fujii Masao2021-07-28
| | | | | | | | | | | | | | | | | | | | | | The error messages using the word "non-negative" are confusing because it's ambiguous about whether it accepts zero or not. This commit improves those error messages by replacing it with less ambiguous word like "greater than zero" or "greater than or equal to zero". Also this commit added the note about the word "non-negative" to the error message style guide, to help writing the new error messages. When postgres_fdw option fetch_size was set to zero, previously the error message "fetch_size requires a non-negative integer value" was reported. This error message was outright buggy. Therefore back-patch to all supported versions where such buggy error message could be thrown. Reported-by: Hou Zhijie Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Use the "pg_temp" schema alias in EXPLAIN and related output.Tom Lane2021-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch causes EXPLAIN output to refer to objects that are in the current session's temp schema with the "pg_temp" schema alias rather than that schema's actual name. This is useful for our own testing purposes since it will stabilize EXPLAIN VERBOSE output for such cases, allowing us to use that in regression tests. It should be less confusing for end users too. Since ruleutils.c needs to change behavior for this, the change also leaks into a few other users of ruleutils.c, for example pg_get_viewdef(). AFAICS that won't cause any problems. We did find that aggressively trying to change this behavior across-the-board would cause issues, but as long as "pg_temp" only appears within generated SQL text, I think it'll be fine. Along the way, make get_namespace_name_or_temp conform to the same API as get_namespace_name, ie that it returns a palloc'd string or NULL. The current behavior hasn't caused any bugs since no callers attempt to pfree the result, but if it gets more widespread usage that could become a problem. Amul Sul, reviewed and extended by me Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
* Allow numeric scale to be negative or greater than precision.Dean Rasheed2021-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, when specifying NUMERIC(precision, scale), the scale had to be in the range [0, precision], which was per SQL spec. This commit extends the range of allowed scales to [-1000, 1000], independent of the precision (whose valid range remains [1, 1000]). A negative scale implies rounding before the decimal point. For example, a column might be declared with a scale of -3 to round values to the nearest thousand. Note that the display scale remains non-negative, so in this case the display scale will be zero, and all digits before the decimal point will be displayed. A scale greater than the precision supports fractional values with zeros immediately after the decimal point. Take the opportunity to tidy up the code that packs, unpacks and validates the contents of a typmod integer, encapsulating it in a small set of new inline functions. Bump the catversion because the allowed contents of atttypmod have changed for numeric columns. This isn't a change that requires a re-initdb, but negative scale values in the typmod would confuse old backends. Dean Rasheed, with additional improvements by Tom Lane. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCWdNLgpKihmURF8nfofP0RFtAKJ7ktY6GcZOPnMfUoRqA@mail.gmail.com
* Fix division by zero error in date_binJohn Naylor2021-07-22
| | | | | | Bauyrzhan Sakhariyev, via Github Backpatch to v14
* Use l*_node() family of functions where appropriatePeter Eisentraut2021-07-19
| | | | | | | Instead of castNode(…, lfoo(…)) Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/87eecahraj.fsf@wibble.ilmari.org
* Support for unnest(multirange)Alexander Korotkov2021-07-18
| | | | | | | | | | | | | | | | It has been spotted that multiranges lack of ability to decompose them into individual ranges. Subscription and proper expanded object representation require substantial work, and it's too late for v14. This commit provides the implementation of unnest(multirange), which is quite trivial. unnest(multirange) is defined as a polymorphic procedure. Catversion is bumped. Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org Author: Alexander Korotkov Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu, Tom Lane Reviewed-by: Alvaro Herrera
* Rename debug_invalidate_system_caches_always to debug_discard_caches.Tom Lane2021-07-13
| | | | | | | | The name introduced by commit 4656e3d66 was agreed to be unreasonably long. To match this change, rename initdb's recently-added --clobber-cache option to --discard-caches. Discussion: https://postgr.es/m/1374320.1625430433@sss.pgh.pa.us
* Fix numeric_mul() overflow due to too many digits after decimal point.Dean Rasheed2021-07-10
| | | | | | | | | This fixes an overflow error when using the numeric * operator if the result has more than 16383 digits after the decimal point by rounding the result. Overflow errors should only occur if the result has too many digits *before* the decimal point. Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com
* Teach pg_size_pretty and pg_size_bytes about petabytesDavid Rowley2021-07-09
| | | | | | | | | | | | | | | | There was talk about adding units all the way up to yottabytes but it seems quite far-fetched that anyone would need those. Since such large units are not exactly commonplace, it seems unlikely that having pg_size_pretty outputting unit any larger than petabytes would actually be helpful to anyone. Since petabytes are on the horizon, let's just add those only. Maybe one day we'll get to add additional units, but it will likely be a while before we'll need to think beyond petabytes in regards to the size of a database. Author: David Christensen Discussion: https://postgr.es/m/CAOxo6XKmHc_WZip-x5QwaOqFEiCq_SVD0B7sbTZQk+qqcn2qaw@mail.gmail.com
* Use a lookup table for units in pg_size_pretty and pg_size_bytesDavid Rowley2021-07-09
| | | | | | | | | | | | | | | | | | | | | We've grown 2 versions of pg_size_pretty over the years, one for BIGINT and one for NUMERIC. Both should output the same, but keeping them in sync is harder than needed due to neither function sharing a source of truth about which units to use and how to transition to the next largest unit. Here we add a static array which defines the units that we recognize and have both pg_size_pretty and pg_size_pretty_numeric use it. This will make adding any units in the future a very simple task. The table contains all information required to allow us to also modify pg_size_bytes to use the lookup table, so adjust that too. There are no behavioral changes here. Author: David Rowley Reviewed-by: Dean Rasheed, Tom Lane, David Christensen Discussion: https://postgr.es/m/CAApHDvru1F7qsEVL-iOHeezJ+5WVxXnyD_Jo9nht+Eh85ekK-Q@mail.gmail.com
* Fix incorrect return value in pg_size_pretty(bigint)David Rowley2021-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to how pg_size_pretty(bigint) was implemented, it's possible that when given a negative number of bytes that the returning value would not match the equivalent positive return value when given the equivalent positive number of bytes. This was due to two separate issues. 1. The function used bit shifting to convert the number of bytes into larger units. The rounding performed by bit shifting is not the same as dividing. For example -3 >> 1 = -2, but -3 / 2 = -1. These two operations are only equivalent with positive numbers. 2. The half_rounded() macro rounded towards positive infinity. This meant that negative numbers rounded towards zero and positive numbers rounded away from zero. Here we fix #1 by dividing the values instead of bit shifting. We fix #2 by adjusting the half_rounded macro always to round away from zero. Additionally, adjust the pg_size_pretty(numeric) function to be more explicit that it's using division rather than bit shifting. A casual observer might have believed bit shifting was used due to a static function being named numeric_shift_right. However, that function was calculating the divisor from the number of bits and performed division. Here we make that more clear. This change is just cosmetic and does not affect the return value of the numeric version of the function. Here we also add a set of regression tests both versions of pg_size_pretty() which test the values directly before and after the function switches to the next unit. This bug was introduced in 8a1fab36a. Prior to that negative values were always displayed in bytes. Author: Dean Rasheed, David Rowley Discussion: https://postgr.es/m/CAEZATCXnNW4HsmZnxhfezR5FuiGgp+mkY4AzcL5eRGO4fuadWg@mail.gmail.com Backpatch-through: 9.6, where the bug was introduced.
* Prevent numeric overflows in parallel numeric aggregates.Dean Rasheed2021-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | Formerly various numeric aggregate functions supported parallel aggregation by having each worker convert partial aggregate values to Numeric and use numeric_send() as part of serializing their state. That's problematic, since the range of Numeric is smaller than that of NumericVar, so it's possible for it to overflow (on either side of the decimal point) in cases that would succeed in non-parallel mode. Fix by serializing NumericVars instead, to avoid the overflow risk and ensure that parallel and non-parallel modes work the same. A side benefit is that this improves the efficiency of the serialization/deserialization code, which can make a noticeable difference to performance with large numbers of parallel workers. No back-patch due to risk from changing the binary format of the aggregate serialization states, as well as lack of prior field complaints and low probability of such overflows in practice. Patch by me. Thanks to David Rowley for review and performance testing, and Ranier Vilela for an additional suggestion. Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com
* Replace magic constants used in pg_stat_get_replication_slot().Amit Kapila2021-06-30
| | | | | | | | | | A few variables have been using 10 as a magic constant while PG_STAT_GET_REPLICATION_SLOT_COLS can be used instead. Author: Masahiko Sawada Reviewed-By: Amit Kapila Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/CAD21AoBvqODDfmD17DkEuPCvV2KbruukXQ2Vwrv5Xi-TsAsTJA@mail.gmail.com
* Fixes for multirange selectivity estimationAlexander Korotkov2021-06-29
| | | | | | | | | | | * Fix enumeration of the multirange operators in calc_multirangesel() and calc_multirangesel() switches. * Add more regression tests for matching to empty ranges/multiranges. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/c5269c65-f967-77c5-ff7c-15e621c47f6a%40gmail.com Author: Alexander Korotkov Backpatch-through: 14, where multiranges were introduced
* Message style improvementsPeter Eisentraut2021-06-28
|
* Error message refactoringPeter Eisentraut2021-06-27
| | | | | | Take some untranslatable things out of the message and replace by format placeholders, to reduce translatable strings and reduce translation mistakes.
* Revert 29854ee8d1 due to buildfarm failuresAlexander Korotkov2021-06-15
| | | | | Reported-by: Tom Lane Discussion: https://postgr.es/m/CAPpHfdvcnw3x7jdV3r52p4%3D5S4WUxBCzcQKB3JukQHoicv1LSQ%40mail.gmail.com
* Support for unnest(multirange) and cast multirange as an array of rangesAlexander Korotkov2021-06-15
| | | | | | | | | | | | | | | | | | | It has been spotted that multiranges lack of ability to decompose them into individual ranges. Subscription and proper expanded object representation require substantial work, and it's too late for v14. This commit provides the implementation of unnest(multirange) and cast multirange as an array of ranges, which is quite trivial. unnest(multirange) is defined as a polymorphic procedure. The catalog description of the cast underlying procedure is duplicated for each multirange type because we don't have anyrangearray polymorphic type to use here. Catversion is bumped. Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org Author: Alexander Korotkov Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu
* Ensure pg_filenode_relation(0, 0) returns NULL.Tom Lane2021-06-12
| | | | | | | | | | | | | | Previously, a zero value for the relfilenode resulted in a confusing error message about "unexpected duplicate". This function returns NULL for other invalid relfilenode values, so zero should be treated likewise. It's been like this all along, so back-patch to all supported branches. Justin Pryzby Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com
* Reconsider the handling of procedure OUT parameters.Tom Lane2021-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
* Reorder superuser check in pg_log_backend_memory_contexts()Michael Paquier2021-06-08
| | | | | | | | | | | | | | | The use of this function is limited to superusers and the code includes a hardcoded check for that. However, the code would look for the PGPROC entry to signal for the memory dump before checking if the user is a superuser or not, which does not make sense if we know that an error will be returned. Note that the code would let one know if a process was a PostgreSQL process or not even for non-authorized users, which is not the case now, but this avoids taking ProcArrayLock that will most likely finish by being unnecessary. Thanks to Julien Rouhaud and Tom Lane for the discussion. Discussion: https://postgr.es/m/YLxw1uVGIAP5uMPl@paquier.xyz
* Adjust locations which have an incorrect copyright yearDavid Rowley2021-06-04
| | | | | | | A few patches committed after ca3b37487 mistakenly forgot to make the copyright year 2021. Fix these. Discussion: https://postgr.es/m/CAApHDvqyLmd9P2oBQYJ=DbrV8QwyPRdmXtCTFYPE08h+ip0UJw@mail.gmail.com
* Reject SELECT ... GROUP BY GROUPING SETS (()) FOR UPDATE.Tom Lane2021-06-01
| | | | | | | | | | | | | | | | | This case should be disallowed, just as FOR UPDATE with a plain GROUP BY is disallowed; FOR UPDATE only makes sense when each row of the query result can be identified with a single table row. However, we missed teaching CheckSelectLocking() to check groupingSets as well as groupClause, so that it would allow degenerate grouping sets. That resulted in a bad plan and a null-pointer dereference in the executor. Looking around for other instances of the same bug, the only one I found was in examine_simple_variable(). That'd just lead to silly estimates, but it should be fixed too. Per private report from Yaoguang Chen. Back-patch to all supported branches.
* Improve some error wording with multirange type parsingMichael Paquier2021-05-31
| | | | | | | | | | | | Braces were referred in some error messages as only brackets (not curly brackets or curly braces), which can be confusing as other types of brackets could be used. While on it, add one test to check after the case of junk characters detected after a right brace. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210514.153153.1814935914483287479.horikyota.ntt@gmail.com
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Refactor some error messages for easier translationPeter Eisentraut2021-05-12
|
* Prevent integer overflows in array subscripting calculations.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | While we were (mostly) careful about ensuring that the dimensions of arrays aren't large enough to cause integer overflow, the lower bound values were generally not checked. This allows situations where lower_bound + dimension overflows an integer. It seems that that's harmless so far as array reading is concerned, except that array elements with subscripts notionally exceeding INT_MAX are inaccessible. However, it confuses various array-assignment logic, resulting in a potential for memory stomps. Fix by adding checks that array lower bounds aren't large enough to cause lower_bound + dimension to overflow. (Note: this results in disallowing cases where the last subscript position would be exactly INT_MAX. In principle we could probably allow that, but there's a lot of code that computes lower_bound + dimension and would need adjustment. It seems doubtful that it's worth the trouble/risk to allow it.) Somewhat independently of that, array_set_element() was careless about possible overflow when checking the subscript of a fixed-length array, creating a different route to memory stomps. Fix that too. Security: CVE-2021-32027
* Make pg_get_statisticsobjdef_expressions return NULLTomas Vondra2021-05-07
| | | | | | | | | The usual behavior for functions in ruleutils.c is to return NULL when the object does not exist. pg_get_statisticsobjdef_expressions raised an error instead, so correct that. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20210505210947.GA27406%40telsasoft.com
* Revert per-index collation version tracking feature.Thomas Munro2021-05-07
| | | | | | | | | | | | | | | | | | | | | | | Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: 1bf946bd: Doc: Document known problem with Windows collation versions. cf002008: Remove no-longer-relevant test case. ef387bed: Fix bogus collation-version-recording logic. 0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>). ff942057: Suppress "warning: variable 'collcollate' set but not used". d50e3b1f: Fix assertion in collation version lookup. f24b1569: Rethink extraction of collation dependencies. 257836a7: Track collation versions for indexes. cd6f479e: Add pg_depend.refobjversion. 7d1297df: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
* Make websearch_to_tsquery() parse text in quotes as a single tokenAlexander Korotkov2021-05-03
| | | | | | | | | | | | | | | | | | | | | | | | websearch_to_tsquery() splits text in quotes into tokens and connects them with phrase operator on its own. However, that leads to surprising results when the token contains no words. For instance, websearch_to_tsquery('"aaa: bbb"') is 'aaa <2> bbb', because it is equivalent of to_tsquery(E'aaa <-> \':\' <-> bbb'). But websearch_to_tsquery('"aaa: bbb"') has to be 'aaa <-> bbb' in order to match to_tsvector('aaa: bbb'). Since 0c4f355c6a, we anyway connect lexemes of complex tokens with phrase operators. Thus, let's just websearch_to_tsquery() parse text in quotes as a single token. Therefore, websearch_to_tsquery() should process the quoted text in the same way phraseto_tsquery() does. This solution is what we exactly need and also simplifies the code. This commit is an incompatible change, so we don't backpatch it. Reported-by: Valentin Gatien-Baron Discussion: https://postgr.es/m/CA%2B0DEqiZs7gdOd4ikmg%3D0UWG%2BSwWOLxPsk_JW-sx9WNOyrb0KQ%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane, Zhihong Yu
* Revert use singular for -1 (commits 9ee7d533da and 5da9868ed9Bruce Momjian2021-05-01
| | | | | | | | | | | | | | Turns out you can specify negative values using plurals: https://english.stackexchange.com/questions/9735/is-1-followed-by-a-singular-or-plural-noun so the previous code was correct enough, and consistent with other usage in our code. Also add comment in the two places where this could be confused. Reported-by: Noah Misch Diagnosed-by: 20210425115726.GA2353095@rfd.leadboat.com
* Use HTAB for replication slot statistics.Amit Kapila2021-04-27
| | | | | | | | | | | | | | | | | | | | | | | Previously, we used to use the array of size max_replication_slots to store stats for replication slots. But that had two problems in the cases where a message for dropping a slot gets lost: 1) the stats for the new slot are not recorded if the array is full and 2) writing beyond the end of the array if the user reduces the max_replication_slots. This commit uses HTAB for replication slot statistics, resolving both problems. Now, pgstat_vacuum_stat() search for all the dead replication slots in stats hashtable and tell the collector to remove them. To avoid showing the stats for the already-dropped slots, pg_stat_replication_slots view searches slot stats by the slot name taken from pg_replication_slots. Also, we send a message for creating a slot at slot creation, initializing the stats. This reduces the possibility that the stats are accumulated into the old slot stats when a message for dropping a slot gets lost. Reported-by: Andres Freund Author: Sawada Masahiko, test case by Vignesh C Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
* Mark multirange_constructor0() and multirange_constructor2() strictAlexander Korotkov2021-04-23
| | | | | | | | | | | | | | | | These functions shouldn't receive null arguments: multirange_constructor0() doesn't have any arguments while multirange_constructor2() has a single array argument, which is never null. But mark them strict anyway for the sake of uniformity. Also, make checks for null arguments use elog() instead of ereport() as these errors should normally be never thrown. And adjust corresponding comments. Catversion is bumped. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/0f783a96-8d67-9e71-996b-f34a7352eeef%40enterprisedb.com
* adjust query id feature to use pg_stat_activity.query_idBruce Momjian2021-04-20
| | | | | | | | | | | Previously, it was pg_stat_activity.queryid to match the pg_stat_statements queryid column. This is an adjustment to patch 4f0b0966c8. This also adjusts some of the internal function calls to match. Catversion bumped. Reported-by: Álvaro Herrera, Julien Rouhaud Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql
* Fix typos and grammar in comments and docsMichael Paquier2021-04-19
| | | | | Author: Justin Pryzby Discussion: https://postgr.es/m/20210416070310.GG3315@telsasoft.com
* Add information of total data processed to replication slot stats.Amit Kapila2021-04-16
| | | | | | | | | | | | This adds the statistics about total transactions count and total transaction data logically sent to the decoding output plugin from ReorderBuffer. Users can query the pg_stat_replication_slots view to check these stats. Suggested-by: Andres Freund Author: Vignesh C and Amit Kapila Reviewed-by: Sawada Masahiko, Amit Kapila Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
* Use NameData datatype for slotname in stats.Amit Kapila2021-04-14
| | | | | | | | | | | This will make it consistent with the other usage of slotname in the code. In the passing, change pgstat_report_replslot signature to use a structure rather than multiple parameters. Reported-by: Andres Freund Author: Vignesh C Reviewed-by: Sawada Masahiko, Amit Kapila Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de