aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/arrayfuncs.c
Commit message (Collapse)AuthorAge
* Detect integer overflow while computing new array dimensions.Tom Lane2023-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | array_set_element() and related functions allow an array to be enlarged by assigning to subscripts outside the current array bounds. While these places were careful to check that the new bounds are allowable, they neglected to consider the risk of integer overflow in computing the new bounds. In edge cases, we could compute new bounds that are invalid but get past the subsequent checks, allowing bad things to happen. Memory stomps that are potentially exploitable for arbitrary code execution are possible, and so is disclosure of server memory. To fix, perform the hazardous computations using overflow-detecting arithmetic routines, which fortunately exist in all still-supported branches. The test cases added for this generate (after patching) errors that mention the value of MaxArraySize, which is platform-dependent. Rather than introduce multiple expected-files, use psql's VERBOSITY parameter to suppress the printing of the message text. v11 psql lacks that parameter, so omit the tests in that branch. Our thanks to Pedro Gallegos for reporting this problem. Security: CVE-2023-5869
* Revert "Add USER SET parameter values for pg_db_role_setting"Alexander Korotkov2023-05-17
| | | | | | | | | This reverts commit 096dd80f3ccc and its fixups beecbe8e5001, afdd9f7f0e00, 529da086ba, db93e739ac61. Catversion is bumped. Discussion: https://postgr.es/m/d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net
* Fix oversights in array manipulation.Tom Lane2023-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
* Allow parallel aggregate on string_agg and array_aggDavid Rowley2023-01-23
| | | | | | | | | | | | | | This adds combine, serial and deserial functions for the array_agg() and string_agg() aggregate functions, thus allowing these aggregates to partake in partial aggregations. This allows both parallel aggregation to take place when these aggregates are present and also allows additional partition-wise aggregation plan shapes to include plans that require additional aggregation once the partially aggregated results from the partitions have been combined. Author: David Rowley Reviewed-by: Andres Freund, Tomas Vondra, Stephen Frost, Tom Lane Discussion: https://postgr.es/m/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A@mail.gmail.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Convert a few datatype input functions to use "soft" error reporting.Tom Lane2022-12-09
| | | | | | | | | | | | | | | This patch converts the input functions for bool, int2, int4, int8, float4, float8, numeric, and contrib/cube to the new soft-error style. array_in and record_in are also converted. There's lots more to do, but this is enough to provide proof-of-concept that the soft-error API is usable, as well as reference examples for how to convert input functions. This patch is mostly by me, but it owes very substantial debt to earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul. Thanks to Andres Freund for review. Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
* Add USER SET parameter values for pg_db_role_settingAlexander Korotkov2022-12-09
| | | | | | | | | | | | | | | | The USER SET flag specifies that the variable should be set on behalf of an ordinary role. That lets ordinary roles set placeholder variables, which permission requirements are not known yet. Such a value wouldn't be used if the variable finally appear to require superuser privileges. The new flags are stored in the pg_db_role_setting.setuser array. Catversion is bumped. This commit is inspired by the previous work by Steve Chavez. Discussion: https://postgr.es/m/CAPpHfdsLd6E--epnGqXENqLP6dLwuNZrPMcNYb3wJ87WR7UBOQ%40mail.gmail.com Author: Alexander Korotkov, Steve Chavez Reviewed-by: Pavel Borisov, Steve Chavez
* Fix trim_array() for zero-dimensional array argument.Tom Lane2022-07-31
| | | | | | | | | | | | | | The code tried to access ARR_DIMS(v)[0] and ARR_LBOUND(v)[0] whether or not those values exist. This made the range check on the "n" argument unstable --- it might or might not fail, and if it did it would report garbage for the allowed upper limit. These bogus accesses would probably annoy Valgrind, and if you were very unlucky even lead to SIGSEGV. Report and fix by Martin Kalcher. Back-patch to v14 where this function was added. Discussion: https://postgr.es/m/baaeb413-b8a8-4656-5757-ef347e5ec11f@aboutsource.net
* Replace many MemSet calls with struct initializationPeter Eisentraut2022-07-16
| | | | | | | | | | | | | | This replaces all MemSet() calls with struct initialization where that is easily and obviously possible. (For example, some cases have to worry about padding bits, so I left those.) (The same could be done with appropriate memset() calls, but this patch is part of an effort to phase out MemSet(), so it doesn't touch memset() calls.) Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/9847b13c-b785-f4e2-75c3-12ec77a3b05c@enterprisedb.com
* Add construct_array_builtin, deconstruct_array_builtinPeter Eisentraut2022-07-01
| | | | | | | | | | | | | | | There were many calls to construct_array() and deconstruct_array() for built-in types, for example, when dealing with system catalog columns. These all hardcoded the type attributes necessary to pass to these functions. To simplify this a bit, add construct_array_builtin(), deconstruct_array_builtin() as wrappers that centralize this hardcoded knowledge. This simplifies many call sites and reduces the amount of hardcoded stuff that is spread around. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2914356f-9e5f-8c59-2995-5997fc48bcba%40enterprisedb.com
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Fix hash_arrayPeter Eisentraut2021-09-15
| | | | | | | | | | | Commit a3d2b1bbe904b0ca8d9fdde20f25295ff3e21f79 neglected to initialize the type_id field of the synthesized type cache entry, so it would make a new one on every call. Also, better use the per-function memory context for this; otherwise it leaks memory. Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
* Disable anonymous record hash support except in special casesPeter Eisentraut2021-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 01e658fa74 added hash support for row types. This also added support for hashing anonymous record types, using the same approach that the type cache uses for comparison support for record types: It just reports that it works, but it might fail at run time if a component type doesn't actually support the operation. We get away with that for comparison because most types support that. But some types don't support hashing, so the current state can result in failures at run time where the planner chooses hashing over sorting, whereas that previously worked if only sorting was an option. We do, however, want the record hashing support for path tracking in recursive unions, and the SEARCH and CYCLE clauses built on that. In that case, hashing is the only plan option. So enable that, this commit implements the following approach: The type cache does not report that hashing is available for the record type. This undoes that part of 01e658fa74. Instead, callers that require hashing no matter what can override that result themselves. This patch only touches the callers to make the aforementioned recursive query cases work, namely the parse analysis of unions, as well as the hash_array() function. Reported-by: Sait Talha Nisanci <sait.nisanci@microsoft.com> Bug: #17158 Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
* Prevent integer overflows in array subscripting calculations.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | While we were (mostly) careful about ensuring that the dimensions of arrays aren't large enough to cause integer overflow, the lower bound values were generally not checked. This allows situations where lower_bound + dimension overflows an integer. It seems that that's harmless so far as array reading is concerned, except that array elements with subscripts notionally exceeding INT_MAX are inaccessible. However, it confuses various array-assignment logic, resulting in a potential for memory stomps. Fix by adding checks that array lower bounds aren't large enough to cause lower_bound + dimension to overflow. (Note: this results in disallowing cases where the last subscript position would be exactly INT_MAX. In principle we could probably allow that, but there's a lot of code that computes lower_bound + dimension and would need adjustment. It seems doubtful that it's worth the trouble/risk to allow it.) Somewhat independently of that, array_set_element() was careless about possible overflow when checking the subscript of a fixed-length array, creating a different route to memory stomps. Fix that too. Security: CVE-2021-32027
* Add trim_array() function.Tom Lane2021-03-03
| | | | | | | | | | This has been in the SQL spec since 2008. It's a pretty thin wrapper around the array slice functionality, but the spec says we should have it, so here it is. Vik Fearing, reviewed by Dian Fay Discussion: https://postgr.es/m/fc92ce17-9655-8ff1-c62a-4dc4c8ccd815@postgresfriends.org
* Rethink the "read/write parameter" mechanism in pl/pgsql.Tom Lane2021-01-04
| | | | | | | | | | | | | | | | | | | | Performance issues with the preceding patch to re-implement array element assignment within pl/pgsql led me to realize that the read/write parameter mechanism is misdesigned. Instead of requiring the assignment source expression to be such that *all* its references to the target variable could be passed as R/W, we really want to identify *one* reference to the target variable to be passed as R/W, allowing any other ones to be passed read/only as they would be by default. As long as the R/W reference is a direct argument to the top-level (hence last to be executed) function in the expression, there is no harm in R/O references being passed to other lower parts of the expression. Nor is there any use-case for more than one argument of the top-level function being R/W. Hence, rewrite that logic to identify one single Param that references the target variable, and make only that Param pass a read/write reference, not any other Params referencing the target variable. Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Support subscripting of arbitrary types, not only arrays.Tom Lane2020-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
* Enable hash partitioning of text arraysPeter Eisentraut2020-11-04
| | | | | | | | | | | | | | | | hash_array_extended() needs to pass PG_GET_COLLATION() to the hash function of the element type. Otherwise, the hash function of a collation-aware data type such as text will error out, since the introduction of nondeterministic collation made hash functions require a collation, too. The consequence of this is that before this change, hash partitioning using an array over text in the partition key would not work. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/32c1fdae-95c6-5dc6-058a-a90330a3b621%40enterprisedb.com
* Weaken type-OID-matching checks in array_recv and record_recv.Tom Lane2020-07-21
| | | | | | | | | | | | | | | | | | | | | Rather than always insisting on an exact match of the type OID in the data to the element type or column type we expect, complain only when both OIDs fall within the manually-assigned range. This acknowledges the reality that user-defined types don't have stable OIDs, while still preserving some of the mistake-detection value of the old test. (It's not entirely clear whether to error if one OID is manually assigned and the other isn't. But perhaps that case could arise in cross-version cases where a former extension type has been imported into core, so I let it pass.) This change allows us to remove the prohibition on binary transfer of user-defined arrays and composites in the recently-landed support for binary logical replication (commit 9de77b545). We can just unconditionally drop that check, since if the client has asked for binary transfer it must be >= v14 and must have this change. Discussion: https://postgr.es/m/CADK3HH+R3xMn=8t3Ct+uD+qJ1KD=Hbif5NFMJ+d5DkoCzp6Vgw@mail.gmail.com
* Fix minor violations of FunctionCallInvoke usage protocol.Tom Lane2020-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Working on commit 1c455078b led me to check through FunctionCallInvoke call sites to see if every one was being honest about (a) making sure that fcinfo.isnull is initially false, and (b) checking its state after the call. Sure enough, I found some violations. The main one is that finalize_partialaggregate re-used serialfn_fcinfo without resetting isnull, even though it clearly intends to cater for serialfns that return NULL. There would only be an issue with a non-strict serialfn, since it's unlikely that a serialfn would return NULL for non-null input. We have no non-strict serialfns in core, and there may be none in the wild either, which would account for the lack of complaints. Still, it's clearly wrong, so back-patch that fix to 9.6 where finalize_partialaggregate was introduced. Also, arrayfuncs.c and rowtypes.c contained various callers that were not bothering to check for result nulls. While what's being called is a comparison or hash function that probably *shouldn't* return null, that's a lousy excuse for not having any check at all. There are existing places that just Assert(!fcinfo->isnull) in comparable situations, so I added that to the places that were calling btree comparison or hash support functions. In the places calling boolean-returning equality functions, it's quite cheap to have them treat isnull as FALSE, so make those places do that. Also remove some "locfcinfo->isnull = false" assignments that are unnecessary given the assumption that no previous call returned null. These changes seem like mostly neatnik-ism or debugging support, so I didn't back-patch.
* Modify additional power 2 calculations to use new helper functionsDavid Rowley2020-04-08
| | | | | | | | | | | | 2nd pass of modifying various places which obtain the next power of 2 of a number and make them use the new functions added in f0705bb62. In passing, also modify num_combinations(). This can be implemented using simple bitshifting rather than looping. Reviewed-by: John Naylor Discussion: https://postgr.es/m/20200114173553.GE32763%40fetter.org
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Fix inconsistencies and typos in the tree, take 11Michael Paquier2019-08-19
| | | | | | | | This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-29
| | | | | | | | This is numbered take 8, and addresses again a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com
* Fix many typos and inconsistenciesMichael Paquier2019-07-01
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com
* Don't read fields of a misaligned ExpandedObjectHeader or AnyArrayType.Noah Misch2019-06-30
| | | | | | | | | | | UBSan complains about this. Instead, cast to a suitable type requiring only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast between AnyArrayType and ArrayType, so this doesn't introduce a new assumption. Back-patch to 9.5, where AnyArrayType was introduced. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20190629210334.GA1244217@rfd.leadboat.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Collations with nondeterministic comparisonPeter Eisentraut2019-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
* Move hash_any prototype from access/hash.h to utils/hashutils.hAlvaro Herrera2019-03-11
| | | | | | | | | | | | | | | | | | | | | ... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
* Build out the planner support function infrastructure.Tom Lane2019-02-09
| | | | | | | | | | | | | | | | | | | | | | | | Add support function requests for estimating the selectivity, cost, and number of result rows (if a SRF) of the target function. The lack of a way to estimate selectivity of a boolean-returning function in WHERE has been a recognized deficiency of the planner since Berkeley days. This commit finally fixes it. In addition, non-constant estimates of cost and number of output rows are now possible. We still fall back to looking at procost and prorows if the support function doesn't service the request, of course. To make concrete use of the possibility of estimating output rowcount for SRFs, this commit adds support functions for array_unnest(anyarray) and the integer variants of generate_series; the lack of plausible rowcount estimates for those, even when it's obvious to a human, has been a repeated subject of complaints. Obviously, much more could now be done in this line, but I'm mostly just trying to get the infrastructure in place. Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
* Change function call information to be variable length.Andres Freund2019-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Fix over-allocation of space for array_out()'s result string.Tom Lane2018-09-24
| | | | | | | | | | | | | | | | | | | | | | | | array_out overestimated the space needed for its output, possibly by a very substantial amount if the array is multi-dimensional, because of wrong order of operations in the loop that counts the number of curly-brace pairs needed. While the output string is normally short-lived, this could still cause problems in extreme cases. An additional minor error was that it counted one more delimiter than is actually needed. Repair those errors, add an Assert that the space is now correctly calculated, and make some minor improvements in the comments. I also failed to resist the temptation to get rid of an integer modulus operation per array element; a simple comparison is sufficient. This bug dates clear back to Berkeley days, so back-patch to all supported versions. Keiichi Hirobe, minor additional work by me Discussion: https://postgr.es/m/CAH=EFxE9W0tRvQkixR2XJRRCToUYUEDkJZk6tnADXugPBRdcdg@mail.gmail.com
* Rethink how to get float.h in old Windows API for isnan/isinfAlvaro Herrera2018-07-11
| | | | | | | | | | | | | | | | We include <float.h> in every place that needs isnan(), because MSVC used to require it. However, since MSVC 2013 that's no longer necessary (cf. commit cec8394b5ccd), so we can retire the inclusion to a version-specific stanza in win32_port.h, where it doesn't need to pollute random .c files. The header is of course still needed in a few places for other reasons. I (Álvaro) removed float.h from a few more files than in Emre's original patch. This doesn't break the build in my system, but we'll see what the buildfarm has to say about it all. Author: Emre Hasegeli Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Replace remaining uses of pq_sendint with pq_sendint{8,16,32}.Andres Freund2017-10-11
| | | | | | | pq_sendint() remains, so extension code doesn't unnecessarily break. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
* Support arrays over domains.Tom Lane2017-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allowing arrays with a domain type as their element type was left un-done in the original domain patch, but not for any very good reason. This omission leads to such surprising results as array_agg() not working on a domain column, because the parser can't identify a suitable output type for the polymorphic aggregate. In order to fix this, first clean up the APIs of coerce_to_domain() and some internal functions in parse_coerce.c so that we consistently pass around a CoercionContext along with CoercionForm. Previously, we sometimes passed an "isExplicit" boolean flag instead, which is strictly less information; and coerce_to_domain() didn't even get that, but instead had to reverse-engineer isExplicit from CoercionForm. That's contrary to the documentation in primnodes.h that says that CoercionForm only affects display and not semantics. I don't think this change fixes any live bugs, but it makes things more consistent. The main reason for doing it though is that now build_coercion_expression() receives ccontext, which it needs in order to be able to recursively invoke coerce_to_target_type(). Next, reimplement ArrayCoerceExpr so that the node does not directly know any details of what has to be done to the individual array elements while performing the array coercion. Instead, the per-element processing is represented by a sub-expression whose input is a source array element and whose output is a target array element. This simplifies life in parse_coerce.c, because it can build that sub-expression by a recursive invocation of coerce_to_target_type(). The executor now handles the per-element processing as a compiled expression instead of hard-wired code. The main advantage of this is that we can use a single ArrayCoerceExpr to handle as many as three successive steps per element: base type conversion, typmod coercion, and domain constraint checking. The old code used two stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty inefficient, and adding yet another array deconstruction to do domain constraint checking seemed very unappetizing. In the case where we just need a single, very simple coercion function, doing this straightforwardly leads to a noticeable increase in the per-array-element runtime cost. Hence, add an additional shortcut evalfunc in execExprInterp.c that skips unnecessary overhead for that specific form of expression. The runtime speed of simple cases is within 1% or so of where it was before, while cases that previously required two levels of array processing are significantly faster. Finally, create an implicit array type for every domain type, as we do for base types, enums, etc. Everything except the array-coercion case seems to just work without further effort. Tom Lane, reviewed by Andrew Dunstan Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
* Make construct_[md_]array return a valid empty array for zero-size input.Tom Lane2017-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If construct_array() or construct_md_array() were given a dimension of zero, they'd produce an array that contains no elements but has positive dimension. This violates a general expectation that empty arrays should have ndims = 0; in particular, while arrays like this print as empty, they don't compare equal to other empty arrays. Up to now we've expected callers to avoid making such calls and instead be careful to call construct_empty_array() if there would be no elements. But this has always been an easily missed case, and we've repeatedly had to fix callers to do it right. In bug #14826, Erwin Brandstetter pointed out yet another such oversight, in ts_lexize(); and a bit of examination of other call sites found at least two more with similar issues. So let's fix the problem centrally and permanently by changing these two functions to construct a proper zero-D empty array whenever the array would be empty. This renders a few explicit calls of construct_empty_array() redundant, but the only such place I found that really seemed worth changing was in ExecEvalArrayExpr(). Although this fixes some very old bugs, no back-patch: the problem is pretty minor and the risk of changing behavior seems to outweigh the benefit in stable branches. Discussion: https://postgr.es/m/20170923125723.1448.39412@wrigleys.postgresql.org Discussion: https://postgr.es/m/20570.1506198383@sss.pgh.pa.us
* Make DatumGetFoo/PG_GETARG_FOO/PG_RETURN_FOO macro names more consistent.Tom Lane2017-09-18
| | | | | | | | | | | | | | | | | | | | | By project convention, these names should include "P" when dealing with a pointer type; that is, if the result of a GETARG macro is of type FOO *, it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO. Some newer types such as JSONB and ranges had not followed the convention, and a number of contrib modules hadn't gotten that memo either. Rename the offending macros to improve consistency. In passing, fix a few places that thought PG_DETOAST_DATUM() returns a Datum; it does not, it returns "struct varlena *". Applying DatumGetPointer to that happens not to cause any bad effects today, but it's formally wrong. Also, adjust an ltree macro that was designed without any thought for what pgindent would do with it. This is all cosmetic and shouldn't have any impact on generated code. Mark Dilger, some further tweaks by me Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com
* Fix assorted carelessness about Datum vs. int64 vs. uint64Robert Haas2017-09-01
| | | | Bugs introduced by commit 81c5e46c490e2426db243eada186995da5bb0ba7
* Introduce 64-bit hash functions with a 64-bit seed.Robert Haas2017-08-31
| | | | | | | | | | | | | | | | This will be useful for hash partitioning, which needs a way to seed the hash functions to avoid problems such as a hash index on a hash partitioned table clumping all values into a small portion of the bucket space; it's also useful for anything that wants a 64-bit hash value rather than a 32-bit hash value. Just in case somebody wants a 64-bit hash value that is compatible with the existing 32-bit hash values, make the low 32-bits of the 64-bit hash value match the 32-bit hash value when the seed is 0. Robert Haas and Amul Sul Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Fix handling of empty arrays in array_fill().Tom Lane2017-01-05
| | | | | | | | | | | | | | | | | array_fill(..., array[0]) produced an empty array, which is probably what users expect, but it was a one-dimensional zero-length array which is not our standard representation of empty arrays. Also, for no very good reason, it rejected empty input arrays; that case should be allowed and produce an empty output array. In passing, remove the restriction that the input array(s) have lower bound 1. That seems rather pointless, and it would have needed extra complexity to make the check deal with empty input arrays. Per bug #14487 from Andrew Gierth. It's been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/20170105152156.10135.64195@wrigleys.postgresql.org
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Add macros to make AllocSetContextCreate() calls simpler and safer.Tom Lane2016-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>