aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
...
* Revert 0f5ca02f53Alexander Korotkov2020-04-08
| | | | | | | | 0f5ca02f53 introduces 3 new keywords. It appears to be too much for relatively small feature. Given now we past feature freeze, it's already late for discussion of the new syntax. So, revert. Discussion: https://postgr.es/m/28209.1586294824%40sss.pgh.pa.us
* Modify additional power 2 calculations to use new helper functionsDavid Rowley2020-04-08
| | | | | | | | | | | | 2nd pass of modifying various places which obtain the next power of 2 of a number and make them use the new functions added in f0705bb62. In passing, also modify num_combinations(). This can be implemented using simple bitshifting rather than looping. Reviewed-by: John Naylor Discussion: https://postgr.es/m/20200114173553.GE32763%40fetter.org
* Fix circle_in to accept "(x,y),r" as it's advertised to do.Tom Lane2020-04-07
| | | | | | | | | | | Our documentation describes four allowed input syntaxes for circles, but the regression tests tried only three ... with predictable consequences. Remarkably, this has been wrong since the circle datatype was added in 1997, but nobody noticed till now. David Zhang, with some help from me Discussion: https://postgr.es/m/332c47fa-d951-7574-b5cc-a8f7f7201202@highgo.ca
* Implement waiting for given lsn at transaction startAlexander Korotkov2020-04-07
| | | | | | | | | | | | | | | | | | | | This commit adds following optional clause to BEGIN and START TRANSACTION commands. WAIT FOR LSN lsn [ TIMEOUT timeout ] New clause pospones transaction start till given lsn is applied on standby. This clause allows user be sure, that changes previously made on primary would be visible on standby. New shared memory struct is used to track awaited lsn per backend. Recovery process wakes up backend once required lsn is applied. Author: Ivan Kartyshov, Anna Akenteva Reviewed-by: Craig Ringer, Thomas Munro, Robert Haas, Kyotaro Horiguchi Reviewed-by: Masahiko Sawada, Ants Aasma, Dmitry Ivanov, Simon Riggs Reviewed-by: Amit Kapila, Alexander Korotkov Discussion: https://postgr.es/m/0240c26c-9f84-30ea-fca9-93ab2df5f305%40postgrespro.ru
* Support FETCH FIRST WITH TIESAlvaro Herrera2020-04-07
| | | | | | | | | | | | | | | | | | WITH TIES is an option to the FETCH FIRST N ROWS clause (the SQL standard's spelling of LIMIT), where you additionally get rows that compare equal to the last of those N rows by the columns in the mandatory ORDER BY clause. There was a proposal by Andrew Gierth to implement this functionality in a more powerful way that would yield more features, but the other patch had not been finished at this time, so we decided to use this one for now in the spirit of incremental development. Author: Surafel Temesgen <surafel3000@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Discussion: https://postgr.es/m/CALAY4q9ky7rD_A4vf=FVQvCGngm3LOes-ky0J6euMrg=_Se+ag@mail.gmail.com Discussion: https://postgr.es/m/87o8wvz253.fsf@news-spur.riddles.org.uk
* Adjust bytea get_bit/set_bit to use int8 not int4 for bit numbering.Tom Lane2020-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since the existing bit number argument can't exceed INT32_MAX, it's not possible for these functions to manipulate bits beyond the first 256MB of a bytea value. Lift that restriction by redeclaring the bit number arguments as int8 (which requires a catversion bump, hence is not back-patchable). The similarly-named functions for bit/varbit don't really have a problem because we restrict those types to at most VARBITMAXLEN bits; hence leave them alone. While here, extend the encode/decode functions in utils/adt/encode.c to allow dealing with values wider than 1GB. This is not a live bug or restriction in current usage, because no input could be more than 1GB, and since none of the encoders can expand a string more than 4X, the result size couldn't overflow uint32. But it might be desirable to support more in future, so make the input length values size_t and the potential-output-length values uint64. Also add some test cases to improve the miserable code coverage of these functions. Movead Li, editorialized some by me; also reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/20200312115135445367128@highgo.ca
* Introduce xid8-based functions to replace txid_XXX.Thomas Munro2020-04-07
| | | | | | | | | | | | | | | | | | | | | The txid_XXX family of fmgr functions exposes 64 bit transaction IDs to users as int8. Now that we have an SQL type xid8 for FullTransactionId, define a new set of functions including pg_current_xact_id() and pg_current_snapshot() based on that. Keep the old functions around too, for now. It's a bit sneaky to use the same C functions for both, but since the binary representation is identical except for the signedness of the type, and since older functions are the ones using the wrong signedness, and since we'll presumably drop the older ones after a reasonable period of time, it seems reasonable to switch to FullTransactionId internally and share the code for both. Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com> Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de
* Add SQL type xid8 to expose FullTransactionId to users.Thomas Munro2020-04-07
| | | | | | | | | | | Similar to xid, but 64 bits wide. This new type is suitable for use in various system views and administration functions. Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com> Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de
* Add SQL functions for Unicode normalizationPeter Eisentraut2020-04-02
| | | | | | | | | | | | | | | | | This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert and check Unicode normal forms, per SQL standard. To support fast IS NORMALIZED tests, we pull in a new data file DerivedNormalizationProps.txt from Unicode and build a lookup table from that, using techniques similar to ones already used for other Unicode data. make update-unicode will keep it up to date. We only build and use these tables for the NFC and NFKC forms, because they are too big for NFD and NFKD and the improvement is not significant enough there. Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com
* Collect statistics about SLRU cachesTomas Vondra2020-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a number of SLRU caches used to access important data like clog, commit timestamps, multixact, asynchronous notifications, etc. Until now we had no easy way to monitor these shared caches, compute hit ratios, number of reads/writes etc. This commit extends the statistics collector to track this information for a predefined list of SLRUs, and also introduces a new system view pg_stat_slru displaying the data. The list of built-in SLRUs is fixed, but additional SLRUs may be defined in extensions. Unfortunately, there's no suitable registry of SLRUs, so this patch simply defines a fixed list of SLRUs with entries for the built-in ones and one entry for all additional SLRUs. Extensions adding their own SLRU are fairly rare, so this seems acceptable. This patch only allows monitoring of SLRUs, not tuning. The SLRU sizes are still fixed (hard-coded in the code) and it's not entirely clear which of the SLRUs might need a GUC to tune size. In a way, allowing us to determine that is one of the goals of this patch. Bump catversion as the patch introduces new functions and system view. Author: Tomas Vondra Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/20200119143707.gyinppnigokesjok@development
* Improve selectivity estimation for assorted match-style operators.Tom Lane2020-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quite a few matching operators such as JSONB's @> used "contsel" and "contjoinsel" as their selectivity estimators. That was a bad idea, because (a) contsel is only a stub, yielding a fixed default estimate, and (b) that default is 0.001, meaning we estimate these operators as five times more selective than equality, which is surely pretty silly. There's a good model for improving this in ltree's ltreeparentsel(): for any "var OP constant" query, we can try applying the operator to all of the column's MCV and histogram values, taking the latter as being a random sample of the non-MCV values. That code is actually 100% generic, except for the question of exactly what default selectivity ought to be plugged in when we don't have stats. Hence, migrate the guts of ltreeparentsel() into the core code, provide wrappers "matchingsel" and "matchingjoinsel" with a more-appropriate default estimate, and use those for the non-geometric operators that formerly used contsel (mostly JSONB containment operators and tsquery matching). Also apply this code to some match-like operators in hstore, ltree, and pg_trgm, including the former users of ltreeparentsel as well as ones that improperly used contsel. Since commit 911e70207 just created new versions of those extensions that we haven't released yet, we can sneak this change into those new versions instead of having to create an additional generation of update scripts. Patch by me, reviewed by Alexey Bashtanov Discussion: https://postgr.es/m/12237.1582833074@sss.pgh.pa.us
* Teach pg_ls_dir_files() to ignore ENOENT failures from stat().Tom Lane2020-03-31
| | | | | | | | | | | | | | | | Buildfarm experience shows that this function can fail with ENOENT if some other process unlinks a file between when we read the directory entry and when we try to stat() it. The problem is old but we had not noticed it until 085b6b667 added regression test coverage. To fix, just ignore ENOENT failures. There is one other case that this might hide: a symlink that points to nowhere. That seems okay though, at least better than erroring. Back-patch to v10 where this function was added, since the regression test cases were too. Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
* Fix assorted typosMagnus Hagander2020-03-31
| | | | Author: Daniel Gustafsson <daniel@yesql.se>
* Implement operator class parametersAlexander Korotkov2020-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
* Improve the performance and accuracy of numeric sqrt() and ln().Dean Rasheed2020-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using Newton's method to compute numeric square roots, use the Karatsuba square root algorithm, which performs better for numbers of all sizes. In practice, this is 3-5 times faster for inputs with just a few digits and up to around 10 times faster for larger inputs. Also, the new algorithm guarantees that the final digit of the result is correctly rounded, since it computes an integer square root with truncation, containing at least 1 extra decimal digit before rounding. The former algorithm would occasionally round the wrong way because it rounded both the intermediate and final results. In addition, arrange for sqrt_var() to explicitly support negative rscale values (rounding before the decimal point). This allows the argument reduction phase of ln_var() to be optimised for large inputs, since it only needs to compute square roots with a few more digits than the final ln() result, rather than computing all the digits before the decimal point. For very large inputs, this can be many thousands of times faster. In passing, optimise div_var_fast() in a couple of places where it was doing unnecessary work. Patch be me, reviewed by Tom Lane and Tels. Discussion: https://postgr.es/m/CAEZATCV1A7+jD3P30Zu31KjaxeSEyOn3v9d6tYegpxcq3cQu-g@mail.gmail.com
* Trigger autovacuum based on number of INSERTsDavid Rowley2020-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally autovacuum has only ever invoked a worker based on the estimated number of dead tuples in a table and for anti-wraparound purposes. For the latter, with certain classes of tables such as insert-only tables, anti-wraparound vacuums could be the first vacuum that the table ever receives. This could often lead to autovacuum workers being busy for extended periods of time due to having to potentially freeze every page in the table. This could be particularly bad for very large tables. New clusters, or recently pg_restored clusters could suffer even more as many large tables may have the same relfrozenxid, which could result in large numbers of tables requiring an anti-wraparound vacuum all at once. Here we aim to reduce the work required by anti-wraparound and aggressive vacuums in general, by triggering autovacuum when the table has received enough INSERTs. This is controlled by adding two new GUCs and reloptions; autovacuum_vacuum_insert_threshold and autovacuum_vacuum_insert_scale_factor. These work exactly the same as the existing scale factor and threshold controls, only base themselves off the number of inserts since the last vacuum, rather than the number of dead tuples. New controls were added rather than reusing the existing controls, to allow these new vacuums to be tuned independently and perhaps even completely disabled altogether, which can be done by setting autovacuum_vacuum_insert_threshold to -1. We make no attempt to skip index cleanup operations on these vacuums as they may trigger for an insert-mostly table which continually doesn't have enough dead tuples to trigger an autovacuum for the purpose of removing those dead tuples. If we were to skip cleaning the indexes in this case, then it is possible for the index(es) to become bloated over time. There are additional benefits to triggering autovacuums based on inserts, as tables which never contain enough dead tuples to trigger an autovacuum are now more likely to receive a vacuum, which can mark more of the table as "allvisible" and encourage the query planner to make use of Index Only Scans. Currently, we still obey vacuum_freeze_min_age when triggering these new autovacuums based on INSERTs. For large insert-only tables, it may be beneficial to lower the table's autovacuum_freeze_min_age so that tuples are eligible to be frozen sooner. Here we've opted not to zero that for these types of vacuums, since the table may just be insert-mostly and we may otherwise freeze tuples that are still destined to be updated or removed in the near future. There was some debate to what exactly the new scale factor and threshold should default to. For now, these are set to 0.2 and 1000, respectively. There may be some motivation to adjust these before the release. Author: Laurenz Albe, Darafei Praliaskouski Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
* Go back to returning int from ereport auxiliary functions.Tom Lane2020-03-25
| | | | | | | | | | | | | | | | | | | This reverts the parts of commit 17a28b03645e27d73bf69a95d7569b61e58f06eb that changed ereport's auxiliary functions from returning dummy integer values to returning void. It turns out that a minority of compilers complain (not entirely unreasonably) about constructs such as (condition) ? errdetail(...) : 0 if errdetail() returns void rather than int. We could update those call sites to say "(void) 0" perhaps, but the expectation for this patch set was that ereport callers would not have to change anything. And this aspect of the patch set was already the most invasive and least compelling part of it, so let's just drop it. Per buildfarm. Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
* Add collation versions for Windows.Thomas Munro2020-03-25
| | | | | | | | On Vista and later, use GetNLSVersionEx() to request collation version information. Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
* Allow NULL version for individual collations.Thomas Munro2020-03-25
| | | | | | | | | | | Remove the documented restriction that collation providers must either return NULL for all collations or non-NULL for all collations. Use NULL for glibc collations like "C.UTF-8", which might otherwise lead future proposed commits to force unnecessary index rebuilds. Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
* Improve the internal implementation of ereport().Tom Lane2020-03-24
| | | | | | | | | | | | | | | | | | | | | | Change all the auxiliary error-reporting routines to return void, now that we no longer need to pretend they are passing something useful to errfinish(). While this probably doesn't save anything significant at the machine-code level, it allows detection of some additional types of mistakes. Pass the error location details (__FILE__, __LINE__, PG_FUNCNAME_MACRO) to errfinish not errstart. This shaves a few cycles off the case where errstart decides we're not going to emit anything. Re-implement elog() as a trivial wrapper around ereport(), removing the separate support infrastructure it used to have. Aside from getting rid of some now-surplus code, this means that elog() now really does have exactly the same semantics as ereport(), in particular that it can skip evaluation work if the message is not to be emitted. Andres Freund and Tom Lane Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
* Add object names to partition integrity violations.Amit Kapila2020-03-23
| | | | | | | | | | | | All errors of SQLSTATE class 23 should include the name of an object associated with the error in separate fields of the error report message. We do this so that applications need not try to extract them from the possibly-localized human-readable text of the message. Reported-by: Chris Bandy Author: Chris Bandy Reviewed-by: Amit Kapila and Amit Langote Discussion: https://postgr.es/m/0aa113a3-3c7f-db48-bcd8-f9290b2269ae@gmail.com
* Introduce "anycompatible" family of polymorphic types.Tom Lane2020-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the pseudo-types anycompatible, anycompatiblearray, anycompatiblenonarray, and anycompatiblerange. They work much like anyelement, anyarray, anynonarray, and anyrange respectively, except that the actual input values need not match precisely in type. Instead, if we can find a common supertype (using the same rules as for UNION/CASE type resolution), then the parser automatically promotes the input values to that type. For example, "myfunc(anycompatible, anycompatible)" can match a call with one integer and one bigint argument, with the integer automatically promoted to bigint. With anyelement in the definition, the user would have had to cast the integer explicitly. The new types also provide a second, independent set of type variables for function matching; thus with "myfunc(anyelement, anyelement, anycompatible) returns anycompatible" the first two arguments are constrained to be the same type, but the third can be some other type, and the result has the type of the third argument. The need for more than one set of type variables was foreseen back when we first invented the polymorphic types, but we never did anything about it. Pavel Stehule, revised a bit by me Discussion: https://postgr.es/m/CAFj8pRDna7VqNi8gR+Tt2Ktmz0cq5G93guc3Sbn_NVPLdXAkqA@mail.gmail.com
* Implement type regcollationPeter Eisentraut2020-03-18
| | | | | | | | | This will be helpful for a following commit and it's also just generally useful, like the other reg* types. Author: Julien Rouhaud Reviewed-by: Thomas Munro and Michael Paquier Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
* Remove useless pfree()s at the ends of various ValuePerCall SRFs.Tom Lane2020-03-16
| | | | | | | | | | | | | | | | | | | | We don't need to manually clean up allocations in a SRF's multi_call_memory_ctx, because the SRF_RETURN_DONE infrastructure takes care of that (and also ensures that it will happen even if the function never gets a final call, which simple manual cleanup cannot do). Hence, the code removed by this patch is a waste of code and cycles. Worse, it gives the impression that cleaning up manually is a thing, which can lead to more serious errors such as those fixed in commits 085b6b667 and b4570d33a. So we should get rid of it. These are not quite actual bugs though, so I couldn't muster the enthusiasm to back-patch. Fix in HEAD only. Justin Pryzby Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
* Avoid holding a directory FD open across assorted SRF calls.Tom Lane2020-03-16
| | | | | | | | | | | | | | | | | This extends the fixes made in commit 085b6b667 to other SRFs with the same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(), pg_ls_dir(), and pg_tablespace_databases(). Also adjust various comments and documentation to warn against expecting to clean up resources during a ValuePerCall SRF's final call. Back-patch to all supported branches, since these functions were all born broken. Justin Pryzby, with cosmetic tweaks by me Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
* Rearrange pseudotypes.c to get rid of duplicative code.Tom Lane2020-03-14
| | | | | | | | | | | | | | | | | | | | Commit a5954de10 replaced a lot of manually-coded stub I/O routines with code generated by macros. That was a good idea but it didn't go far enough, because there were still manually-coded stub input routines for types that had live output routines. Refactor the macro so that we can generate just a stub input routine at need. Also create similar macros to generate stub binary I/O routines, since we have some of those now. The only stub functions that remain hand-coded are shell_in() and shell_out(), which need to be separate because they use different error messages. While here, rearrange the commentary to discuss each type not each function. This provides a better way to explain the *why* of which types need which support, rather than just duplicatively annotating the functions. Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
* Unify several ways to tracking backend typePeter Eisentraut2020-03-13
| | | | | | | | | | | | | | Add a new global variable MyBackendType that uses the same BackendType enum that was previously only used by the stats collector. That way several duplicate ways of checking what type a particular process is can be simplified. Since it's no longer just for stats, move to miscinit.c and rename existing functions to match the expanded purpose. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
* Avoid holding a directory FD open across pg_ls_dir_files() calls.Tom Lane2020-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This coding technique is undesirable because (a) it leaks the FD for the rest of the transaction if the SRF is not run to completion, and (b) allocated FDs are a scarce resource, but multiple interleaved uses of the relevant functions could eat many such FDs. In v11 and later, a query such as "SELECT pg_ls_waldir() LIMIT 1" yields a warning about the leaked FD, and the only reason there's no warning in earlier branches is that fd.c didn't whine about such leaks before commit 9cb7db3f0. Even disregarding the warning, it wouldn't be too hard to run a backend out of FDs with careless use of these SQL functions. Hence, rewrite the function so that it reads the directory within a single call, returning the results as a tuplestore rather than via value-per-call mode. There are half a dozen other built-in SRFs with similar problems, but let's fix this one to start with, just to see if the buildfarm finds anything wrong with the code. In passing, fix bogus error report for stat() failure: it was whining about the directory when it should be fingering the individual file. Doubtless a copy-and-paste error. Back-patch to v10 where this function was added. Justin Pryzby, with cosmetic tweaks and test cases by me Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
* Remove utils/acl.h from catalog/objectaddress.hPeter Eisentraut2020-03-10
| | | | | | | | | | | | | | | | | | The need for this was removed by 8b9e9644dc6a9bd4b7a97950e6212f63880cf18b. A number of files now need to include utils/acl.h or parser/parse_node.h explicitly where they previously got it indirectly somehow. Since parser/parse_node.h already includes nodes/parsenodes.h, the latter is then removed where the former was added. Also, remove nodes/pg_list.h from objectaddress.h, since that's included via nodes/parsenodes.h. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/7601e258-26b2-8481-36d0-dc9dca6f28f1%402ndquadrant.com
* Add tg_updatedcols to TriggerDataPeter Eisentraut2020-03-09
| | | | | | | | | | This allows a trigger function to determine for an UPDATE trigger which columns were actually updated. This allows some optimizations in generic trigger functions such as lo_manage and tsvector_update_trigger. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
* Allow Unicode escapes in any server encoding, not only UTF-8.Tom Lane2020-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | SQL includes provisions for numeric Unicode escapes in string literals and identifiers. Previously we only accepted those if they represented ASCII characters or the server encoding was UTF-8, making the conversion to internal form trivial. This patch adjusts things so that we'll call the appropriate encoding conversion function in less-trivial cases, allowing the escape sequence to be accepted so long as it corresponds to some character available in the server encoding. This also applies to processing of Unicode escapes in JSONB. However, the old restriction still applies to client-side JSON processing, since that hasn't got access to the server's encoding conversion infrastructure. This patch includes some lexer infrastructure that simplifies throwing errors with error cursors pointing into the middle of a string (or other complex token). For the moment I only used it for errors relating to Unicode escapes, but we might later expand the usage to some other cases. Patch by me, reviewed by John Naylor. Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
* Remove the "opaque" pseudo-type and associated compatibility hacks.Tom Lane2020-03-05
| | | | | | | | | | | | | | | | | | | A long time ago, it was necessary to declare datatype I/O functions, triggers, and language handler support functions in a very type-unsafe way involving a single pseudo-type "opaque". We got rid of those conventions in 7.3, but there was still support in various places to automatically convert such functions to the modern declaration style, to be able to transparently re-load dumps from pre-7.3 servers. It seems unnecessary to continue to support that anymore, so take out the hacks; whereupon the "opaque" pseudo-type itself is no longer needed and can be dropped. This is part of a group of patches removing various server-side kluges for transparently upgrading pre-8.0 dump files. Since we've had few complaints about dropping pg_dump's support for dumping from pre-8.0 servers (commit 64f3524e2), it seems okay to now remove these kluges. Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
* Remove RangeIOData->typiofuncAlvaro Herrera2020-03-05
| | | | | | | | | | | We used to carry the I/O function OID in RangeIOData, but it's not used for anything. Since the struct is not exposed to the world anyway, we can simplify it a bit. Also, rename the FmgrInfo member to match the accompanying 'typioparam' and put them in a more sensible order. Reviewed by Tom Lane and Paul Jungwirth. Discussion: https://postgr.es/m/20200304215711.GA8732@alvherre.pgsql
* Introduce macros for typalign and typstorage constants.Tom Lane2020-03-04
| | | | | | | | | | | | | | | | | | | | | Our usual practice for "poor man's enum" catalog columns is to define macros for the possible values and use those, not literal constants, in C code. But for some reason lost in the mists of time, this was never done for typalign/attalign or typstorage/attstorage. It's never too late to make it better though, so let's do that. The reason I got interested in this right now is the need to duplicate some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch. But in general, this sort of change aids greppability and readability, so it's a good idea even without any specific motivation. I may have missed a few places that could be converted, and it's even more likely that pending patches will re-introduce some hard-coded references. But that's not fatal --- there's no expectation that we'd actually change any of these values. We can clean up stragglers over time. Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
* Allow to_date/to_timestamp to recognize non-English month/day names.Tom Lane2020-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | to_char() has long allowed the TM (translation mode) prefix to specify output of translated month or day names; but that prefix had no effect in input format strings. Now it does. to_date() and to_timestamp() will now recognize the same month or day names that to_char() would output for the same format code. Matching is case-insensitive (per the active collation's notion of what that means), just as it has always been for English month/day names without the TM prefix. (As per the discussion thread, there are lots of cases that this feature will not handle, such as alternate day names. But being able to accept what to_char() will output seems useful enough.) In passing, fix some shaky English and violations of message style guidelines in jsonpath errors for the .datetime() method, which depends on this code. Juan José Santamaría Flecha, reviewed and modified by me, with other commentary from Alvaro Herrera, Tomas Vondra, Arthur Zakirov, Peter Eisentraut, Mark Dilger. Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
* Report progress of streaming base backup.Fujii Masao2020-03-03
| | | | | | | | | | | | | This commit adds pg_stat_progress_basebackup view that reports the progress while an application like pg_basebackup is taking a base backup. This uses the progress reporting infrastructure added by c16dc1aca5e0, adding support for streaming base backup. Bump catversion. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
* Fix corner-case loss of precision in numeric ln().Dean Rasheed2020-03-01
| | | | | | | | | | | | | | | | | | | | | | When deciding on the local rscale to use for the Taylor series expansion, ln_var() neglected to account for the fact that the result is subsequently multiplied by a factor of 2^(nsqrt+1), where nsqrt is the number of square root operations performed in the range reduction step, which can be as high as 22 for very large inputs. This could result in a loss of precision, particularly when combined with large rscale values, for which a large number of Taylor series terms is required (up to around 400). Fix by computing a few extra digits in the Taylor series, based on the weight of the multiplicative factor log10(2^(nsqrt+1)). It remains to be proven whether or not the other 8 extra digits used for the Taylor series is appropriate, but this at least deals with the obvious oversight of failing to account for the effects of the final multiplication. Per report from Justin AnyhowStep. Reviewed by Tom Lane. Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
* Move src/backend/utils/hash/hashfn.c to src/commonRobert Haas2020-02-27
| | | | | | | | | | | | | | This also involves renaming src/include/utils/hashutils.h, which becomes src/include/common/hashfn.h. Perhaps an argument can be made for keeping the hashutils.h name, but it seemed more consistent to make it match the name of the file, and also more descriptive of what is actually going on here. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not to break the Windows build from Davinder Singh and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
* Add equalimage B-Tree support functions.Peter Geoghegan2020-02-26
| | | | | | | | | | | | | | | | | | | | | | Invent the concept of a B-Tree equalimage ("equality implies image equality") support function, registered as support function 4. This indicates whether it is safe (or not safe) to apply optimizations that assume that any two datums considered equal by an operator class's order method must be interchangeable without any loss of semantic information. This is static information about an operator class and a collation. Register an equalimage routine for almost all of the existing B-Tree opclasses. We only need two trivial routines for all of the opclasses that are included with the core distribution. There is one routine for opclasses that index non-collatable types (which returns 'true' unconditionally), plus another routine for collatable types (which returns 'true' when the collation is a deterministic collation). This patch is infrastructure for an upcoming patch that adds B-Tree deduplication. Author: Peter Geoghegan, Anastasia Lubennikova Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
* Assume that we have <wchar.h>.Tom Lane2020-02-21
| | | | | | | | | | | | | Windows has this, and so do all other live platforms according to the buildfarm; it's been required by POSIX since SUSv2. So remove the configure probe and tests of HAVE_WCHAR_H. This is part of a series of commits to get rid of no-longer-relevant configure checks and dead src/port/ code. I'm committing them separately to make it easier to back out individual changes if they prove less portable than I expect. Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
* Assume that we have cbrt().Tom Lane2020-02-21
| | | | | | | | | | | | Windows has this, and so do all other live platforms according to the buildfarm, so remove the configure probe and float.c's substitute code. This is part of a series of commits to get rid of no-longer-relevant configure checks and dead src/port/ code. I'm committing them separately to make it easier to back out individual changes if they prove less portable than I expect. Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
* Remove duplicated words in commentsMichael Paquier2020-02-18
| | | | | | Author: Daniel Gustafsson Reviewed-by: Vik Fearing Discussion: https://postgr.es/m/EBC3BFEB-664C-4063-81ED-29F1227DB012@yesql.se
* Avoid a performance regression in float overflow/underflow detection.Tom Lane2020-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6bf0bc842 replaced float.c's CHECKFLOATVAL() macro with static inline subroutines, but that wasn't too well thought out. In the original coding, the unlikely condition (isinf(result) or result == 0) was checked first, and the inf_is_valid or zero_is_valid condition only afterwards. The inline-subroutine coding caused that to be swapped around, which is pretty horrid for performance because (a) in common cases the is_valid condition is twice as expensive to evaluate (e.g., requiring two isinf() calls not one) and (b) in common cases the is_valid condition is false, requiring us to perform the unlikely-condition check anyway. Net result is that one isinf() call becomes two or three, resulting in visible performance loss as reported by Keisuke Kuroda. The original fix proposal was to revert the replacement of the macro, but on second thought, that macro was just a bad idea from the beginning: if anything it's a net negative for readability of the code. So instead, let's just open-code all the overflow/underflow tests, being careful to test the unlikely condition first (and mark it unlikely() to help the compiler get the point). Also, rather than having N copies of the actual ereport() calls, collapse those into out-of-line error subroutines to save some code space. This does mean that the error file/line numbers won't be very helpful for figuring out where the issue really is --- but we'd already burned that bridge by putting the ereports into static inlines. In HEAD, check_float[48]_val() are gone altogether. In v12, leave them present in float.h but unused in the core code, just in case some extension is depending on them. Emre Hasegeli, with some kibitzing from me and Andres Freund Discussion: https://postgr.es/m/CANDwggLe1Gc1OrRqvPfGE=kM9K0FSfia0hbeFCEmwabhLz95AA@mail.gmail.com
* Fix bug in Tid scan.Fujii Masao2020-02-07
| | | | | | | | | | | | | | | | | | | | | | | Commit 147e3722f7 changed Tid scan so that it calls table_beginscan() and uses the scan option for seq scan. This change caused two issues. (1) The change caused Tid scan to take a predicate lock on the entire relation in serializable transaction even when relation-level lock is not necessary. This could lead to an unexpected serialization error. (2) The change caused Tid scan to increment the number of seq_scan in pg_stat_*_tables views even though it's not seq scan. This could confuse the users. This commit adds the scan option for Tid scan and makes Tid scan use it, to avoid those issues. Back-patch to v12, where the bug was introduced. Author: Tatsuhito Kasahara Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Fujii Masao Discussion: https://postgr.es/m/CAP0=ZVKy+gTbFmB6X_UW0pP3WaeJ-fkUWHoD-pExS=at3CY76g@mail.gmail.com
* Refactor hash_agg_entry_size().Jeff Davis2020-02-06
| | | | | | Consolidate the calculations for hash table size estimation. This will help with upcoming Hash Aggregation work that will add additional call sites.
* Add leader_pid to pg_stat_activityMichael Paquier2020-02-06
| | | | | | | | | | | | | | | This new field tracks the PID of the group leader used with parallel query. For parallel workers and the leader, the value is set to the PID of the group leader. So, for the group leader, the value is the same as its own PID. Note that this reflects what PGPROC stores in shared memory, so as leader_pid is NULL if a backend has never been involved in parallel query. If the backend is using parallel query or has used it at least once, the value is set until the backend exits. Author: Julien Rouhaud Reviewed-by: Sergei Kornilov, Guillaume Lelarge, Michael Paquier, Tomas Vondra Discussion: https://postgr.es/m/CAOBaU_Yy5bt0vTPZ2_LUM6cUcGeqmYNoJ8-Rgto+c2+w3defYA@mail.gmail.com
* Add declaration-level assertions for compile-time checksMichael Paquier2020-02-03
| | | | | | | | | | | Those new assertions can be used at file scope, outside of any function for compilation checks. This commit provides implementations for C and C++, and fallback implementations. Author: Peter Smith Reviewed-by: Andres Freund, Kyotaro Horiguchi, Dagfinn Ilmari Mannsåker, Michael Paquier Discussion: https://postgr.es/m/201DD0641B056142AC8C6645EC1B5F62014B8E8030@SYD1217
* Optimizations for integer to decimal output.Andrew Gierth2020-02-01
| | | | | | | | | | Using a lookup table of digit pairs reduces the number of divisions needed, and calculating the length upfront saves some work; these ideas are taken from the code previously committed for floats. David Fetter, reviewed by Kyotaro Horiguchi, Tels, and me. Discussion: https://postgr.es/m/20190924052620.GP31596%40fetter.org
* Fix not-quite-right string comparison in parse_jsonb_index_flags().Tom Lane2020-01-31
| | | | | | | | | | | | | This code would accept "strinX", where X is any 1-byte character, as meaning "string". Clearly it wasn't meant to do that. No back-patch, since this doesn't affect correct queries and there's some tiny chance we'd break somebody's incorrect query in a minor release. Report and patch by Dominik Czarnota. Discussion: https://postgr.es/m/CABEVAa1dU0mDCAfaT8WF2adVXTDsLVJy_izotg6ze_hh-cn8qQ@mail.gmail.com
* Clean up newlines following left parenthesesAlvaro Herrera2020-01-30
| | | | | | | | | | | | We used to strategically place newlines after some function call left parentheses to make pgindent move the argument list a few chars to the left, so that the whole line would fit under 80 chars. However, pgindent no longer does that, so the newlines just made the code vertically longer for no reason. Remove those newlines, and reflow some of those lines for some extra naturality. Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql