postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Reconsider pg_stat_subscription_workers view.	Amit Kapila	2022-03-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was decided (refer to the Discussion link below) that the stats collector is not an appropriate place to store the error information of subscription workers. This patch changes the pg_stat_subscription_workers view (introduced by commit 8d74fc96db) so that it stores only statistics counters: apply_error_count and sync_error_count, and has one entry for each subscription. The removed error information such as error-XID and the error message would be stored in another way in the future which is more reliable and persistent. After removing these error details, there is no longer any relation information, so the subscription statistics are now a cluster-wide statistics. The patch also changes the view name to pg_stat_subscription_stats since the word "worker" is an implementation detail that we use one worker for one tablesync and one apply. Author: Masahiko Sawada, based on suggestions by Andres Freund Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de
*	Handle integer overflow in interval justification functions.	Tom Lane	2022-02-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	justify_interval, justify_hours, and justify_days didn't check for overflow when promoting hours to days or days to months; but that's possible when the upper field's value is already large. Detect and report any such overflow. Also, we can avoid unnecessary overflow in some cases in justify_interval by pre-justifying the days field. (Thanks to Nathan Bossart for this idea.) Joe Koshakow Discussion: https://postgr.es/m/CAAvxfHeNqsJ2xYFbPUf_8nNQUiJqkag04NW6aBQQ0dbZsxfWHA@mail.gmail.com
*	Optimise numeric division for one and two base-NBASE digit divisors.	Dean Rasheed	2022-02-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Formerly div_var() had "fast path" short division code that was significantly faster when the divisor was just one base-NBASE digit, but otherwise used long division. This commit adds a new function div_var_int() that divides by an arbitrary 32-bit integer, using the fast short division algorithm, and updates both div_var() and div_var_fast() to use it for one and two digit divisors. In the case of div_var(), this is slightly faster in the one-digit case, because it avoids some digit array copying, and is much faster in the two-digit case where it replaces long division. For div_var_fast(), it is much faster in both cases because the main div_var_fast() algorithm is optimised for larger inputs. Additionally, optimise exp() and ln() by using div_var_int(), allowing a NumericVar to be replaced by an int in a couple of places, most notably in the Taylor series code. This produces a significant speedup of exp(), ln() and the numeric_big regression test. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com
*	Simplify the inner loop of numeric division in div_var().	Dean Rasheed	2022-02-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the standard numeric division algorithm, the inner loop multiplies the divisor by the next quotient digit and subtracts that from the working dividend. As suggested by the original code comment, the separate "carry" and "borrow" variables (from the multiplication and subtraction steps respectively) can be folded together into a single variable. Doing so significantly improves performance, as well as simplifying the code. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com
*	Apply auto-vectorization to the inner loop of div_var_fast().	Dean Rasheed	2022-02-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This loop is basically the same as the inner loop of mul_var(), which was auto-vectorized in commit 8870917623, but the compiler will only consider auto-vectorizing the div_var_fast() loop if the assignment target div[qi + i] is replaced by div_qi[i], where div_qi = &div[qi]. Additionally, since the compiler doesn't know that qdigit is guaranteed to fit in a 16-bit NumericDigit, cast it to NumericDigit before multiplying to make the resulting auto-vectorized code more efficient (avoiding unnecessary multiplication of the high 16 bits). While at it, per suggestion from Tom Lane, change var1digit in mul_var() to be a NumericDigit rather than an int for the same reason. This actually makes no difference with modern gcc, but it might help other compilers generate more efficient assembly. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com
*	Simplify more checks related to set-returning functions	Michael Paquier	2022-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes more consistent the SRF-related checks in the area of PL/pgSQL, PL/Perl, PL/Tcl, pageinspect and some of the JSON worker functions, making it easier to grep for the same error patterns through the code, reducing a bit the translation work. It is worth noting that each_worker_jsonb()/each_worker() in jsonfuncs.c and pageinspect's brin_page_items() were doing a check on expectedDesc that is not required as they fetch their tuple descriptor directly from get_call_result_type(). This looks like a set of copy-paste errors that have spread over the years. This commit is a continuation of the changes begun in 07daca5, for any remaining code paths on sight. Like fcc2817, this makes the code more consistent, easing the integration of a larger patch that will refactor the way tuplestores are created and checked in a good portion of the set-returning functions present in core. I have worked my way through the changes of this patch by myself, and Ranier has proposed the same changes in a different thread in parallel, though there were some inconsistencies related in expectedDesc in what was proposed by him. Author: Michael Paquier, Ranier Vilela Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com Discussion: https://postgr.es/m/CAEudQApm=AFuJjEHLBjBcJbxcw4pBMwg2sHwXyCXYcbBOj3hpg@mail.gmail.com
*	Use bitwise rotate functions in more places	John Naylor	2022-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	There were a number of places in the code that used bespoke bit-twiddling expressions to do bitwise rotation. While we've had pg_rotate_right32() for a while now, we hadn't gotten around to standardizing on that. Do so now. Since many potential call sites look more natural with the "left" equivalent, add that function too. Reviewed by Tom Lane and Yugo Nagata Discussion: https://www.postgresql.org/message-id/CAFBsxsH7c1LC0CGZ0ADCBXLHU5-%3DKNXx-r7tHYPAW51b2HK4Qw%40mail.gmail.com
*	Fix inconsistencies in SRF checks of pg_config() and string_to_table()	Michael Paquier	2022-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The execution paths of those functions have been using a set of checks inconsistent with any other SRF function: - string_to_table() missed a check on expectedDesc, the tuple descriptor expected by the caller, that should never be NULL. Introduced in 66f1630. - pg_config() should check for a ReturnSetInfo, and expectedDesc cannot be NULL. Its error messages were also inconsistent. Introduced in a5c43b8. Extracted from a larger patch by the same author, in preparation for a larger patch set aimed at refactoring the way tuplestores are created and checked in SRF functions. Author: Melanie Plageman Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com
*	Remove all traces of tuplestore_donestoring() in the C code	Michael Paquier	2022-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This routine is a no-op since dd04e95 from 2003, with a macro kept around for compatibility purposes. This has led to the same code patterns being copy-pasted around for no effect, sometimes in confusing ways like in pg_logical_slot_get_changes_guts() from logical.c where the code was actually incorrect. This issue has been discussed on two different threads recently, so rather than living with this legacy, remove any uses of this routine in the C code to simplify things. The compatibility macro is kept to avoid breaking any out-of-core modules that depend on it. Reported-by: Tatsuhito Kasahara, Justin Pryzby Author: Tatsuhito Kasahara Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
*	Ensure that length argument of memcmp() isn't seen as negative.	Tom Lane	2022-02-15
\| \| \| \| \| \| \| \| \| \|	I think this will shut up a weird warning from buildfarm member serinus. Perhaps it'd be better to change tsCompareString's length arguments to unsigned, but that seems more invasive than is justified. Part of a general push to remove off-the-beaten-track warnings where we can easily do so.
*	Remove pg_atoi()	Peter Eisentraut	2022-02-15
\| \| \| \| \| \| \| \| \|	The last caller was int2vectorin(), and having such a general function for one user didn't seem useful, so just put the required parts inline and remove the function. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Remove one use of pg_atoi()	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \|	There was no real need to use this here instead of a simpler API. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Move scanint8() to numutils.c	Peter Eisentraut	2022-02-14
\| \| \| \| \| \| \| \| \| \| \|	Move scanint8() to numutils.c and rename to pg_strtoint64(). We already have a "16" and "32" version of that, and the code inside the functions was aligned, so this move makes all three versions consistent. The API is also changed to no longer provide the errorOK case. Users that need the error checking can use strtoi64(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Update comment	Peter Eisentraut	2022-02-10
\| \| \| \| \|	Update a comment that assumed that libc collations don't support versioning. Also improve an adjacent error message a bit.
*	Add min() and max() aggregates for xid8.	Fujii Masao	2022-02-10
\| \| \| \| \| \| \| \|	Bump catalog version. Author: Ken Kato Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/47d77b18c44f87f8222c4c7a3e2dee6b@oss.nttdata.com
*	Reduce more the number of calls to GetMaxBackends()	Michael Paquier	2022-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the code paths changed by aa64f23 can reduce the number of times GetMaxBackends() is called. The performance gain is marginal, but most of the code changed by this commit already did that. Hence, let's be clean and apply the same rule everywhere, for consistency. Some of the code paths, like in deadlock.c, involve only assertions. These are left unchanged. Reviewed-by: Nathan Bossart, Robert Haas Discussion: https://postgr.es/m/YgMpGZhPOjNfS7er@paquier.xyz
*	Remove MaxBackends variable in favor of GetMaxBackends() function.	Robert Haas	2022-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, it was really easy to write code that accessed MaxBackends before we'd actually initialized it, especially when coding up an extension. To make this less error-prune, introduce a new function GetMaxBackends() which should be used to obtain the correct value. This will ERROR if called too early. Demote the global variable to a file-level static, so that nobody can peak at it directly. Nathan Bossart. Idea by Andres Freund. Review by Greg Sabino Mullane, by Michael Paquier (who had doubts about the approach), and by me. Discussion: http://postgr.es/m/20210802224204.bckcikl45uezv5e4@alap3.anarazel.de
*	Improve worst-case performance of text_position_get_match_pos()	John Naylor	2022-02-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function converts a byte position to a character position after a successful string match. Rather than calling pg_mblen() in a loop, use pg_mbstrlen_with_len() since the latter can inline its own call to pg_mblen(). When the string match is at the end of the haystack text, this change results in 10-20% performance improvement, depending on platform and typical character length in bytes. This also simplifies the code a little. Specializing for UTF-8 could result in further improvement, but the performance gain was not found to be reliable between platforms. The modest gain in this commit is stable between platforms and usable by all server encodings. Discussion: https://www.postgresql.org/message-id/CAFBsxsH1Yutrmu+6LLHKK8iXY+vG--Do6zN+2900spHXQNNQKQ@mail.gmail.com
*	Add UNIQUE null treatment option	Peter Eisentraut	2022-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SQL standard has been ambiguous about whether null values in unique constraints should be considered equal or not. Different implementations have different behaviors. In the SQL:202x draft, this has been formalized by making this implementation-defined and adding an option on unique constraint definitions UNIQUE [ NULLS [NOT] DISTINCT ] to choose a behavior explicitly. This patch adds this option to PostgreSQL. The default behavior remains UNIQUE NULLS DISTINCT. Making this happen in the btree code is pretty easy; most of the patch is just to carry the flag around to all the places that need it. The CREATE UNIQUE INDEX syntax extension is not from the standard, it's my own invention. I named all the internal flags, catalog columns, etc. in the negative ("nulls not distinct") so that the default PostgreSQL behavior is the default if the flag is false. Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com
*	Simplify coding around path_contains_parent_reference().	Tom Lane	2022-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given the existing stipulation that path_contains_parent_reference() must only be invoked on canonicalized paths, we can simplify things in the wake of commit c10f830c5. It is now only possible to see ".." at the start of a relative path. That means we can simplify path_contains_parent_reference() itself quite a bit, and it makes the two existing outside call sites dead code, since they'd already checked that the path is absolute. We could now fold path_contains_parent_reference() into its only remaining caller path_is_relative_and_below_cwd(). But it seems better to leave it as a separately callable function, in case any extensions are using it. Also document the pre-existing requirement for path_is_relative_and_below_cwd's input to be likewise canonicalized. Shenhao Wang and Tom Lane Discussion: https://postgr.es/m/OSBPR01MB4214FA221FFE046F11F2AD74F2D49@OSBPR01MB4214.jpnprd01.prod.outlook.com
*	Fix ordering of XIDs in ProcArrayApplyRecoveryInfo	Tomas Vondra	2022-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs before adding them to KnownAssignedXids. But the XIDs are sorted using xidComparator, which compares the XIDs simply as uint32 values, not logically. KnownAssignedXidsAdd() however expects XIDs in logical order, and calls TransactionIdFollowsOrEquals() to enforce that. If there are XIDs for which the two orderings disagree, an error is raised and the recovery fails/restarts. Hitting this issue is fairly easy - you just need two transactions, one started before the 4B limit (e.g. XID 4294967290), the other sometime after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when compared using xidComparator we try to add them in the opposite order. Which makes KnownAssignedXidsAdd() fail with an error like this: ERROR: out-of-order XID insertion in KnownAssignedXids This only happens during replica startup, while processing RUNNING_XACTS records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we skip these records. So this does not affect already running replicas, but if you restart (or create) a replica while there are transactions with XIDs for which the two orderings disagree, you may hit this. Long-running transactions and frequent replica restarts increase the likelihood of hitting this issue. Once the replica gets into this state, it can't be started (even if the old transactions are terminated). Fixed by sorting the XIDs logically - this is fine because we're dealing with normal XIDs (because it's XIDs assigned to backends) and from the same wraparound epoch (otherwise the backends could not be running at the same time on the primary node). So there are no problems with the triangle inequality, which is why xidComparator compares raw values. Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me. This issue is present in all releases since 9.4, however releases up to 9.6 are EOL already so backpatch to 10 only. Reviewed-by: Abhijit Menon-Sen Reviewed-by: Alvaro Herrera Backpatch-through: 10 Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
*	Change collate and ctype fields to type text	Peter Eisentraut	2022-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changes the data type of the catalog fields datcollate, datctype, collcollate, and collctype from name to text. There wasn't ever a really good reason for them to be of type name; presumably this was just carried over from when they were fixed-size fields in pg_control, first into the corresponding pg_database fields, and then to pg_collation. The values are not identifiers or object names, and we don't ever look them up that way. Changing to type text saves space in the typical case, since locale names are typically only a few bytes long. But it is also possible that an ICU locale name with several customization options appended could be longer than 63 bytes, so this also enables that case, which was previously probably broken. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com
*	Call pg_newlocale_from_collation() also with default collation	Peter Eisentraut	2022-01-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, callers of pg_newlocale_from_collation() did not call it if the collation was DEFAULT_COLLATION_OID and instead proceeded with a pg_locale_t of 0. Instead, now we call it anyway and have it return 0 if the default collation was passed. It already did this, so we just have to adjust the callers. This simplifies all the call sites and also makes future enhancements easier. After discussion and testing, the previous comment in pg_locale.c about avoiding this for performance reasons may have been mistaken since it was testing a very different patch version way back when. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/ed3baa81-7fac-7788-cc12-41e3f7917e34@enterprisedb.com
*	pg_upgrade: Preserve relfilenodes and tablespace OIDs.	Robert Haas	2022-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, database OIDs, relfilenodes, and tablespace OIDs can all change when a cluster is upgraded using pg_upgrade. It seems better to preserve them, because (1) it makes troubleshooting pg_upgrade easier, since you don't have to do a lot of work to match up files in the old and new clusters, (2) it allows 'rsync' to save bandwidth when used to re-sync a cluster after an upgrade, and (3) if we ever encrypt or sign blocks, we would likely want to use a nonce that depends on these values. This patch only arranges to preserve relfilenodes and tablespace OIDs. The task of preserving database OIDs is left for another patch, since it involves some complexities that don't exist in these cases. Database OIDs have a similar issue, but there are some tricky points in that case that do not apply to these cases, so that problem is left for another patch. Shruthi KC, based on an earlier patch from Antonin Houska, reviewed and with some adjustments by me. Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
*	Introduce log_destination=jsonlog	Michael Paquier	2022-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"jsonlog" is a new value that can be added to log_destination to provide logs in the JSON format, with its output written to a file, making it the third type of destination of this kind, after "stderr" and "csvlog". The format is convenient to feed logs to other applications. There is also a plugin external to core that provided this feature using the hook in elog.c, but this had to overwrite the output of "stderr" to work, so being able to do both at the same time was not possible. The files generated by this log format are suffixed with ".json", and use the same rotation policies as the other two formats depending on the backend configuration. This takes advantage of the refactoring work done previously in ac7c807, bed6ed3, 8b76f89 and 2d77d83 for the backend parts, and 72b76f7 for the TAP tests, making the addition of any new file-based format rather straight-forward. The documentation is updated to list all the keys and the values that can exist in this new format. pg_current_logfile() also required a refresh for the new option. Author: Sehrope Sarkuni, Michael Paquier Reviewed-by: Nathan Bossart, Justin Pryzby Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
*	Add stxdinherit flag to pg_statistic_ext_data	Tomas Vondra	2022-01-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add pg_statistic_ext_data.stxdinherit flag, so that for each extended statistics definition we can store two versions of data - one for the relation alone, one for the whole inheritance tree. This is analogous to pg_statistic.stainherit, but we failed to include such flag in catalogs for extended statistics, and we had to work around it (see commits 859b3003de, 36c4bc6e72 and 20b9fa308e). This changes the relationship between the two catalogs storing extended statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until now, there was a simple 1:1 mapping - for each definition there was one pg_statistic_ext_data row, and this row was inserted while creating the statistics (and then updated during ANALYZE). With the stxdinherit flag, we don't know how many rows there will be (child relations may be added after the statistics object is defined), so there may be up to two rows. We could make CREATE STATISTICS to always create both rows, but that seems wasteful - without partitioning we only need stxdinherit=false rows, and declaratively partitioned tables need only stxdinherit=true. So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS, and instead make that a responsibility of ANALYZE. Which is what we do for regular statistics too. Patch by me, with extensive improvements and fixes by Justin Pryzby. Author: Tomas Vondra, Justin Pryzby Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
*	Build inherited extended stats on partitioned tables	Tomas Vondra	2022-01-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 859b3003de disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue 859b3003de addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built 859b3003de), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as 859b3003de). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
*	Ignore extended statistics for inheritance trees	Tomas Vondra	2022-01-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 859b3003de we only build extended statistics for individual relations, ignoring the child relations. This resolved the issue with updating catalog tuple twice, but we still tried to use the statistics when calculating estimates for the whole inheritance tree. When the relations contain very distinct data, it may produce bogus estimates. This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we fix it the same way - by ignoring extended statistics when calculating estimates for the inheritance tree as a whole. We still consider extended statistics when calculating estimates for individual child relations, of course. This may result in plan changes due to different estimates, but if the old statistics were not describing the inheritance tree particularly well it's quite likely the new plans is actually better. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as 859b3003de). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
*	Rename value node fields	Peter Eisentraut	2022-01-14
\| \| \| \| \| \| \| \| \|	For the formerly-Value node types, rename the "val" field to a name specific to the node type, namely "ival", "fval", "sval", and "bsval". This makes some code clearer and catches mixups better. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
*	Fix ruleutils.c's dumping of whole-row Vars in more contexts.	Tom Lane	2022-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 7745bc352 intended to ensure that whole-row Vars would be printed with "::type" decoration in all contexts where plain "var.*" notation would result in star-expansion, notably in ROW() and VALUES() constructs. However, it missed the case of INSERT with a single-row VALUES, as reported by Timur Khanjanov. Nosing around ruleutils.c, I found a second oversight: the code for RowCompareExpr generates ROW() notation without benefit of an actual RowExpr, and naturally it wasn't in sync :-(. (The code for FieldStore also does this, but we don't expect that to generate strictly parsable SQL anyway, so I left it alone.) Back-patch to all supported branches. Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
*	Enhance pg_log_backend_memory_contexts() for auxiliary processes.	Fujii Masao	2022-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously pg_log_backend_memory_contexts() could request to log the memory contexts of backends, but not of auxiliary processes such as checkpointer. This commit enhances the function so that it can also send the request to auxiliary processes. It's useful to look at the memory contexts of those processes for debugging purpose and better understanding of the memory usage pattern of them. Note that pg_log_backend_memory_contexts() cannot send the request to logger or statistics collector. Because this logging request mechanism is based on shared memory but those processes aren't connected to that. Author: Bharath Rupireddy Reviewed-by: Vignesh C, Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/CALj2ACU1nBzpacOK2q=a65S_4+Oaz_rLTsU1Ri0gf7YUmnmhfQ@mail.gmail.com
*	Improve error handling of cryptohash computations	Michael Paquier	2022-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing cryptohash facility was causing problems in some code paths related to MD5 (frontend and backend) that relied on the fact that the only type of error that could happen would be an OOM, as the MD5 implementation used in PostgreSQL ~13 (the in-core implementation is used when compiling with or without OpenSSL in those older versions), could fail only under this circumstance. The new cryptohash facilities can fail for reasons other than OOMs, like attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to 1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this would cause incorrect reports to show up. This commit extends the cryptohash APIs so as callers of those routines can fetch more context when an error happens, by using a new routine called pg_cryptohash_error(). The error states are stored within each implementation's internal context data, so as it is possible to extend the logic depending on what's suited for an implementation. The default implementation requires few error states, but OpenSSL could report various issues depending on its internal state so more is needed in cryptohash_openssl.c, and the code is shaped so as we are always able to grab the necessary information. The core code is changed to adapt to the new error routine, painting more "const" across the call stack where the static errors are stored, particularly in authentication code paths on variables that provide log details. This way, any future changes would warn if attempting to free these strings. The MD5 authentication code was also a bit blurry about the handling of "logdetail" (LOG sent to the postmaster), so improve the comments related that, while on it. The origin of the problem is 87ae969, that introduced the centralized cryptohash facility. Extra changes are done for pgcrypto in v14 for the non-OpenSSL code path to cope with the improvements done by this commit. Reported-by: Michael Mühlbeyer Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com Backpatch-through: 14
*	Make pg_get_expr() more bulletproof.	Tom Lane	2022-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since this function is defined to accept pg_node_tree values, it could get applied to any nodetree that can appear in a cataloged pg_node_tree column. Some such cases can't be supported --- for example, its API doesn't allow providing referents for more than one relation --- but we should try to throw a user-facing error rather than an internal error when encountering such a case. In support of this, extend expression_tree_walker/mutator to be sure they'll work on any such node tree (which basically means adding support for relpartbound node types). That allows us to run pull_varnos and check for the case of multiple relations before we start processing the tree. The alternative of changing the low-level error thrown for an out-of-range varno isn't appealing, because that could mask actual bugs in other usages of ruleutils. Per report from Justin Pryzby. This is basically cosmetic, so no back-patch. Discussion: https://postgr.es/m/20211219205422.GT17618@telsasoft.com
*	Update copyright for 2022	Bruce Momjian	2022-01-07
\| \| \| \|	Backpatch-through: 10
*	Clean up error messages related to bad datetime units.	Tom Lane	2022-01-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adjust the error texts used for unrecognized/unsupported datetime units so that there are just two strings to translate, not two per datatype. Along the way, follow our usual error message style of not double-quoting type names, and instead making sure that we say the name is a type. Fix a couple of places in date.c that were using the wrong one of "unrecognized" and "unsupported". Nikhil Benesch, with a bit more editing by me Discussion: https://postgr.es/m/CAPWqQZTURGixmbMH2_Z3ZtWGA0ANjUb9bwtkkxSxSfDeFHuM6Q@mail.gmail.com
*	Simplify the general-purpose 64-bit integer parsing APIs	Peter Eisentraut	2021-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but it seems no longer necessary to have this indirection. msvc/Solution.pm claims HAVE_STRTOULL, so the "MSVC only" part seems unnecessary. Also, we have code in c.h to substitute alternatives for strtoull() if not found, and that would appear to cover all currently supported platforms, so having a further fallback in pg_strtouint64() seems unnecessary. Therefore, we could remove pg_strtouint64(), and use strtoull() directly in all call sites. However, it seems useful to keep a separate notation for parsing exactly 64-bit integers, matching the type definition int64/uint64. For that, add new macros strtoi64() and strtou64() in c.h as thin wrappers around strtol()/strtoul() or strtoll()/stroull(). This makes these functions available everywhere instead of just in the server code, and it makes the function naming notably different from the pg_strtointNN() functions in numutils.c, which have a different API. Discussion: https://www.postgresql.org/message-id/flat/a3df47c9-b1b4-29f2-7e91-427baf8b75a3%40enterprisedb.com
*	Always use ReleaseTupleDesc after lookup_rowtype_tupdesc et al.	Tom Lane	2021-12-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The API spec for lookup_rowtype_tupdesc previously said you could use either ReleaseTupleDesc or DecrTupleDescRefCount. However, the latter choice means the caller must be certain that the returned tupdesc is refcounted. I don't recall right now whether that was always true when this spec was written, but it's certainly not always true since we introduced shared record typcaches for parallel workers. That means that callers using DecrTupleDescRefCount are dependent on typcache behavior details that they probably shouldn't be. Hence, change the API spec to say that you must call ReleaseTupleDesc, and fix the half-dozen callers that weren't. AFAICT this is just future-proofing, there's no live bug here. So no back-patch. Per gripe from Chapman Flack. Discussion: https://postgr.es/m/61B901A4.1050808@anastigmatix.net
*	Remove unimplemented/undocumented geometric functions & operators.	Tom Lane	2021-12-13
\| \| \| \| \| \| \| \| \| \| \|	Nobody has filled in these stubs for upwards of twenty years, so it's time to drop the idea that they might get implemented any day now. The associated pg_operator and pg_proc entries are just confusing wastes of space. Per complaint from Anton Voloshin. Discussion: https://postgr.es/m/3426566.1638832718@sss.pgh.pa.us
*	Implement poly_distance().	Tom Lane	2021-12-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	geo_ops.c contains half a dozen functions that are just stubs throwing ERRCODE_FEATURE_NOT_SUPPORTED. Since it's been like that for more than twenty years, there's clearly not a lot of interest in filling in the stubs. However, I'm uncomfortable with deleting poly_distance(), since every other geometric type supports a distance-to-another-object- of-the-same-type function. We can easily add this capability by cribbing from poly_overlap() and path_distance(). It's possible that the (existing) test case for this will show some numeric instability, but hopefully the buildfarm will expose it if so. In passing, improve the documentation to try to explain why polygons are distinct from closed paths in the first place. Discussion: https://postgr.es/m/3426566.1638832718@sss.pgh.pa.us
*	Fix alignment in multirange_get_range() function	Alexander Korotkov	2021-12-13
\| \| \| \| \| \| \| \| \|	The multirange_get_range() function fails when two boundaries of the same range have different alignments. Fix that by adding proper pointer alignment. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/17300-dced2d01ddeb1f2f%40postgresql.org Backpatch-through: 14
*	Allow specifying column list for foreign key ON DELETE SET actions	Peter Eisentraut	2021-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend the foreign key ON DELETE actions SET NULL and SET DEFAULT by allowing the specification of a column list, like CREATE TABLE posts ( ... FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL (author_id) ); If a column list is specified, only those columns are set to null/default, instead of all the columns in the foreign-key constraint. This is useful for multitenant or sharded schemas, where the tenant or shard ID is included in the primary key of all tables but shouldn't be set to null. Author: Paul Martinez <paulmtz@google.com> Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
*	Fix inappropriate uses of PG_GETARG_UINT32()	Peter Eisentraut	2021-12-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The chr() function used PG_GETARG_UINT32() even though the argument is declared as (signed) integer. As a result, you can pass negative arguments to this function and it internally interprets them as positive. Ultimately ends up being harmless, but it seems wrong, so fix this and rearrange the internal error checking a bit to accommodate this. Another case was in the documentation, where example code used PG_GETARG_UINT32() with an argument declared as signed integer. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/7e43869b-d412-8f81-30a3-809783edc9a3%40enterprisedb.com
*	Some RELKIND macro refactoring	Peter Eisentraut	2021-12-03
\| \| \| \| \| \| \| \| \| \| \| \|	Add more macros to group some RELKIND_* macros: - RELKIND_HAS_PARTITIONS() - RELKIND_HAS_TABLESPACE() - RELKIND_HAS_TABLE_AM() Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f%40enterprisedb.com
*	Remove unused includes	Peter Eisentraut	2021-12-01
\| \| \| \| \| \| \|	These haven't been needed for a long time. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
*	Add a view to show the stats of subscription workers.	Amit Kapila	2021-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a new system view pg_stat_subscription_workers, that shows information about any errors which occur during the application of logical replication changes as well as during performing initial table synchronization. The subscription statistics entries are removed when the corresponding subscription is removed. It also adds an SQL function pg_stat_reset_subscription_worker() to reset single subscription errors. The contents of this view can be used by an upcoming patch that skips the particular transaction that conflicts with the existing data on the subscriber. This view can be extended in the future to track other xact related statistics like the number of xacts committed/aborted for subscription workers. Author: Masahiko Sawada Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
*	Fix typos	Michael Paquier	2021-11-30
\| \| \| \| \|	Author: Lingjie Qiang Discussion: https://postgr.es/m/OSAPR01MB71654E773F62AC88DC1FC8CC80669@OSAPR01MB7165.jpnprd01.prod.outlook.com
*	Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.	Tom Lane	2021-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Standardize on xoroshiro128 as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128 is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
*	Allow Memoize to operate in binary comparison mode	David Rowley	2021-11-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memoize would always use the hash equality operator for the cache key types to determine if the current set of parameters were the same as some previously cached set. Certain types such as floating points where -0.0 and +0.0 differ in their binary representation but are classed as equal by the hash equality operator may cause problems as unless the join uses the same operator it's possible that whichever join operator is being used would be able to distinguish the two values. In which case we may accidentally return in the incorrect rows out of the cache. To fix this here we add a binary mode to Memoize to allow it to the current set of parameters to previously cached values by comparing bit-by-bit rather than logically using the hash equality operator. This binary mode is always used for LATERAL joins and it's used for normal joins when any of the join operators are not hashable. Reported-by: Tom Lane Author: David Rowley Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us Backpatch-through: 14, where Memoize was added
*	Add SQL functions to monitor the directory contents of replication slots	Michael Paquier	2021-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a set of functions able to look at the contents of various paths related to replication slots: - pg_ls_logicalsnapdir, for pg_logical/snapshots/ - pg_ls_logicalmapdir, for pg_logical/mappings/ - pg_ls_replslotdir, for pg_replslot/<slot_name>/ These are intended to be used by monitoring tools. Unlike pg_ls_dir(), execution permission can be granted to non-superusers. Roles members of pg_monitor gain have access to those functions. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Justin Pryzby Discussion: https://postgr.es/m/CALj2ACWsfizZjMN6bzzdxOk1ADQQeSw8HhEjhmVXn_Pu+7VzLw@mail.gmail.com
*	Add a planner support function for starts_with().	Tom Lane	2021-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fills in some gaps in planner support for starts_with() and the equivalent ^@ operator: * A condition such as "textcol ^@ constant" can now use a regular btree index, not only an SP-GiST index, so long as the index's collation is C. (This works just like "textcol LIKE 'foo%'".) * "starts_with(textcol, constant)" can be optimized the same as "textcol ^@ constant". * Fixed-prefix LIKE and regex patterns are now more like starts_with() in another way: if you apply one to an SPGiST-indexed column, you'll get an index condition using ^@ rather than two index conditions with >= and <. Per a complaint from Shay Rojansky. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/232599.1633800229@sss.pgh.pa.us