| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
The previous approach didn't really work on windows, due to the PATH separator
being ';' not ':'. Instead of making the PATH change more complicated,
reference the binary using the TESTDIR environment.
Reported-By: Andres Freund <andres@anarazel.de>
Suggested-By: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20210930214040.odkdd42vknvzifm6@alap3.anarazel.de
Backpatch: 14-, where the test was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regexps like "(.){0}...\1" drew an "invalid backreference number".
That's not unreasonable on its face, since the capture group will
never be matched if it's iterated zero times. However, other engines
such as Perl's don't complain about this, nor do we throw an error for
related cases such as "(.)|\1", even though that backref can never
succeed either. Also, if the zero-iterations case happens at runtime
rather than compile time --- say, "(x)*...\1" when there's no "x" to
be found --- that's not an error, we just deem the backref to not
match. Making this even less defensible, no error was thrown for
nested cases such as "((.)){0}...\2"; and to add insult to injury,
those cases could result in assertion failures instead. (It seems
that nothing especially bad happened in non-assert builds, though.)
Let's just fix it so that no error is thrown and instead the backref
is deemed to never match, so that compile-time detection of no
iterations behaves the same as run-time detection.
Per report from Mark Dilger. This appears to be an aboriginal error
in Spencer's library, so back-patch to all supported versions.
Pre-v14, it turns out to also be necessary to back-patch one aspect of
commits cb76fbd7e/00116dee5, namely to create capture-node subREs with
the begin/end states of their subexpressions, not the current lp/rp
of the outer parseqatom invocation. Otherwise delsub complains that
we're trying to disconnect a state from itself. This is a bit scary
but code examination shows that it's safe: in the pre-v14 code, if we
want to wrap iteration around the subexpression, the first thing we do
is overwrite the atom's begin/end fields with new states. So the
bogus values didn't survive long enough to be used for anything, except
if no iteration is required, in which case it doesn't matter.
Discussion: https://postgr.es/m/A099E4A8-4377-4C64-A98C-3DEDDC075502@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recursion in cdissect() was careless about clearing match data
for capturing parentheses after rejecting a partial match. This
could allow a later back-reference to succeed when by rights it
should fail for lack of a defined referent.
To fix, think a little more rigorously about what the contract
between different levels of cdissect's recursion needs to be.
With the right spec, we can fix this using fewer rather than more
resets of the match data; the key decision being that a failed
sub-match is now explicitly responsible for clearing any matches
it may have set.
There are enough other cross-checks and optimizations in the code
that it's not especially easy to exhibit this problem; usually, the
match will fail as-expected. Plus, regexps that are even potentially
vulnerable are most likely user errors, since there's just not much
point in writing a back-ref that doesn't always have a referent.
These facts perhaps explain why the issue hasn't been detected,
even though it's almost certainly a couple of decades old.
Discussion: https://postgr.es/m/151435.1629733387@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit cebc1d34e taught parseqatom() to optimize cases where a branch
contains only one, "messy", atom by getting rid of excess subRE nodes.
The way we really should do that is to keep the subRE built for the
"messy" child atom; but to avoid changing parseqatom's nominal API,
I made it delete that node after copying its fields to the outer subRE
made by parsebranch(). It seems that that actually worked at the time;
but it became dangerous after ea1268f63, because that later commit
allowed the lower invocation of parse() to return a subRE that was also
pointed to by some v->subs[] entry. This meant we could wind up with a
dangling pointer in v->subs[], allowing a later backref to misbehave,
but only if that subRE struct had been reused in between. So the damage
seems confined to cases like '((...))...(...\2'.
To fix, do what I should have done before and modify parseqatom's API
to make it possible for it to remove the caller's subRE instead of the
callee's. That's safer because we know that subRE isn't complete yet,
so noplace else will have a pointer to it.
Per report from Mark Dilger. Back-patch to v14 where the problematic
patches came in.
Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The error messages using the word "non-negative" are confusing
because it's ambiguous about whether it accepts zero or not.
This commit improves those error messages by replacing it with
less ambiguous word like "greater than zero" or
"greater than or equal to zero".
Also this commit added the note about the word "non-negative" to
the error message style guide, to help writing the new error messages.
When postgres_fdw option fetch_size was set to zero, previously
the error message "fetch_size requires a non-negative integer value"
was reported. This error message was outright buggy. Therefore
back-patch to all supported versions where such buggy error message
could be thrown.
Reported-by: Hou Zhijie
Author: Bharath Rupireddy
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sending queries in pipeline mode, we were careless about leaving
the connection in the right state so that PQgetResult would behave
correctly; trying to read further results after sending a query after
having read a result with an error would sometimes hang. Fix by
ensuring internal libpq state is changed properly. All the state
changes were being done by the callers of pqAppendCmdQueueEntry(); it
would have become too repetitious to have this logic in each of them, so
instead put it all in that function and relieve callers of the
responsibility.
Add a test to verify this case. Without the code fix, this new test
hangs sometimes.
Also, document that PQisBusy() would return false when no queries are
pending result. This is not intuitively obvious, and NULL would be
obtained by calling PQgetResult() at that point, which is confusing.
Wording by Boris Kolpackov.
In passing, fix bogus use of "false" to mean "0", per Ranier Vilela.
Backpatch to 14.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Boris Kolpackov <boris@codesynthesis.com>
Discussion: https://postgr.es/m/boris.20210624103805@codesynthesis.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original coding required that PQpipelineSync had been called before
the first call to PQgetResult, and failure to do that would result in an
unexpected NULL result being returned. Fix by setting the right state
when a query is sent, rather than leaving it unchanged and having
PQpipelineSync apply the necessary state change.
A new test case to verify the behavior is added, which relies on the new
PQsendFlushRequest() function added by commit a7192326c74d.
Backpatch to 14, where pipeline mode was added.
Reported-by: Boris Kolpackov <boris@codesynthesis.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/boris.20210616110321@codesynthesis.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too. Also there was no visual separation of a query's result
from subsequent isolationtester output. This made test result
files confusing and hard to read.
To improve matters, let's use libpq's PQprint() function. Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.
Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.
Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit acb7e4eb6b1c added a new implementation for PQsendQuery so that
it works in pipeline mode (by using extended query protocol), but it
behaves differently from the 'Q' message (in simple query protocol) used
by regular implementation: the new one doesn't close the unnamed portal.
Change the new code to have identical behavior to the old.
Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202106072107.d4i55hdscxqj@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
| |
We have a dozen PQset*() functions. PQresultSetInstanceData() and this
were the libpq setter functions having a different word order. Adopt
the majority word order.
Reviewed by Alvaro Herrera and Robert Haas, though this choice of name
was not unanimous.
Discussion: https://postgr.es/m/20210605060555.GA216695@rfd.leadboat.com
|
| |
|
|
|
|
|
|
|
|
| |
The path needs to be set to refer to the build directory, not the
current directory, because that's actually the source directory at
that point.
fix for 6abc8c2596dbfcb24f9b4d954a1465b8015118c3
|
|
|
|
|
|
|
|
|
| |
Apply in libpq_pipeline test makefile, so that the test file is not
installed into tmp_install.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/cb9d16a6-760f-cd44-28d6-b091d5fb6ca7%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Formerly we just relied on operator classes that assert longValuesOK
to eventually shorten the leaf value enough to fit on an index page.
That fails since the introduction of INCLUDE-column support (commit
09c1c6ab4), because the INCLUDE columns might alone take up more
than a page, meaning no amount of leaf-datum compaction will get
the job done. At least with spgtextproc.c, that leads to an infinite
loop, since spgtextproc.c won't throw an error for not being able
to shorten the leaf datum anymore.
To fix without breaking cases that would otherwise work, add logic
to spgdoinsert() to verify that the leaf tuple size is decreasing
after each "choose" step. Some opclasses might not decrease the
size on every single cycle, and in any case, alignment roundoff
of the tuple size could obscure small gains. Therefore, allow
up to 10 cycles without additional savings before throwing an
error. (Perhaps this number will need adjustment, but it seems
quite generous right now.)
As long as we've developed this logic, let's back-patch it.
The back branches don't have INCLUDE columns to worry about, but
this seems like a good defense against possible bugs in operator
classes. We already know that an infinite loop here is pretty
unpleasant, so having a defense seems to outweigh the risk of
breaking things. (Note that spgtextproc.c is actually the only
known opclass with longValuesOK support, so that this is all moot
for known non-core opclasses anyway.)
Per report from Dilip Kumar.
Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
6568cef, that introduced the option, had an inconsistent behavior when
it comes to configuration tables set up by pg_extension_config_dump, as
the data of all configuration tables would included in a dump even for
extensions not listed by a set of --extension switches.
The contents dumped changed depending on the schema where an extension
was installed when an extension was not listed. For example, an
extension installed under the public schema would have its configuration
data not dumped even when not listed with --extension, which was
inconsistent with the case of an extension installed on a non-public
schema, where the configuration would be dumped.
Per discussion with Noah, we have settled down to the simple rule of
dumping configuration data of an extension if it is listed in
--extension (default is unchanged and backward-compatible, to dump
everything on sight if there are no extensions directly listed). This
avoids some weird cases where the dumps depended on a --schema for one.
More tests are added to cover the gap, where we cross-check more
behaviors depending on --schema when an extension is not listed.
Reported-by: Noah Misch
Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/20210404220802.GA728316@rfd.leadboat.com
|
|
|
|
|
|
|
| |
These got introduced in 6568cef.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20210404220802.GA728316@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A (relatively minor) annoyance of ErrorResponse/NoticeResponse messages
as printed by PQtrace() is that their length might vary when we move
error messages from one source file to another, one function to another,
or even when their location line numbers change number of digits.
To avoid having to adjust expected files for some tests, make the
regress mode of PQtrace() suppress the length word of NoticeResponse and
ErrorResponse messages.
Discussion: https://postgr.es/m/20210402023010.GA13563@alvherre.pgsql
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
| |
DEBUG was already defined by the MSVC toolchain for "Debug" builds. On
these systems the unconditional #define DEBUG was causing a 'DEBUG': macro
redefinition warning.
Here we rename DEBUG to DEBUG_OUPUT and also get rid of the #define which
defined this constant. This appears to have been left in the code by
mistake.
Discussion: https://postgr.es/m/CAApHDvqTTgDm38s4HRj03nhzhzQ1oMOj-RXFUB1pE6Bj07jyuQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not much to say here: does what it says on the tin.
We steal a previously-always-zero bit from the nextOffset
field of leaf index tuples in order to track whether there
is a nulls bitmap. Otherwise it works about like included
columns in other index types.
Pavel Borisov, reviewed by Andrey Borodin and Anastasia Lubennikova,
and rather heavily editorialized on by me
Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the documentation, the attType passed to the opclass
config function (and also relied on by the core code) is the type
of the heap column or expression being indexed. But what was
actually being passed was the type stored for the index column.
This made no difference for user-defined SP-GiST opclasses,
because we weren't allowing the STORAGE clause of CREATE OPCLASS
to be used, so the two types would be the same. But it's silly
not to allow that, seeing that the built-in poly_ops opclass
has a different value for opckeytype than opcintype, and that if you
want to do lossy storage then the types must really be different.
(Thus, user-defined opclasses doing lossy storage had to lie about
what type is in the index.) Hence, remove the restriction, and make
sure that we use the input column type not opckeytype where relevant.
For reasons of backwards compatibility with existing user-defined
opclasses, we can't quite insist that the specified leafType match
the STORAGE clause; instead just add an amvalidate() warning if
they don't match.
Also fix some bugs that would only manifest when trying to return
index entries when attType is different from attLeafType. It's not
too surprising that these have not been reported, because the only
usual reason for such a difference is to store the leaf value
lossily, rendering index-only scans impossible.
Add a src/test/modules module to exercise cases where attType is
different from attLeafType and yet index-only scan is supported.
Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Windows doesn't like setvbuf(..., _IOLBF) and crashes if you use it,
which has been causing the libpq_pipeline failures all along ... and our
own port.h has known about it for a long time: it offers PG_IOLBF that's
defined to _IONBF on that platform. Follow its advice.
While at it, get rid of a bogus bitshift that used a constant of the
wrong size. Decorate the constant as LL to fix. While at it, remove a
pointless addition that only confused matters.
All as diagnosed by Tom Lane.
Discussion: https://postgr.es/m/3458958.1617302154@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I forgot to strdup() when processing argv[]. Apparently many platforms
hide this mistake from users, but in those that don't you may get a
program crash. Repair.
Per buildfarm member drongo, which is the only one in all the buildfarm
manifesting a problem here.
While at it, move "numrows" processing out of the line of special cases,
and make it getopt's -r instead. (A similar thing could be done to
'conninfo', but current use of the program doesn't warrant spending time
on that -- nowhere else we use conninfo in so simplistic a manner.)
Discussion: https://postgr.es/m/20210401124850.GA19247@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's misplaced there -- it's not libpq's output stream to tweak in that
way. In particular, POSIX says that it has to be called before any
other operation on the file, so if a stream previously used by the
calling application, bad things may happen.
Put setvbuf() in libpq_pipeline for good measure.
Also, reduce fopen(..., "w+") to just fopen(..., "w") in
libpq_pipeline.c. It's not clear that this fixes anything, but we don't
use w+ anywhere.
Per complaints from Tom Lane.
Discussion: https://postgr.es/m/3337422.1617229905@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some buildfarm animals with force_parallel_mode=regress were failing
this test because the error is reported in a parallel worker quicker
than the rows that succeed.
Take the opportunity to move the SET of lc_messages out of the traced
section, because it's not very interesting.
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3304521.1617221724@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
Some compilers seem to be concerned about the possibility that
recv_step is not any of the defined enum values. Silence
warnings about uninitialized cmdtag in a different way than
I did in 9fb9691a8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test pipeline_abort was not checking that it got the rows it expected in
one mode; make it do so. This doesn't fix the actual problem (no idea
what that is, yet) but at least it should make it more obvious rather
than being visible only as a difference in the trace output.
While at it, fix other infelicities in the test:
* I reversed the order of result vs. expected in like().
* The output traces from -t are being put in the log dir, which means
the buildfarm script uselessly captures them. Put them in a separate
dir tmp_check/traces instead, to avoid cluttering the buildfarm results.
* Test pipelined_insert was using too large a row count. Reduce that a
tad and add a filler column to make each insert a little bulkier, while
still keeping enough that a buffer is filled and we have to switch mode.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When specified, only extensions matching the given pattern are included
in dumps. Similarly to --table and --schema, when --strict-names is
used, a perfect match is required. Also, like the two other options,
this new option offers no guarantee that dependent objects have been
dumped, so a restore may fail on a clean database.
Tests are added in test_pg_dump/, checking after a set of positive and
negative cases, with or without an extension's contents added to the
dump generated.
Author: Guillaume Lelarge
Reviewed-by: David Fetter, Tom Lane, Michael Paquier, Asif Rehman,
Julien Rouhaud
Discussion: https://postgr.es/m/CAECtzeXOt4cnMU5+XMZzxBPJ_wu76pNy6HZKPRBL-j7yj1E4+g@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
The libpq_pipeline program recently introduced by commit acb7e4eb6b1c
is well equipped to test the PQtrace() functionality, so let's make it
do that.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20210327192812.GA25115@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
|
|
|
|
|
|
|
|
| |
Compilers that don't understand that elog(ERROR) doesn't return
issued warnings here. In the cases in libpq_pipeline.c, we were
not exactly helping things by failing to mark pg_fatal() as noreturn.
Per buildfarm.
|
|
|
|
|
|
|
|
|
|
|
|
| |
I forgot that Windows represents newlines as \r\n, so splitting a string
at /\s/ creates additional empty strings. Let's rewrite that as /\s+/
to see if that avoids those. (There's precedent for using that pattern
on Windows in other scripts.)
Previously: 91bdf499b37b, 8ed428dc977f, 650b96707672.
Per buildfarm, via Tom Lane.
Discussion: https://postgr.es/m/3144460.1615860259@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pipeline mode in libpq lets an application avoid the Sync messages in
the FE/BE protocol that are implicit in the old libpq API after each
query. The application can then insert Sync at its leisure with a new
libpq function PQpipelineSync. This can lead to substantial reductions
in query latency.
Co-authored-by: Craig Ringer <craig.ringer@enterprisedb.com>
Co-authored-by: Matthieu Garrigues <matthieu.garrigues@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aya Iwata <iwata.aya@jp.fujitsu.com>
Reviewed-by: Daniel Vérité <daniel@manitou-mail.org>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kirk Jamison <k.jamison@fujitsu.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Nikhil Sontakke <nikhils@2ndquadrant.com>
Reviewed-by: Vaishnavi Prabakaran <VaishnaviP@fast.au.fujitsu.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com
Discussion: https://postgr.es/m/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
POSIX defines the behavior of back-references thus:
The back-reference expression '\n' shall match the same (possibly
empty) string of characters as was matched by a subexpression
enclosed between "\(" and "\)" preceding the '\n'.
As far as I can see, the back-reference is supposed to consider only
the data characters matched by the referenced subexpression. However,
because our engine copies the NFA constructed from the referenced
subexpression, it effectively enforces any constraints therein, too.
As an example, '(^.)\1' ought to match 'xx', or any other string
starting with two occurrences of the same character; but in our code
it does not, and indeed can't match anything, because the '^' anchor
constraint is included in the backref's copied NFA. If POSIX intended
that, you'd think they'd mention it. Perl for one doesn't act that
way, so it's hard to conclude that this isn't a bug.
Fix by modifying the backref's NFA immediately after it's copied from
the reference, replacing all constraint arcs by EMPTY arcs so that the
constraints are treated as automatically satisfied. This still allows
us to enforce matching rules that depend only on the data characters;
for example, in '(^\d+).*\1' the NFA matching step will still know
that the backref can only match strings of digits.
Perhaps surprisingly, this change does not affect the results of any
of a rather large corpus of real-world regexes. Nonetheless, I would
not consider back-patching it, since it's a clear compatibility break.
Patch by me, reviewed by Joel Jacobson
Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Newline is certainly not a digit, nor a word character, so it is
sensible that it should match these complemented character classes.
Previously, \D and \W acted that way by default, but in
newline-sensitive mode ('n' or 'p' flag) they did not match newlines.
This behavior was previously forced because explicit complemented
character classes don't match newlines in newline-sensitive mode;
but as of the previous commit that implementation constraint no
longer exists. It seems useful to change this because the primary
real-world use for newline-sensitive mode seems to be to match the
default behavior of other regex engines such as Perl and Javascript
... and their default behavior is that these match newlines.
The old behavior can be kept by writing an explicit complemented
character class, i.e. [^[:digit:]] or [^[:word:]]. (This means
that \D and \W are not exactly equivalent to those strings, but
they weren't anyway.)
Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The complement-class escapes \D, \S, \W are now allowed within
bracket expressions. There is no semantic difficulty with doing
that, but the rather hokey macro-expansion-based implementation
previously used here couldn't cope.
Also, invent "word" as an allowed character class name, thus "\w"
is now equivalent to "[[:word:]]" outside brackets, or "[:word:]"
within brackets. POSIX allows such implementation-specific
extensions, and the same name is used in e.g. bash.
One surprising compatibility issue this raises is that constructs
such as "[\w-_]" are now disallowed, as our documentation has always
said they should be: character classes can't be endpoints of a range.
Previously, because \w was just a macro for "[:alnum:]_", such a
construct was read as "[[:alnum:]_-_]", so it was accepted so long as
the character after "-" was numerically greater than or equal to "_".
Some implementation cleanup along the way:
* Remove the lexnest() hack, and in consequence clean up wordchrs()
to not interact with the lexer.
* Fix colorcomplement() to not be O(N^2) in the number of colors
involved.
* Get rid of useless-as-far-as-I-can-see calls of element()
on single-character character element names in brackpart().
element() always maps these to the character itself, and things
would be quite broken if it didn't --- should "[a]" match something
different than "a" does? Besides, the shortcut path in brackpart()
wasn't doing this anyway, making it even more inconsistent.
Discussion: https://postgr.es/m/2845172.1613674385@sss.pgh.pa.us
Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While poking at the regex code, I happened to notice that the bug
squashed in commit afcc8772e had a sibling: next() failed to return
a specific value associated with the '}' token for a "\{m,n\}"
quantifier when parsing in basic RE mode. Again, this could result
in treating the quantifier as non-greedy, which it never should be in
basic mode. For that to happen, the last character before "\}" that
sets "nextvalue" would have to set it to zero, or it'd have to have
accidentally been zero from the start. The failure can be provoked
repeatably with, for example, a bound ending in digit "0".
Like the previous patch, back-patch all the way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Push some hopefully-uncontroversial bits extracted from an upcoming
patch series, to remove non-relevant clutter from the main patches.
In compact(), return immediately after setting REG_ASSERT error;
continuing the loop would just lead to assertion failure below.
(Ask me how I know.)
In parseqatom(), remove assertion that moresubs() did its job.
When moresubs actually did its job, this is redundant with that
function's final assert; but when it failed on OOM, this is an
assertion crash. We could avoid the crash by adding a NOERR()
check before the assertion, but it seems better to subtract code
than add it. (Note that there's a NOERR exit a few lines further
down, and nothing else between here and there requires moresubs
to have succeeded. So we don't really need an extra error exit.)
This is a live bug in assert-enabled builds, but given the very
low likelihood of OOM in moresub's tiny allocation, I don't think
it's worth back-patching.
On the other hand, it seems worthwhile to add an assertion that
our intended v->subs[subno] target is still null by the time we
are ready to insert into it, since there's a recursion in between.
In pg_regexec, ensure we fflush any debug output on the way out,
and try to make MDEBUG messages more uniform and helpful. (In
particular, ensure that all of them are prefixed with the subre's
id number, so one can match up entry and exit reports.)
Add some test cases in test_regex to improve coverage of lookahead
and lookbehind constraints. Adding these now is mainly to establish
that this is indeed the existing behavior.
Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a replacement for the existing --with-openssl, extending the
logic to make easier the addition of new SSL libraries. The grammar is
chosen to be similar to --with-uuid, where multiple values can be
chosen, with "openssl" as the only supported value for now.
The original switch, --with-openssl, is kept for compatibility.
Author: Daniel Gustafsson, Michael Paquier
Reviewed-by: Jacob Champion
Discussion: https://postgr.es/m/FAB21FC8-0F62-434F-AA78-6BD9336D630A@yesql.se
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For those system catalogs that have a unique indexes, make a primary
key and unique constraint, using ALTER TABLE ... PRIMARY KEY/UNIQUE
USING INDEX.
This can be helpful for GUI tools that look for a primary key, and it
might in the future allow declaring foreign keys, for making schema
diagrams.
The constraint creation statements are automatically created by
genbki.pl from DECLARE_UNIQUE_INDEX directives. To specify which one
of the available unique indexes is the primary key, use the new
directive DECLARE_UNIQUE_INDEX_PKEY instead. By convention, we
usually make a catalog's OID column its primary key, if it has one.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/dc5f44d9-5ec1-a596-0251-dadadcdede98@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
| |
The stanza to report a "partial" match could overrun the initially
allocated output array, so it needs its own copy of the array-resizing
logic that's in the main loop. I overlooked the need for this in
ca8217c10.
Per report from Alexander Lakhin.
Discussion: https://postgr.es/m/3206aace-50db-e02a-bbea-76d5cdaa2cb6@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The context is an object that no longer bears some aclitem that it bore
initially. (A user issued REVOKE or GRANT statements upon the object.)
pg_dump is forming SQL to reproduce the object ACL. Since initdb
creates no ACL bearing GRANT OPTION, reaching this bug requires an
extension where the creation script establishes such an ACL. No PGXN
extension does that. If an installation did reach the bug, pg_dump
would have omitted a semicolon, causing a REVOKE and the next SQL
statement to fail. Separately, since the affected code exists to
eliminate an entire aclitem, it wants plain REVOKE, not REVOKE GRANT
OPTION FOR. Back-patch to 9.6, where commit
23f34fa4ba358671adab16773e79c17c92cbc870 first appeared.
Discussion: https://postgr.es/m/20210109102423.GA160022@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an executor aminsert() hint mechanism that informs index AMs that
the incoming index tuple (the tuple that accompanies the hint) is not
being inserted by execution of an SQL statement that logically modifies
any of the index's key columns.
The hint is received by indexes when an UPDATE takes place that does not
apply an optimization like heapam's HOT (though only for indexes where
all key columns are logically unchanged). Any index tuple that receives
the hint on insert is expected to be a duplicate of at least one
existing older version that is needed for the same logical row. Related
versions will typically be stored on the same index page, at least
within index AMs that apply the hint.
Recognizing the difference between MVCC version churn duplicates and
true logical row duplicates at the index AM level can help with cleanup
of garbage index tuples. Cleanup can intelligently target tuples that
are likely to be garbage, without wasting too many cycles on less
promising tuples/pages (index pages with little or no version churn).
This is infrastructure for an upcoming commit that will teach nbtree to
perform bottom-up index deletion. No index AM actually applies the hint
just yet.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brenext(), when parsing a '*' quantifier, forgot to return any "value"
for the token; per the equivalent case in next(), it should return
value 1 to indicate that greedy rather than non-greedy behavior is
wanted. The result is that the compiled regexp could behave like 'x*?'
rather than the intended 'x*', if we were unlucky enough to have
a zero in v->nextvalue at this point. That seems to happen with some
reliability if we have '.*' at the beginning of a BRE-mode regexp,
although that depends on the initial contents of a stack-allocated
struct, so it's not guaranteed to fail.
Found by Alexander Lakhin using valgrind testing. This bug seems
to be aboriginal in Spencer's code, so back-patch all the way.
Discussion: https://postgr.es/m/16814-6c5e3edd2bdf0d50@postgresql.org
|
|
|
|
|
| |
I apparently copied-and-pasted the wrong link in commit ca8217c10.
Point it where it was meant to go.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This module provides a function test_regex() that is functionally
rather like regexp_matches(), but with additional debugging-oriented
options and additional output. The debug options are somewhat obscure;
they are chosen to match the API of the test harness that Henry Spencer
wrote way-back-when for use in Tcl. With this, we can import all the
test cases that Spencer wrote originally, even for regex functionality
that we don't currently expose in Postgres. This seems necessary
because we can no longer rely on Tcl to act as upstream and verify
any fixes or improvements that we make.
In addition to Spencer's tests, I added a few for lookbehind
constraints (which we added in 2015, and Tcl still hasn't absorbed)
that are modeled on his tests for lookahead constraints. After looking
at code coverage reports, I also threw in a couple of tests to more
fully exercise our "high colormap" logic.
According to my testing, this brings the check-world coverage
for src/backend/regex/ from 71.1% to 86.7% of lines.
(coverage.postgresql.org shows a slightly different number,
which I think is because it measures a non-assert build.)
Discussion: https://postgr.es/m/2873268.1609732164@sss.pgh.pa.us
|
|
|
|
| |
Backpatch-through: 9.5
|
|
|
|
|
|
|
|
|
| |
Any subdirectory that's ignoring /output_iso/ should also
ignore /tmp_check_iso/, which could be left behind by a
failed pg_isolation_regress_check run.
I think these have been wrong for awhile, but it doesn't
seem important to fix in back branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously test_shm_mq worker used the stripped-down version of die()
as the SIGTERM signal handler. This commit makes it use die(), instead,
to simplify the code.
In terms of the code, the difference between die() and the stripped-down
version previously used is whether the signal handler directly may call
ProcessInterrupts() or not. But this difference doesn't exist in
a background worker because, in bgworker, DoingCommandRead flag will
never be true and die() will never call ProcessInterrupts() directly.
Therefore test_shm_mq worker can safely use die(), like other bgworker
proceses (e.g., logical replication apply launcher or autoprewarm worker)
currently do.
Thanks to Craig Ringer for the report and investigation of the issue.
Author: Bharath Rupireddy
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAGRY4nxsAe_1k_9g5b47orA0S011iBoHsXHFMH7cg7HV0O1bwQ@mail.gmail.com
|