| Commit message (Collapse) | Author | Age |
|
|
|
| |
Backpatch-through: 13
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
| |
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna.
Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
|
|
|
|
| |
Backpatch-through: 11
|
|
|
|
| |
Backpatch-through: 10
|
|
|
|
| |
Backpatch-through: 9.5
|
|
|
|
| |
Backpatch-through: update all files in master, backpatch legal files through 9.4
|
|
|
|
| |
Backpatch-through: certain files through 9.4
|
|
|
|
| |
Backpatch-through: certain files through 9.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes almost all core code follow the policy introduced in the
previous commit. Specific decisions:
- Text search support functions with char* and length arguments, such as
prsstart and lexize, may receive unaligned strings. I doubt
maintainers of non-core text search code will notice.
- Use plain VARDATA() on values detoasted or synthesized earlier in the
same function. Use VARDATA_ANY() on varlenas sourced outside the
function, even if they happen to always have four-byte headers. As an
exception, retain the universal practice of using VARDATA() on return
values of SendFunctionCall().
- Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large
for a one-byte header, so this misses no optimization.) Sites that do
not call get_page_from_raw() typically need the four-byte alignment.
- For now, do not change btree_gist. Its use of four-byte headers in
memory is partly entangled with storage of 4-byte headers inside
GBT_VARKEY, on disk.
- For now, do not change gtrgm_consistent() or gtrgm_distance(). They
incorporate the varlena header into a cache, and there are multiple
credible implementation strategies to consider.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ts_rewrite()'s replacement argument is an empty tsquery, it's supposed
to simplify any operator nodes whose operand(s) become NULL; but it failed
to do that reliably, because dropvoidsubtree() only examined the top level
of the result tree. Rather than make a second recursive pass, let's just
give the responsibility to dofindsubquery() to simplify while it's doing
the main replacement pass. Per report from Andreas Seltenreich.
Artur Zakirov, with some cosmetic changes by me. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/8737i01dew.fsf@credativ.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tsquery_rewrite() tries to find matches to subsets of AND/OR conditions;
for example, in the query 'a | b | c' the substitution subquery 'a | c'
should match and lead to replacement of the first and third items.
That's fine, but the matching algorithm apparently takes about O(2^N)
for an N-clause query (I say "apparently" because the code is also both
unintelligible and uncommented). We could probably do better than that
even without any extra assumptions --- but actually, we know that the
subclauses are sorted, indeed are depending on that elsewhere in this very
same function. So we can just scan the two lists a single time to detect
matches, as though we were doing a merge join.
Also do a re-flattening call (QTNTernary()) in tsquery_rewrite_query, just
to make sure that the tree fits the expectations of the next search cycle.
I didn't try to devise a test case for this, but I'm pretty sure that the
oversight could have led to failure to match in some cases where a match
would be expected.
Improve comments, and also stick a CHECK_FOR_INTERRUPTS into
dofindsubquery, just in case it's still too slow for somebody.
Per report from Andreas Seltenreich. Back-patch to all supported branches.
Discussion: <8760oasf2y.fsf@credativ.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch widens SPI_processed, EState's es_processed field, PortalData's
portalPos field, FuncCallContext's call_cntr and max_calls fields,
ExecutorRun's count argument, PortalRunFetch's result, and the max number
of rows in a SPITupleTable to uint64, and deals with (I hope) all the
ensuing fallout. Some of these values were declared uint32 before, and
others "long".
I also removed PortalData's posOverflow field, since that logic seems
pretty useless given that portalPos is now always 64 bits.
The user-visible results are that command tags for SELECT etc will
correctly report tuple counts larger than 4G, as will plpgsql's GET
GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples
than that are still not exactly the norm, but they're becoming more
common.
Most values associated with FETCH/MOVE distances, such as PortalRun's count
argument and the count argument of most SPI functions that have one, remain
declared as "long". It's not clear whether it would be worth promoting
those to int64; but it would definitely be a large dollop of additional
API churn on top of this, and it would only help 32-bit platforms which
seem relatively less likely to see any benefit.
Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
|
|
|
|
| |
Backpatch certain files through 9.1
|
|
|
|
| |
Backpatch certain files through 9.0
|
|
|
|
|
| |
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
|
|
|
|
|
| |
This is the first run of the Perl-based pgindent script. Also update
pgindent instructions.
|
|
|
|
|
| |
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Aaron Marcuse-Kubitza <aaronmk@blackducksoftware.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This alters various incidental uses of C++ key words to use other similar
identifiers, so that a C++ compiler won't choke outright. You still
(probably) need extern "C" { }; around the inclusion of backend headers.
based on a patch by Kurt Harriman <harriman@acm.org>
Also add a script cpluspluscheck to check for C++ compatibility in the
future. As of right now, this passes without error for me.
|
|
|
|
|
|
|
|
|
| |
not include postgres.h nor anything else it doesn't directly need. Add
#includes to calling files as needed to compensate. Per my proposal of
yesterday.
This should be noted as a source code change in the 8.4 release notes,
since it's likely to require changes in add-on modules.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strings. This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString. A number of
existing macros that provided variants on these themes were removed.
Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed. There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).
This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.
Brendan Jurd and Tom Lane
|
| |
|
|
|
|
| |
avoid this problem in the future.)
|
| |
|
|
|
|
| |
and put it into contrib/tsearch2 compatibility module.
|
|
|
|
|
|
| |
and ts_stat(), per my recent suggestion. Also add a possibly-not-needed-
but-can't-hurt check for NULL SPI_tuptable, before we try to dereference
same.
|
|
|
|
|
|
|
|
|
| |
if there are zero rows to aggregate over, and the API seems both conceptually
and notationally ugly anyway. We should look for something that improves
on the tsquery-and-text-SELECT version (which is also pretty ugly but at
least it works...), but it seems that will take query infrastructure that
doesn't exist today. (Hm, I wonder if there's anything in or near SQL2003
window functions that would help?) Per discussion.
|
|
|
|
|
| |
a later rewrite rule should change a subtree modified by an earlier one.
Per my gripe of a few days ago.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- change the alignment requirement of lexemes in TSVector slightly.
Lexeme strings were always padded to 2-byte aligned length to make sure
that if there's position array (uint16[]) it has the right alignment.
The patch changes that so that the padding is not done when there's no
positions. That makes the storage of tsvectors without positions
slightly more compact.
- added some #include "miscadmin.h" lines I missed in the earlier when I
added calls to check_stack_depth().
- Reimplement the send/recv functions, and added a comment
above them describing the on-wire format. The CRC is now recalculated in
tsquery as well per previous discussion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add code to check that the query tree is well-formed. It was indeed
possible to send malformed queries in binary mode, which produced all
kinds of strange results.
- make the left-field a uint32. There's no reason to
arbitrarily limit it to 16-bits, and it won't increase the disk/memory
footprint either now that QueryOperator and QueryOperand are separate
structs.
- add check_stack_depth() call to all recursive functions I found.
Some of them might have a natural limit so that you can't force
arbitrarily deep recursions, but check_stack_depth() is cheap enough
that seems best to just stick it into anything that might be a problem.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
small editorization by me
- Brake the QueryItem struct into QueryOperator and QueryOperand.
Type was really the only common field between them. QueryItem still
exists, and is used in the TSQuery struct as before, but it's now a
union of the two. Many other changes fell from that, like separation
of pushval_asis function into pushValue, pushOperator and pushStop.
- Moved some structs that were for internal use only from header files
to the right .c-files.
- Moved tsvector parser to a new tsvector_parser.c file. Parser code was
about half of the size of tsvector.c, it's also used from tsquery.c, and
it has some data structures of its own, so it seems better to separate
it. Cleaned up the API so that TSVectorParserState is not accessed from
outside tsvector_parser.c.
- Separated enumerations (#defines, really) used for QueryItem.type
field and as return codes from gettoken_query. It was just accidental
code sharing.
- Removed ParseQueryNode struct used internally by makepol and friends.
push*-functions now construct QueryItems directly.
- Changed int4 variables to just ints for variables like "i" or "array
size", where the storage-size was not significant.
|
|
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.
Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
|