| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
hash table is allocated in a child context of the agg node's memory
context, MemoryContextReset() will reset but *not* delete the child
context. Since ExecReScanAgg() proceeds to build a new hash table
from scratch (in a new sub-context), this results in leaking the
header for the previous memory context. Therefore, use
MemoryContextResetAndDeleteChildren() instead.
Credit: My colleague Sailesh Krishnamurthy at Truviso for isolating
the cause of the leak.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
we should check that the function code returns the claimed result datatype
every time we parse the function for execution. Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0555
|
|
|
|
|
|
|
| |
involving unions of types having typmods. Variants of the failure are known
to occur in 8.1 and up; not sure if it's possible in 8.0 and 7.4, but since
the code exists that far back, I'll just patch 'em all. Per report from
Brian Hurt.
|
|
|
|
|
|
|
|
|
| |
ps_TupFromTlist in plan nodes that make use of it. This was being done
correctly in join nodes and Result nodes but not in any relation-scan nodes.
Bug would lead to bogus results if a set-returning function appeared in the
targetlist of a subquery that could be rescanned after partial execution,
for example a subquery within EXISTS(). Bug has been around forever :-(
... surprising it wasn't reported before.
|
|
|
|
|
|
|
|
|
|
| |
sub-arrays. Per discussion, if all inputs are empty arrays then result
must be an empty array too, whereas a mix of empty and nonempty arrays
should (and already did) draw an error. In the back branches, the
construct was strict: any NULL input immediately yielded a NULL output;
so I left that behavior alone. HEAD was simply ignoring NULL sub-arrays,
which doesn't seem very sensible. For lack of a better idea it now
treats NULL sub-arrays the same as empty ones.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a multiple (OR'ed) indexscan. It was checking for duplicate
tuple->t_data->t_ctid, when what it should be checking is tuple->t_self.
The trouble situation occurs when a live tuple has t_ctid not pointing to
itself, which can happen if an attempted UPDATE was rolled back. After a
VACUUM, an unrelated tuple could be installed where the failed update tuple
was, leading to one live tuple's t_ctid pointing to an unrelated tuple.
If one of these tuples is fetched by an earlier OR'ed indexscan and the other
by a later indexscan, nodeIndexscan.c would incorrectly ignore the second
tuple. The bug exists in all 7.4.* and 8.0.* versions, but not in earlier
or later branches because this code was only used in those releases. Per
trouble report from Rafael Martinez Guerrero.
|
|
|
|
|
|
|
|
| |
our own command (or more generally, xmin = our xact and cmin >= current
command ID) should not be seen as good. Else we may try to update rows
we already updated. This error was inserted last August while fixing the
even bigger problem that the old coding wouldn't see *any* tuples inserted
by our own transaction as good. Per report from Euler Taveira de Oliveira.
|
|
|
|
|
| |
and with insufficient paranoia in code that follows t_ctid links.
This patch covers the 7.4 branch.
|
|
|
|
| |
for all SPI-created cursors.
|
|
|
|
| |
functions of the aggregate, at both aggregate creation and execution times.
|
|
|
|
|
|
| |
was large enough to be batched and the tuples fell into a batch where
there were no inner tuples at all. Thanks to Xiaoyu Wang for finding a
test case that exposed this long-standing bug.
|
|
|
|
|
| |
Not sure why this isn't causing serious problems in some simple tests,
but it definitely isn't going to do anything desirable...
|
|
|
|
|
|
|
| |
This is required by SQL spec to avoid failures in cases like
SELECT sum(win)/sum(lose) FROM ... GROUP BY ... HAVING sum(lose) > 0;
AFAICT we have gotten this wrong since day one. Kudos to Holger Jakobs
for being the first to notice.
|
|
|
|
| |
Per example from Jiang Wei.
|
|
|
|
|
|
|
| |
7.4 rewrite for hashed aggregate support. If the transition data type
is pass-by-reference, the transValue must be pfreed when starting a new
group boundary, else we have a one-value-per-group leakage. Thanks to
Rae Steining for providing a reproducible test case.
|
|
|
|
|
| |
simplistic; it recognized SELECT * FROM but not SELECT * FROM LIMIT.
Per bug report from Jeff Bohmer.
|
|
|
|
|
|
| |
when scanning a table that we need all the columns from. In case of
SELECT INTO, we have to check that the hasoids flag matches the desired
output type, too. Per report from Mike Mascari.
|
|
|
|
|
| |
number-of-buckets that exceeds the size we actually plan to allow
the hash table to grow to. Per trouble report from Sean Shanny.
|
|
|
|
|
| |
evaluating a set-valued function. This fixes some additional problems
with rescanning partially-evaluated SRFs.
|
|
|
|
|
|
| |
shut down cleanly if the plan node is ReScanned before the SRFs are run
to completion. This fixes the problem for SQL-language functions, but
still need work on functions using the SRF_XXX() macros.
|
|
|
|
|
|
|
|
|
|
| |
per report from Andrew Holm-Hansen. The difficulty arises from the fact
that the planner allowed a Hash node's hashkeys to share substructure
with the parent HashJoin node's hashclauses, plus some rather bizarre
choices about who initializes what during executor startup. A cleaner
but more invasive solution is to not store hashkeys separately in the
plan tree at all, but let the HashJoin node deconstruct hashclauses
during executor startup. I plan to fix it that way in HEAD.
|
| |
|
|
|
|
|
|
| |
rather than allocating them on the stack.
Fixes complaint from gcc 3.3.1.
|
| |
|
|
|
|
|
|
|
| |
when -fstrict-aliasing is turned on, as it is in the latest gcc when you
use -O2
Andrew Dunstan
|
|
|
|
|
|
|
| |
discussion on pgsql-hackers: in READ COMMITTED mode we just have to force
a QuerySnapshot update in the trigger, but in SERIALIZABLE mode we have
to run the scan under a current snapshot and then complain if any rows
would be updated/deleted that are not visible in the transaction snapshot.
|
|
|
|
|
|
|
| |
don't deal well with tuples having dropped columns. The attached fixes
the issue. Please apply.
Joe Conway
|
|
|
|
| |
during ExecInitTidScan, because the rest of the executor isn't ready.
|
| |
|
| |
|
|
|
|
|
|
| |
to allow es_snapshot to be set to SnapshotNow rather than a query snapshot.
This solves a bug reported by Wade Klaver, wherein triggers fired as a
result of RI cascade updates could misbehave.
|
|
|
|
|
| |
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method. This
leaves only shared system indexes as special cases. Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case. Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
|
|
|
|
| |
_SPI_begin_call. Per gripe from Tomasz Myrta.
|
|
|
|
| |
changed, it should allow a zero value (implying no changes to make).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
really general fix might be difficult, I believe the only case where
AtCommit_Notify could see an uncommitted tuple is where the other guy
has just unlistened and not yet committed. The best solution seems to
be to just skip updating that tuple, on the assumption that the other
guy does not want to hear about the notification anyway. This is not
perfect --- if the other guy rolls back his unlisten instead of committing,
then he really should have gotten this notify. But to do that, we'd have
to wait to see if he commits or not, or make UNLISTEN hold exclusive lock
on pg_listener until commit. Either of these answers is deadlock-prone,
not to mention horrible for interactive performance. Do it this way
for now. (What happened to that project to do LISTEN/NOTIFY in memory
with no table, anyway?)
|
| |
|
|
|
|
|
| |
feature they complain about isn't a feature or cannot be implemented without
definitional changes.
|
|
|
|
|
|
|
| |
handling many-way scans: instead of re-evaluating all prior indexscan
quals to see if a tuple has been fetched more than once, use a hash table
indexed by tuple CTID. But fall back to the old way if the hash table
grows to exceed SortMem.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as well as the hash function (formerly the comparison function was hardwired
as memcmp()). This makes it possible to eliminate the special-purpose
hashtable management code in execGrouping.c in favor of using dynahash to
manage tuple hashtables; which is a win because dynahash knows how to expand
a hashtable when the original size estimate was too small, whereas the
special-purpose code was too stupid to do that. (See recent gripe from
Stephan Szabo about poor performance when hash table size estimate is way
off.) Free side benefit: when using string_hash, the default comparison
function is now strncmp() instead of memcmp(). This should eliminate some
part of the overhead associated with larger NAMEDATALEN values.
|
|
|
|
|
|
|
|
| |
be anything yielding an array of the proper kind, not only sub-ARRAY[]
constructs; do subscript checking at runtime not parse time. Also,
adjust array_cat to make array || array comply with the SQL99 spec.
Joe Conway
|
|
|
|
|
|
|
|
| |
yet, though). Avoid using nth() to fetch tlist entries; provide a
common routine get_tle_by_resno() to search a tlist for a particular
resno. This replaces a couple uses of nth() and a dozen hand-coded
search loops. Also, replace a few uses of nth(length-1, list) with
llast().
|
| |
|
|
|
|
| |
Per report from Mendola Gaetano.
|
| |
|
|
|
|
| |
macros in some platforms' sys/socket.h.
|
|
|
|
| |
so it won't miss 'em again.
|
| |
|
| |
|