| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like the similar logic for arrays and records, it's necessary to examine
the range's subtype to decide whether the range type can support hashing.
We can omit checking the subtype for btree-defined operations, though,
since range subtypes are required to have those operations. (Possibly
that simplification for btree cases led us to overlook that it does
not apply for hash cases.)
This is only an issue if the subtype lacks hash support, which is not
true of any built-in range type, but it's easy to demonstrate a problem
with a range type over, eg, money: you can get a "could not identify
a hash function" failure when the planner is misled into thinking that
hash join or aggregation would work.
This was born broken, so back-patch to all supported branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 8abb3cda0ddc00a0ab98977a1633a95b97068d4e I attempted to cache
the expression state trees constructed for domain CHECK constraints for
the life of the backend (assuming the domain's constraints don't get
redefined). However, this turns out not to work very well, because
execQual.c will run those state trees with ecxt_per_query_memory pointing
to a query-lifespan context, and in some situations we'll end up with
pointers into that context getting stored into the state trees. This
happens in particular with SQL-language functions, as reported by
Emre Hasegeli, but there are many other cases.
To fix, keep only the expression plan trees for domain CHECK constraints
in the typcache's data structure, and revert to performing ExecInitExpr
(at least) once per query to set up expression state trees in the query's
context.
Eventually it'd be nice to undo this, but that will require some careful
thought about memory management for expression state trees, and it seems
far too late for any such redesign in 9.5. This way is still much more
efficient than what happened before 8abb3cda0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, CHECK constraints of the same scope were checked in whatever
order they happened to be read from pg_constraint. (Usually, but not
reliably, this would be creation order for domain constraints and reverse
creation order for table constraints, because of differing implementation
details.) Nondeterministic results of this sort are problematic at least
for testing purposes, and in discussion it was agreed to be a violation of
the principle of least astonishment. Therefore, borrow the principle
already established for triggers, and apply such checks in name order
(using strcmp() sort rules). This lets users control the check order
if they have a mind to.
Domain CHECK constraints still follow the rule of checking lower nested
domains' constraints first; the name sort only applies to multiple
constraints attached to the same domain.
In passing, I failed to resist the temptation to wordsmith a bit in
create_domain.sgml.
Apply to HEAD only, since this could result in a behavioral change in
existing applications, and the potential regression test failures have
not actually been observed in our buildfarm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
|
|
|
|
|
|
| |
Fix a batch of structs that are only visible within individual .c files.
Michael Paquier
|
|
|
|
| |
Backpatch certain files through 9.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if you wanted anything besides C-string hash keys, you had to
specify a custom hashing function to hash_create(). Nearly all such
callers were specifying tag_hash or oid_hash; which is tedious, and rather
error-prone, since a caller could easily miss the opportunity to optimize
by using hash_uint32 when appropriate. Replace this with a design whereby
callers using simple binary-data keys just specify HASH_BLOBS and don't
need to mess with specific support functions. hash_create() itself will
take care of optimizing when the key size is four bytes.
This nets out saving a few hundred bytes of code space, and offers
a measurable performance improvement in tidbitmap.c (which was not
exploiting the opportunity to use hash_uint32 for its 4-byte keys).
There might be some wins elsewhere too, I didn't analyze closely.
In future we could look into offering a similar optimized hashing function
for 8-byte keys. Under this design that could be done in a centralized
and machine-independent fashion, whereas getting it right for keys of
platform-dependent sizes would've been notationally painful before.
For the moment, the old way still works fine, so as not to break source
code compatibility for loadable modules. Eventually we might want to
remove tag_hash and friends from the exported API altogether, since there's
no real need for them to be explicitly referenced from outside dynahash.c.
Teodor Sigaev and Tom Lane
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if the typcache had for example tried and failed to find a hash
opclass for a given data type, it would nonetheless repeat the unsuccessful
catalog lookup each time it was asked again. This can lead to a
significant amount of useless bufmgr traffic, as in a recent report from
Scott Marlowe. Like the catalog caches, typcache should be able to cache
negative results. This patch arranges that by making use of separate flag
bits to remember whether a particular item has been looked up, rather than
treating a zero OID as an indicator that no lookup has been done.
Also, install a credible invalidation mechanism, namely watching for inval
events in pg_opclass. The sole advantage of the lack of negative caching
was that the code would cope if operators or opclasses got added for a type
mid-session; to preserve that behavior we have to be able to invalidate
stale lookup results. Updates in pg_opclass should be pretty rare in
production systems, so it seems sufficient to just invalidate all the
dependent data whenever one happens.
Adding proper invalidation also means that this code will now react sanely
if an opclass is dropped mid-session. Arguably, that's a back-patchable
bug fix, but in view of the lack of complaints from the field I'll refrain
from back-patching. (Probably, in most cases where an opclass is dropped,
the data type itself is dropped soon after, so that this misfeasance has
no bad consequences.)
|
|
|
|
|
| |
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
|
|
|
|
|
| |
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types. Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it. This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately). That may explain the
lack of any previous reports. Buildfarm issue noted by Christian Ullrich.
Back-patch to 9.1 where the bogus code was added.
|
|
|
|
|
| |
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
|
|
|
|
|
| |
This lets files that are mere users of ResourceOwner not automatically
include the headers for stuff that is managed by the resowner mechanism.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When (re) loading the typcache comparison cache for an enum type's values,
use an up-to-date MVCC snapshot, not the transaction's existing snapshot.
This avoids problems if we encounter an enum OID that was created since our
transaction started. Per report from Andres Freund and diagnosis by Robert
Haas.
To ensure this is safe even if enum comparison manages to get invoked
before we've set a transaction snapshot, tweak GetLatestSnapshot to
redirect to GetTransactionSnapshot instead of throwing error when
FirstSnapshotSet is false. The existing uses of GetLatestSnapshot (in
ri_triggers.c) don't care since they couldn't be invoked except in a
transaction that's already done some work --- but it seems just conceivable
that this might not be true of enums, especially if we ever choose to use
enums in system catalogs.
Note that the comparable coding in enum_endpoint and enum_range_internal
remains GetTransactionSnapshot; this is perhaps debatable, but if we
changed it those functions would have to be marked volatile, which doesn't
seem attractive.
Back-patch to 9.1 where ALTER TYPE ADD VALUE was added.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the responsibility for caching specialized information about range
types into the type cache, so that the catalog lookups only have to occur
once per session. Rearrange APIs a bit so that fn_extra caching is
actually effective in the GiST support code. (Use of OidFunctionCallN is
bad enough for performance in itself, but it also prevents the function
from exploiting fn_extra caching.)
The range I/O functions are still not very bright about caching repeated
lookups, but that seems like material for a separate patch.
Also, avoid unnecessary use of memcpy to fetch/store the range type OID and
flags, and don't use the full range_deserialize machinery when all we need
to see is the flags value.
Also fix API error in range_gist_penalty --- it was failing to set *penalty
for any case involving an empty range.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existence of a btree opclass accepting composite types caused us to
assume that every composite type is sortable. This isn't true of course;
we need to check if the column types are all sortable. There was logic
for this for the case of array comparison (ie, check that the element
type is sortable), but we missed the point for rowtypes. Per Teodor's
report of an ANALYZE failure for an unsortable composite type.
Rather than just add some more ad-hoc logic for this, I moved knowledge of
the issue into typcache.c. The typcache will now only report out array_eq,
record_cmp, and friends as usable operators if the array or composite type
will work with those functions.
Unfortunately we don't have enough info to do this for anonymous RECORD
types; in that case, just assume it will work, and take the runtime failure
as before if it doesn't.
This patch might be a candidate for back-patching at some point, but
given the lack of complaints from the field, I'd rather just test it in
HEAD for now.
Note: most of the places touched in this patch will need further work
when we get around to supporting hashing of record types.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The core of this patch is hash_array() and associated typcache
infrastructure, which works just about exactly like the existing support
for array comparison.
In addition I did some work to ensure that the planner won't think that an
array type is hashable unless its element type is hashable, and similarly
for sorting. This includes adding a datatype parameter to op_hashjoinable
and op_mergejoinable, and adding an explicit "hashable" flag to
SortGroupClause. The lack of a cross-check on the element type was a
pre-existing bug in mergejoin support --- but it didn't matter so much
before, because if you couldn't sort the element type there wasn't any good
alternative to failing anyhow. Now that we have the alternative of hashing
the array type, there are cases where we can avoid a failure by being picky
at the planner stage, so it's time to be picky.
The issue of exactly how to combine the per-element hash values to produce
an array hash is still open for discussion, but the rest of this is pretty
solid, so I'll commit it as-is.
|
|
|
|
|
|
|
| |
After much expenditure of effort, we've got this to the point where the
performance penalty is pretty minimal in typical cases.
Andrew Dunstan, reviewed by Brendan Jurd, Dean Rasheed, and Tom Lane
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SI invalidation events, rather than indirectly through the relcache.
In the previous coding, we had to flush a composite-type typcache entry
whenever we discarded the corresponding relcache entry. This caused problems
at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report
from Jeff Davis, and might result in real-world problems given the kind of
unexpected relcache flush that that test mechanism is intended to model.
The new coding decouples relcache and typcache management, which is a good
thing anyway from a structural perspective. The cost is that we have to
search the typcache linearly to find entries that need to be flushed. There
are a couple of ways we could avoid that, but at the moment it's not clear
it's worth any extra trouble, because the typcache contains very few entries
in typical operation.
Back-patch to 8.2, the same as some other recent fixes in this general area.
The patch could be carried back to 8.0 with some additional work, but given
that it's only hypothetical whether we're fixing any problem observable in
the field, it doesn't seem worth the work now.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
|
| |
|
|
|
|
|
|
|
| |
probably got there via blind copy-and-paste from one of the legitimate
callers, so rearrange and comment that code a bit to make it clearer that
this isn't a necessary prerequisite to hash_create. Per observation
from Robert Haas.
|
| |
|
|
|
|
|
|
|
| |
corresponding struct definitions. This allows other headers to avoid including
certain highly-loaded headers such as rel.h and relscan.h, instead using just
relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less
unnecessary dependencies.
|
| |
|
| |
|
|
|
|
|
| |
pg_type.typtype whereever practical. Tom Dunstan, with some kibitzing
from Tom Lane.
|
|
|
|
| |
back-stamped for this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.
This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
|
| |
|
|
|
|
|
|
|
|
|
|
| |
regardless of the current schema search path. Since CREATE OPERATOR CLASS
only allows one default opclass per datatype regardless of schemas, this
should have minimal impact, and it fixes problems with failure to find a
desired opclass while restoring dump files. Per discussion at
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php.
Remove now-redundant-or-unused code in typcache.c and namespace.c,
and backpatch as far as 8.0.
|
|
|
|
|
|
|
|
|
| |
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spotted by Qingqing Zhou. The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places. If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL. But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
|
|
|
|
|
|
| |
whose keys are OIDs. The only one that looks particularly performance
critical is the relcache hashtable, but as long as we've got the function
we may as well use it wherever it's applicable.
|
|
|
|
|
|
|
| |
indexes. Replace all heap_openr and index_openr calls by heap_open
and index_open. Remove runtime lookups of catalog OID numbers in
various places. Remove relcache's support for looking up system
catalogs by name. Bulky but mostly very boring patch ...
|
|
|
|
|
|
|
|
| |
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
|
| |
|