| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value. It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer. ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own. In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly. Back-patch to 8.4
(all supported versions).
Reported by Daniel Schüssler, reviewed by Tom Lane.
Security: CVE-2014-0063
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've seen multiple cases of people looking at the postmaster's original
stderr output to try to diagnose problems, not realizing/remembering that
their logging configuration is set up to send log messages somewhere else.
This seems particularly likely to happen in prepackaged distributions,
since many packagers patch the code to change the factory-standard logging
configuration to something more in line with their platform conventions.
In hopes of reducing confusion, emit a LOG message about this at the point
in startup where we are about to switch log output away from the original
stderr, providing a pointer to where to look instead. This message will
appear as the last thing in the original stderr output. (We might later
also try to emit such link messages when logging parameters are changed
on-the-fly; but that case seems to be both noticeably harder to do nicely,
and much less frequently a problem in practice.)
Per discussion, back-patch to 9.3 but not further.
|
|
|
|
|
|
|
| |
The new JSON API uses a bit of an unusual typedef scheme, where for
example OkeysState is a pointer to okeysState. And that's not applied
consistently either. Change that to the more usual PostgreSQL style
where struct typedefs are upper case, and use pointers explicitly.
|
|
|
|
|
|
| |
Looks like a cut/paste error in the original addition of the file.
Andres Freund
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible. As a test case, add an "updatable" option to
contrib/postgres_fdw.
This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.
initdb forced due to changes in information_schema views and the functions
they rely on. This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.
Dean Rasheed, somewhat editorialized on by Tom Lane
|
|
|
|
|
| |
This is the first run of the Perl-based pgindent script. Also update
pgindent instructions.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously this state was represented by whether the view's disk file had
zero or nonzero size, which is problematic for numerous reasons, since it's
breaking a fundamental assumption about heap storage. This was done to
allow unlogged matviews to revert to unpopulated status after a crash
despite our lack of any ability to update catalog entries post-crash.
However, this poses enough risk of future problems that it seems better to
not support unlogged matviews until we can find another way. Accordingly,
revert that choice as well as a number of existing kluges forced by it
in favor of creating a pg_class.relispopulated flag column.
|
|
|
|
|
|
| |
In most cases, these were just references to the SQL standard in
general. In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
|
|
|
|
|
|
|
| |
This saves some memory from each index relcache entry. At least on a 64-bit
machine, it saves just enough to shrink a typical relcache entry's memory
usage from 2k to 1k. That's nice if you have a lot of backends and a lot of
indexes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intent was that being populated would, long term, be just one
of the conditions which could affect whether a matview was
scannable; being populated should be necessary but not always
sufficient to scan the relation. Since only CREATE and REFRESH
currently determine the scannability, names and comments
accidentally conflated these concepts, leading to confusion.
Also add missing locking for the SQL function which allows a
test for scannability, and fix a modularity violatiion.
Per complaints from Tom Lane, although its not clear that these
will satisfy his concerns. Hopefully this will at least better
frame the discussion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The JSON parser is converted into a recursive descent parser, and
exposed for use by other modules such as extensions. The API provides
hooks for all the significant parser event such as the beginning and end
of objects and arrays, and providing functions to handle these hooks
allows for fairly simple construction of a wide variety of JSON
processing functions. A set of new basic processing functions and
operators is also added, which use this API, including operations to
extract array elements, object fields, get the length of arrays and the
set of keys of a field, deconstruct an object into a set of key/value
pairs, and create records from JSON objects and arrays of objects.
Catalog version bumped.
Andrew Dunstan, with some documentation assistance from Merlin Moncure.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This event takes place just before ddl_command_end, and is fired if and
only if at least one object has been dropped by the command. (For
instance, DROP TABLE IF EXISTS of a table that does not in fact exist
will not lead to such a trigger firing). Commands that drop multiple
objects (such as DROP SCHEMA or DROP OWNED BY) will cause a single event
to fire. Some firings might be surprising, such as
ALTER TABLE DROP COLUMN.
The trigger is fired after the drop has taken place, because that has
been deemed the safest design, to avoid exposing possibly-inconsistent
internal state (system catalogs as well as current transaction) to the
user function code. This means that careful tracking of object
identification is required during the object removal phase.
Like other currently existing events, there is support for tag
filtering.
To support the new event, add a new pg_event_trigger_dropped_objects()
set-returning function, which returns a set of rows comprising the
objects affected by the command. This is to be used within the user
function code, and is mostly modelled after the recently introduced
pg_identify_object() function.
Catalog version bumped due to the new function.
Dimitri Fontaine and Álvaro Herrera
Review by Robert Haas, Tom Lane
|
| |
|
|
|
|
|
| |
If required, recovery.conf can now be located outside of the data directory.
Server needs read/write permissions on this directory.
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce pg_identify_object(oid,oid,int4), which is similar in spirit
to pg_describe_object but instead produces a row of machine-readable
information to uniquely identify the given object, without resorting to
OIDs or other internal representation. This is intended to be used in
the event trigger implementation, to report objects being operated on;
but it has usefulness of its own.
Catalog version bumped because of the new function.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This GUC allows limiting the time spent waiting to acquire any one
heavyweight lock.
In support of this, improve the recently-added timeout infrastructure
to permit efficiently enabling or disabling multiple timeouts at once.
That reduces the performance hit from turning on lock_timeout, though
it's still not zero.
Zoltán Böszörményi, reviewed by Tom Lane,
Stephen Frost, and Hari Babu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following:
json_agg(anyrecord) -> json
to_json(any) -> json
hstore_to_json(hstore) -> json (also used as a cast)
hstore_to_json_loose(hstore) -> json
The last provides heuristic treatment of numbers and booleans.
Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.
Andrew Dunstan, reviewed by Steve Singer.
Catalog version bumped.
|
|
|
|
| |
Alexander Korotkov, reviewed by Jeff Davis.
|
|
|
|
|
|
|
| |
This saves several catalog lookups per reference. It's not all that
exciting right now, because we'd managed to minimize the number of places
that need to fetch the data; but the upcoming writable-foreign-tables patch
needs this info in a lot more places.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY'). Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I". To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters. Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate. Per bug #7913 from Adnan Dursun. Back-patch to all active
branches, since it's been like this for a long time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A materialized view has a rule just like a view and a heap and
other physical properties like a table. The rule is only used to
populate the table, references in queries refer to the
materialized data.
This is a minimal implementation, but should still be useful in
many cases. Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed. At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.
Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.
|
|
|
|
|
|
|
|
| |
This program relies on rm_desc backend routines and the xlogreader
infrastructure to emit human-readable rendering of WAL records.
Author: Andres Freund, with many reworks by Álvaro
Reviewed (in a much earlier version) by Peter Eisentraut
|
|
|
|
|
|
|
|
| |
This generalizes the existing ALTER ROLE ... SET and ALTER DATABASE
... SET functionality to allow creating settings that apply to all users
in all databases.
reviewed by Pavel Stehule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libpgcommon is a new static library to allow sharing code among the
various frontend programs and backend; this lets us eliminate duplicate
implementations of common routines. We avoid libpgport, because that's
intended as a place for porting issues; per discussion, it seems better
to keep them separate.
The first use case, and the only implemented by this patch, is pg_malloc
and friends, which many frontend programs were already using.
At the same time, we can use this to provide palloc emulation functions
for the frontend; this way, some palloc-using files in the backend can
also be used by the frontend cleanly. To do this, we change palloc() in
the backend to be a function instead of a macro on top of
MemoryContextAlloc(). This was previously believed to cause loss of
performance, but this implementation has been tweaked by Tom and Andres
so that on modern compilers it provides a slight improvement over the
previous one.
This lets us clean up some places that were already with
localized hacks.
Most of the pg_malloc/palloc changes in this patch were authored by
Andres Freund. Zoltán Böszörményi also independently provided a form of
that. libpgcommon infrastructure was authored by Álvaro.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure. It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error. (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)
Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.
Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
1290721684-sup-3951@alvh.no-ip.org
1294953201-sup-2099@alvh.no-ip.org
1320343602-sup-2290@alvh.no-ip.org
1339690386-sup-8927@alvh.no-ip.org
4FE5FF020200002500048A3D@gw.wicourts.gov
4FEAB90A0200002500048B7D@gw.wicourts.gov
|
|
|
|
| |
Dimitri Fontaine, with slight changes by me
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code in PostPrepare_Locks supposed that it could reassign locks to
the prepared transaction's dummy PGPROC by deleting the PROCLOCK table
entries and immediately creating new ones. This was safe when that code
was written, but since we invented partitioning of the shared lock table,
it's not safe --- another process could steal away the PROCLOCK entry in
the short interval when it's on the freelist. Then, if we were otherwise
out of shared memory, PostPrepare_Locks would have to PANIC, since it's
too late to back out of the PREPARE at that point.
Fix by inventing a dynahash.c function to atomically update a hashtable
entry's key. (This might possibly have other uses in future.)
This is an ancient bug that in principle we ought to back-patch, but the
odds of someone hitting it in the field seem really tiny, because (a) the
risk window is small, and (b) nobody runs servers with maxed-out lock
tables for long, because they'll be getting non-PANIC out-of-memory errors
anyway. So fixing it in HEAD seems sufficient, at least until the new
code has gotten some testing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 71450d7fd6c7cf7b3e38ac56e363bff6a681973c, we added code to inform
suitably-intelligent compilers that ereport() doesn't return if the elevel
is ERROR or higher. This patch extends that to elog(), and also fixes a
double-evaluation hazard that the previous commit created in ereport(),
as well as reducing the emitted code size.
The elog() improvement requires the compiler to support __VA_ARGS__, which
should be available in just about anything nowadays since it's required by
C99. But our minimum language baseline is still C89, so add a configure
test for that.
The previous commit assumed that ereport's elevel could be evaluated twice,
which isn't terribly safe --- there are already counterexamples in xlog.c.
On compilers that have __builtin_constant_p, we can use that to protect the
second test, since there's no possible optimization gain if the compiler
doesn't know the value of elevel. Otherwise, use a local variable inside
the macros to prevent double evaluation. The local-variable solution is
inferior because (a) it leads to useless code being emitted when elevel
isn't constant, and (b) it increases the optimization level needed for the
compiler to recognize that subsequent code is unreachable. But it seems
better than not teaching non-gcc compilers about unreachability at all.
Lastly, if the compiler has __builtin_unreachable(), we can use that
instead of abort(), resulting in a noticeable code savings since no
function call is actually emitted. However, it seems wise to do this only
in non-assert builds. In an assert build, continue to use abort(), so that
the behavior will be predictable and debuggable if the "impossible"
happens.
These changes involve making the ereport and elog macros emit do-while
statement blocks not just expressions, which forces small changes in
a few call sites.
Andres Freund, Tom Lane, Heikki Linnakangas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
|
|
|
|
|
| |
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
|
|
|
|
|
|
|
| |
Ensure comments accurately reflect state of code
given new understanding, and recent changes.
Include example code from Noah Misch to
illustrate how rd_newRelfilenodeSubid can be
reset deterministically. No code changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During crash recovery, we remove disk files belonging to temporary tables,
but the system catalog entries for such tables are intentionally not
cleaned up right away. Instead, the first backend that uses a temp schema
is expected to clean out any leftover objects therein. This approach
requires that we be careful to ignore leftover temp tables (since any
actual access attempt would fail), *even if their BackendId matches our
session*, if we have not yet established use of the session's corresponding
temp schema. That worked fine in the past, but was broken by commit
debcec7dc31a992703911a9953e299c8d730c778 which incorrectly removed the
rd_islocaltemp relcache flag. Put it back, and undo various changes
that substituted tests like "rel->rd_backend == MyBackendId" for use
of a state-aware flag. Per trouble report from Heikki Linnakangas.
Back-patch to 9.1 where the erroneous change was made. In the back
branches, be careful to add rd_islocaltemp in a spot in the struct that
was alignment padding before, so as not to break existing add-on code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes "simple" views automatically updatable, without the need
to create either INSTEAD OF triggers or INSTEAD rules. "Simple" views
are those classified as updatable according to SQL-92 rules. The rewriter
transforms INSERT/UPDATE/DELETE commands on such views directly into an
equivalent command on the underlying table, which will generally have
noticeably better performance than is possible with either triggers or
user-written rules. A view that has INSTEAD OF triggers or INSTEAD rules
continues to operate the same as before.
For the moment, security_barrier views are not considered simple.
Also, we do not support WITH CHECK OPTION. These features may be
added in future.
Dean Rasheed, reviewed by Amit Kapila
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a relfilenode is created in this subtransaction or
a committed child transaction and it cannot otherwise
be seen by our own process, mark tuples committed ahead
of transaction commit for all COPY commands in same
transaction. If FREEZE specified on COPY
and pre-conditions met then rows will also be frozen.
Both options designed to avoid revisiting rows after commit,
increasing performance of subsequent commands after
data load and upgrade. pg_restore changes later.
Simon Riggs, review comments from Heikki Linnakangas, Noah Misch and design
input from Tom Lane, Robert Haas and Kevin Grittner
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
errcontext() is typically used in an error context callback function, not
within an ereport() invocation like e.g errmsg and errdetail are. That means
that the message domain that the TEXTDOMAIN magic in ereport() determines
is not the right one for the errcontext() calls. The message domain needs to
be determined by the C file containing the errcontext() call, not the file
containing the ereport() call.
Fix by turning errcontext() into a macro that passes the TEXTDOMAIN to use
for the errcontext message. "errcontext" was used in a few places as a
variable or struct field name, I had to rename those out of the way, now
that errcontext is a macro.
We've had this problem all along, but this isn't doesn't seem worth
backporting. It's a fairly minor issue, and turning errcontext from a
function to a macro requires at least a recompile of any external code that
calls errcontext().
|
|
|
|
|
|
|
|
|
|
|
| |
We used to send structs wrapped in CopyData messages, which works as long as
the client and server agree on things like endianess, timestamp format and
alignment. That's good enough for running a standby server, which has to run
on the same platform anyway, but it's useful for tools like pg_receivexlog
to work across platforms.
This breaks protocol compatibility of streaming replication, but we never
promised that to be compatible across versions, anyway.
|
|
|
|
|
|
|
|
| |
dlist_delete, dlist_insert_after, dlist_insert_before, slist_insert_after
do not need access to the list header, and indeed insisting on that negates
one of the main advantages of a doubly-linked list.
In consequence, revert addition of "cache_bucket" field to CatCTup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a common implementation of embedded singly-linked and
doubly-linked lists. "Embedded" in the sense that the nodes'
next/previous pointers exist within some larger struct; this design
choice reduces memory allocation overhead.
Most of the implementation uses inlineable functions (where supported),
for performance.
Some existing uses of both types of lists have been converted to the new
code, for demonstration purposes. Other uses can (and probably will) be
converted in the future. Since dllist.c is unused after this conversion,
it has been removed.
Author: Andres Freund
Some tweaks by me
Reviewed by Tom Lane, Peter Geoghegan
|
|
|
|
|
|
| |
The former name was too likely to conflict with symbols from external
headers; and, as seen in recent buildfarm failures in member spoonbill,
it has now happened at least in plpython.
|
|
|
|
|
|
|
| |
We already had those, but they forced modules to spell out the function
bodies twice. Eliminate some duplicates we had already grown.
Extracted from a somewhat larger patch from Andres Freund.
|
|
|
|
|
|
|
| |
This allows easily splitting configuration into many files, deployed in a
directory.
Magnus Hagander, Greg Smith, Selena Deckelmann, reviewed by Noah Misch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous scheme had bugs in some corner cases involving tables that had
been renamed since a view was made. This could result in dumped views that
failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd
Albin, as well as in some pgsql-hackers discussion back in January. Also,
its behavior for printing EXPLAIN plans was sometimes confusing because of
willingness to use the same alias for multiple RTEs (it was Ashutosh
Bapat's complaint about that aspect that started the January thread).
To fix, ensure that each RTE in the query has a unique unqualified alias,
by modifying the alias if necessary (we add "_" and digits as needed to
create a non-conflicting name). Then we can just print its variables with
that alias, avoiding the confusing and bug-prone scheme of sometimes
schema-qualifying variable names. In EXPLAIN, it proves to be expedient to
take the further step of only assigning such aliases to RTEs that are
actually referenced in the query, since the planner has a habit of
generating extra RTEs with the same alias in situations such as
inheritance-tree expansion.
Although this fixes a bug of very long standing, I'm hesitant to back-patch
such a noticeable behavioral change. My experiments while creating a
regression test convinced me that actually incorrect output (as opposed to
confusing output) occurs only in very narrow cases, which is backed up by
the lack of previous complaints from the field. So we may be better off
living with it in released branches; and in any case it'd be smart to let
this ripen awhile in HEAD before we consider back-patching it.
|
|
|
|
|
|
|
| |
the week via ISO or Gregorian designations. The fix is to store the
day-of-week consistently as 1-7, Sunday = 1.
Fixes bug reported by Marc Munro
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
|
|
|
|
|
| |
Instead, place a forward struct declaration for struct catclist in
syscache.h. This reduces header proliferation somewhat.
|