postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix CREATE TABLE / LIKE with bigint identity column	Peter Eisentraut	2018-03-13
\| \| \| \| \| \| \| \| \| \| \|	CREATE TABLE / LIKE with a bigint identity column would fail on platforms where long is 32 bits. Copying the sequence values used makeInteger(), which would truncate the 64-bit sequence data to 32 bits. To fix, use makeFloat() instead, like the parser. (This does not actually make use of floats, but stores the values as strings.) Bug: #15096 Reviewed-by: Michael Paquier <michael@paquier.xyz>
*	Update copyright for 2018	Bruce Momjian	2018-01-02
\| \| \| \|	Backpatch-through: certain files through 9.3
*	Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes.	Tom Lane	2017-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	resowner/README contained advice to use a PG_TRY block to restore the old CurrentResourceOwner value anywhere that that variable is transiently changed. That advice was only inconsistently followed, however, and on reflection it seems like unnecessary overhead. We don't bother with such a convention for transient CurrentMemoryContext changes, on the grounds that any (sub)transaction abort will start out by resetting CurrentMemoryContext to what it wants. But the same is true of CurrentResourceOwner, so there seems no need to treat it differently. Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner before re-throwing the error. There are a couple of places that restore it along with some other actions, and I left those alone; the restore is probably unnecessary but no noticeable gain will result from removing it. Discussion: https://postgr.es/m/5236.1507583529@sss.pgh.pa.us
*	For wal_consistency_checking, mask page checksum as well as page LSN.	Robert Haas	2017-09-22
\| \| \| \| \| \| \| \|	If the LSN is different, the checksum will be different, too. Ashwin Agrawal, reviewed by Michael Paquier and Kuntal Ghosh Discussion: http://postgr.es/m/CALfoeis5iqrAU-+JAN+ZzXkpPr7+-0OAGv7QUHwFn=-wDy4o4Q@mail.gmail.com
*	Fix typo	Peter Eisentraut	2017-08-14
\| \| \| \|	Author: Masahiko Sawada <sawada.mshk@gmail.com>
*	Phase 3 of pgindent updates.	Tom Lane	2017-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
*	Phase 2 of pgindent updates.	Tom Lane	2017-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
*	Use RangeVarGetRelidExtended() in AlterSequence()	Peter Eisentraut	2017-06-16
\| \| \| \| \| \|	This allows us to combine the opening and the ownership check. Reported-by: Robert Haas <robertmhaas@gmail.com>
*	Fix ALTER SEQUENCE OWNED BY to not rewrite the sequence relation.	Tom Lane	2017-06-12
\| \| \| \| \| \| \| \| \| \| \|	It's not necessary for it to do that, since OWNED BY requires only ordinary catalog updates and doesn't affect future sequence values. And pg_upgrade needs to use OWNED BY without having it change the sequence's relfilenode. Commit 3d79013b9 broke this by making all forms of ALTER SEQUENCE change the relfilenode; that seems to be the explanation for the hard-to-reproduce buildfarm failures we've been seeing since then. Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us
*	Modify sequence catalog tuple before invoking post alter hook.	Andres Freund	2017-06-01
\| \| \| \| \| \| \| \| \|	This seems to have been broken in the commit (1753b1b027035029) that moved the sequence definition into pg_sequence. Author: Andres Freund Discussion: https://postgr.es/m/20170601000716.qxg7c46ukkiljjb3@alap3.anarazel.de Backpatch: Bug is in master/v10 only
*	Make ALTER SEQUENCE, including RESTART, fully transactional.	Andres Freund	2017-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the changes to the "data" part of the sequence, i.e. the one containing the current value, were not transactional, whereas the definition, including minimum and maximum value were. That leads to odd behaviour if a schema change is rolled back, with the potential that out-of-bound sequence values can be returned. To avoid the issue create a new relfilenode fork whenever ALTER SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY already is already handled. This commit also makes ALTER SEQUENCE RESTART transactional, as it seems to be too confusing to have some forms of ALTER SEQUENCE behave transactionally, some forms not. This way setval() and nextval() are not transactional, but DDL is, which seems to make sense. This commit also rolls back parts of the changes made in 3d092fe540 and f8dc1985f as they're now not needed anymore. Author: Andres Freund Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de Backpatch: Bug is in master/v10 only
*	Post-PG 10 beta1 pgindent run	Bruce Momjian	2017-05-17
\| \| \| \|	perltidy run not included.
*	Fix ALTER SEQUENCE locking	Peter Eisentraut	2017-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In 1753b1b027035029c2a2a1649065762fafbf63f3, the pg_sequence system catalog was introduced. This made sequence metadata changes transactional, while the actual sequence values are still behaving nontransactionally. This requires some refinement in how ALTER SEQUENCE, which operates on both, locks the sequence and the catalog. The main problems were: - Concurrent ALTER SEQUENCE causes "tuple concurrently updated" error, caused by updates to pg_sequence catalog. - Sequence WAL writes and catalog updates are not protected by same lock, which could lead to inconsistent recovery order. - nextval() disregarding uncommitted ALTER SEQUENCE changes. To fix, nextval() and friends now lock the sequence using RowExclusiveLock instead of AccessShareLock. ALTER SEQUENCE locks the sequence using ShareRowExclusiveLock. This means that nextval() and ALTER SEQUENCE block each other, and ALTER SEQUENCE on the same sequence blocks itself. (This was already the case previously for the OWNER TO, RENAME, and SET SCHEMA variants.) Also, rearrange some code so that the entire AlterSequence is protected by the lock on the sequence. As an exception, use reduced locking for ALTER SEQUENCE ... RESTART. Since that is basically a setval(), it does not require the full locking of other ALTER SEQUENCE actions. So check whether we are only running a RESTART and run with less locking if so. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reported-by: Jason Petersen <jason@citusdata.com> Reported-by: Andres Freund <andres@anarazel.de>
*	Avoid unnecessary catalog updates in ALTER SEQUENCE	Peter Eisentraut	2017-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ALTER SEQUENCE can do nontransactional changes to the sequence (RESTART clause) and transactional updates to the pg_sequence catalog (most other clauses). When just calling RESTART, the code would still needlessly do a catalog update without any changes. This would entangle that operation in the concurrency issues of a catalog update (causing either locking or concurrency errors, depending on how that issue is to be resolved). Fix by keeping track during options parsing whether a catalog update is needed, and skip it if not. Reported-by: Jason Petersen <jason@citusdata.com>
*	Fix crash when partitioned column specified twice.	Robert Haas	2017-04-28
\| \| \| \| \| \|	Amit Langote, reviewed by Beena Emerson Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
*	Identity columns	Peter Eisentraut	2017-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is the SQL standard-conforming variant of PostgreSQL's serial columns. It fixes a few usability issues that serial columns have: - CREATE TABLE / LIKE copies default but refers to same sequence - cannot add/drop serialness with ALTER TABLE - dropping default does not drop sequence - need to grant separate privileges to sequence - other slight weirdnesses because serial is some kind of special macro Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
*	Adjust min/max values when changing sequence type	Peter Eisentraut	2017-04-04
\| \| \| \| \| \| \| \| \| \|	When changing the type of a sequence, adjust the min/max values of the sequence if it looks like the previous values were the default values. Previously, it would leave the old values in place, requiring manual adjustments even in the usual/default cases. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
*	Use wrappers of PG_DETOAST_DATUM_PACKED() more.	Noah Misch	2017-03-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.
*	Use SQL standard error code for nextval	Peter Eisentraut	2017-03-09
\|
*	Add CREATE SEQUENCE AS <data type> clause	Peter Eisentraut	2017-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This stores a data type, required to be an integer type, with the sequence. The sequences min and max values default to the range supported by the type, and they cannot be set to values exceeding that range. The internal implementation of the sequence is not affected. Change the serial types to create sequences of the appropriate type. This makes sure that the min and max values of the sequence for a serial column match the range of values supported by the table column. So the sequence can no longer overflow the table column. This also makes monitoring for sequence exhaustion/wraparound easier, which currently requires various contortions to cross-reference the sequences with the table columns they are used with. This commit also effectively reverts the pg_sequence column reordering in f3b421da5f4addc95812b9db05a24972b8fd9739, because the new seqtypid column allows us to fill the hole in the struct and create a more natural overall column ordering. Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Add WAL consistency checking facility.	Robert Haas	2017-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the new GUC wal_consistency_checking is set to a non-empty value, it triggers recording of additional full-page images, which are compared on the standby against the results of applying the WAL record (without regard to those full-page images). Allowable differences such as hints are masked out, and the resulting pages are compared; any difference results in a FATAL error on the standby. Kuntal Ghosh, based on earlier patches by Michael Paquier and Heikki Linnakangas. Extensively reviewed and revised by Michael Paquier and by me, with additional reviews and comments from Amit Kapila, Álvaro Herrera, Simon Riggs, and Peter Eisentraut.
*	Provide CatalogTupleDelete() as a wrapper around simple_heap_delete().	Tom Lane	2017-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the work done in commit 2f5c9d9c9 to provide a more nearly complete abstraction layer hiding the details of index updating for catalog changes. That commit only invented abstractions for catalog inserts and updates, leaving nearby code for catalog deletes still calling the heap-level routines directly. That seems rather ugly from here, and it does little to help if we ever want to shift to a storage system in which indexing work is needed at delete time. Hence, create a wrapper function CatalogTupleDelete(), and replace calls of simple_heap_delete() on catalog tuples with it. There are now very few direct calls of [simple_]heap_delete remaining in the tree. Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us
*	Tweak catalog indexing abstraction for upcoming WARM	Alvaro Herrera	2017-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com
*	Fix default minimum value for descending sequences	Peter Eisentraut	2017-01-23
\| \| \| \| \| \| \| \| \| \| \|	For some reason that is lost in history, a descending sequence would default its minimum value to -2^63+1 (-PG_INT64_MAX) instead of -2^63 (PG_INT64_MIN), even though explicitly specifying a minimum value of -2^63 would work. Fix this inconsistency by using the full range by default. Reported-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Move some things from builtins.h to new header files	Peter Eisentraut	2017-01-20
\| \| \| \|	This avoids that builtins.h has to include additional header files.
*	Update copyright via script for 2017	Bruce Momjian	2017-01-03
\|
*	Reorder pg_sequence columns to avoid alignment issue	Peter Eisentraut	2016-12-21
\| \| \| \| \| \| \| \| \| \| \| \|	On AIX, doubles are aligned at 4 bytes, but int64 is aligned at 8 bytes. Our code assumes that doubles have alignment that can also be applied to int64, but that fails in this case. One effect is that heap_form_tuple() writes tuples in a different layout than Form_pg_sequence expects. Rather than rewrite the whole alignment code, work around the issue by reordering the columns in pg_sequence so that the first int64 column naturally comes out at an 8-byte boundary.
*	Add pg_sequence system catalog	Peter Eisentraut	2016-12-20
\| \| \| \| \| \| \| \| \| \|	Move sequence metadata (start, increment, etc.) into a proper system catalog instead of storing it in the sequence heap object. This separates the metadata from the sequence data. Sequence metadata is now operated on transactionally by DDL commands, whereas previously rollbacks of sequence-related DDL commands would be ignored. Reviewed-by: Andreas Karlsson <andreas@proxel.se>
*	Implement table partitioning.	Robert Haas	2016-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
*	Add pg_sequences view	Peter Eisentraut	2016-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Like pg_tables, pg_views, and others, this view contains information about sequences in a way that is independent of the system catalog layout but more comprehensive than the information schema. To help implement the view, add a new internal function pg_sequence_last_value() to return the last value of a sequence. This is kept separate from pg_sequence_parameters() to separate querying run-time state from catalog-like information. Reviewed-by: Andreas Karlsson <andreas@proxel.se>
*	Add location field to DefElem	Peter Eisentraut	2016-09-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a location field to the DefElem struct, used to parse many utility commands. Update various error messages to supply error position information. To propogate the error position information in a more systematic way, create a ParseState in standard_ProcessUtility() and pass that to interested functions implementing the utility commands. This seems better than passing the query string and then reassembling a parse state ad hoc, which violates the encapsulation of the ParseState type. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
*	Revert no-op changes to BufferGetPage()	Kevin Grittner	2016-04-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas
*	Modify BufferGetPage() to prepare for "snapshot too old" feature	Kevin Grittner	2016-04-08
\| \| \| \| \| \| \| \| \| \| \|	This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.
*	Update copyright for 2016	Bruce Momjian	2016-01-02
\| \| \| \|	Backpatch certain files through 9.1
*	pgindent run for 9.5	Bruce Momjian	2015-05-23
\|
*	Collection of typo fixes.	Heikki Linnakangas	2015-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.
*	Create an infrastructure for parallel computation in PostgreSQL.	Robert Haas	2015-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.
*	Change many routines to return ObjectAddress rather than OID	Alvaro Herrera	2015-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund
*	Reconsider when to wait for WAL flushes/syncrep during commit.	Andres Freund	2015-02-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to now RecordTransactionCommit() waited for WAL to be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transaction did not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid, but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit. Not something people expect in a read only transaction. This lead to such strange symptoms as backends being seemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying when said stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involved in a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for syncrep using latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler. Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interrupts aren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid, and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xid has been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to cause overhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usage of nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0 doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches.
*	Update copyright for 2015	Bruce Momjian	2015-01-06
\| \| \| \|	Backpatch certain files through 9.0
*	Improve hash_create's API for selecting simple-binary-key hash functions.	Tom Lane	2014-12-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, if you wanted anything besides C-string hash keys, you had to specify a custom hashing function to hash_create(). Nearly all such callers were specifying tag_hash or oid_hash; which is tedious, and rather error-prone, since a caller could easily miss the opportunity to optimize by using hash_uint32 when appropriate. Replace this with a design whereby callers using simple binary-data keys just specify HASH_BLOBS and don't need to mess with specific support functions. hash_create() itself will take care of optimizing when the key size is four bytes. This nets out saving a few hundred bytes of code space, and offers a measurable performance improvement in tidbitmap.c (which was not exploiting the opportunity to use hash_uint32 for its 4-byte keys). There might be some wins elsewhere too, I didn't analyze closely. In future we could look into offering a similar optimized hashing function for 8-byte keys. Under this design that could be done in a centralized and machine-independent fashion, whereas getting it right for keys of platform-dependent sizes would've been notationally painful before. For the moment, the old way still works fine, so as not to break source code compatibility for loadable modules. Eventually we might want to remove tag_hash and friends from the exported API altogether, since there's no real need for them to be explicitly referenced from outside dynahash.c. Teodor Sigaev and Tom Lane
*	Revamp the WAL record format.	Heikki Linnakangas	2014-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.
*	Fix relpersistence setting in reindex_index	Alvaro Herrera	2014-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buildfarm members with CLOBBER_CACHE_ALWAYS advised us that commit 85b506bbfc2937 was mistaken in setting the relpersistence value of the index directly in the relcache entry, within reindex_index. The reason for the failure is that an invalidation message that comes after mucking with the relcache entry directly, but before writing it to the catalogs, would cause the entry to become rebuilt in place from catalogs with the old contents, losing the update. Fix by passing the correct persistence value to RelationSetNewRelfilenode instead; this routine also writes the updated tuple to pg_class, avoiding the problem. Suggested by Tom Lane.
*	Move the backup-block logic from XLogInsert to a new file, xloginsert.c.	Heikki Linnakangas	2014-11-06
\| \| \| \| \| \| \| \| \| \| \| \|	xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
*	Minimize calls of pg_class_aclcheck to minimum necessary	Peter Eisentraut	2014-10-22
\| \| \| \| \| \| \| \| \| \|	In a couple of code paths, pg_class_aclcheck is called in succession with multiple different modes set. This patch combines those modes to have a single call of this function and reduce a bit process overhead for permission checking. Author: Michael Paquier <michael@otacoo.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
*	Implement IF NOT EXISTS for CREATE SEQUENCE.	Heikki Linnakangas	2014-08-26
\| \| \| \|	Fabrízio de Royes Mello
*	pgindent run for 9.4	Bruce Momjian	2014-05-06
\| \| \| \| \|	This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
*	Avoid transient bogus page contents when creating a sequence.	Heikki Linnakangas	2014-04-22
\| \| \| \| \| \| \| \| \| \| \| \|	Don't use simple_heap_insert to insert the tuple to a sequence relation. simple_heap_insert creates a heap insertion WAL record, and replaying that will create a regular heap page without the special area containing the sequence magic constant, which is wrong for a sequence. That was not a bug because we always created a sequence WAL record after that, and replaying that overwrote the bogus heap page, and the transient state could never be seen by another backend because it was only done when creating a new sequence relation. But it's simpler and cleaner to avoid that in the first place.
*	Update copyright for 2014	Bruce Momjian	2014-01-07
\| \| \| \| \|	Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
*	Change the way we mark tuples as frozen.	Robert Haas	2013-12-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of changing the tuple xmin to FrozenTransactionId, the combination of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never set together, is now defined as HEAP_XMIN_FROZEN. A variety of previous proposals to freeze tuples opportunistically before vacuum_freeze_min_age is reached have foundered on the objection that replacing xmin by FrozenTransactionId might hinder debugging efforts when things in this area go awry; this patch is intended to solve that problem by keeping the XID around (but largely ignoring the value to which it is set). Third-party code that checks for HEAP_XMIN_INVALID on tuples where HEAP_XMIN_COMMITTED might be set will be broken by this change. To fix, use the new accessor macros in htup_details.h rather than consulting the bits directly. HeapTupleHeaderGetXmin has been modified to return FrozenTransactionId when the infomask bits indicate that the tuple is frozen; use HeapTupleHeaderGetRawXmin when you already know that the tuple isn't marked commited or frozen, or want the raw value anyway. We currently do this in routines that display the xmin for user consumption, in tqual.c where it's known to be safe and important for the avoidance of extra cycles, and in the function-caching code for various procedural languages, which shouldn't invalidate the cache just because the tuple gets frozen. Robert Haas and Andres Freund