postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Check for interrupts during tuple-insertion loops.	Robert Haas	2014-06-23
\| \| \| \| \| \| \| \|	Normally, this won't matter too much; but if I/O is really slow, for example because the system is overloaded, we might write many pages before checking for interrupts. A single toast insertion might write up to 1GB of data, and a multi-insert could write hundreds of tuples (and their corresponding TOAST data).
*	Fix bug in WAL_DEBUG.	Heikki Linnakangas	2014-06-23
\| \| \| \| \|	The record header was not copied correctly to the buffer that was passed to the rm_desc function. Broken by my rm_desc signature refactoring patch.
*	Don't allow to disable backend assertions via the debug_assertions GUC.	Andres Freund	2014-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existance of the assert_enabled variable (backing the debug_assertions GUC) reduced the amount of knowledge some static code checkers (like coverity and various compilers) could infer from the existance of the assertion. That could have been solved by optionally removing the assertion_enabled variable from the Assert() et al macros at compile time when some special macro is defined, but the resulting complication doesn't seem to be worth the gain from having debug_assertions. Recompiling is fast enough. The debug_assertions GUC is still available, but readonly, as it's useful when diagnosing problems. The commandline/client startup option -A, which previously also allowed to enable/disable assertions, has been removed as it doesn't serve a purpose anymore. While at it, reduce code duplication in bufmgr.c and localbuf.c assertions checking for spurious buffer pins. That code had to be reindented anyway to cope with the assert_enabled removal.
*	Change the signature of rm_desc so that it's passed a XLogRecord.	Heikki Linnakangas	2014-06-14
\| \| \| \|	Just feels more natural, and is more consistent with rm_redo.
*	Consistency improvements for slot and decoding code.	Andres Freund	2014-06-12
\| \| \| \| \| \| \| \|	Change the order of checks in similar functions to be the same; remove a parameter that's not needed anymore; rename a memory context and expand a couple of comments. Per review comments from Amit Kapila
*	Fix infinite loop when splitting inner tuples in SPGiST text indexes.	Tom Lane	2014-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the code used a node label of zero both for strings that contain no bytes beyond the inner tuple's prefix, and for cases where an "allTheSame" inner tuple has to be split to allow a string with a different next byte to be inserted into it. Failing to distinguish these cases meant that if a string ending with the current prefix needed to be inserted into an allTheSame tuple, we got into an infinite loop, because after splitting the tuple we'd descend into the child allTheSame tuple and then find we need to split again. To fix, instead use -1 and -2 as the node labels for these two cases. This requires widening the node label type from "char" to int2, but fortunately SPGiST stores all pass-by-value node label types in their Datum representation, which means that this change is transparently upward compatible so far as the on-disk representation goes. We continue to recognize zero as a dummy node label for reading purposes, but will not attempt to push new index entries down into such a label, so that the loop won't occur even when dealing with an existing index. Per report from Teodor Sigaev. Back-patch to 9.2 where the faulty code was introduced.
*	Wrap multixact/members correctly during extension, take 2	Alvaro Herrera	2014-06-09
\| \| \| \| \| \| \| \|	In a50d97625497b7 I already changed this, but got it wrong for the case where the number of members is larger than the number of entries that fit in the last page of the last segment. As reported by Serge Negodyuck in a followup to bug #8673.
*	Add defenses against running with a wrong selection of LOBLKSIZE.	Tom Lane	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's critical that the backend's idea of LOBLKSIZE match the way data has actually been divided up in pg_largeobject. While we don't provide any direct way to adjust that value, doing so is a one-line source code change and various people have expressed interest recently in changing it. So, just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in pg_control and cross-check that the backend's compiled-in setting matches the on-disk data. Also tweak the code in inv_api.c so that fetches from pg_largeobject explicitly verify that the length of the data field is not more than LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection at all in production builds. In some of the call sites an overlength data value would translate directly to a security-relevant stack clobber, so it seems worth one extra runtime comparison to be sure. In the back branches, we can't change the contents of pg_control; but we can still make the extra checks in inv_api.c, which will offer some amount of protection against running with the wrong value of LOBLKSIZE.
*	Consistently spell a replication slot's name as slot_name.	Andres Freund	2014-06-05
\| \| \| \| \| \| \| \| \| \| \|	Previously there's been a mix between 'slotname' and 'slot_name'. It's not nice to be unneccessarily inconsistent in a new feature. As a post beta1 initdb now is required in the wake of eeca4cd35e, fix the inconsistencies. Most the changes won't affect usage of replication slots because the majority of changes is around function parameter names. The prominent exception to that is that the recovery.conf parameter 'primary_slotname' is now named 'primary_slot_name'.
*	Adjust SP-GiST WAL record formats to reduce alignment padding.	Heikki Linnakangas	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way the code was written, the padding was copied from uninitialized memory areas.. Because the structs are local variables in the code where the WAL records are constructed, making them larger and zeroing the padding bytes would not make the code very pretty, so rather than fixing this directly by zeroing out the padding bytes, it seems more clear to not try to align the tuples in the WAL records. The redo functions are taught to copy the tuple header to a local variable to avoid unaligned access. Stable-branches have the same problem, but we can't change the WAL format there, so fix in master only. Reading a few random extra bytes at the stack is harmless in practice, so it's not worth crafting a different back-patchable fix. Per reports from Kevin Grittner and Andres Freund, using clang static analyzer and Valgrind, respectively.
*	Fix error when trying to delete page with half-dead left sibling.	Heikki Linnakangas	2014-05-25
\| \| \| \| \| \| \| \| \| \| \| \|	The new page deletion code didn't cope with the case the target page's right sibling was marked half-dead. It failed a sanity check which checked that the downlinks in the parent page match the lower level, because a half-dead page has no downlink. To cope, check for that condition, and just give up on the deletion if it happens. The vacuum will finish the deletion of the half-dead page when it gets there, and on the next vacuum after that the empty can be deleted. Reported by Jeff Janes.
*	Fix backup-block numbering in redo of b-tree split.	Heikki Linnakangas	2014-05-19
\| \| \| \| \| \| \| \| \| \| \| \|	I got the backup block numbers off-by-one in the commit that changed the way incomplete-splits are handled. I blame the comments, which said "backup block 1" and "backup block 2", even though the backup blocks are numbered starting from 0, in the macros and functions used in replay. Fix the comments and the code. Per Jeff Janes' bug report about corruption caused by torn page writes. The incorrect code is new in git master, but backpatch the comment change down to 9.0, where the numbering in the redo-side macros was changed.
*	Fix a bunch of functions that were declared static then defined not-static.	Tom Lane	2014-05-17
\| \| \| \|	Per testing with a compiler that whines about this.
*	Update README, we don't do post-recovery cleanup actions anymore.	Heikki Linnakangas	2014-05-17
\| \| \| \| \| \| \|	transam/README explained how B-tree incomplete splits were tracked and fixed after recovery, as an example of handling complex actions that need multiple WAL records, but that's not how it works anymore. Explain the new paradigm.
*	Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.	Heikki Linnakangas	2014-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed that prepared transaction commits reuse that struct. Fix that. Because those fields were left unitialized, replaying a commit prepared WAL record in a hot standby node would fail to remove the relcache init file. That can lead to "could not open file" errors on the standby. Relcache init file only needs to be removed when a system table/index is rewritten in the transaction using two phase commit, so that should be rare in practice. In HEAD, the incorrect dbId/tsId values are also used for filtering in logical replication code, causing the transaction to always be filtered out. Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was introduced.
*	Fix race condition in preparing a transaction for two-phase commit.	Heikki Linnakangas	2014-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To lock a prepared transaction's shared memory entry, we used to mark it with the XID of the backend. When the XID was no longer active according to the proc array, the entry was implicitly considered as not locked anymore. However, when preparing a transaction, the backend's proc array entry was cleared before transfering the locks (and some other state) to the prepared transaction's dummy PGPROC entry, so there was a window where another backend could finish the transaction before it was in fact fully prepared. To fix, rewrite the locking mechanism of global transaction entries. Instead of an XID, just have simple locked-or-not flag in each entry (we store the locking backend's backend id rather than a simple boolean, but that's just for debugging purposes). The backend is responsible for explicitly unlocking the entry, and to make sure that that happens, install a callback to unlock it on abort or process exit. Backpatch to all supported versions.
*	Code review for recent changes in relcache.c.	Tom Lane	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rd_replidindex should be managed the same as rd_oidindex, and rd_keyattr and rd_idattr should be managed like rd_indexattr. Omissions in this area meant that the bitmapsets computed for rd_keyattr and rd_idattr would be leaked during any relcache flush, resulting in a slow but permanent leak in CacheMemoryContext. There was also a tiny probability of relcache entry corruption if we ran out of memory at just the wrong point in RelationGetIndexAttrBitmap. Otherwise, the fields were not zeroed where expected, which would not bother the code any AFAICS but could greatly confuse anyone examining the relcache entry while debugging. Also, create an API function RelationGetReplicaIndex rather than letting non-relcache code be intimate with the mechanisms underlying caching of that value (we won't even mention the memory leak there). Also, fix a relcache flush hazard identified by Andres Freund: RelationGetIndexAttrBitmap must not assume that rd_replidindex stays valid across index_open. The aspects of this involving rd_keyattr date back to 9.3, so back-patch those changes.
*	Rename min_recovery_apply_delay to recovery_min_apply_delay.	Tom Lane	2014-05-10
\| \| \| \| \| \| \|	Per discussion, this seems like a more consistent choice of name. Fabrízio de Royes Mello, after a suggestion by Peter Eisentraut; some additional documentation wordsmithing by me
*	Fix bug in lossy-page handling in GIN	Heikki Linnakangas	2014-05-10
\| \| \| \| \| \| \| \| \| \| \|	When returning rows from a bitmap, as done with partial match queries, we would get stuck in an infinite loop if the bitmap contained a lossy page reference. This bug is new in master, it was introduced by the patch to allow skipping items refuted by other entries in GIN scans. Report and fix by Alexander Korotkov
*	Remove overeager assertion in logical_heap_begin_rewrite.	Robert Haas	2014-05-09
\| \| \| \| \| \| \|	It's legal to configure wal_level=logical and max_replication_slots=0 simultaneously. Andres Freund
*	Protect against torn pages when deleting GIN list pages.	Heikki Linnakangas	2014-05-08
\| \| \| \| \| \| \| \| \|	To-be-deleted list pages contain no useful information, as they are being deleted, but we must still protect the writes from being torn by a crash after a partial write. To do that, re-initialize the pages on WAL replay. Jeff Janes caught this with a test program to test partial writes. Backpatch to all supported versions.
*	pgindent run for 9.4	Bruce Momjian	2014-05-06
\| \| \| \| \|	This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
*	Correct comment in Hot Standby nbtree handling	Simon Riggs	2014-05-06
\| \| \| \|	Logic is correct, matching handling of LP_DEAD elsewhere.
*	Assert that pre/post-fix updated tuples are on the same page during replay.	Heikki Linnakangas	2014-05-05
\| \| \| \| \| \| \| \| \| \| \|	If they were not 'oldtup.t_data' would be dereferenced while set to NULL in case of a full page image for block 0. Do so primarily to silence coverity; but also to make sure this prerequisite isn't changed without adapting the replay routine as that would appear to work in many cases. Andres Freund
*	Fix failure to detoast fields in composite elements of structured types.	Tom Lane	2014-05-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.
*	Rationalize common/relpath.[hc].	Tom Lane	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit a73018392636ce832b09b5c31f6ad1f18a4643ea created rather a mess by putting dependencies on backend-only include files into include/common. We really shouldn't do that. To clean it up: * Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in catalog/catalog.h. We won't consider this symbol part of the FE/BE API. * Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider relpath.h as the source of truth for fork numbers, since relpath.c was already partially serving that function, and anyway relfilenode.h was kind of a random place for that enum. * So, relfilenode.h now includes relpath.h rather than vice-versa. This direction of dependency is fine. (That allows most, but not quite all, of the existing explicit #includes of relpath.h to go away again.) * Push forkname_to_number from catalog.c to relpath.c, just to centralize fork number stuff a bit better. * Push GetDatabasePath from catalog.c to relpath.c; it was rather odd that the previous commit didn't keep this together with relpath(). * To avoid needing relfilenode.h in common/, redefine the underlying function (now called GetRelationPath) as taking separate OID arguments, and make the APIs using RelFileNode or RelFileNodeBackend into macro wrappers. (The macros have a potential multiple-eval risk, but none of the existing call sites have an issue with that; one of them had such a risk already anyway.) * Fix failure to follow the directions when "init" fork type was added; specifically, the errhint in forkname_to_number wasn't updated, and neither was the SGML documentation for pg_relation_size(). * Fix tablespace-path-too-long check in CreateTableSpace() to account for fork-name component of maximum-length pathnames. This requires putting FORKNAMECHARS into a header file, but it was rather useless (and actually unreferenced) where it was. The last couple of items are potentially back-patchable bug fixes, if anyone is sufficiently excited about them; but personally I'm not. Per a gripe from Christoph Berg about how include/common wasn't self-contained.
*	Fix two bugs in WAL-logging of GIN pending-list pages.	Heikki Linnakangas	2014-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In writeListPage, never take a full-page image of the page, because we have all the information required to re-initialize in the WAL record anyway. Before this fix, a full-page image was always generated, unless full_page_writes=off, because when the page is initialized its LSN is always 0. In stable-branches, keep the code to restore the backup blocks if they exist, in case that the WAL is generated with an older minor version, but in master Assert that there are no full-page images. In the redo routine, add missing "off++". Otherwise the tuples are added to the page in reverse order. That happens to be harmless because we always scan and remove all the tuples together, but it was clearly wrong. Also, it was masked by the first bug unless full_page_writes=off, because the page was always restored from a full-page image. Backpatch to all supported versions.
*	Improve generation algorithm for database system identifier.	Tom Lane	2014-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \|	As noted some time ago, the original coding had a typo ("\|" for "^") that made the result less unique than intended. Even the intended behavior is obsolete since it was based on wanting to produce a usable value even if we didn't have int64 arithmetic --- a limitation we stopped supporting years ago. Instead, let's redefine the system identifier as tv_sec in the upper 32 bits (same as before), tv_usec in the next 20 bits, and the low 12 bits of getpid() in the remaining bits. This is still hardly guaranteed-universally-unique, but it's noticeably better than before. Per my proposal at <29019.1374535940@sss.pgh.pa.us>
*	Fix race when updating a tuple concurrently locked by another process	Alvaro Herrera	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a tuple is locked, and this lock is later upgraded either to an update or to a stronger lock, and in the meantime some other process tries to lock, update or delete the same tuple, it (the tuple) could end up being updated twice, or having conflicting locks held. The reason for this is that the second updater checks for a change in Xmax value, or in the HEAP_XMAX_IS_MULTI infomask bit, after noticing the first lock; and if there's a change, it restarts and re-evaluates its ability to update the tuple. But it neglected to check for changes in lock strength or in lock-vs-update status when those two properties stayed the same. This would lead it to take the wrong decision and continue with its own update, when in reality it shouldn't do so but instead restart from the top. This could lead to either an assertion failure much later (when a multixact containing multiple updates is detected), or duplicate copies of tuples. To fix, make sure to compare the other relevant infomask bits alongside the Xmax value and HEAP_XMAX_IS_MULTI bit, and restart from the top if necessary. Also, in the belt-and-suspenders spirit, add a check to MultiXactCreateFromMembers that a multixact being created does not have two or more members that are claimed to be updates. This should protect against other bugs that might cause similar bogus situations. Backpatch to 9.3, where the possibility of multixacts containing updates was introduced. (In prior versions it was possible to have the tuple lock upgraded from shared to exclusive, and an update would not restart from the top; yet we're protected against a bug there because there's always a sleep to wait for the locking transaction to complete before continuing to do anything. Really, the fact that tuple locks always conflicted with concurrent updates is what protected against bugs here.) Per report from Andrew Dunstan and Josh Berkus in thread at http://www.postgresql.org/message-id/534C8B33.9050807@pgexperts.com Bug analysis by Andres Freund.
*	Reset pg_stat_activity.xact_start during PREPARE TRANSACTION.	Tom Lane	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once we've completed a PREPARE, our session is not running a transaction, so its entry in pg_stat_activity should show xact_start as null, rather than leaving the value as the start time of the now-prepared transaction. I think possibly this oversight was triggered by faulty extrapolation from the adjacent comment that says PrepareTransaction should not call AtEOXact_PgStat, so tweak the wording of that comment. Noted by Andres Freund while considering bug #10123 from Maxim Boguk, although this error doesn't seem to explain that report. Back-patch to all active branches.
*	Update obsolete comments.	Heikki Linnakangas	2014-04-23
\| \| \| \|	We no longer have a TLI field in the page header.
*	Fix typos in comment.	Heikki Linnakangas	2014-04-23
\|
*	Cleanup of new b-tree page deletion code.	Heikki Linnakangas	2014-04-23
\| \| \| \| \| \| \| \| \| \| \| \|	When marking a branch as half-dead, a pointer to the top of the branch is stored in the leaf block's hi-key. During normal operation, the high key was left in place, and the block number was just stored in the ctid field of the high key tuple, but in WAL replay, the high key was recreated as a truncated tuple with zero columns. For the sake of easier debugging, also truncate the tuple in normal operation, so that the page is identical after WAL replay. Also, rename the 'downlink' field in the WAL record to 'topparent', as that seems like a more descriptive name. And make sure it's set to invalid when unlinking the leaf page.
*	Fix broken logic in logical_heap_rewrite_flush_mappings().	Tom Lane	2014-04-22
\| \| \| \| \|	It's blatantly obvious that commit 4d0d607a454ee832574afd52a3c515099cc85eb3 wasn't tested. The leak's real enough, though.
*	revert 4d0d607a454ee832574afd52a3c515099cc85eb3	Bruce Momjian	2014-04-22
\| \| \| \|	Revert due to contrib/test_decoding regression failure
*	release memory used while flushing logical mappings	Bruce Momjian	2014-04-22
\| \| \| \|	Patch by Ants Aasma
*	Fix bug in the new B-tree incomplete-split code.	Heikki Linnakangas	2014-04-22
\| \| \| \| \| \|	Forgot to update LSN of left sibling's page, when creating a new root. I fixed this for regular insertions and page splits earlier, but missed new root creation.
*	Fix Gin README.	Heikki Linnakangas	2014-04-22
\| \| \| \| \| \| \|	The README incorrectly claimed that GIN posting tree pages contain an array of uncompressed items in addition to compressed posting lists. Earlier versions of the GIN posting list compression patch worked that way, but not the one that was committed.
*	Fix bug in new B-tree page deletion code.	Heikki Linnakangas	2014-04-22
\| \| \| \| \|	When modifying a page, must hold an exclusive lock. A shared lock is obviously not good enough.
*	Retain original physical order of tuples in redo of b-tree splits.	Heikki Linnakangas	2014-04-22
\| \| \| \| \|	It makes no difference to the system, but minimizing the differences between a master and standby makes debugging simpler.
*	Fix rm_desc routine of b-tree page delete records.	Heikki Linnakangas	2014-04-22
\| \| \| \|	A couple of typos from my refactoring of the page deletion patch.
*	Fix typo.	Robert Haas	2014-04-20
\| \| \| \|	Etsuro Fujita
*	Fix typo	Magnus Hagander	2014-04-18
\| \| \| \|	Amit Langote
*	report stat() error in trigger file check	Bruce Momjian	2014-04-17
\| \| \| \| \| \| \|	Permissions might prevent the existence of the trigger file from being checked. Per report from Andres Freund
*	Use correctly-sized buffer when zero-filling a WAL file.	Heikki Linnakangas	2014-04-16
\| \| \| \| \| \|	I mixed up BLCKSZ and XLOG_BLCKSZ when I changed the way the buffer is allocated a couple of weeks ago. With the default settings, they are both 8k, but they can be changed at compile-time.
*	Set pd_lower on internal GIN posting tree pages.	Heikki Linnakangas	2014-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows squeezing out the unused space in full-page writes. And more importantly, it can be a useful debugging aid. In hindsight we should've done this back when GIN was added - we wouldn't need the 'maxoff' field in the page opaque struct if we had used pd_lower and pd_upper like on normal pages. But as long as there can be pages in the index that have been binary-upgraded from pre-9.4 versions, we can't rely on that, and have to continue using 'maxoff'. Most of the code churn comes from renaming some macros, now that they're used on internal pages, too. This change is completely backwards-compatible, no effect on pg_upgrade.
*	Fix bogus handling of bad strategy number in GIST consistent() functions.	Tom Lane	2014-04-14
\| \| \| \| \| \| \| \| \| \| \|	Make sure we throw an error instead of silently doing the wrong thing when fed a strategy number we don't recognize. Also, in the places that did already throw an error, spell the error message in a way more consistent with our message style guidelines. Per report from Paul Jones. Although this is a bug, it won't occur unless a superuser tries to do something he shouldn't, so it doesn't seem worth back-patching.
*	Remove dead checks for invalid left page in ginDeletePage.	Heikki Linnakangas	2014-04-14
\| \| \| \| \|	In some places, the function assumes the left page is valid, and in others, it checks if it is valid. Remove all the checks.
*	GIN entry pages follow the standard page layout - tell XLogInsert.	Heikki Linnakangas	2014-04-14
\| \| \| \| \| \|	The entry B-tree pages all follow the standard page layout. The 9.3 code has this right. I inadvertently changed this at some point during the big refactorings in git master.
*	Fix bugs in GIN "fast scan" with partial match.	Heikki Linnakangas	2014-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were a couple of bugs here. First, if the fuzzy limit was exceeded, the loop in entryGetItem might drop out too soon if a whole block needs to be skipped because it's < advancePast ("continue" in a while-loop checks the loop condition too). Secondly, the loop checked when stepping to a new page that there is at least one offset on the page < advancePast, but we cannot rely on that on subsequent calls of entryGetItem, because advancePast might change in between. That caused the skipping loop to read bogus items in the TbmIterateResult's offset array. First item and fix by Alexander Korotkov, second bug pointed out by Fabrízio de Royes Mello, by a small variation of Alexander's test query.