postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Refactor the index AM API slightly: move currentItemData and	Neil Conway	2007-01-20
\| \| \| \| \| \| \|	currentMarkData from IndexScanDesc to the opaque structs for the AMs that need this information (currently gist and hash). Patch from Heikki Linnakangas, fixes by Neil Conway.
*	Remove remains of old depend target.	Peter Eisentraut	2007-01-20
\|
*	Arrange for autovacuum to be killed when another operation wants to be alone	Alvaro Herrera	2007-01-16
\| \| \| \| \| \|	accessing it, like DROP DATABASE. This allows the regression tests to pass with autovacuum enabled, which open the gates for finally enabling autovacuum by default.
*	Add some notes about the basic mathematical laws that the system presumes	Tom Lane	2007-01-12
\| \| \| \| \| \|	hold true for operators in a btree operator family. This is mostly to clarify my own thinking about what the planner can assume for optimization purposes. (blowing dust off an old abstract-algebra textbook...)
*	Enable another five tuple status bits by using the high bits of the	Bruce Momjian	2007-01-09
\| \| \| \| \| \|	nattr field, and rename the field. Heikki Linnakangas
*	Add a citation to Seltzer and Yigit's Usenix '91 paper about hash table	Tom Lane	2007-01-09
\| \| \| \| \| \| \| \| \| \|	management. The paper clearly describes many of the ideas embodied in our current hashing code, but as far as I could find out there is not a direct code heritage. (Mike Olsen recalls discussion of this paper at Postgres meetings but believes it "informed the Postgres implementation probably just at the design level". Margo herself says she wasn't involved with Postgres' hash code.) Credit where credit is due 'n all that, even if fifteen years after the fact.
*	Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST	Tom Lane	2007-01-09
\| \| \| \| \| \| \| \| \| \| \| \|	per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.
*	Update CVS HEAD for 2007 copyright. Back branches are typically not	Bruce Momjian	2007-01-05
\| \| \| \|	back-stamped for this.
*	Fix some small typos in comments. Greg Stark	Tom Lane	2007-01-04
\|
*	Clean up smgr.c/md.c APIs as per discussion a couple months ago. Instead of	Tom Lane	2007-01-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	having md.c return a success/failure boolean to smgr.c, which was just going to elog anyway, let md.c issue the elog messages itself. This allows better error reporting, particularly in cases such as "short read" or "short write" which Peter was complaining of. Also, remove the kluge of allowing mdread() to return zeroes from a read-beyond-EOF: this is now an error condition except when InRecovery or zero_damaged_pages = true. (Hash indexes used to require that behavior, but no more.) Also, enforce that mdwrite() is to be used for rewriting existing blocks while mdextend() is to be used for extending the relation EOF. This restriction lets us get rid of the old ad-hoc defense against creating huge files by an accidental reference to a bogus block number: we'll only create new segments in mdextend() not mdwrite() or mdread(). (Again, when InRecovery we allow it anyway, since we need to allow updates of blocks that were later truncated away.) Also, clean up the original makeshift patch for bug #2737: move the responsibility for padding relation segments to full length into md.c.
*	Support type modifiers for user-defined types, and pull most knowledge	Tom Lane	2006-12-30
\| \| \| \| \| \|	about typmod representation for standard types out into type-specific typmod I/O functions. Teodor Sigaev, with some editorialization by Tom Lane.
*	Fix up btree's initial scankey processing to be able to detect redundant	Tom Lane	2006-12-28
\| \| \| \| \| \|	or contradictory keys even in cross-data-type scenarios. This is another benefit of the opfamily rewrite: we can find the needed comparison operators now.
*	Restructure operator classes to allow improved handling of cross-data-type	Tom Lane	2006-12-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cases. Operator classes now exist within "operator families". While most families are equivalent to a single class, related classes can be grouped into one family to represent the fact that they are semantically compatible. Cross-type operators are now naturally adjunct parts of a family, without having to wedge them into a particular opclass as we had done originally. This commit restructures the catalogs and cleans up enough of the fallout so that everything still works at least as well as before, but most of the work needed to actually improve the planner's behavior will come later. Also, there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way to create a new family right now is to allow CREATE OPERATOR CLASS to make one by default. I owe some more documentation work, too. But that can all be done in smaller pieces once this infrastructure is in place.
*	Remove the logId/logSeg fields from pg_control, because they are not needed	Tom Lane	2006-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in normal operation, and we can avoid rewriting pg_control at every log segment switch if we don't insist that these values be valid. Reducing the number of pg_control updates is a good idea for both performance and reliability. It does make pg_resetxlog's life a bit harder, but that seems a good tradeoff; and anyway the change to pg_resetxlog amounts to automating something people formerly needed to do by hand, namely look at the existing pg_xlog files to make sure the new WAL start point was past them. In passing, change the wording of xlog.c's "database system was interrupted" messages: describe the pg_control timestamp as "last known up at" rather than implying it is the exact time of service interruption. With this change the timestamp will generally be the time of the last checkpoint, which could be many minutes before the failure; and we've already seen indications that people tend to misinterpret the old wording. initdb forced due to change in pg_control layout. Simon Riggs and Tom Lane
*	Add a txn_start column to pg_stat_activity. This makes it easier to	Neil Conway	2006-12-06
\| \| \| \| \| \| \| \|	identify long-running transactions. Since we already need to record the transaction-start time (e.g. for now()), we don't need any additional system calls to report this information. Catversion bumped, initdb required.
*	Minor adjustments to make failures in startup/shutdown behave more cleanly.	Tom Lane	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	StartupXLOG and ShutdownXLOG no longer need to be critical sections, because in all contexts where they are invoked, elog(ERROR) would be translated to elog(FATAL) anyway. (One change in bgwriter.c is needed to make this true: set ExitOnAnyError before trying to exit. This is a good fix anyway since the existing code would have gone into an infinite loop on elog(ERROR) during shutdown.) That avoids a misleading report of PANIC during semi-orderly failures. Modify the postmaster to include the startup process in the set of processes that get SIGTERM when a fast shutdown is requested, and also fix it to not try to restart the bgwriter if the bgwriter fails while trying to write the shutdown checkpoint. Net result is that "pg_ctl stop -m fast" does something reasonable for a system in warm standby mode, and so should Unix system shutdown (ie, universal SIGTERM). Per gripe from Stephen Harris and some corner-case testing of my own.
*	Fix bug with page deletion. If inner page is removed and it tries to	Teodor Sigaev	2006-11-30
\| \| \| \| \| \| \| \| \| \|	remove page on next level linked from next inner page, ginScanToDelete() wrongly sets parent page. Bug reveals when many item pointers from index was deleted ( several hundred thousands). Bug is discovered by hubert depesz lubaczewski <depesz@gmail.com> Suppose, we need rc2 before release...
*	Add a comment noting that heap_copytuple_with_tuple() results in a	Neil Conway	2006-11-23
\| \| \| \| \| \|	HeapTuple that is no longer allocated as a single palloc() block; if used carelessly, this might result in a subsequent memory leak after heap_freetuple().
*	Several changes to reduce the probability of running out of memory during	Tom Lane	2006-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AbortTransaction, which would lead to recursion and eventual PANIC exit as illustrated in recent report from Jeff Davis. First, in xact.c create a special dedicated memory context for AbortTransaction to run in. This solves the problem as long as AbortTransaction doesn't need more than 32K (or whatever other size we create the context with). But in corner cases it might. Second, in trigger.c arrange to keep pending after-trigger event records in separate contexts that can be freed near the beginning of AbortTransaction, rather than having them persist until CleanupTransaction as before. Third, in portalmem.c arrange to free executor state data earlier as well. These two changes should result in backing off the out-of-memory condition before AbortTransaction needs any significant amount of memory, at least in typical cases such as memory overrun due to too many trigger events or too big an executor hash table. And all the same for subtransaction abort too, of course.
*	On systems that have setsid(2) (which should be just about everything except	Tom Lane	2006-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Windows), arrange for each postmaster child process to be its own process group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole process group not only the direct child process. This provides saner behavior for archive and recovery scripts; in particular, it's possible to shut down a warm-standby recovery server using "pg_ctl stop -m immediate", since delivery of SIGQUIT to the startup subprocess will result in killing the waiting recovery_command. Also, this makes Query Cancel and statement_timeout apply to scripts being run from backends via system(). (There is no support in the core backend for that, but it's widely done using untrusted PLs.) Per gripe from Stephen Harris and subsequent discussion.
*	Repair problems with hash indexes that span multiple segments: the hash code's	Tom Lane	2006-11-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	preference for filling pages out-of-order tends to confuse the sanity checks in md.c, as per report from Balazs Nagy in bug #2737. The fix is to ensure that the smgr-level code always has the same idea of the logical EOF as the hash index code does, by using ReadBuffer(P_NEW) where we are adding a single page to the end of the index, and using smgrextend() to reserve a large batch of pages when creating a new splitpoint. The patch is a bit ugly because it avoids making any changes in md.c, which seems the most prudent approach for a backpatchable beta-period fix. After 8.3 development opens, I'll take a look at a cleaner but more invasive patch, in particular getting rid of the now unnecessary hack to allow reading beyond EOF in mdread(). Backpatch as far as 7.4. The bug likely exists in 7.3 as well, but because of the magnitude of the 7.3-to-7.4 changes in hash, the later-version patch doesn't even begin to apply. Given the other known bugs in the 7.3-era hash code, it does not seem worth trying to develop a separate patch for 7.3.
*	Repair two related errors in heap_lock_tuple: it was failing to recognize	Tom Lane	2006-11-17
\| \| \| \| \| \| \| \| \|	cases where we already hold the desired lock "indirectly", either via membership in a MultiXact or because the lock was originally taken by a different subtransaction of the current transaction. These cases must be accounted for to avoid needless deadlocks and/or inappropriate replacement of an exclusive lock with a shared lock. Per report from Clarence Gardner and subsequent investigation.
*	String fix	Peter Eisentraut	2006-11-16
\|
*	Fix some typos in comments.	Neil Conway	2006-11-12
\|
*	Suppress a few 'uninitialized variable' warnings that gcc emits only at	Tom Lane	2006-11-11
\| \| \| \| \|	-O3 or higher (presumably because it inlines more things). Per gripe from Mark Mielke.
*	Clean up some misleading references to %p being a full path, per Simon.	Tom Lane	2006-11-10
\|
*	Change Windows rename and unlink substitutes so that they time out after	Tom Lane	2006-11-08
\| \| \| \| \| \| \| \|	30 seconds instead of retrying forever. Also modify xlog.c so that if it fails to rename an old xlog segment up to a future slot, it will unlink the segment instead. Per discussion of bug #2712, in which it became apparent that Windows can handle unlinking a file that's being held open, but not renaming it.
*	Fix recently-understood problems with handling of XID freezing, particularly	Tom Lane	2006-11-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in PITR scenarios. We now WAL-log the replacement of old XIDs with FrozenTransactionId, so that such replacement is guaranteed to propagate to PITR slave databases. Also, rather than relying on hint-bit updates to be preserved, pg_clog is not truncated until all instances of an XID are known to have been replaced by FrozenTransactionId. Add new GUC variables and pg_autovacuum columns to allow management of the freezing policy, so that users can trade off the size of pg_clog against the amount of freezing work done. Revise the already-existing code that forces autovacuum of tables approaching the wraparound point to make it more bulletproof; also, revise the autovacuum logic so that anti-wraparound vacuuming is done per-table rather than per-database. initdb forced because of changes in pg_class, pg_database, and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
*	Fix "failed to re-find parent key" btree VACUUM failure by revising page	Tom Lane	2006-11-01
\| \| \| \| \| \| \| \| \| \| \|	deletion code to avoid the case where an upper-level btree page remains "half dead" for a significant period of time, and to block insertions into a key range that is in process of being re-assigned to the right sibling of the deleted page's parent. This prevents the scenario reported by Ed L. wherein index keys could become out-of-order in the grandparent index level. Since this is a moderately invasive fix, I'm applying it only to HEAD. The bug exists back to 7.4, but the back branches will get a different patch.
*	Add some code to CREATE DATABASE to check for pre-existing subdirectories	Tom Lane	2006-10-18
\| \| \| \| \| \|	that conflict with the OID that we want to use for the new database. This avoids the risk of trying to remove files that maybe we shouldn't remove. Per gripe from Jon Lapham and subsequent discussion of 27-Sep.
*	Message style improvements	Peter Eisentraut	2006-10-06
\|
*	Cleanup for pglz_compress code: remove dead code, const-ify API of	Tom Lane	2006-10-05
\| \| \| \| \| \|	remaining functions, simplify pglz_compress's API to not require a useless data copy when compression fails. Also add a check in pglz_decompress that the expected amount of data was decompressed.
*	Make use of qsort_arg in several places that were formerly using klugy	Tom Lane	2006-10-05
\| \| \| \| \| \|	static variables. This avoids any risk of potential non-reentrancy, and in particular offers a much cleaner workaround for the Intel compiler bug that was affecting ginutil.c.
*	pgindent run for 8.2.	Bruce Momjian	2006-10-04
\|
*	Make some sentences consistent with similar ones.	Bruce Momjian	2006-10-03
\| \| \| \|	Euler Taveira de Oliveira
*	Degrade the transaction-id wraparound point message from LOG to DEBUG1, per	Alvaro Herrera	2006-09-26
\| \| \| \| \| \|	discussion. Patch from Simon Riggs.
*	Fix free space map to correctly track the total amount of FSM space needed	Tom Lane	2006-09-21
\| \| \| \| \| \| \|	even when a single relation requires more than max_fsm_pages pages. Also, make VACUUM emit a warning in this case, since it likely means that VACUUM FULL or other drastic corrective measure is needed. Per reports from Jeff Frost and others of unexpected changes in the claimed max_fsm_pages need.
*	Improve error message. Per discussion	Teodor Sigaev	2006-09-14
\| \| \| \|	http://archives.postgresql.org/pgsql-general/2006-09/msg00186.php
*	Remove unnecessary brace pair.	Bruce Momjian	2006-09-10
\|
*	If we're going to advertise the array overlap/containment operators,	Tom Lane	2006-09-10
\| \| \| \| \| \|	we probably should make them work reliably for all arrays. Fix code to handle NULLs and multidimensional arrays, move it into arrayfuncs.c. GIN is still restricted to indexing arrays with no null elements, however.
*	Rename contains/contained-by operators to @> and <@, per discussion that	Tom Lane	2006-09-10
\| \| \| \| \| \| \| \|	agreed these symbols are less easily confused. I made new pg_operator entries (with new OIDs) for the old names, so as to provide backward compatibility while making it pretty easy to remove the old names in some future release cycle. This commit only touches the core datatypes, contrib will be fixed separately.
*	Fix Intel compiler bug. Per discussion	Teodor Sigaev	2006-09-05
\| \| \| \| \|	'GIN FailedAssertions on Itanium2 with Intel compiler' in pgsql-hackers, http://archives.postgresql.org/pgsql-hackers/2006-08/msg01914.php
*	Arrange for GetSnapshotData to copy live-subtransaction XIDs from the	Tom Lane	2006-09-03
\| \| \| \| \| \| \| \| \| \|	PGPROC array into snapshots, and use this information to avoid visits to pg_subtrans in HeapTupleSatisfiesSnapshot. This appears to solve the pg_subtrans-related context swap storm problem that's been reported by several people for 8.1. While at it, modify GetSnapshotData to not take an exclusive lock on ProcArrayLock, as closer analysis shows that shared lock is always sufficient. Itagaki Takahiro and Tom Lane
*	Fix BUG #2594: Gin Indexes cause server to crash when it builds on empty table	Teodor Sigaev	2006-08-29
\|
*	Move xact.c's partial support for Lists of TransactionIds into pg_list.h.	Tom Lane	2006-08-27
\| \| \| \|	Needed because lock.c is now going to use the same type of list.
*	Add the ability to create indexes 'concurrently', that is, without	Tom Lane	2006-08-25
\| \| \| \| \|	blocking concurrent writes to the table. Greg Stark, with a little help from Tom Lane.
*	Optimize the case where a btree indexscan has current and mark positions	Tom Lane	2006-08-24
\| \| \| \| \| \| \| \|	on the same index page; we can avoid data copying as well as buffer refcount manipulations in this common case. Makes for a small but noticeable improvement in mergejoin speed. Heikki Linnakangas
*	Make the server track an 'XID epoch', that is, maintain higher-order bits	Tom Lane	2006-08-21
\| \| \| \| \| \| \| \| \|	of the transaction ID counter. Nothing is done with the epoch except to store it in checkpoint records, but this provides a foundation with which add-on code can pretend that XIDs never wrap around. This is a severely trimmed and rewritten version of the xxid patch submitted by Marko Kreen. Per discussion, the epoch counter seems the only part of xxid that really needs to be in the core server.
*	Now that we've rearranged relation open to get a lock before touching	Tom Lane	2006-08-18
\| \| \| \| \| \|	the rel, it's easy to get rid of the narrow race-condition window that used to exist in VACUUM and CLUSTER. Did some minor code-beautification work in the same area, too.
*	Implement archive_timeout feature to force xlog file switches to occur no more	Tom Lane	2006-08-17
\| \| \| \| \| \| \| \| \| \| \|	than N seconds apart. This allows a simple, if not very high performance, means of guaranteeing that a PITR archive is no more than N seconds behind real time. Also make pg_current_xlog_location return the WAL Write pointer, add pg_current_xlog_insert_location to return the Insert pointer, and fix pg_xlogfile_name_offset to return its results as a two-element record instead of a smashed-together string, as per recent discussion. Simon Riggs