postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Restructure LOCKTAG as per discussions of a couple months ago.	Tom Lane	2005-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Essentially, we shoehorn in a lockable-object-type field by taking a byte away from the lockmethodid, which can surely fit in one byte instead of two. This allows less artificial definitions of all the other fields of LOCKTAG; we can get rid of the special pg_xactlock pseudo-relation, and also support locks on individual tuples and general database objects (including shared objects). None of those possibilities are actually exploited just yet, however. I removed pg_xactlock from pg_class, but did not force initdb for that change. At this point, relkind 's' (SPECIAL) is unused and could be removed entirely.
*	Implement sharable row-level locks, and use them for foreign key references	Tom Lane	2005-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.
*	Add comment about checkpoint panic behavior during shutdown, per	Tom Lane	2005-04-23
\| \| \| \|	suggestion from Qingqing Zhou.
*	Recent changes got the sense of the notnull bit backwards in the 2.0	Tom Lane	2005-04-23
\| \| \| \|	protocol output routines. Mea culpa :-(. Per report from Kris Jurka.
*	Fix comment typo.	Bruce Momjian	2005-04-17
\|
*	Reduce PANIC to ERROR in several xlog routines that are used in both	Tom Lane	2005-04-15
\| \| \| \| \| \| \| \| \| \|	critical and noncritical contexts (an example of noncritical being post-checkpoint removal of dead xlog segments). In the critical cases the CRIT_SECTION mechanism will cause ERROR to be promoted to PANIC anyway, and in the noncritical cases we shouldn't let an error take down the entire database. Arguably there should be no explicit PANIC errors in this module, only more START/END_CRIT_SECTION calls, but I didn't go that far. (Yet.)
*	Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior	Tom Lane	2005-04-15
\| \| \| \| \| \| \|	when recycling a large number of xlog segments during checkpoint. The former behavior searched from the same start point each time, requiring O(checkpoint_segments^2) stat() calls to relocate all the segments. Instead keep track of where we stopped last time through.
*	Make equalTupleDescs() compare attlen/attbyval/attalign rather than	Tom Lane	2005-04-14
\| \| \| \| \| \| \| \| \| \|	assuming comparison of atttypid is sufficient. In a dropped column atttypid will be 0, and we'd better check the physical-storage data to make sure the tupdescs are physically compatible. I do not believe there is a real risk before 8.0, since before that we only used this routine to compare successive states of the tupdesc for a particular relation. But 8.0's typcache.c might be comparing arbitrary tupdescs so we'd better play it safer.
*	Completion of project to use fixed OIDs for all system catalogs and	Tom Lane	2005-04-14
\| \| \| \| \| \| \|	indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...
*	Simplify initdb-time assignment of OIDs as I proposed yesterday, and	Tom Lane	2005-04-13
\| \| \| \| \| \| \| \|	avoid encroaching on the 'user' range of OIDs by allowing automatic OID assignment to use values below 16k until we reach normal operation. initdb not forced since this doesn't make any incompatible change; however a lot of stuff will have different OIDs after your next initdb.
*	Fix interaction between materializing holdable cursors and firing	Tom Lane	2005-04-11
\| \| \| \| \| \|	deferred triggers: either one can create more work for the other, so we have to loop till it's all gone. Per example from andrew@supernews. Add a regression test to help spot trouble in this area in future.
*	Merge Resdom nodes into TargetEntry nodes to simplify code and save a	Tom Lane	2005-04-06
\| \| \| \| \| \| \| \| \|	few palloc's. I also chose to eliminate the restype and restypmod fields entirely, since they are redundant with information stored in the node's contained expression; re-examining the expression at need seems simpler and more reliable than trying to keep restype/restypmod up to date. initdb forced due to change in contents of stored rules.
*	First phase of OUT-parameters project. We can now define and use SQL	Tom Lane	2005-03-31
\| \| \| \| \|	functions with OUT parameters. The various PLs still need work, as does pg_dump. Rudimentary docs and regression tests included.
*	Officially decouple FUNC_MAX_ARGS from INDEX_MAX_KEYS, and set the	Tom Lane	2005-03-29
\| \| \| \| \| \|	former to 100 by default. Clean up some of the less necessary dependencies on FUNC_MAX_ARGS; however, the biggie (FunctionCallInfoData) remains.
*	Convert oidvector and int2vector into variable-length arrays. This	Tom Lane	2005-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	change saves a great deal of space in pg_proc and its primary index, and it eliminates the former requirement that INDEX_MAX_KEYS and FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded in the on-disk representation (because it affects index tuple header size), but FUNC_MAX_ARGS is not. I believe it would now be possible to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet. There are still a lot of vestigial references to FUNC_MAX_ARGS, which I will clean up in a separate pass. However, getting rid of it altogether would require changing the FunctionCallInfoData struct, and I'm not sure I want to buy into that.
*	Remove dead push/pop rollback code. Vadim once planned to implement	Tom Lane	2005-03-28
\| \| \| \| \| \|	transaction rollback via UNDO but I think that's highly unlikely to happen, so we may as well remove the stubs. (Someday we ought to rip out the stub xxx_undo routines, too.) Per Alvaro.
*	First steps towards index scans with heap access decoupled from index	Tom Lane	2005-03-27
\| \| \| \| \| \| \| \| \| \|	access: define new index access method functions 'amgetmulti' that can fetch multiple TIDs per call. (The functions exist but are totally untested as yet.) Since I was modifying pg_am anyway, remove the no-longer-needed 'rel' parameter from amcostestimate functions, and also remove the vestigial amowner column that was creating useless work for Alvaro's shared-object-dependencies project. Initdb forced due to changes in pg_am.
*	Eliminate duplicate hasnulls bit testing in index tuple access, and	Tom Lane	2005-03-27
\| \| \| \|	clean up itup.h a little bit.
*	Change Win32 O_SYNC method to O_DSYNC because that is what the method	Bruce Momjian	2005-03-24
\| \| \| \| \| \| \| \| \| \| \|	currently does. This is now the default Win32 wal sync method because we perfer o_datasync to fsync. Also, change Win32 fsync to a new wal sync method called fsync_writethrough because that is the behavior of _commit, which is what is used for fsync on Win32. Backpatch to 8.0.X.
*	Create a routine PageIndexMultiDelete() that replaces a loop around	Tom Lane	2005-03-22
\| \| \| \| \| \| \| \| \|	PageIndexTupleDelete() with a single pass of compactification --- logic mostly lifted from PageRepairFragmentation. I noticed while profiling that a VACUUM that's cleaning up a whole lot of deleted tuples would spend as much as a third of its CPU time in PageIndexTupleDelete; not too surprising considering the loop method was roughly O(N^2) in the number of tuples involved.
*	Convert index-related tuple handling routines from char 'n'/' ' to bool	Tom Lane	2005-03-21
\| \| \| \| \| \| \| \| \| \|	convention for isnull flags. Also, remove the useless InsertIndexResult return struct from index AM aminsert calls --- there is no reason for the caller to know where in the index the tuple was inserted, and we were wasting a palloc cycle per insert to deliver this uninteresting value (plus nontrivial complexity in some AMs). I forced initdb because of the change in the signature of the aminsert routines, even though nothing really looks at those pg_proc entries...
*	Change the return value of HeapTupleSatisfiesUpdate() to be an enum,	Neil Conway	2005-03-20
\| \| \| \| \|	rather than an integer, and fix the associated fallout. From Alvaro Herrera.
*	Remove unnecessary calls of FlushRelationBuffers: there is no need	Tom Lane	2005-03-20
\| \| \| \| \| \| \| \| \| \|	to write out data that we are about to tell the filesystem to drop. smgr_internal_unlink already had a DropRelFileNodeBuffers call to get rid of dead buffers without a write after it's no longer possible to roll back the deleting transaction. Adding a similar call in smgrtruncate simplifies callers and makes the overall division of labor clearer. This patch removes the former behavior that VACUUM would write all dirty buffers of a relation unconditionally.
*	Revise TupleTableSlot code to avoid unnecessary construction and disassembly	Tom Lane	2005-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.
*	Avoid O(N^2) overhead in repeated nocachegetattr calls when columns of	Tom Lane	2005-03-14
\| \| \| \| \| \| \| \|	a tuple are being accessed via ExecEvalVar and the attcacheoff shortcut isn't usable (due to nulls and/or varlena columns). To do this, cache Datums extracted from a tuple in the associated TupleTableSlot. Also some code cleanup in and around the TupleTable handling. Atsushi Ogawa with some kibitzing by Tom Lane.
*	Adjust creation/destruction of TupleDesc data structure to reduce the	Tom Lane	2005-03-07
\| \| \| \| \| \|	number of palloc calls. This has a salutory impact on plpgsql operations with record variables (which create and destroy tupdescs constantly) and probably helps a bit in some other cases too.
*	Remove some no-longer-needed kluges for bootstrapping, in particular	Tom Lane	2005-02-20
\| \| \| \| \| \| \| \|	the AMI_OVERRIDE flag. The fact that TransactionLogFetch treats BootstrapTransactionId as always committed is sufficient to make bootstrap work, and getting rid of extra tests in heavily used code paths seems like a win. The files produced by initdb are demonstrably the same after this change.
*	Add code to prevent transaction ID wraparound by enforcing a safe limit	Tom Lane	2005-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	in GetNewTransactionId(). Since the limit value has to be computed before we run any real transactions, this requires adding code to database startup to scan pg_database and determine the oldest datfrozenxid. This can conveniently be combined with the first stage of an attack on the problem that the 'flat file' copies of pg_shadow and pg_group are not properly updated during WAL recovery. The code I've added to startup resides in a new file src/backend/utils/init/flatfiles.c, and it is responsible for rewriting the flat files as well as initializing the XID wraparound limit value. This will eventually allow us to get rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add a trigger to pg_database.
*	Move plpgsql DEBUG from DEBUG2 to DEBUG1 because it is a user-requested	Bruce Momjian	2005-02-12
\| \| \| \| \| \|	DEBUG. Fix a few places where DEBUG1 crept in that should have been DEBUG2.
*	Marginal hack to merge adjacent ReleaseBuffer/ReadBuffer calls into	Tom Lane	2005-02-05
\| \| \| \| \|	ReleaseAndReadBuffer during GIST index searches. We already did this in btree and rtree, might as well do it here too.
*	Change heap_modifytuple() to require a TupleDesc rather than a	Neil Conway	2005-01-27
\| \| \| \| \|	Relation. Patch from Alvaro Herrera, minor editorializing by Neil Conway.
*	Fix memory leak in rtdosplit, per report from Clive Page.	Tom Lane	2005-01-24
\|
*	This patch makes some improvements to the rtree index implementation:	Neil Conway	2005-01-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(1) Keep a pin on the scan's current buffer and mark buffer. This avoids the need to do a ReadBuffer() for each tuple produced by the scan. Since ReadBuffer() is expensive, this is a significant win. (2) Convert a ReleaseBuffer(); ReadBuffer() pair into ReleaseAndReadBuffer(). Surely not a huge win, but it saves a lock acquire/release... (3) Remove a bunch of duplicated code in rtget.c; make rtnext() handle both the "initial result" and "subsequent result" cases. (4) Add support for index tuple killing (5) Remove rtscancache(): it is dead code, for the same reason that gistscancache() is dead code (an index scan ought not be invoked with NoMovementScanDirection). The end result is about a 10% improvement in rtree index scan perf, according to contrib/rtree_gist/bench.
*	Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This	Tom Lane	2005-01-10
\| \| \| \| \| \|	is the minimum required fix. I want to look next at taking advantage of it by simplifying the message semantics in the shared inval message queue, but that part can be held over for 8.1 if it turns out too ugly.
*	Update copyrights that were missed.	Bruce Momjian	2005-01-01
\|
*	Tag appropriate files for rc3	PostgreSQL Daemon	2004-12-31
\| \| \| \| \| \| \| \|	Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
*	Awhile back I added some code to StartupCLOG() to forcibly zero out	Tom Lane	2004-12-22
\| \| \| \| \| \| \| \| \| \|	the remainder of the current clog page during system startup. While this was a good idea, it turns out the code fails if nextXid is exactly at a page boundary, because we won't have created the "current" clog page yet in that case. Since the page will be correctly zeroed when we execute the first transaction on it, the solution is just to do nothing when exactly at a page boundary. Per trouble report from Dave Hartwig.
*	Fix is-it-time-for-a-checkpoint logic so that checkpoint_segments can	Tom Lane	2004-12-17
\| \| \| \|	usefully be larger than 255. Per gripe from Simon Riggs.
*	Calculation of keys_are_unique flag was wrong for cases involving	Tom Lane	2004-12-15
\| \| \| \|	redundant cross-datatype comparisons. Per example from Merlin Moncure.
*	Change planner to use the current true disk file size as its estimate of	Tom Lane	2004-12-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a relation's number of blocks, rather than the possibly-obsolete value in pg_class.relpages. Scale the value in pg_class.reltuples correspondingly to arrive at a hopefully more accurate number of rows. When pg_class contains 0/0, estimate a tuple width from the column datatypes and divide that into current file size to estimate number of rows. This improved methodology allows us to jettison the ancient hacks that put bogus default values into pg_class when a table is first created. Also, per a suggestion from Simon, make VACUUM (but not VACUUM FULL or ANALYZE) adjust the value it puts into pg_class.reltuples to try to represent the mean tuple density instead of the minimal density that actually prevails just after VACUUM. These changes alter the plans selected for certain regression tests, so update the expected files accordingly. (I removed join_1.out because it's not clear if it still applies; we can add back any variant versions as they are shown to be needed.)
*	Minor adjustment of message style.	Tom Lane	2004-11-17
\|
*	Micro-optimization of markpos() and restrpos() in btree and hash indexes.	Neil Conway	2004-11-17
\| \| \| \| \| \|	Rather than using ReadBuffer() to increment the reference count on an already-pinned buffer, we should use IncrBufferRefCount() as it is faster and does not require acquiring the BufMgrLock.
*	Don't allow pg_start_backup() to be invoked if archive_command has not	Neil Conway	2004-11-17
\| \| \| \|	been defined. Patch from Gavin Sherry, editorializing by Neil Conway.
*	There is no need for ReadBuffer() call sites to check that the returned	Neil Conway	2004-11-14
\| \| \| \| \| \|	buffer is valid, as ReadBuffer() will elog on error. Most of the call sites of ReadBuffer() got this right, but this patch fixes those call sites that did not.
*	Remove obsolete comment from btbuild() and hashbuild(): we no longer use	Neil Conway	2004-11-11
\| \| \| \|	a global variable to control building indexes.
*	Small message clarifications	Peter Eisentraut	2004-11-05
\|
*	Change COMMIT back to the old behavior of emitting command tag COMMIT,	Tom Lane	2004-10-30
\| \| \| \| \|	not ROLLBACK, for the case of COMMIT outside a transaction block. Alvaro Herrera
*	Rearrange order of pre-commit operations: must close cursors before doing	Tom Lane	2004-10-29
\| \| \| \|	ON COMMIT actions. Per bug report from Michael Guerin.
*	Add DEBUG1-level logging of checkpoint start and end. Also, reduce the	Tom Lane	2004-10-29
\| \| \| \| \| \|	'recycled log files' and 'removed log files' messages from DEBUG1 to DEBUG2, replacing them with a count of files added/removed/recycled in the checkpoint end message, as per suggestion from Simon Riggs.
*	Make heap_fetch API more consistent by having the buffer remain pinned	Tom Lane	2004-10-26
\| \| \| \| \| \|	in all cases when keep_buf = true. This allows ANALYZE's inner loop to use heap_release_fetch, which saves multiple buffer lookups for the same page and avoids overestimation of cost by the vacuum cost mechanism.