postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Change hashscan.c to keep its list of active hash index scans in	Tom Lane	2008-03-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	TopMemoryContext, rather than scattered through executor per-query contexts. This poses no danger of memory leak since the ResourceOwner mechanism guarantees release of no-longer-needed items. It is needed because the per-query context might already be released by the time we try to clean up the hash scan list. Report by ykhuang, diagnosis by Heikki. Back-patch to 8.0, where the ResourceOwner-based cleanup was introduced. The given test case does not fail before 8.2, probably because we rearranged transaction abort processing somehow; but this coding is undoubtedly risky so I'll patch 8.0 and 8.1 anyway.
*	Fix PREPARE TRANSACTION to reject the case where the transaction has dropped a	Tom Lane	2008-03-04
\| \| \| \| \| \| \|	temporary table; we can't support that because there's no way to clean up the source backend's internal state if the eventual COMMIT PREPARED is done by another backend. This was checked correctly in 8.1 but I broke it in 8.2 :-(. Patch by Heikki Linnakangas, original trouble report by John Smith.
*	Reducing the assumed alignment of struct varlena means that the compiler	Tom Lane	2008-02-29
\| \| \| \| \| \| \| \| \| \|	is also licensed to put a local variable declared that way at an unaligned address. Which will not work if the variable is then manipulated with SET_VARSIZE or other macros that assume alignment. So the previous patch is not an unalloyed good, but on balance I think it's still a win, since we have very few places that do that sort of thing. Fix the one place in tuptoaster.c that does it. Per buildfarm results from gypsy_moth (I'm a bit surprised that only one machine showed a failure).
*	Change the declaration of struct varlena so that the length word is	Tom Lane	2008-02-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	represented as "char ...[4]" not "int32". Since the length word is never supposed to be accessed via this struct member anyway, this won't break any existing code that is following the rules. The advantage is that C compilers will no longer assume that a pointer to struct varlena is word-aligned, which prevents incorrect optimizations in TOAST-pointer access and perhaps other places. gcc doesn't seem to do this (at least not at -O2), but the problem is demonstrable on some other compilers. I changed struct inet as well, but didn't bother to touch a lot of other struct definitions in which it wouldn't make any difference because there were other fields forcing int alignment anyway. Hopefully none of those struct definitions are used for accessing unaligned Datums.
*	Remove another target I forgot during the refactoring	Peter Eisentraut	2008-02-19
\|
*	Refactor backend makefiles to remove lots of duplicate code	Peter Eisentraut	2008-02-19
\|
*	Replace time_t with pg_time_t (same values, but always int64) in on-disk	Tom Lane	2008-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	data structures and backend internal APIs. This solves problems we've seen recently with inconsistent layout of pg_control between machines that have 32-bit time_t and those that have already migrated to 64-bit time_t. Also, we can get out from under the problem that Windows' Unix-API emulation is not consistent about the width of time_t. There are a few remaining places where local time_t variables are used to hold the current or recent result of time(NULL). I didn't bother changing these since they do not affect any cross-module APIs and surely all platforms will have 64-bit time_t before overflow becomes an actual risk. time_t should be avoided for anything visible to extension modules, however.
*	Add a GUC variable "synchronize_seqscans" to allow clients to disable the new	Tom Lane	2008-01-30
\| \| \| \| \|	synchronized-scanning behavior, and make pg_dump disable sync scans so that it will reliably preserve row ordering. Per recent discussions.
*	Provide a clearer error message if the pg_control version number looks	Peter Eisentraut	2008-01-21
\| \| \| \|	wrong because of mismatched byte ordering.
*	Revise memory management for libxml calls. Instead of keeping libxml's data	Tom Lane	2008-01-15
\| \| \| \| \| \| \| \| \| \| \|	in whichever context happens to be current during a call of an xml.c function, use a dedicated context that will not go away until we explicitly delete it (which we do at transaction end or subtransaction abort). This makes recovery after an error much simpler --- we don't have to individually delete the data structures created by libxml. Also, we need to initialize and cleanup libxml only once per transaction (if there's no error) instead of once per function call, so it should be a bit faster. We'll need to keep an eye out for intra-transaction memory leaks, though. Alvaro and Tom.
*	Fix CREATE INDEX CONCURRENTLY so that it won't use synchronized scan for	Tom Lane	2008-01-14
\| \| \| \| \| \| \| \|	its second pass over the table. It has to start at block zero, else the "merge join" logic for detecting which TIDs are already in the index doesn't work. Hence, extend heapam.c's API so that callers can enable or disable syncscan. (I put in an option to disable buffer access strategy, too, just in case somebody needs it.) Per report from Hannes Dorbath.
*	Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX,	Tom Lane	2008-01-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and CLUSTER) execute as the table owner rather than the calling user, using the same privilege-switching mechanism already used for SECURITY DEFINER functions. The purpose of this change is to ensure that user-defined functions used in index definitions cannot acquire the privileges of a superuser account that is performing routine maintenance. While a function used in an index is supposed to be IMMUTABLE and thus not able to do anything very interesting, there are several easy ways around that restriction; and even if we could plug them all, there would remain a risk of reading sensitive information and broadcasting it through a covert channel such as CPU usage. To prevent bypassing this security measure, execution of SET SESSION AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context. Thanks to Itagaki Takahiro for reporting this vulnerability. Security: CVE-2007-6600
*	Update copyrights in source tree to 2008.	Bruce Momjian	2008-01-01
\|
*	Improve a number of elog messages for not-supposed-to-happen cases in btrees,	Tom Lane	2007-12-31
\| \| \| \| \| \| \| \| \|	since these seem to happen after all in corrupted indexes. Make sure we supply the index name in all cases, and provide relevant block numbers where available. Also consistently identify the index name as such. Back-patch to 8.2, in hopes that this might help Mason Hale figure out his problem.
*	Code review for LIKE ... INCLUDING INDEXES patch. Fix failure to propagate	Tom Lane	2007-12-01
\| \| \| \| \| \| \| \| \| \|	constraint status of copied indexes (bug #3774), as well as various other small bugs such as failure to pstrdup when needed. Allow INCLUDING INDEXES indexes to be merged with identical declared indexes (perhaps not real useful, but the code is there and having it not apply to LIKE indexes seems pretty unorthogonal). Avoid useless work in generateClonedIndexStmt(). Undo some poorly chosen API changes, and put a couple of routines in modules that seem to be better places for them.
*	Avoid incrementing the CommandCounter when CommandCounterIncrement is called	Tom Lane	2007-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	but no database changes have been made since the last CommandCounterIncrement. This should result in a significant improvement in the number of "commands" that can typically be performed within a transaction before hitting the 2^32 CommandId size limit. In particular this buys back (and more) the possible adverse consequences of my previous patch to fix plan caching behavior. The implementation requires tracking whether the current CommandCounter value has been "used" to mark any tuples. CommandCounter values stored into snapshots are presumed not to be used for this purpose. This requires some small executor changes, since the executor used to conflate the curcid of the snapshot it was using with the command ID to mark output tuples with. Separating these concepts allows some small simplifications in executor APIs. Something for the TODO list: look into having CommandCounterIncrement not do AcceptInvalidationMessages. It seems fairly bogus to be doing it there, but exactly where to do it instead isn't clear, and I'm disinclined to mess with asynchronous behavior during late beta.
*	Improve GIN index build's tracking of memory usage by using	Tom Lane	2007-11-16
\| \| \| \| \| \| \| \| \|	GetMemoryChunkSpace, not just the palloc request size. This brings the allocatedMemory counter close enough to reality (as measured by MemoryContextStats printouts) that I think we can get rid of the arbitrary factor-of-2 adjustment that was put into the code initially. Given the sensitivity of GIN build to work memory size, not using as much of work memory as we're allowed to seems a pretty bad idea.
*	Repair still another bug in the btree page split WAL reduction patch:	Tom Lane	2007-11-16
\| \| \| \| \| \| \|	it failed for splits of non-leaf pages because in such pages the first data key on a page is suppressed, and so we can't just copy the first key from the right page to reconstitute the left page's high key. Problem found by Koichi Suzuki, patch by Heikki.
*	Small comment spacing improvement.	Bruce Momjian	2007-11-16
\|
*	Fix pgindent to properly handle 'else' and single-line comments on the	Bruce Momjian	2007-11-15
\| \| \| \| \|	same line; previous fix was only partial. Re-run pgindent on files that need it.
*	Re-run pgindent with updated list of typedefs. (Updated README should	Bruce Momjian	2007-11-15
\| \| \| \|	avoid this problem in the future.)
*	When logging the recovery.conf parameters, show them quoted as they would	Peter Eisentraut	2007-11-15
\| \| \| \|	appear in the configuration file.
*	pgindent run for 8.3.	Bruce Momjian	2007-11-15
\|
*	Prevent re-use of a deleted relation's relfilenode until after the next	Tom Lane	2007-11-15
\| \| \| \| \| \| \| \| \| \|	checkpoint. This guards against an unlikely data-loss scenario in which we re-use the relfilenode, then crash, then replay the deletion and recreation of the file. Even then we'd be OK if all insertions into the new relation had been WAL-logged ... but that's not guaranteed given all the no-WAL-logging optimizations that have recently been added. Patch by Heikki Linnakangas, per a discussion last month.
*	Clean up some stray references to tsearch2.	Tom Lane	2007-11-13
\|
*	Ensure that typmod decoration on a datatype name is validated in all cases,	Tom Lane	2007-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	even in code paths where we don't pay any subsequent attention to the typmod value. This seems needed in view of the fact that 8.3's generalized typmod support will accept a lot of bogus syntax, such as "timestamp(foo)" or "record(int, 42)" --- if we allow such things to pass without comment, users will get confused. Per a recent example from Greg Stark. To implement this in a way that's not very vulnerable to future bugs-of-omission, refactor the API of parse_type.c's TypeName lookup routines so that typmod validation is folded into the base lookup operation. Callers can still choose not to receive the encoded typmod, but we'll check the decoration anyway if it's present.
*	Reduce error level of ROLLBACK outside a transaction from WARNING to	Bruce Momjian	2007-11-10
\| \| \| \|	NOTICE.
*	Use "alternative" instead of "alternate" where it is clearer.	Peter Eisentraut	2007-11-07
\|
*	- Add check of already changed page while replay WAL. This touches only	Teodor Sigaev	2007-10-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	ginRedoInsert(), because other ginRedo* functions rewrite whole page or make changes which could be applied several times without consistent's loss - Remove check of identifying of corresponding split record: it's possible that replaying of WAL starts after actual page split, but before removing of that split from incomplete splits list. In this case, that check cause FATAL error. Per stress test which reproduces bug reported by Craig McElroy <craig.mcelroy@contegix.com>
*	Fix coredump during replay WAL after crash. Change entrySplitPage() to prevent	Teodor Sigaev	2007-10-29
\| \| \| \| \| \| \| \|	usage of any information from system catalog, because it could be called during replay of WAL. Per bug report from Craig McElroy <craig.mcelroy@contegix.com>. Patch doesn't change on-disk storage.
*	Rearrange vacuum-related bits in PGPROC as a bitmask, to better support	Alvaro Herrera	2007-10-24
\| \| \| \| \| \| \| \| \|	having several of them. Add two more flags: whether the process is executing an ANALYZE, and whether a vacuum is for Xid wraparound (which is obviously only set by autovacuum). Sneakily move the worker's recently-acquired PostAuthDelay to a more useful place.
*	Keep heap_page_prune from marking the buffer dirty when it didn't	Tom Lane	2007-10-24
\| \| \| \| \|	really change anything. Per report from Itagaki Takahiro. Fix by Pavan Deolasee.
*	Tweak toast-related logic in heapam.c so that the toaster is only invoked	Tom Lane	2007-10-16
\| \| \| \| \| \| \|	when relkind = RELKIND_RELATION. This syncs these tests with the Asserts in tuptoaster.c, and ensures that we won't ever try to, for example, compress a sequence's tuple. Problem found by Greg Stark while stress-testing with much-smaller-than-normal page sizes.
*	When telling the bgwriter that we need a checkpoint because too much xlog	Tom Lane	2007-10-12
\| \| \| \| \| \| \| \| \|	has been consumed, recheck against the latest value of RedoRecPtr before really sending the signal. This avoids useless checkpoint activity if XLogWrite is executed when we have a very stale local copy of RedoRecPtr. The potential for useless checkpoint is very much worse in 8.3 because of the walwriter process (which never does XLogInsert), so while this behavior was intentional, it needs to be changed. Per report from Itagaki Takahiro.
*	Remove incorrect use of VARSIZE() on a toasted datum. We can just remove it	Tom Lane	2007-10-11
\| \| \| \| \|	instead of fix it, since once we've set toast_action[i] to 'p' it no longer matters what toast_sizes[i] is. Greg Stark
*	Avoid assuming that struct varattrib_pointer doesn't get padded by the	Tom Lane	2007-10-01
\| \| \| \| \| \| \| \| \| \| \|	compiler --- at least on ARM, it does. I suspect that the varvarlena patch has been creating larger-than-intended toast pointers all along on ARM, but it wasn't exposed until the latest tweak added some Asserts that calculated the expected size in a different way. We could probably have fixed this by adding __attribute__((packed)) as is done for ItemPointerData, but struct varattrib_pointer isn't really all that useful anyway, so it seems cleanest to just get rid of it and have only struct varattrib_1b_e. Per results from buildfarm member quagga.
*	Add an extra header byte to TOAST-pointer datums to represent their size	Tom Lane	2007-09-30
\| \| \| \| \| \| \|	explicitly. This means a TOAST pointer takes 18 bytes instead of 17 --- still smaller than in 8.2 --- which seems a good tradeoff to ensure we won't have painted ourselves into a corner if we want to support multiple types of TOAST pointer later on. Per discussion with Greg Stark.
*	Adjust recovery PS display as agreed with Simon: 'waiting for XXX'	Tom Lane	2007-09-30
\| \| \| \| \| \|	while the restore_command does its thing, then 'recovering XXX' while processing the segment file. These operations are heavyweight enough that an extra PS display set shouldn't bother anyone.
*	Make recovery show the current input WAL segment name in the startup	Tom Lane	2007-09-29
\| \| \| \| \|	process' PS display. After a suggestion by Simon (not exactly his patch though).
*	Make archive recovery always start a new timeline, rather than only when a	Tom Lane	2007-09-29
\| \| \| \| \| \| \|	recovery stop time was used. This avoids a corner-case risk of trying to overwrite an existing archived copy of the last WAL segment, and seems simpler and cleaner all around than the original definition. Per example from Jon Colverson and subsequent analysis by Simon.
*	Some small tuptoaster improvements from Greg Stark. Avoid unnecessary	Tom Lane	2007-09-26
\| \| \| \| \| \|	decompression of an already-compressed external value when we have to copy it; save a few cycles when a value is too short for compression; and annotate various lines that are currently unreachable.
*	Minor improvements in backup and recovery:	Tom Lane	2007-09-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- create a separate archive_mode GUC, on which archive_command is dependent - %r option in recovery.conf sends last restartpoint to recovery command - %r used in pg_standby, updated README - minor other code cleanup in pg_standby - doc on Warm Standby now mentions pg_standby and %r - log_restartpoints recovery option emits LOG message at each restartpoint - end of recovery now displays last transaction end time, as requested by Warren Little; also shown at each restartpoint - restart archiver if needed to carry away WAL files at shutdown Simon Riggs
*	Fix regex, LIKE, and some other second-rank text-manipulation functions	Tom Lane	2007-09-21
\| \| \| \| \| \|	to not cause needless copying of text datums that have 1-byte headers. Greg Stark, in response to performance gripe from Guillaume Smet and ITAGAKI Takahiro.
*	Improve handling of prune/no-prune decisions by storing a page's oldest	Tom Lane	2007-09-21
\| \| \| \| \| \|	unpruned XMAX in its header. At the cost of 4 bytes per page, this keeps us from performing heap_page_prune when there's no chance of pruning anything. Seems to be necessary per Heikki's preliminary performance testing.
*	Fix comments that misspelled TransactionIdIsInProgress, per Heikki.	Tom Lane	2007-09-21
\|
*	HOT updates. When we update a tuple without changing any of its indexed	Tom Lane	2007-09-20
\| \| \| \| \| \| \| \| \| \| \| \|	columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
*	Remove GIN interface section, which is now documented in SGML.	Bruce Momjian	2007-09-14
\| \| \| \|	Heikki Linnakangas
*	Redefine the lp_flags field of item pointers as having four states, rather	Tom Lane	2007-09-12
\| \| \| \| \| \| \| \| \|	than two independent bits (one of which was never used in heap pages anyway, or at least hadn't been in a very long time). This gives us flexibility to add the HOT notions of redirected and dead item pointers without requiring anything so klugy as magic values of lp_off and lp_len. The state values are chosen so that for the states currently in use (pre-HOT) there is no change in the physical representation.
*	Rename recently-added pg_stat_activity column from txn_start to xact_start,	Tom Lane	2007-09-11
\| \| \| \|	for consistency with other column names such as in pg_stat_database.
*	Replace the former method of determining snapshot xmax --- to wit, calling	Tom Lane	2007-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid" variable that is updated during transaction commit or abort. Since latestCompletedXid is written only in places that had to lock ProcArrayLock exclusively anyway, and is read only in places that had to lock ProcArrayLock shared anyway, it adds no new locking requirements to the system despite being cluster-wide. Moreover, removing ReadNewTransactionId from snapshot acquisition eliminates the need to take both XidGenLock and ProcArrayLock at the same time. Since XidGenLock is sometimes held across I/O this can be a significant win. Some preliminary benchmarking suggested that this patch has no effect on average throughput but can significantly improve the worst-case transaction times seen in pgbench. Concept by Florian Pflug, implementation by Tom Lane.