aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
...
* Add a field to the first page of each WAL file to indicate theTom Lane2006-04-05
| | | | | | XLOG_BLCKSZ. This ought to help in preventing configuration mismatch problems if anyone tries to ship PITR files between servers compiled with different XLOG_BLCKSZ settings. Simon Riggs
* Don't use BLCKSZ for the physical length of the pg_control file, butTom Lane2006-04-04
| | | | | | instead a dedicated symbol. This probably makes no functional difference for likely values of BLCKSZ, but it makes the intent clearer. Simon Riggs, minor editorialization by Tom Lane.
* Modify all callers of datatype input and receive functions so that if theseTom Lane2006-04-04
| | | | | | | | | | | | | | | functions are not strict, they will be called (passing a NULL first parameter) during any attempt to input a NULL value of their datatype. Currently, all our input functions are strict and so this commit does not change any behavior. However, this will make it possible to build domain input functions that centralize checking of domain constraints, thereby closing numerous holes in our domain support, as per previous discussion. While at it, I took the opportunity to introduce convenience functions InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O functions. This eliminates a lot of grotty-looking casts, but the main motivation is to make it easier to grep for these places if we ever need to touch them again.
* Define a separately configurable XLOG_BLCKSZ symbol for the page sizeTom Lane2006-04-03
| | | | | | | | | | | used within WAL files. Historically this was the same as the data file BLCKSZ, but there's no necessary connection, and it's possible that performance gains might ensue from reducing XLOG_BLCKSZ. In any case distinguishing two symbols should improve code clarity. This commit does not actually change the page size, only provide the infrastructure to make it possible to do so. initdb forced because of addition of a field to pg_control. Mark Wong, with some help from Simon Riggs and Tom Lane.
* Fix thinko in gistRedoPageUpdateRecord: if XLR_BKP_BLOCK_1 is set, weTom Lane2006-04-03
| | | | | don't have anything to do to the page, but we still have to adjust the incomplete_inserts list that we're maintaining in memory.
* Eliminate ajust scan code. Since concurrent GiST it doesn'tTeodor Sigaev2006-04-03
| | | | do real work. That was missed during concurrence development.
* Remove the 'slow' path for btree index build, which built the btreeTom Lane2006-04-01
| | | | | | | | | | | incrementally by successive inserts rather than by sorting the data. We were only using the slow path during bootstrap, apparently because when first written it failed during bootstrap --- but it works fine now AFAICT. Removing it saves a hundred or so lines of code and produces noticeably (~10%) smaller initial states of the system catalog indexes. While that won't make much difference for heavily-modified catalogs, for the more static ones there may be a useful long-term performance improvement.
* Clean up WAL/buffer interactions as per my recent proposal. Get rid of theTom Lane2006-03-31
| | | | | | | | | | | | | | | | misleadingly-named WriteBuffer routine, and instead require routines that change buffer pages to call MarkBufferDirty (which does exactly what it says). We also require that they do so before calling XLogInsert; this takes care of the synchronization requirement documented in SyncOneBuffer. Note that because bufmgr takes the buffer content lock (in shared mode) while writing out any buffer, it doesn't matter whether MarkBufferDirty is executed before the buffer content change is complete, so long as the content change is completed before releasing exclusive lock on the buffer. So it's OK to set the dirtybit before we fill in the LSN. This eliminates the former kluge of needing to set the dirtybit in LockBuffer. Aside from making the code more transparent, we can also add some new debugging assertions, in particular that the caller of MarkBufferDirty must hold the buffer content lock, not merely a pin.
* Improve gist XLOG code to follow the coding rules needed to preventTom Lane2006-03-30
| | | | | | | torn-page problems. This introduces some issues of its own, mainly that there are now some critical sections of unreasonably broad scope, but it's a step forward anyway. Further cleanup will require some code refactoring that I'd prefer to get Oleg and Teodor involved in.
* Clean up and document the API for XLogOpenRelation and XLogReadBuffer.Tom Lane2006-03-29
| | | | | | | | This commit doesn't make much functional change, but it does eliminate some duplicated code --- for instance, PageIsNew tests are now done inside XLogReadBuffer rather than by each caller. The GIST xlog code still needs a lot of love, but I'll worry about that separately.
* Disable full_page_writes, because turning it off risks causing crash-recoveryTom Lane2006-03-28
| | | | | | | | | | failures even when the hardware and OS did nothing wrong. Per recent analysis of a problem report from Alex Bahdushka. For the moment I've just diked out the test of the parameter, rather than removing the GUC infrastructure and documentation, in case we conclude that there's something salvageable there. There seems no chance of it being resurrected in the 8.1 branch though.
* Repair longstanding error in btree xlog replay: XLogReadBuffer should beTom Lane2006-03-28
| | | | | | | | | | passed extend = true whenever we are reading a page we intend to reinitialize completely, even if we think the page "should exist". This is because it might indeed not exist, if the relation got truncated sometime after the current xlog record was made and before the crash we're trying to recover from. These two thinkos appear to explain both of the old bug reports discussed here: http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php
* Arrange to emit a description of the current XLOG record as error contextTom Lane2006-03-24
| | | | | | | | | when an error occurs during xlog replay. Also, replace the former risky 'write into a fixed-size buffer with no overflow detection' API for XLOG record description routines; use an expansible StringInfo instead. (The latter accounts for most of the patch bulk.) Qingqing Zhou
* Clean up representation of function RTEs for functions returning RECORD.Tom Lane2006-03-16
| | | | | | | | | | | | | | | | The original coding stored the raw parser output (ColumnDef and TypeName nodes) which was ugly, bulky, and wrong because it failed to create any dependency on the referenced datatype --- and in fact would not track type renamings and suchlike. Instead store a list of column type OIDs in the RTE. Also fix up general failure of recordDependencyOnExpr to do anything sane about recording dependencies on datatypes. While there are many cases where there will be an indirect dependency (eg if an operator returns a datatype, the dependency on the operator is enough), we do have to record the datatype as a separate dependency in examples like CoerceToDomain. initdb forced because of change of stored rules.
* Improve parser so that we can show an error cursor position for errorsTom Lane2006-03-14
| | | | | | | | | | | during parse analysis, not only errors detected in the flex/bison stages. This is per my earlier proposal. This commit includes all the basic infrastructure, but locations are only tracked and reported for errors involving column references, function calls, and operators. More could be done later but this seems like a good set to start with. I've also moved the ReportSyntaxErrorPosition logic out of psql and into libpq, which should make it available to more people --- even within psql this is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
* Add a CHECK_FOR_INTERRUPTS() in _bt_buildadd(). This fixes problemTom Lane2006-03-10
| | | | | with not responding to query cancel during the last stage of btree index creation.
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* This patch makes the error message strings throughout the backendNeil Conway2006-03-01
| | | | | | | | more compliant with the error message style guide. In particular, errdetail should begin with a capital letter and end with a period, whereas errmsg should not. I also fixed a few related issues in passing, such as fixing the repeated misspelling of "lexeme" in contrib/tsearch2 (per Tom's suggestion).
* Cleanup the usage of ScanDirection: use the symbolic names for theNeil Conway2006-02-21
| | | | | | | | | possible ScanDirection alternatives rather than magic numbers (-1, 0, 1). Also, use the ScanDirection macros in a few places rather than directly checking whether `dir == ForwardScanDirection' and the like. Per patch from James William Pye. His patch also changed ScanDirection to be a "char" rather than an enum, which I haven't applied.
* Move btbulkdelete's vacuum_delay_point() call to a place in the loop whereTom Lane2006-02-14
| | | | | | | | we are not holding a buffer content lock; where it was, InterruptHoldoffCount is positive and so we'd not respond to cancel signals as intended. Also add missing vacuum_delay_point() call in btvacuumcleanup. This should fix complaint from Evgeny Gridasov about failure to respond to SIGINT/SIGTERM in a timely fashion (bug #2257).
* Add some missing vacuum_delay_point calls in GIST vacuuming.Tom Lane2006-02-14
|
* Actually there's a better way to do this, which is to count tuplesTom Lane2006-02-12
| | | | | | | | | during the vacuumcleanup scan that we're going to do anyway. Should save a few cycles (one calculation per page, not per tuple) as well as not having to depend on assumptions about heap and index being in step. I think this could probably be made to work for GIST too, but that code looks messy enough that I'm disinclined to try right now.
* Skip ambulkdelete scan if there's nothing to delete and the index is notTom Lane2006-02-11
| | | | | | | | | | | partial. None of the existing AMs do anything useful except counting tuples when there's nothing to delete, and we can get a tuple count from the heap as long as it's not a partial index. (hash actually can skip anyway because it maintains a tuple count in the index metapage.) GIST is not currently able to exploit this optimization because, due to failure to index NULLs, GIST is always effectively partial. Possibly we should fix that sometime. Simon Riggs w/ some review by Tom Lane.
* Revert based on Tom's recommendation:Bruce Momjian2006-02-11
| | | | | > Allow VACUUM to complete faster by avoiding scanning the indexes when no > rows were removed from the heap by the VACUUM.
* Allow VACUUM to complete faster by avoiding scanning the indexes when noBruce Momjian2006-02-11
| | | | | | rows were removed from the heap by the VACUUM. Simon Riggs
* Remove the no-longer-useful HashItem/HashItemData level of structure.Tom Lane2006-01-25
| | | | Same motivation as for BTItem.
* Remove the no-longer-useful BTItem/BTItemData level of structure, andTom Lane2006-01-25
| | | | | | | just refer to btree index entries as plain IndexTuples, which is what they have been for a very long time. This is mostly just an exercise in removing extraneous notation, but it does save a palloc/pfree cycle per index insertion.
* Allow row comparisons to be used as indexscan qualifications.Tom Lane2006-01-25
| | | | This completes the project to upgrade our handling of row comparisons.
* Instead of using a numberOfRequiredKeys count to distinguish requiredTom Lane2006-01-23
| | | | | | | | | and non-required keys in a btree index scan, mark the required scankeys with private flag bits SK_BT_REQFWD and/or SK_BT_REQBKWD. This seems at least marginally clearer to me, and it eliminates a wired-into-the- data-structure assumption that required keys are consecutive. Even though that assumption will remain true for the foreseeable future, having it in there makes the code seem more complex than necessary.
* Repair longstanding bug in slru/clog logic: it is possible for two backendsTom Lane2006-01-21
| | | | | | | | | | to try to create a log segment file concurrently, but the code erroneously specified O_EXCL to open(), resulting in a needless failure. Before 7.4, it was even a PANIC condition :-(. Correct code is actually simpler than what we had, because we can just say O_CREAT to start with and not need a second open() call. I believe this accounts for several recent reports of hard-to-reproduce "could not create file ...: File exists" errors in both pg_clog and pg_subtrans.
* Improve comments about btree's use of ScanKey data structures: thereTom Lane2006-01-17
| | | | | | | | are two basically different kinds of scankeys, and we ought to try harder to indicate which is used in each place in the code. I've chosen the names "search scankey" and "insertion scankey", though you could make about as good an argument for "operator scankey" and "comparison function scankey".
* Some minor code cleanup, falling out from the removal of rtree. SK_NEGATETom Lane2006-01-14
| | | | | | | | | | | | isn't being used anywhere anymore, and there seems no point in a generic index_keytest() routine when two out of three remaining access methods aren't using it. Also, add a comment documenting a convention for letting access methods define private flag bits in ScanKey sk_flags. There are no such flags at the moment but I'm thinking about changing btree's handling of "required keys" to use flag bits in the keys rather than a count of required key positions. Also, if some AM did still want SK_NEGATE then it would be reasonable to treat it as a private flag bit.
* Cosmetic code cleanup: fix a bunch of places that used "return (expr);"Neil Conway2006-01-11
| | | | | | rather than "return expr;" -- the latter style is used in most of the tree. I kept the parentheses when they were necessary or useful because the return expression was complex.
* Add RelationOpenSmgr() calls to ensure rd_smgr is valid when we try toTom Lane2006-01-07
| | | | | | use it. While it normally has been opened earlier during btree index build, testing shows that it's possible for the link to be closed again if an sinval reset occurs while the index is being built.
* Convert Assert checking for empty page into a regular test and elog.Tom Lane2006-01-06
| | | | | The consequences of overwriting a non-empty page are bad enough that we should not omit this test in production builds.
* Make all command-line options of postmaster and postgres the same. SeePeter Eisentraut2006-01-05
| | | | | http://archives.postgresql.org/pgsql-hackers/2006-01/msg00151.php for the complete plan.
* Get rid of the SpinLockAcquire/SpinLockAcquire_NoHoldoff distinctionTom Lane2005-12-29
| | | | | | | | | | in favor of having just one set of macros that don't do HOLD/RESUME_INTERRUPTS (hence, these correspond to the old SpinLockAcquire_NoHoldoff case). Given our coding rules for spinlock use, there is no reason to allow CHECK_FOR_INTERRUPTS to be done while holding a spinlock, and also there is no situation where ImmediateInterruptOK will be true while holding a spinlock. Therefore doing HOLD/RESUME_INTERRUPTS while taking/releasing a spinlock is just a waste of cycles. Qingqing Zhou and Tom Lane.
* Arrange to set the LC_XXX environment variables to match our localeTom Lane2005-12-28
| | | | | | setup. This protects against undesired changes in locale behavior if someone carelessly does setlocale(LC_ALL, "") (and we know who you are, perl guys).
* I have added these macros to c.h:Bruce Momjian2005-12-25
| | | | | | | | | #define HIGHBIT (0x80) #define IS_HIGHBIT_SET(ch) ((unsigned char)(ch) & HIGHBIT) and removed CSIGNBIT and mapped it uses to HIGHBIT. I have also added uses for IS_HIGHBIT_SET where appropriate. This change is purely for code clarity.
* Adjust string comparison so that only bitwise-equal strings are consideredTom Lane2005-12-22
| | | | | | | | | | | | equal: if strcoll claims two strings are equal, check it with strcmp, and sort according to strcmp if not identical. This fixes inconsistent behavior under glibc's hu_HU locale, and probably under some other locales as well. Also, take advantage of the now-well-defined behavior to speed up texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise comparison and not bother with strcoll at all. NOTE: affected databases may need to REINDEX indexes on text columns to be sure they are self-consistent.
* Divide the lock manager's shared state into 'partitions', so as toTom Lane2005-12-11
| | | | | | | reduce contention for the former single LockMgrLock. Per my recent proposal. I set it up for 16 partitions, but on a pgbench test this gives only a marginal further improvement over 4 partitions --- we need to test more scenarios to choose the number of partitions.
* Push the responsibility for handling ignore_killed_tuples down intoTom Lane2005-12-07
| | | | | | _bt_checkkeys(), instead of checking it in the top-level nbtree.c routines as formerly. This saves a little bit of loop overhead, but more importantly it lets us skip performing the index key comparisons for dead tuples.
* A couple of tiny performance hacks in _bt_step(). Remove PageIsEmptyTom Lane2005-12-07
| | | | | | | | | checks, which were once needed because PageGetMaxOffsetNumber would fail on empty pages, but are now just redundant. Also, don't set up local variables that aren't needed in the fast path --- most of the time, we only need to advance offnum and not step across a page boundary. Motivated by noticing _bt_step at the top of OProfile profile for a pgbench run.
* Get rid of slru.c's hardwired insistence on a fixed number of slots perTom Lane2005-12-06
| | | | | | | | | SLRU area. The number of slots is still a compile-time constant (someday we might want to change that), but at least it's a different constant for each SLRU area. Increase number of subtrans buffers to 32 based on experimentation with a heavily subtrans-bashing test case, and increase number of multixact member buffers to 16, since it's obviously silly for it not to be at least twice the number of multixact offset buffers.
* Arrange for read-only accesses to SLRU page buffers to take only a sharedTom Lane2005-12-06
| | | | | | | | | | lock, not exclusive, if the desired page is already in memory. This can be demonstrated to be a significant win on the pg_subtrans cache when there is a large window of open transactions. It should be useful for pg_clog as well. I didn't try to make GetMultiXactIdMembers() use the code, as that would have taken some restructuring, and what with the local cache for multixact contents it probably wouldn't really make a difference. Per my recent proposal.
* Tweak indexscan machinery to avoid taking an AccessShareLock on an indexTom Lane2005-12-03
| | | | | | | if we already have a stronger lock due to the index's table being the update target table of the query. Same optimization I applied earlier at the table level. There doesn't seem to be much interest in the more radical idea of not locking indexes at all, so do what we can ...
* Some marginal additional hacking to shave a few more cycles offTom Lane2005-11-26
| | | | heapgettup.
* Change seqscan logic so that we check visibility of all tuples on a pageTom Lane2005-11-26
| | | | | | | | | | | | | | when we first read the page, rather than checking them one at a time. This allows us to take and release the buffer content lock just once per page, instead of once per tuple. Since it's a shared lock the contention penalty for holding the lock longer shouldn't be too bad. We can safely do this only when using an MVCC snapshot; else the assumption that visibility won't change over time is uncool. Therefore there are now two code paths depending on the snapshot type. I also made the same change in nodeBitmapHeapscan.c, where it can be done always because we only support MVCC snapshots for bitmap scans anyway. Also make some incidental cleanups in the APIs of these functions. Per a suggestion from Qingqing Zhou.
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Remove the t_datamcxt field of HeapTupleData. This was introduced forTom Lane2005-11-20
| | | | | the convenience of tuptoaster.c and is no longer needed, so may as well get rid of some small amount of overhead.