aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* XLOG (and related) changes:Tom Lane2001-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Store two past checkpoint locations, not just one, in pg_control. On startup, we fall back to the older checkpoint if the newer one is unreadable. Also, a physical copy of the newest checkpoint record is kept in pg_control for possible use in disaster recovery (ie, complete loss of pg_xlog). Also add a version number for pg_control itself. Remove archdir from pg_control; it ought to be a GUC parameter, not a special case (not that it's implemented yet anyway). * Suppress successive checkpoint records when nothing has been entered in the WAL log since the last one. This is not so much to avoid I/O as to make it actually useful to keep track of the last two checkpoints. If the things are right next to each other then there's not a lot of redundancy gained... * Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs on alternate bytes. Polynomial borrowed from ECMA DLT1 standard. * Fix XLOG record length handling so that it will work at BLCKSZ = 32k. * Change XID allocation to work more like OID allocation. (This is of dubious necessity, but I think it's a good idea anyway.) * Fix a number of minor bugs, such as off-by-one logic for XLOG file wraparound at the 4 gig mark. * Add documentation and clean up some coding infelicities; move file format declarations out to include files where planned contrib utilities can get at them. * Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also possible to force a checkpoint by sending SIGUSR1 to the postmaster (undocumented feature...) * Defend against kill -9 postmaster by storing shmem block's key and ID in postmaster.pid lockfile, and checking at startup to ensure that no processes are still connected to old shmem block (if it still exists). * Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency stop, for symmetry with postmaster and xlog utilities. Clean up signal handling in bootstrap.c so that xlog utilities launched by postmaster will react to signals better. * Standalone bootstrap now grabs lockfile in target directory, as added insurance against running it in parallel with live postmaster.
* Repair a number of places that didn't bother to check whether PageAddItemTom Lane2001-03-07
| | | | | | | | | | succeeds or not. Revise rtree page split algorithm to take care about making a feasible split --- ie, will the incoming tuple actually fit? Failure to make a feasible split, combined with failure to notice the failure, account for Jim Stone's recent bug report. I suspect that hash and gist indices may have the same type of bug, but at least now we'll get error messages rather than silent failures if so. Also clean up rtree code to use Datum rather than char* where appropriate.
* Implement COMMIT_SIBLINGS parameter to allow pre-commit delay to occurTom Lane2001-02-26
| | | | | | | | | | | only if at least N other backends currently have open transactions. This is not a great deal of intelligence about whether a delay might be profitable ... but it beats no intelligence at all. Note that the default COMMIT_DELAY is still zero --- this new code does nothing unless that setting is changed. Also, mark ENABLEFSYNC as a system-wide setting. It's no longer safe to allow that to be set per-backend, since we may be relying on some other backend's fsync to have synced the WAL log.
* Clean up index/btree comments/macros, as approved.Bruce Momjian2001-02-22
|
* Avoid 'FATAL: out of free buffers: time to abort !" errorHiroshi Inoue2001-02-22
| | | | during WAL recovery. Recovery failure is always serious.
* Change default commit_delay to zero, update documentation.Tom Lane2001-02-18
|
* Change s_lock to not use any zero-delay select() calls; these are just aTom Lane2001-02-18
| | | | | | | | | | waste of cycles on single-CPU machines, and of dubious utility on multi-CPU machines too. Tweak s_lock_stuck so that caller can specify timeout interval, and increase interval before declaring stuck spinlock for buffer locks and XLOG locks. On systems that have fdatasync(), use that rather than fsync() to sync WAL log writes. Ensure that WAL file is entirely allocated during XLogFileInit.
* Although we can't support out-of-line TOAST storage in indexes (yet),Tom Lane2001-02-15
| | | | | | | | compressed storage works perfectly well. Might as well have a coherent strategy for applying it, rather than the haphazard store-what-you-get approach that was in the code before. The strategy I've set up here is to attempt compression of any compressible index value exceeding BLCKSZ/16, or about 500 bytes by default.
* Comments about GetFreeXLBuffer().Vadim B. Mikheev2001-02-13
| | | | | GetFreeXLBuffer(): use Insert->LgwrResult instead of private LgwrResult copy if it's more fresh (attempt to avoid acquiring info_lck/lgwr_lck).
* Removed abort() in XLogFileOpen.Vadim B. Mikheev2001-02-13
|
* When updating a tuple containing compressed-in-line fields, do notTom Lane2001-02-09
| | | | decompress the existing fields unnecessarily.
* Runtime btree recovery is now ON by default.Vadim B. Mikheev2001-02-07
|
* Runtime tree recovery is implemented, just testing is left -:)Vadim B. Mikheev2001-02-02
|
* Couple additional functions to fix tree at runtime.Vadim B. Mikheev2001-01-31
| | | | | Need in one more function to handle "my bits moved..." case. FixBTree is still FALSE.
* Call _bt_fixroot() from _bt_insertonpg.Vadim B. Mikheev2001-01-29
|
* Clean up handling of tuple descriptors so that result-tuple descriptorsTom Lane2001-01-29
| | | | | | | | allocated by plan nodes are not leaked at end of query. This doesn't really matter for normal queries, but it sure does for queries invoked repetitively inside SQL functions. Clean up some other grotty code associated with tupdescs, and fix a few other memory leaks exposed by tests with simple SQL functions.
* First step in attempt to fix tree at runtime: create upper levelsVadim B. Mikheev2001-01-26
| | | | | | | and new root page if old root one was splitted but new root page wasn't created. New code is protected by FixBTree bool flag setted to FALSE, so nothing should be affected by this untested approach.
* Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group.Bruce Momjian2001-01-24
|
* Do _bt_wrtbuf() outside critical section, per discussion with Vadim 1/19.Tom Lane2001-01-23
|
* Fix all the places that called heap_update() and heap_delete() withoutTom Lane2001-01-23
| | | | | | | | | | | bothering to check the return value --- which meant that in case the update or delete failed because of a concurrent update, you'd not find out about it, except by observing later that the transaction produced the wrong outcome. There are now subroutines simple_heap_update and simple_heap_delete that should be used anyplace that you're not prepared to do the full nine yards of coping with concurrent updates. In practice, that seems to mean absolutely everywhere but the executor, because *noplace* else was checking.
* Make critical sections (elog->crash) and interrupt holdoff sectionsTom Lane2001-01-19
| | | | into distinct concepts, per recent discussion on pghackers.
* Comment out xlrec in xact_redo - no support for file unlinking onVadim B. Mikheev2001-01-18
| | | | commit yet.
* Tweak heap_update/delete so that we do not hold the buffer context lockTom Lane2001-01-15
| | | | on the old tuple's page while we are doing TOAST pushups.
* Restructure backend SIGINT/SIGTERM handling so that 'die' interruptsTom Lane2001-01-14
| | | | | | | are treated more like 'cancel' interrupts: the signal handler sets a flag that is examined at well-defined spots, rather than trying to cope with an interrupt that might happen anywhere. See pghackers discussion of 1/12/01.
* Add more critical-section calls: all code sections that hold spinlocksTom Lane2001-01-12
| | | | | | | | | | | are now critical sections, so as to ensure die() won't interrupt us while we are munging shared-memory data structures. Avoid insecure intermediate states in some code that proc_exit will call, like palloc/pfree. Rename START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be what people tend to call them anyway, and make them be called with () like a function call, in hopes of not confusing pg_indent. I doubt that this is sufficient to make SIGTERM safe anywhere; there's just too much code that could get invoked during proc_exit().
* New feature:Marc G. Fournier2001-01-12
| | | | | | | | | | | | | | | | | | | | | | 1. Support of variable size keys - new algorithm of insertion to tree (GLI - gist layrered insertion). Previous algorithm was implemented as described in paper by Joseph M. Hellerstein et.al "Generalized Search Trees for Database Systems". This (old) algorithm was not suitable for variable size keys and could be not effective ( walking up-down ) in case of multiple levels split Bug fixed: 1. fixed bug in gistPageAddItem - key values were written to disk uncompressed. This caused failure if decompression function does real job. 2. NULLs handling - we keep NULLs in tree. Right way is to remove them, but we don't know how to inform vacuum about index statistics. This is just cosmetic warning message (like in case with R-Tree), but I'm not sure how to recognize real problem if we remove NULLs and suppress this warning as Tom suggested. 3. various memory leaks This work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov (oleg@sai.msu.su).
* 1. Checkpoint.undo may be after checkpoint itself:Vadim B. Mikheev2001-01-09
| | | | | | | | - no more elog(STOP) in StartupXLOG(); - both checkpoint' undo & redo are used to define oldest on-line log file. 2. Ability to pre-allocate a few log files at checkpoint time (wal_files option). Off by default.
* Correct nasty error in heap_update: it was releasing the buffer refcountTom Lane2001-01-07
| | | | | | | | | before calling RelationInvalidateHeapTuple(), which is bad because the latter needs to look at the tuple data, which is in the shared disk buffer. If another backend manages to recycle the buffer while this is going on, we will compute the wrong hashindex for the tuple or maybe even crash outright. Must hold buffer refcount until afterwards. (This bug is not in 7.0.*; seems to be have introduced during WAL changes.)
* Clean up non-reentrant interface for hash_seq/HashTableWalk, so thatTom Lane2001-01-02
| | | | | | | | starting a new hashtable search no longer clobbers any other search active anywhere in the system. Fix RelationCacheInvalidate() so that it will not crash or go into an infinite loop if invoked recursively, as for example by a second SI Reset message arriving while we are still processing a prior one.
* 1. WAL needs in zero-ed content of newly initialized page.Vadim B. Mikheev2000-12-30
| | | | | 2. Log record for PageRepaireFragmentation now keeps array of !LP_USED offnums to redo cleanup properly.
* Fixed misprint in heap update WALoging.Vadim B. Mikheev2000-12-30
|
* Fix failure in CreateCheckPoint on some Alpha boxes --- it's not OK toTom Lane2000-12-29
| | | | | | | assume that TAS() will always succeed the first time, even if the lock is known to be free. Also, make sure that code will eventually time out and report a stuck spinlock, rather than looping forever. Small cleanups in s_lock.h, too.
* MUST update (in-memory) data page BEFORE XLogInsert to logVadim B. Mikheev2000-12-29
| | | | NEW page content if WAL will decide to backup page.
* nbtree_xlog_newroot: set meta flag in meta page opaque.Vadim B. Mikheev2000-12-29
|
* New WAL version - CRC and data blocks backup.Vadim B. Mikheev2000-12-28
|
* Fix portability problems recently exposed by regression tests on Alphas.Tom Lane2000-12-27
| | | | | | | | | | 1. Distinguish cases where a Datum representing a tuple datatype is an OID from cases where it is a pointer to TupleTableSlot, and make sure we use the right typlen in each case. 2. Make fetchatt() and related code support 8-byte by-value datatypes on machines where Datum is 8 bytes. Centralize knowledge of the available by-value datatype sizes in two macros in tupmacs.h, so that this will be easier if we ever have to do it again.
* Revise lock manager to support "session level" locks as well as "transactionTom Lane2000-12-22
| | | | | | | | | | | | | | | | level" locks. A session lock is not released at transaction commit (but it is released on transaction abort, to ensure recovery after an elog(ERROR)). In VACUUM, use a session lock to protect the master table while vacuuming a TOAST table, so that the TOAST table can be done in an independent transaction. I also took this opportunity to do some cleanup and renaming in the lock code. The previously noted bug in ProcLockWakeup, that it couldn't wake up any waiters beyond the first non-wakeable waiter, is now fixed. Also found a previously unknown bug of the same kind (failure to scan all members of a lock queue in some cases) in DeadLockCheck. This might have led to failure to detect a deadlock condition, resulting in indefinite waits, but it's difficult to characterize the conditions required to trigger a failure.
* >> Here is a patch for the beos port (All regression tests are OK).Bruce Momjian2000-12-18
| | | | | | | | | | | | | | | | | | | | | | >> xlog.c : special case for beos to avoid 'link' which does not work yet >> beos/sem.c : implementation of new sem_ctl call (GETPID) and a new >sem_op >> flag (IPCNOWAIT) >> dynloader/beos.c : add a verification of symbol validity (seem that the >> loader sometime return OK with an invalid symbol) >> postmaster.c : add beos forking support for the new checkpoint process >> postgres.c : remove beos special case for getrusage >> beos.h : Correction of a bas definition of AF_UNIX, misc defnitions >> >> >> thanks >> >> >> cyril Cyril VELTER
* Clean up backend-exit-time cleanup behavior. Use on_shmem_exit callbacksTom Lane2000-12-18
| | | | | | to ensure that we have released buffer refcounts and so forth, rather than putting ad-hoc operations before (some of the calls to) proc_exit. Add commentary to discourage future hackers from repeating that mistake.
* Remove elog for online log files.Vadim B. Mikheev2000-12-11
|
* elog(LOG)-->elog(DEBUG) for skipped logs.Vadim B. Mikheev2000-12-11
|
* Resolve complie error(was my fault).Hiroshi Inoue2000-12-11
|
* *redo: Heap move* neglects to set t_cmin for MOVED_IN tuples.Hiroshi Inoue2000-12-11
|
* Repair erroneous use of hashvarlena() for MACADDR, which is not aTom Lane2000-12-08
| | | | | | | varlena type. (I did not force initdb, but you won't see the fix unless you do one.) Also, make sure all index support operators and functions are careful not to leak memory for toasted inputs; I had missed some hash and rtree support ops on this point before.
* Resurrect -F switch: it controls fsyncs again, though the fsyncs areTom Lane2000-12-08
| | | | | mostly just on the WAL logfile nowadays. But if people want to disable fsync for performance, why should we say no?
* RecordTransactionAbort() shouldn't log XLOG_XACT_ABORTHiroshi Inoue2000-12-07
| | | | if the transaction has already been committed ?
* Silence compiler warning.Tom Lane2000-12-07
|
* Disable elog(ERROR|FATAL) in signal handlers inVadim B. Mikheev2000-12-03
| | | | critical sections of code.
* Make tuple receive/print routines TOAST-aware. Formerly, printtup wouldTom Lane2000-12-01
| | | | | leak memory when printing a toasted attribute, and printtup_internal didn't work at all...
* Remove VARLENA_FIXED_SIZE hack, which is irreversibly broken now thatTom Lane2000-11-30
| | | | | | | | both MULTIBYTE and TOAST prevent char(n) from being truly fixed-size. Simplify and speed up fastgetattr() and index_getattr() macros by eliminating special cases for attnum=1. It's just as fast to handle the first attribute by presetting its attcacheoff to zero; so do that instead when loading the tupledesc in relcache.c.