aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Rewrite gather-write patch into something less obviously bolted onTom Lane2005-08-22
| | | | | | | | | after the fact. Fix bug with incorrect test for whether we are at end of logfile segment. Arrange for writes triggered by XLogInsert's is-cache-more-than-half-full test to synchronize with the cache boundaries, so that in long transactions we tend to write alternating halves of the cache rather than randomly chosen portions of it; this saves one more write syscall per cache load.
* Improve xid wraparound message (the server isn't really shut down, justBruce Momjian2005-08-22
| | | | | | | | | not accepting queries). errmsg("database is not accepting queries to avoid wraparound data loss in database \"%s\"", errhint("Stop the postmaster and use a standalone backend to VACUUM database \"%s\".",
* Fix some inconsistent choices of datatypes in xlog.c. Make bufferTom Lane2005-08-22
| | | | | indexes all be int, rather than variously int, uint16 and uint32; add some casts where necessary to support large buffer arrays.
* Seems that the childXids list would be better based on Oid lists thanTom Lane2005-08-20
| | | | integer lists.
* Convert the arithmetic for shared memory size calculation from 'int'Tom Lane2005-08-20
| | | | | | | | | | | to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.
* Make GetMultiXactIdMembers() a public function.Tatsuo Ishii2005-08-20
|
* Repair problems with VACUUM destroying t_ctid chains too soon, and withTom Lane2005-08-20
| | | | | | | | | | | | insufficient paranoia in code that follows t_ctid links. (We must do both because even with VACUUM doing it properly, the intermediate state with a dangling t_ctid link is visible concurrently during lazy VACUUM, and could be seen afterwards if either type of VACUUM crashes partway through.) Also try to improve documentation about what's going on. Patch is a bit bulky because passing the XMAX information around required changing the APIs of some low-level heapam.c routines, but it's not conceptually very complicated. Per trouble report from Teodor and subsequent analysis. This needs to be back-patched, but I'll do that after 8.1 beta is out.
* Avoid an Assert failure if OuterUserId hasn't been set yet duringTom Lane2005-08-17
| | | | | AbortTransaction. This can happen if a backend's InitPostgres transaction fails (eg, because the given username is invalid). Per Alvaro.
* Change a couple of "can't happen" error messages to be a shade moreTom Lane2005-08-12
| | | | | verbose when they do happen. The "left link changed unexpectedly" one in particular has been seen more than once in the field.
* Solve the problem of OID collisions by probing for duplicate OIDsTom Lane2005-08-12
| | | | | | | whenever we generate a new OID. This prevents occasional duplicate-OID errors that can otherwise occur once the OID counter has wrapped around. Duplicate relfilenode values are also checked for when creating new physical files. Per my recent proposal.
* Autovacuum loose end mop-up. Provide autovacuum-specific vacuum costTom Lane2005-08-11
| | | | | | | delay and limit, both as global GUCs and as table-specific entries in pg_autovacuum. stats_reset_on_server_start is now OFF by default, but a reset is forced if we did WAL replay. XID-wrap vacuums do not ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
* Mention MD5 function index for indexing long values.Bruce Momjian2005-08-11
|
* Make new hints follow style guide.Tom Lane2005-08-10
|
* Add hints to cases where indexes fail because of values that are too long.Bruce Momjian2005-08-10
|
* Modify AtEOXact_CatCache and AtEOXact_RelationCache to assume that theTom Lane2005-08-08
| | | | | | | | | | | ResourceOwner mechanism already released all reference counts for the cache entries; therefore, we do not need to scan the catcache or relcache at transaction end, unless we want to do it as a debugging crosscheck. Do the crosscheck only in Assert mode. This is the same logic we had previously installed in AtEOXact_Buffers to avoid overhead with large numbers of shared buffers. I thought it'd be a good idea to do it here too, in view of Kari Lavikka's recent report showing a real-world case where AtEOXact_CatCache is taking a significant fraction of runtime.
* Code and docs review for pg_column_size() patch.Tom Lane2005-08-02
|
* Add NOWAIT option to SELECT FOR UPDATE/SHARE.Tom Lane2005-08-01
| | | | | Original patch by Hans-Juergen Schoenig, revisions by Karel Zak and Tom Lane.
* Add per-user and per-database connection limit options.Tom Lane2005-07-31
| | | | | This patch also includes preliminary update of pg_dumpall for roles. Petr Jelinek, with review by Bruce Momjian and Tom Lane.
* Fix compile for no O_SYNC, but introduced with O_DIRECT.Bruce Momjian2005-07-30
|
* Clean up a number of autovacuum loose ends. Make the stats collectorTom Lane2005-07-29
| | | | | | | | track shared relations in a separate hashtable, so that operations done from different databases are counted correctly. Add proper support for anti-XID-wraparound vacuuming, even in databases that are never connected to and so have no stats entries. Miscellaneous other bug fixes. Alvaro Herrera, some additional fixes by Tom Lane.
* Update O_DIRECT comment.Bruce Momjian2005-07-29
|
* Use O_DIRECT if available when using O_SYNC for wal_sync_method.Bruce Momjian2005-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, write multiple WAL buffers out in one write() operation. ITAGAKI Takahiro --------------------------------------------------------------------------- > If we disable writeback-cache and use open_sync, the per-page writing > behavior in WAL module will show up as bad result. O_DIRECT is similar > to O_DSYNC (at least on linux), so that the benefit of it will disappear > behind the slow disk revolution. > > In the current source, WAL is written as: > for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); } > Is this intentional? Can we rewrite it as follows? > write(&buffers[0], N * BLCKSZ); > > In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff). > Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff). > These two patches are independent, so they can be applied either or both. > > > I tested them on my machine and the results as follows. It shows that > direct-io and gather-write is the best choice when writeback-cache is off. > Are these two patches worth trying if they are used together? > > > | writeback | fsync= | fdata | open_ | fsync_ | open_ > patch | cache | false | sync | sync | direct | direct > ------------+-----------+--------+-------+-------+--------+--------- > direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2 > direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5 > gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A) > both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2 > > - 20runs * pgbench -s 100 -c 50 -t 200 > - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8) > - using 2 ATA disks: > - hda(reiserfs) includes system and wal. > - hdc(jfs) includes database files. writeback-cache is always on. > > --- > ITAGAKI Takahiro
* Add SET ROLE. This is a partial commit of Stephen Frost's recent patch;Tom Lane2005-07-25
| | | | I'm still working on the has_role function and information_schema changes.
* Remove unintended code addition.Bruce Momjian2005-07-23
|
* Macro alignment cleanup.Bruce Momjian2005-07-23
|
* Fix a couple of bogus comments, per Alvaro.Tom Lane2005-07-13
|
* Even though I'd like to see full_page_writes go away before 8.1,Tom Lane2005-07-08
| | | | | a minimum requirement is that it not completely break the system meanwhile. Put the test in the right place.
* Add pg_column_size() to return storage size of a column, includingBruce Momjian2005-07-06
| | | | | | possible compression. Mark Kirkwood
* Add GUC full_page_writes to control writing full pages to WAL.Bruce Momjian2005-07-05
|
* Arrange for the postmaster (and standalone backends, initdb, etc) toTom Lane2005-07-04
| | | | | | | | chdir into PGDATA and subsequently use relative paths instead of absolute paths to access all files under PGDATA. This seems to give a small performance improvement, and it should make the system more robust against naive DBAs doing things like moving a database directory that has a live postmaster in it. Per recent discussion.
* Migrate rtree_gist functionality into the core system, and add someTom Lane2005-07-01
| | | | | | | basic regression tests for GiST to the standard regression tests. I took the opportunity to add an rtree-equivalent gist opclass for circles; the contrib version only covered boxes and polygons, but indexing circles is very handy for distance searches.
* Improve error messages and add commentTeodor Sigaev2005-07-01
|
* Bug fixes for GiST crash recovery.Teodor Sigaev2005-06-30
| | | | | | - add forgotten check of lsn for insert completion - remove level of pages: hard to check in recovery - some cleanups
* Improve the checkpoint signaling mechanism so that the bgwriter can tellTom Lane2005-06-30
| | | | | | | | the difference between checkpoints forced due to WAL segment consumption and checkpoints forced for other reasons (such as CREATE DATABASE). Avoid generating 'checkpoints are occurring too frequently' messages when the checkpoint wasn't caused by WAL segment consumption. Per gripe from Chris K-L.
* Clean up the rather historically encumbered interface to now() andTom Lane2005-06-29
| | | | | | | | current time: provide a GetCurrentTimestamp() function that returns current time in the form of a TimestampTz, instead of separate time_t and microseconds fields. This is what all the callers really want anyway, and it eliminates low-level dependencies on AbsoluteTime, which is a deprecated datatype that will have to disappear eventually.
* Cleanup, remove unneeded pallocsTeodor Sigaev2005-06-29
|
* Code cleanup. gistfillbuffer accepts InvalidOffsetNumber.Teodor Sigaev2005-06-28
|
* Replace pg_shadow and pg_group by new role-capable catalogs pg_authidTom Lane2005-06-28
| | | | | | | | and pg_auth_members. There are still many loose ends to finish in this patch (no documentation, no regression tests, no pg_dump support for instance). But I'm going to commit it now anyway so that Alvaro can make some progress on shared dependencies. The catalog changes should be pretty much done.
* Concurrency for GiSTTeodor Sigaev2005-06-27
| | | | | | | | | | | | | | | | | | - full concurrency for insert/update/select/vacuum: - select and vacuum never locks more than one page simultaneously - select (gettuple) hasn't any lock across it's calls - insert never locks more than two page simultaneously: - during search of leaf to insert it locks only one page simultaneously - while walk upward to the root it locked only parent (may be non-direct parent) and child. One of them X-lock, another may be S- or X-lock - 'vacuum full' locks index - improve gistgetmulti - simplify XLOG records Fix bug in index_beginscan_internal: LockRelation may clean rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
* Extend r-tree operator classes to handle Y-direction tests equivalentTom Lane2005-06-24
| | | | | | | | | | | | | | | to the existing X-direction tests. An rtree class now includes 4 actual 2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests. This involved adding four new Y-direction test operators for each of box and polygon; I followed the PostGIS project's lead as to the names of these operators. NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright (&>) operators now have semantics comparable to box_overleft and box_overright. This is necessary to make r-tree indexes work correctly on polygons. Also, I changed circle_left and circle_right to agree with box_left and box_right --- formerly they allowed the boundaries to touch. This isn't actually essential given the lack of any r-tree opclass for circles, but it seems best to sync all the definitions while we are at it.
* Fix rtree and contrib/rtree_gist search behavior for the 1-D box andTom Lane2005-06-24
| | | | | | | | | | polygon operators (<<, &<, >>, &>). Per ideas originally put forward by andrew@supernews and later rediscovered by moi. This patch just fixes the existing opclasses, and does not add any new behavior as I proposed earlier; that can be sorted out later. In principle this could be back-patched, since it changes only search behavior and not system catalog entries nor rtree index contents. I'm not currently planning to do that, though, since I think it could use more testing.
* Fix the mechanism for reporting the original table OID and column numberTom Lane2005-06-22
| | | | | of columns of a query result so that it can "see through" cursors and prepared statements. Per gripe a couple months back from John DeSoi.
* Avoid WAL-logging individual tuple insertions during CREATE TABLE ASTom Lane2005-06-20
| | | | | | (a/k/a SELECT INTO). Instead, flush and fsync the whole relation before committing. We do still need the WAL log when PITR is active, however. Simon Riggs and Tom Lane.
* fix founded hole in recovery after crash, add vacuum_delay_point()Teodor Sigaev2005-06-20
|
* 1. full functional WAL for GiSTTeodor Sigaev2005-06-20
| | | | | | | | | | | 2. improve vacuum for gist - use FSM - full vacuum: - reforms parent tuple if it's needed ( tuples was deleted on child page or parent tuple remains invalid after crash recovery ) - truncate index file if possible 3. fixes bugs and mistakes
* Avoid unnecessary palloc overhead in _bt_first(). The temporaryTom Lane2005-06-19
| | | | | | | scankeys arrays that it needs can never have more than INDEX_MAX_KEYS entries, so it's reasonable to just allocate them as fixed-size local arrays, and save the cost of palloc/pfree. Not a huge savings, but a cycle saved is a cycle earned ...
* Need #include <time.h> on some platforms.Tom Lane2005-06-19
|
* Simplify uses of readdir() by creating a function ReadDir() thatTom Lane2005-06-19
| | | | | | | includes error checking and an appropriate ereport(ERROR) message. This gets rid of rather tedious and error-prone manipulation of errno, as well as a Windows-specific bug workaround, at more than a dozen call sites. After an idea in a recent patch by Heikki Linnakangas.
* Arrange to fsync two-phase-commit state files only during checkpoints;Tom Lane2005-06-19
| | | | | | given reasonably short lifespans for prepared transactions, this should mean that only a small minority of state files ever need to be fsynced at all. Per discussion with Heikki Linnakangas.
* Add a time-of-preparation column to the pg_prepared_xacts view, per anTom Lane2005-06-18
| | | | | | | | | | old suggestion by Oliver Jowett. Also, add a transaction column to the pg_locks view to show the xid of each transaction holding or awaiting locks; this allows prepared transactions to be properly associated with the locks they own. There was already a column named 'transaction', and I chose to rename it to 'transactionid' --- since this column is new in the current devel cycle there should be no backwards compatibility issue to worry about.