aboutsummaryrefslogtreecommitdiff
path: root/src/backend/commands
Commit message (Collapse)AuthorAge
* Make CLUSTER and REINDEX silently skip remote temp tables in theirAlvaro Herrera2007-09-10
| | | | | | database-wide editions. Per report from bitsandbytes88 <at> hotmail.com and subsequent discussion.
* Release the exclusive lock on the table early after truncating it in lazyAlvaro Herrera2007-09-10
| | | | vacuum, instead of waiting till commit.
* Remove the vacuum_delay_point call in count_nondeletable_pages, because we holdAlvaro Herrera2007-09-10
| | | | | | | | | | | | an exclusive lock on the table at this point, which we want to release as soon as possible. This is called in the phase of lazy vacuum where we truncate the empty pages at the end of the table. An alternative solution would be to lower the vacuum delay settings before starting the truncating phase, but this doesn't work very well in autovacuum due to the autobalancing code (which can cause other processes to change our cost delay settings). This case could be considered in the balancing code, but it is simpler this way.
* Replace the former method of determining snapshot xmax --- to wit, callingTom Lane2007-09-08
| | | | | | | | | | | | | | | ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid" variable that is updated during transaction commit or abort. Since latestCompletedXid is written only in places that had to lock ProcArrayLock exclusively anyway, and is read only in places that had to lock ProcArrayLock shared anyway, it adds no new locking requirements to the system despite being cluster-wide. Moreover, removing ReadNewTransactionId from snapshot acquisition eliminates the need to take both XidGenLock and ProcArrayLock at the same time. Since XidGenLock is sometimes held across I/O this can be a significant win. Some preliminary benchmarking suggested that this patch has no effect on average throughput but can significantly improve the worst-case transaction times seen in pgbench. Concept by Florian Pflug, implementation by Tom Lane.
* Don't take ProcArrayLock while exiting a transaction that has no XID; there isTom Lane2007-09-07
| | | | | | | | | | no need for serialization against snapshot-taking because the xact doesn't affect anyone else's snapshot anyway. Per discussion. Also, move various info about the interlocking of transactions and snapshots out of code comments and into a hopefully-more-cohesive discussion in access/transam/README. Also, remove a couple of now-obsolete comments about having to force some WAL to be written to persuade RecordTransactionCommit to do its thing.
* Allow CREATE INDEX CONCURRENTLY to disregard transactions in otherTom Lane2007-09-07
| | | | | databases, per gripe from hubert depesz lubaczewski. Patch from Simon Riggs.
* Make eval_const_expressions() preserve typmod when simplifying something likeTom Lane2007-09-06
| | | | | | | | | | | null::char(3) to a simple Const node. (It already worked for non-null values, but not when we skipped evaluation of a strict coercion function.) This prevents loss of typmod knowledge in situations such as exhibited in bug #3598. Unfortunately there seems no good way to fix that bug in 8.1 and 8.2, because they simply don't carry a typmod for a plain Const node. In passing I made all the other callers of makeNullConst supply "real" typmod values too, though I think it probably doesn't matter anywhere else.
* Implement lazy XID allocation: transactions that do not modify any databaseTom Lane2007-09-05
| | | | | | | | | | | | | | | rows will normally never obtain an XID at all. We already did things this way for subtransactions, but this patch extends the concept to top-level transactions. In applications where there are lots of short read-only transactions, this should improve performance noticeably; not so much from removal of the actual XID-assignments, as from reduction of overhead that's driven by the rate of XID consumption. We add a concept of a "virtual transaction ID" so that active transactions can be uniquely identified even if they don't have a regular XID. This is a much lighter-weight concept: uniqueness of VXIDs is only guaranteed over the short term, and no on-disk record is made about them. Florian Pflug, with some editorialization by Tom.
* Provide for binary input/output of enums, to fix complaint from Merlin Moncure.Andrew Dunstan2007-09-04
| | | | | This just provides text values, we're not exposing the underlying Oid representation. Catalog version bumped.
* Support SET FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, ALTER ROLE.Tom Lane2007-09-03
| | | | | | | (Actually, it works as a plain statement too, but I didn't document that because it seems a bit useless.) Unify VariableResetStmt with VariableSetStmt, and clean up some ancient cruft in the representation of same.
* Implement function-local GUC parameter settings, as per recent discussion.Tom Lane2007-09-03
| | | | | | | There are still some loose ends: I didn't do anything about the SET FROM CURRENT idea yet, and it's not real clear whether we are happy with the interaction of SET LOCAL with function-local settings. The documentation is a bit spartan, too.
* Fix a couple of misbehaviors rooted in the fact that the default creationTom Lane2007-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | namespace isn't necessarily first in the search path (there could be implicit schemas ahead of it). Examples are test=# set search_path TO s1; test=# create view pg_timezone_names as select * from pg_timezone_names(); ERROR: "pg_timezone_names" is already a view test=# create table pg_class (f1 int primary key); ERROR: permission denied: "pg_class" is a system catalog You'd expect these commands to create the requested objects in s1, since names beginning with pg_ aren't supposed to be reserved anymore. What is happening is that we create the requested base table and then execute additional commands (here, CREATE RULE or CREATE INDEX), and that code is passed the same RangeVar that was in the original command. Since that RangeVar has schemaname = NULL, the secondary commands think they should do a path search, and that means they find system catalogs that are implicitly in front of s1 in the search path. This is perilously close to being a security hole: if the secondary command failed to apply a permission check then it'd be possible for unprivileged users to make schema modifications to system catalogs. But as far as I can find, there is no code path in which a check doesn't occur. Which makes it just a weird corner-case bug for people who are silly enough to want to name their tables the same as a system catalog. The relevant code has changed quite a bit since 8.2, which means this patch wouldn't work as-is in the back branches. Since it's a corner case no one has reported from the field, I'm not going to bother trying to back-patch.
* Fix brain fade in DefineIndex(): it was continuing to access the table'sTom Lane2007-08-25
| | | | | | | | | | | | | | | relcache entry after having heap_close'd it. This could lead to misbehavior if a relcache flush wiped out the cache entry meanwhile. In 8.2 there is a very real risk of CREATE INDEX CONCURRENTLY using the wrong relid for locking and waiting purposes. I think the bug is only cosmetic in 8.0 and 8.1, because their transgression is limited to using RelationGetRelationName(rel) in an ereport message immediately after heap_close, and there's no way (except with special debugging options) for a cache flush to occur in that interval. Not quite sure that it's cosmetic in 7.4, but seems best to patch anyway. Found by trying to run the regression tests with CLOBBER_CACHE_ALWAYS enabled. Maybe we should try to do that on a regular basis --- it's awfully slow, but perhaps some fast buildfarm machine could do it once in awhile.
* Suppress testing the options of CREATE TEXT SEARCH DICTIONARY duringTom Lane2007-08-22
| | | | | initdb. We should create all the standard dictionaries even though some of them may not work in template1's encoding. Per Teodor.
* Remove option to change parser of an existing text search configuration.Tom Lane2007-08-22
| | | | | | This prevents needing to do complex and poorly-defined updates of the mapping table if the new parser has different token types than the old. Per discussion.
* Simplify the syntax of CREATE/ALTER TEXT SEARCH DICTIONARY by treating theTom Lane2007-08-22
| | | | | | | | | | | | init options of the template as top-level options in the syntax. This also makes ALTER a bit easier to use, since options can be replaced individually. I also made these statements verify that the tmplinit method will accept the new settings before they get stored; in the original coding you didn't find out about mistakes until the dictionary got invoked. Under the hood, init methods now get options as a List of DefElem instead of a raw text string --- that lets tsearch use existing options-pushing code instead of duplicating functionality.
* Simplify CREATE TEXT SEARCH CONFIGURATION by eliminating the separateTom Lane2007-08-21
| | | | | | | | | 'with map' parameter; as things now stand there's really not much point in specifying a config-to-copy if you don't copy its map. Also, use COPY instead of TEMPLATE as the key word for a config-to-copy, so as to avoid confusion with text search templates. Per discussion; the just-committed reference page for the command already describes it this way.
* Tsearch2 functionality migrates to core. The bulk of this work is byTom Lane2007-08-21
| | | | | | | | Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing, so anything that's broken is probably my fault. Documentation is nonexistent as yet, but let's land the patch so we can get some portability testing done.
* Arrange to cache a ResultRelInfo in the executor's EState for relations thatTom Lane2007-08-15
| | | | | | | | | | | | | are not one of the query's defined result relations, but nonetheless have triggers fired against them while the query is active. This was formerly impossible but can now occur because of my recent patch to fix the firing order for RI triggers. Caching a ResultRelInfo avoids duplicating work by repeatedly opening and closing the same relation, and also allows EXPLAIN ANALYZE to "see" and report on these extra triggers. Use the same mechanism to cache open relations when firing deferred triggers at transaction shutdown; this replaces the former one-element-cache strategy used in that case, and should improve performance a bit when there are deferred triggers on a number of relations.
* Repair problems occurring when multiple RI updates have to be done to the sameTom Lane2007-08-15
| | | | | | | | | row within one query: we were firing check triggers before all the updates were done, leading to bogus failures. Fix by making the triggers queued by an RI update go at the end of the outer query's trigger event list, thereby effectively making the processing "breadth-first". This was indeed how it worked pre-8.0, so the bug does not occur in the 7.x branches. Per report from Pavel Stehule.
* Fix two bugs induced in VACUUM FULL by async-commit patch.Tom Lane2007-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | First, we cannot assume that XLogAsyncCommitFlush guarantees hint bits will be settable, because clog.c's inexact LSN bookkeeping results in windows where a previously flushed transaction is considered unhintable because it shares an LSN slot with a later unflushed transaction. But repair_frag requires XMIN_COMMITTED to be correct so that it can distinguish tuples moved by the current vacuum. Since not being able to set the bit is an uncommon corner case, the most practical way of dealing with it seems to be to abandon shrinking (ie, don't invoke repair_frag) when we find a non-dead tuple whose XMIN_COMMITTED bit couldn't be set. Second, it is possible for the same reason that a RECENTLY_DEAD tuple does not get its XMAX_COMMITTED bit set during scan_heap. But by the time repair_frag examines the tuple it might be possible to set the bit. We therefore must take buffer content lock when calling HeapTupleSatisfiesVacuum a second time, else we can get an Assert failure in SetBufferCommitInfoNeedsSave. This latter bug is latent in existing releases, but I think it cannot actually occur without async commit, since the first HeapTupleSatisfiesVacuum call should always have set the bit. So I'm not going to back-patch it. In passing, reduce the existing "cannot shrink relation" messages from NOTICE to LOG level. The new message must be no higher than LOG if we don't want unpredictable regression test failures, and consistency seems like a good idea. Also arrange that only one such message is reported per VACUUM FULL; in typical scenarios you could get spammed with many such messages, which seems a bit useless.
* Switch over to using the src/timezone functions for formatting timestampsTom Lane2007-08-04
| | | | | | | | | | | | | | displayed in the postmaster log. This avoids Windows-specific problems with localized time zone names that are in the wrong encoding, and generally seems like a good idea to forestall other potential platform-dependent issues. To preserve the existing behavior that all backends will log in the same time zone, create a new GUC variable log_timezone that can only be changed on a system-wide basis, and reference log-related calculations to that zone instead of the TimeZone variable. This fixes the issue reported by Hiroshi Saito that timestamps printed by xlog.c startup could be improperly localized on Windows. We still need a simpler patch for that problem in the back branches, however.
* Support an optional asynchronous commit mode, in which we don't flush WALTom Lane2007-08-01
| | | | | | before reporting a transaction committed. Data consistency is still guaranteed (unlike setting fsync = off), but a crash may lose the effects of the last few transactions. Patch by Simon, some editorialization by Tom.
* Fix incorrect optimization of foreign-key checks. When an UPDATE on theTom Lane2007-07-17
| | | | | | | | | | | | | | referencing table does not change the tuple's FK column(s), we don't bother to check the PK table since the constraint was presumably already valid. However, the check is still necessary if the tuple was inserted by our own transaction, since in that case the INSERT trigger will conclude it need not make the check (since its version of the tuple has been deleted). We got this right for simple cases, but not when the insert and update are in different subtransactions of the current top-level transaction; in such cases the FK check would never be made at all. (Hence, problem dates back to 8.0 when subtransactions were added --- it's actually the subtransaction version of a bug fixed in 7.3.5.) Fix, and add regression test cases. Report and fix by Affan Salman.
* Implement CREATE TABLE LIKE ... INCLUDING INDEXES. Patch from NikhilS,Neil Conway2007-07-17
| | | | | based in part on an earlier patch from Trevor Hardcastle, and reviewed by myself.
* Add ALTER VIEW ... RENAME TO, and a RENAME TO clause to ALTER SEQUENCE.Neil Conway2007-07-03
| | | | | | | Sequences and views could previously be renamed using ALTER TABLE, but this was a repeated source of confusion for users. Update the docs, and psql tab completion. Patch from David Fetter; various minor fixes by myself.
* Avoid memory leakage when a series of subtransactions invoke AFTER triggersTom Lane2007-07-01
| | | | | | | | | | | | | that are fired at end-of-statement (as is the normal case for foreign keys, for example). In this situation the per-subxact deferred trigger context is always empty when subtransaction exit is reached; so we could free it, but were not doing so, leading to an intratransaction leak of 8K or more per subtransaction. Per off-list example from Viatcheslav Kalinin subsequent to bug #3418 (his original bug report omitted a foreign key constraint needed to cause this leak). Back-patch to 8.2; prior versions were not using per-subxact contexts for deferred triggers, so did not have this leak.
* Implement "distributed" checkpoints in which the checkpoint I/O is spreadTom Lane2007-06-28
| | | | | | | | | | | | | over a fairly long period of time, rather than being spat out in a burst. This happens only for background checkpoints carried out by the bgwriter; other cases, such as a shutdown checkpoint, are still done at full speed. Remove the "all buffers" scan in the bgwriter, and associated stats infrastructure, since this seems no longer very useful when the checkpoint itself is properly throttled. Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas, and some minor API editorialization by me.
* Separate parse-analysis for utility commands out of parser/analyze.cTom Lane2007-06-23
| | | | | | | | | | | | | | | | | (which now deals only in optimizable statements), and put that code into a new file parser/parse_utilcmd.c. This helps clarify and enforce the design rule that utility statements shouldn't be processed during the regular parse analysis phase; all interpretation of their meaning should happen after they are given to ProcessUtility to execute. (We need this because we don't retain any locks for a utility statement that's in a plan cache, nor have any way to detect that it's stale.) We are also able to simplify the API for parse_analyze() and related routines, because they will now always return exactly one Query structure. In passing, fix bug #3403 concerning trying to add a serial column to an existing temp table (this is largely Heikki's work, but we needed all that restructuring to make it safe).
* CREATE DOMAIN ... DEFAULT NULL failed because gram.y special-cases DEFAULTTom Lane2007-06-20
| | | | | NULL and DefineDomain didn't. Bug goes all the way back to original coding of domains. Per bug #3396 from Sergey Burladyan.
* Minor code cleanup: calling FreeFile() before ereport(ERROR) is notNeil Conway2007-06-20
| | | | | necessary, since files opened via AllocateFile() are closed automatically as part of error recovery.
* Marginal hacking to improve the speed of COPY OUT. I had found in a bit ofTom Lane2007-06-17
| | | | | | | | | | | | | profiling that CopyAttributeOutText was taking an unreasonable fraction of the backend run time (like 66%!) on the following trivial test case: $ time psql -c "copy (select repeat('xyzzy',50) from generate_series(1,10000000)) to stdout" regression >/dev/null The time is all being spent on scanning the string for characters to be escaped, which most of the time there aren't any of. Some tweaking to take as many tests as possible out of the inner loop reduced the runtime of this example by more than 10%. In a real-world case it wouldn't be as useful a speedup, but it still seems worth adding a few lines here.
* Tweak the API for per-datatype typmodin functions so that they are passedTom Lane2007-06-15
| | | | | | | | | | | | an array of strings rather than an array of integers, and allow any simple constant or identifier to be used in typmods; for example create table foo (f1 widget(42,'23skidoo',point)); Of course the typmodin function has still got to pack this info into a non-negative int32 for storage, but it's still a useful improvement in flexibility, especially considering that you can do nearly anything if you are willing to keep the info in a side table. We can get away with this change since we have not yet released a version providing user-definable typmods. Per discussion.
* Avoid having autovacuum run multiple ANALYZE commands in a single transaction,Alvaro Herrera2007-06-14
| | | | to prevent possible deadlock problems. Per request from Tom Lane.
* Rework temp_tablespaces patch so that temp tablespaces are assigned separatelyTom Lane2007-06-07
| | | | | | | | | for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level.
* Clarify some error messages about duplicate things.Peter Eisentraut2007-06-03
|
* Create a GUC parameter temp_tablespaces that allows selection of theTom Lane2007-06-03
| | | | | | | | | | tablespace(s) in which to store temp tables and temporary files. This is a list to allow spreading the load across multiple tablespaces (a random list element is chosen each time a temp object is to be created). Temp files are not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace directories. Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.
* Minimal message corrections found by spell checker.Peter Eisentraut2007-06-02
|
* Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other backendsTom Lane2007-06-01
| | | | | | | will exit before failing because of conflicting DB usage. Per discussion, this seems a good idea to help mask the fact that backend exit takes nonzero time. Remove a couple of thereby-obsoleted sleeps in contrib and PL regression test sequences.
* Make some messages more consistentPeter Eisentraut2007-05-31
|
* Make large sequential scans and VACUUMs work in a limited-size "ring" ofTom Lane2007-05-30
| | | | | | | | | | | | | | | | | | | | | | | buffers, rather than blowing out the whole shared-buffer arena. Aside from avoiding cache spoliation, this fixes the problem that VACUUM formerly tended to cause a WAL flush for every page it modified, because we had it hacked to use only a single buffer. Those flushes will now occur only once per ring-ful. The exact ring size, and the threshold for seqscans to switch into the ring usage pattern, remain under debate; but the infrastructure seems done. The key bit of infrastructure is a new optional BufferAccessStrategy object that can be passed to ReadBuffer operations; this replaces the former StrategyHintVacuum API. This patch also changes the buffer usage-count methodology a bit: we now advance usage_count when first pinning a buffer, rather than when last unpinning it. To preserve the behavior that a buffer's lifetime starts to decrease when it's released, the clock sweep code is modified to not decrement usage_count of pinned buffers. Work not done in this commit: teach GiST and GIN indexes to use the vacuum BufferAccessStrategy for vacuum-driven fetches. Original patch by Simon, reworked by Heikki and again by Tom.
* Create hooks to let a loadable plugin monitor (or even replace) the plannerTom Lane2007-05-25
| | | | | | | | | | | | | | | and/or create plans for hypothetical situations; in particular, investigate plans that would be generated using hypothetical indexes. This is a heavily-rewritten version of the hooks proposed by Gurjeet Singh for his Index Advisor project. In this formulation, the index advisor can be entirely a loadable module instead of requiring a significant part to be in the core backend, and plans can be generated for hypothetical indexes without requiring the creation and rolling-back of system catalog entries. The index advisor patch as-submitted is not compatible with these hooks, but it needs significant work anyway due to other 8.2-to-8.3 planner changes. With these hooks in the core backend, development of the advisor can proceed as a pgfoundry project.
* Fix dumb compile error in the last patch.Alvaro Herrera2007-05-19
|
* Have CLUSTER advance the table's relfrozenxid. The new frozen point is theAlvaro Herrera2007-05-18
| | | | | | | | | | | | | | | | FreezeXid introduced in a recent commit, so there isn't any data loss in this approach. Doing it causes ALTER TABLE (or rather, the forms of it that cause a full table rewrite) to be affected as well. In this case, the frozen point is RecentXmin, because after the rewrite all the tuples are relabeled with the rewriting transaction's Xid. TOAST tables are fixed automatically as well, as fallout of the way they were already being handled in the respective code paths. With this patch, there is no longer need to VACUUM tables for Xid wraparound purposes that have been cleaned up via TRUNCATE or CLUSTER.
* Move the tuple freezing point in CLUSTER to a point further back in the past,Alvaro Herrera2007-05-17
| | | | | | | | | | to avoid losing useful Xid information in not-so-old tuples. This makes CLUSTER behave the same as VACUUM as far a tuple-freezing behavior goes (though CLUSTER does not yet advance the table's relfrozenxid). While at it, move the actual freezing operation in rewriteheap.c to a more appropriate place, and document it thoroughly. This part of the patch from Tom Lane.
* Have TRUNCATE advance the affected table's relfrozenxid to RecentXmin, toAlvaro Herrera2007-05-16
| | | | | | | avoid a later needless VACUUM for Xid-wraparound purposes. We can do this since the table is known to be left empty, so no Xid remains on it. Per discussion.
* Get rid of the pg_shdepend entry for a TOAST table; it's unnecessary sinceTom Lane2007-05-14
| | | | | | | there's an indirect dependency on the owner via the parent table. We were already handling indexes that way, but not toast tables for some reason. Saves a little catalog space and cuts down the verbosity of checkSharedDependencies reports.
* Fix the problem that creating a user-defined type named _foo, followed by oneTom Lane2007-05-12
| | | | | | | | | | | named foo, would work but the other ordering would not. If a user-specified type or table name collides with an existing auto-generated array name, just rename the array type out of the way by prepending more underscores. This should not create any backward-compatibility issues, since the cases in which this will happen would have failed outright in prior releases. Also fix an oversight in the arrays-of-composites patch: ALTER TABLE RENAME renamed the table's rowtype but not its array type.
* Fix my oversight in enabling domains-of-domains: ALTER DOMAIN ADD CONSTRAINTTom Lane2007-05-11
| | | | | | | | | | | | needs to check the new constraint against columns of derived domains too. Also, make it error out if the domain to be modified is used within any composite-type columns. Eventually we should support that case, but it seems a bit painful, and not suitable for a back-patch. For the moment just let the user know we can't do it. Backpatch to 8.2, which is the only released version that allows nested domains. Possibly the other part should be back-patched further.
* Support arrays of composite types, including the rowtypes of regular tablesTom Lane2007-05-11
| | | | | | | | | | | | | | | and views (but not system catalogs, nor sequences or toast tables). Get rid of the hardwired convention that a type's array type is named exactly "_type", instead using a new column pg_type.typarray to provide the linkage. (It still will be named "_type", though, except in odd corner cases such as maximum-length type names.) Along the way, make tracking of owner and schema dependencies for types more uniform: a type directly created by the user has these dependencies, while a table rowtype or auto-generated array type does not have them, but depends on its parent object instead. David Fetter, Andrew Dunstan, Tom Lane