aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Prevent intermittent hang in recovery from bgwriter interaction.Simon Riggs2011-03-23
| | | | | | Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
* Make FKs valid at creation when added as column constraints.Simon Riggs2011-03-22
| | | | Bug report from Alvaro Herrera
* Improve reporting of run-time-detected indeterminate-collation errors.Tom Lane2011-03-22
| | | | | | | | pg_newlocale_from_collation does not have enough context to give an error message that's even a little bit useful, so move the responsibility for complaining up to its callers. Also, reword ERRCODE_INDETERMINATE_COLLATION error messages in a less jargony, more message-style-guide-compliant fashion.
* Throw error for indeterminate collation of an ORDER/GROUP/DISTINCT target.Tom Lane2011-03-22
| | | | | | | | | | | | | | | | | | | | | | This restores a parse error that was thrown (though only in the ORDER BY case) by the original collation patch. I had removed it in my recent revisions because it was thrown at a place where collations now haven't been computed yet; but I thought of another way to handle it. Throwing the error at parse time, rather than leaving it to be done at runtime, is good because a syntax error pointer is helpful for localizing the problem. We can reasonably assume that the comparison function for a collatable datatype will complain if it doesn't have a collation to use. Now the planner might choose to implement GROUP or DISTINCT via hashing, in which case no runtime error would actually occur, but it seems better to throw error consistently rather than let the error depend on what the planner chooses to do. Another possible objection is that the user might specify a nondefault sort operator that doesn't care about collation ... but that's surely an uncommon usage, and it wouldn't hurt him to throw in a COLLATE clause anyway. This change also makes the ORDER BY/GROUP BY/DISTINCT case more consistent with the UNION/INTERSECT/EXCEPT case, which was already coded to throw this error even though the same objections could be raised there.
* Avoid potential deadlock in InitCatCachePhase2().Tom Lane2011-03-22
| | | | | | | | | | | | | | | Opening a catcache's index could require reading from that cache's own catalog, which of course would acquire AccessShareLock on the catalog. So the original coding here risks locking index before heap, which could deadlock against another backend trying to get exclusive locks in the normal order. Because InitCatCachePhase2 is only called when a backend has to start up without a relcache init file, the deadlock was seldom seen in the field. (And by the same token, there's no need to worry about any performance disadvantage; so not much point in trying to distinguish exactly which catalogs have the risk.) Bug report, diagnosis, and patch by Nikhil Sontakke. Additional commentary by me. Back-patch to all supported branches.
* Reimplement planner's handling of MIN/MAX aggregate optimization (again).Tom Lane2011-03-22
| | | | | | | | | | | | | | Instead of playing cute games with pathkeys, just build a direct representation of the intended sub-select, and feed it through query_planner to get a Path for the index access. This is a bit slower than 9.1's previous method, since we'll duplicate most of the overhead of query_planner; but since the whole optimization only applies to rather simple single-table queries, that probably won't be much of a problem in practice. The advantage is that we get to do the right thing when there's a partial index that needs the implicit IS NOT NULL clause to be usable. Also, although this makes planagg.c be a bit more closely tied to the ordering of operations in grouping_planner, we can get rid of some coupling to lower-level parts of the planner. Per complaint from Marti Raudsepp.
* When two base backups are started at the same time with pg_basebackup,Heikki Linnakangas2011-03-21
| | | | | | | | ensure that they use different checkpoints as the starting point. We use the checkpoint redo location as a unique identifier for the base backup in the end-of-backup record, and in the backup history file name. Bug spotted by Fujii Masao.
* Suppress platform-dependent unused-variable warning.Tom Lane2011-03-20
| | | | | | The local variable "sock" can be unused depending on compilation flags. But there seems no particular need for it, since the kernel calls can just as easily say port->sock instead.
* Fix up handling of C/POSIX collations.Tom Lane2011-03-20
| | | | | | | | | | | | | | | | | Install just one instance of the "C" and "POSIX" collations into pg_collation, rather than one per encoding. Make these instances exist and do something useful even in machines without locale_t support: to wit, it's now possible to force comparisons and case-folding functions to use C locale in an otherwise non-C database, whether or not the platform has support for using any additional collations. Fix up severely broken upper/lower/initcap functions, too: the C/POSIX fastpath now does what it is supposed to, and non-default collations are handled correctly in single-byte database encodings. Merge the two separate collation hashtables that were being maintained in pg_locale.c, and be more wary of the possibility that we fail partway through filling a cache entry.
* Revise collation derivation method and expression-tree representation.Tom Lane2011-03-19
| | | | | | | | | | | | | | | | | | | All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.
* Rename ident authentication over local connections to peerMagnus Hagander2011-03-19
| | | | | | | | | | | | | This removes an overloading of two authentication options where one is very secure (peer) and one is often insecure (ident). Peer is also the name used in libpq from 9.1 to specify the same type of authentication. Also make initdb select peer for local connections when ident is chosen, and ident for TCP connections when peer is chosen. ident keyword in pg_hba.conf is still accepted and maps to peer authentication.
* Fix possible "tuple concurrently updated" error in ALTER TABLE.Robert Haas2011-03-18
| | | | | | | | | | When adding an inheritance parent to a table, an AccessShareLock on the parent isn't strong enough to prevent trouble, so take ShareUpdateExclusiveLock instead. Since this is a behavior change, albeit a fairly unobtrusive one, and since we have only one report from the field, no back-patch. Report by Jon Nelson, analysis by Alvaro Herrera, fix by me.
* Move synchronous_standbys_defined updates from WAL writer to BG writer.Robert Haas2011-03-18
| | | | | | | | | This is advantageous because the BG writer is alive until much later in the shutdown sequence than WAL writer; we want to make sure that it's possible to shut off synchronous replication during a smart shutdown, else it might not be possible to complete the shutdown at all. Per very reasonable gripes from Fujii Masao and Simon Riggs.
* Make synchronous replication query cancel/die messages more consistent.Robert Haas2011-03-18
| | | | | Per a gripe from Thom Brown about my previous commit in this area, commit 9a56dc3389b9470031e9ef8e45c95a680982e01a.
* Remove bogus semicolons in recoveryPausesHere.Robert Haas2011-03-18
| | | | | Without this, the startup process goes into a tight loop, consuming 100% of one CPU and failing to respond to interrupts.
* Remove bogus comment.Robert Haas2011-03-17
|
* Raise maximum value of several timeout parametersPeter Eisentraut2011-03-17
| | | | | | | | | The maximum value of deadlock_timeout, max_standby_archive_delay, max_standby_streaming_delay, log_min_duration_statement, and log_autovacuum_min_duration was INT_MAX/1000 milliseconds, which is about 35min, which is too short for some practical uses. Raise the maximum value to INT_MAX; the code that uses the parameters already supports that just fine.
* Add pause_at_recovery_target to recovery.conf.sample; improve docs.Robert Haas2011-03-17
| | | | | Fujii Masao, but with the proposed behavior change reverted, and the rest adjusted accordingly.
* Fix various possible problems with synchronous replication.Robert Haas2011-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Don't ignore query cancel interrupts. Instead, if the user asks to cancel the query after we've already committed it, but before it's on the standby, just emit a warning and let the COMMIT finish. 2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown). Instead, emit a warning message and close the connection without acknowledging the commit. Other backends will still see the effect of the commit, but there's no getting around that; it's too late to abort at this point, and ignoring die interrupts altogether doesn't seem like a good idea. 3. If synchronous_standby_names becomes empty, wake up all backends waiting for synchronous replication to complete. Without this, someone attempting to shut synchronous replication off could easily wedge the entire system instead. 4. Avoid depending on the assumption that if a walsender updates MyProc->syncRepState, we'll see the change even if we read it without holding the lock. The window for this appears to be quite narrow (and probably doesn't exist at all on machines with strong memory ordering) but protecting against it is practically free, so do that. 5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and doesn't actually do anything. There's still some further work needed here to make the behavior of fast shutdown plausible, but that looks complex, so I'm leaving it for a separate commit. Review by Fujii Masao.
* Improve handling of unknown-type literals in UNION/INTERSECT/EXCEPT.Tom Lane2011-03-15
| | | | | | | | | | | | | | | | | | | | This patch causes unknown-type Consts to be coerced to the resolved output type of the set operation at parse time. Formerly such Consts were left alone until late in the planning stage. The disadvantage of that approach is that it disables some optimizations, because the planner sees the set-op leaf query as having different output column types than the overall set-op. We saw an example of that in a recent performance gripe from Claudio Freire. Fixing such a Const requires scribbling on the leaf query in transformSetOperationTree, but that should be all right since if the leaf query's semantics depended on that output column, it would already have resolved the unknown to something else. Most of the bulk of this patch is a simple adjustment of transformSetOperationTree's API so that upper levels can get at the TargetEntry containing a Const to be replaced: it now returns a list of TargetEntries, instead of just the bare expressions.
* Remove 13 keywords that are used only for ROLE options.Robert Haas2011-03-15
| | | | Review by Tom Lane.
* Simplify list traversal logic in add_path().Tom Lane2011-03-13
| | | | | Its mechanism for recovering after deleting the current list cell was a bit klugy. Borrow the technique used in other places.
* Make all comparisons done for/with statistics use the default collation.Tom Lane2011-03-12
| | | | | | | | | | | | | | While this will give wrong answers when estimating selectivity for a comparison operator that's using a non-default collation, the estimation error probably won't be large; and anyway the former approach created estimation errors of its own by trying to use a histogram that might have been computed with some other collation. So we'll adopt this simplified approach for now and perhaps improve it sometime in the future. This patch incorporates changes from Andres Freund to make sure that selfuncs.c passes a valid collation OID to any datatype-specific function it calls, in case that function wants collation information. Said OID will now always be DEFAULT_COLLATION_OID, but at least we won't get errors.
* Use "backend process" rather than "backend server", where appropriate.Bruce Momjian2011-03-12
|
* Use macros for time-based constants, rather than constants.Bruce Momjian2011-03-12
|
* On further reflection, we'd better do the same in int.c.Tom Lane2011-03-11
| | | | | We previously heard of the same problem in int24div(), so there's not a good reason to suppose the problem is confined to cases involving int8.
* Put in some more safeguards against executing a division-by-zero.Tom Lane2011-03-11
| | | | | | | | Add dummy returns before every potential division-by-zero in int8.c, because apparently further "improvements" in gcc's optimizer have enabled it to break functions that weren't broken before. Aurelien Jarno, via Martin Pitt
* Split CollateClause into separate raw and analyzed node types.Tom Lane2011-03-11
| | | | | | | | | | | CollateClause is now used only in raw grammar output, and CollateExpr after parse analysis. This is for clarity and to avoid carrying collation names in post-analysis parse trees: that's both wasteful and possibly misleading, since the collation's name could be changed while the parsetree still exists. Also, clean up assorted infelicities and omissions in processing of the node type.
* Create an explicit concept of collations that work for any encoding.Tom Lane2011-03-11
| | | | | | | | | | | Use collencoding = -1 to represent such a collation in pg_collation. We need this to make the "default" entry work sanely, and a later patch will fix the C/POSIX entries to be represented this way instead of duplicating them across all encodings. All lookup operations now search first for an entry that's database-encoding-specific, and then for the same name with collencoding = -1. Also some incidental code cleanup in collationcmds.c and pg_collation.c.
* Clarify C comment that O_SYNC/O_FSYNC are really the same settting, asBruce Momjian2011-03-10
| | | | opposed to O_DSYNC.
* Revert addition of third argument to format_type().Tom Lane2011-03-10
| | | | | | | | | | | | Including collation in the behavior of that function promotes a world view we do not want. Moreover, it was producing the wrong behavior for pg_dump anyway: what we want is to dump a COLLATE clause on attributes whose attcollation is different from the underlying type, and likewise for domains, and the function cannot do that for us. Doing it the hard way in pg_dump is a bit more tedious but produces more correct output. In passing, fix initdb so that the initial entry in pg_collation is properly pinned. It was droppable before :-(
* Make error handling of synchronous_standby_names consistent.Robert Haas2011-03-10
| | | | | | | It's not a good idea to kill the postmaster just because someone muffs this, and it's not consistent with what we do for other, similar GUCs. Fujii Masao, with a bit more hacking by me
* More synchronous replication typo fixes.Robert Haas2011-03-10
| | | | Fujii Masao
* More synchronous replication tweaks.Robert Haas2011-03-10
| | | | | | | | | | | | | | | | | SyncRepRequested() must check not only the value of the synchronous_replication GUC but also whether max_wal_senders > 0. Otherwise, we might end up waiting for sync rep even when there's no possibility of a standby ever managing to connect. There are some existing cross-checks to prevent this, but they're not quite sufficient: the user can start the server with max_wal_senders=0, synchronous_standby_names='', and synchronous_replication=off and then subsequent make synchronous_standby_names not empty using pg_ctl reload, and then SET synchronous_standby=on, leading to an indefinite hang. Along the way, rename the global variable for the synchronous_replication GUC to match the name of the GUC itself, for clarity. Report by Fujii Masao, though I didn't use his patch.
* Minor sync rep corrections.Robert Haas2011-03-10
| | | | Fujii Masao, with a bit of additional wordsmithing by me.
* Emit a LOG message when pausing at the recovery target.Robert Haas2011-03-10
| | | | Fujii Masao
* Replication README updates.Robert Haas2011-03-10
| | | | Fujii Masao
* Cleanup copyright years and file names in the header comments of some files.Itagaki Takahiro2011-03-10
|
* replication/repl_gram.h needs to be cleaned too ...Tom Lane2011-03-10
|
* Fix some oversights in distprep and maintainer-clean targets.Tom Lane2011-03-10
| | | | | | | | | At least two recent commits have apparently imagined that a comment in a Makefile stating that something would be included in the distribution tarball was sufficient to make it so. They hadn't bothered to hook into the upper maintainer-clean targets either. Per bug #5923 from Charles Johnson, in which it emerged that the 9.1alpha4 tarballs are short a few files that should be there.
* Mention gcc version in C comment.Bruce Momjian2011-03-09
|
* Remove collation information from TypeName, where it does not belong.Tom Lane2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial collations patch treated a COLLATE spec as part of a TypeName, following what can only be described as brain fade on the part of the SQL committee. It's a lot more reasonable to treat COLLATE as a syntactically separate object, so that it can be added in only the productions where it actually belongs, rather than needing to reject it in a boatload of places where it doesn't belong (something the original patch mostly failed to do). In addition this change lets us meet the spec's requirement to allow COLLATE anywhere in the clauses of a ColumnDef, and it avoids unfriendly behavior for constructs such as "foo::type COLLATE collation". To do this, pull collation information out of TypeName and put it in ColumnDef instead, thus reverting most of the collation-related changes in parse_type.c's API. I made one additional structural change, which was to use a ColumnDef as an intermediate node in AT_AlterColumnType AlterTableCmd nodes. This provides enough room to get rid of the "transform" wart in AlterTableCmd too, since the ColumnDef can carry the USING expression easily enough. Also fix some other minor bugs that have crept in in the same areas, like failure to copy recently-added fields of ColumnDef in copyfuncs.c. While at it, document the formerly secret ability to specify a collation in ALTER TABLE ALTER COLUMN TYPE, ALTER TYPE ADD ATTRIBUTE, and ALTER TYPE ALTER ATTRIBUTE TYPE; and correct some misstatements about what the default collation selection will be when COLLATE is omitted. BTW, the three-parameter form of format_type() should go away too, since it just contributes to the confusion in this area; but I'll do that in a separate patch.
* Adjust the permissions required for COMMENT ON ROLE.Tom Lane2011-03-09
| | | | | | | | | | | | | | | | | | Formerly, any member of a role could change the role's comment, as of course could superusers; but holders of CREATEROLE privilege could not, unless they were also members. This led to the odd situation that a CREATEROLE holder could create a role but then could not comment on it. It also seems a bit dubious to let an unprivileged user change his own comment, let alone those of group roles he belongs to. So, change the rule to be "you must be superuser to comment on a superuser role, or hold CREATEROLE to comment on non-superuser roles". This is the same as the privilege check for creating/dropping roles, and thus fits much better with the rule for other object types, namely that only the owner of an object can comment on it. In passing, clean up the documentation for COMMENT a little bit. Per complaint from Owen Jacobson and subsequent discussion.
* Add missing keywords to gram.y's unreserved_keywords list.Tom Lane2011-03-08
| | | | | | We really need an automated check for this ... and did VALIDATE really need to become a keyword at all, rather than picking some other syntax using existing keywords?
* Fix overly strict assertion in SummarizeOldestCommittedSxact(). There's aHeikki Linnakangas2011-03-08
| | | | | | | | | race condition where SummarizeOldestCommittedSxact() is called even though another backend already cleared out all finished sxact entries. That's OK, RegisterSerializableTransactionInt() can just retry getting a news xact slot from the available-list when that happens. Reported by YAMAMOTO Takashi, bug #5918.
* Don't throw a warning if vacuum sees PD_ALL_VISIBLE flag set on a page thatHeikki Linnakangas2011-03-08
| | | | | | | | | | | | | | | | | | | | contains newly-inserted tuples that according to our OldestXmin are not yet visible to everyone. The value returned by GetOldestXmin() is conservative, and it can move backwards on repeated calls, so if we see that contradiction between the PD_ALL_VISIBLE flag and status of tuples on the page, we have to assume it's because an earlier vacuum calculated a higher OldestXmin value, and all the tuples really are visible to everyone. We have received several reports of this bug, with the "PD_ALL_VISIBLE flag was incorrectly set in relation ..." warning appearing in logs. We were finally able to hunt it down with David Gould's help to run extra diagnostics in an environment where this happened frequently. Also reword the warning, per Robert Haas' suggestion, to not imply that the PD_ALL_VISIBLE flag is necessarily at fault, as it might also be a symptom of corruption on a tuple header. Backpatch to 8.4, where the PD_ALL_VISIBLE flag was introduced.
* Truncate predicate lock manager's SLRU lazily at checkpoint. That's saferHeikki Linnakangas2011-03-08
| | | | | | | | than doing it aggressively whenever the tail-XID pointer is advanced, because this way we don't need to do it while holding SerializableXactHashLock. This also fixes bug #5915 spotted by YAMAMOTO Takashi, and removes an obsolete comment spotted by Kevin Grittner.
* If recovery_target_timeline is set to 'latest' and standby mode is enabled,Heikki Linnakangas2011-03-07
| | | | | | | | | | | | | | | | | periodically rescan the archive for new timelines, while waiting for new WAL segments to arrive. This allows you to set up a standby server that follows the TLI change if another standby server is promoted to master. Before this, you had to restart the standby server to make it notice the new timeline. This patch only scans the archive for TLI changes, it won't follow a TLI change in streaming replication. That is much needed too, but it would be a much bigger patch than I dare to sneak in this late in the release cycle. There was discussion on improving the sanity checking of the WAL segments so that the system would notice more reliably if the new timeline isn't an ancestor of the current one, but that is not included in this patch. Reviewed by Fujii Masao.
* Zero out vacuum_count and related counters in pgstat_recv_tabstat().Tom Lane2011-03-07
| | | | | | This fixes an oversight in commit 946045f04d11d246a834b917a2b8bc6e4f884a37 of 2010-08-21, as reported by Itagaki Takahiro. Also a couple of minor cosmetic adjustments.
* Begin error message with lower-case letter.Heikki Linnakangas2011-03-07
|