aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.Simon Riggs2013-02-02
| | | | Following bug analysis of #7819 by Tom Lane
* Fix typo in freeze_table_age implementationAlvaro Herrera2013-02-01
| | | | | | | | | | | | | | The original code used freeze_min_age instead of freeze_table_age. The main consequence of this mistake is that lowering freeze_min_age would cause full-table scans to occur much more frequently, which causes serious issues because the number of writes required is much larger. That feature (freeze_min_age) is supposed to affect only how soon tuples are frozen; some pages should still be skipped due to the visibility map. Backpatch to 8.4, where the freeze_table_age feature was introduced. Report and patch from Andres Freund
* Properly zero-pad the day-of-year part of the win32 build numberMagnus Hagander2013-01-31
| | | | | | | | | This ensure the version number increases over time. The first three digits in the version number is still set to the actual PostgreSQL version number, but the last one is intended to be an ever increasing build number, which previosly failed when it changed between 1, 2 and 3 digits long values. Noted by Deepak
* Fix plpgsql's reporting of plan-time errors in possibly-simple expressions.Tom Lane2013-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
* Fix grammar for subscripting or field selection from a sub-SELECT result.Tom Lane2013-01-30
| | | | | | | | | | | | | | | | | | | Such cases should work, but the grammar failed to accept them because of our ancient precedence hacks to convince bison that extra parentheses around a sub-SELECT in an expression are unambiguous. (Formally, they *are* ambiguous, but we don't especially care whether they're treated as part of the sub-SELECT or part of the expression. Bison cares, though.) Fix by adding a redundant-looking production for this case. This is a fine example of why fixing shift/reduce conflicts via precedence declarations is more dangerous than it looks: you can easily cause the parser to reject cases that should work. This has been wrong since commit 3db4056e22b0c6b2adc92543baf8408d2894fe91 or maybe before, and apparently some people have been working around it by inserting no-op casts. That method introduces a dump/reload hazard, as illustrated in bug #7838 from Jan Mate. Hence, back-patch to all active branches.
* DROP OWNED: don't try to drop tablespaces/databasesAlvaro Herrera2013-01-28
| | | | | | | | | | | | | | | | | My "fix" for bugs #7578 and #6116 on DROP OWNED at fe3b5eb08a1 not only misstated that it applied to REASSIGN OWNED (which it did not affect), but it also failed to fix the problems fully, because I didn't test the case of owned shared objects. Thus I created a new bug, reported by Thomas Kellerer as #7748, which would cause DROP OWNED to fail with a not-for-user-consumption error message. The code would attempt to drop the database, which not only fails to work because the underlying code does not support that, but is a pretty dangerous and undesirable thing to be doing as well. This patch fixes that bug by having DROP OWNED only attempt to process shared objects when grants on them are found, ignoring ownership. Backpatch to 8.3, which is as far as the previous bug was backpatched.
* Made ecpglib use translated messages.Michael Meskes2013-01-27
| | | | Bug reported and fixed by Chen Huajun <chenhj@cn.fujitsu.com>.
* Fix plpython's handling of functions used as triggers on multiple tables.Tom Lane2013-01-25
| | | | | | | | | | | | plpython tried to use a single cache entry for a trigger function, but it needs a separate cache entry for each table the trigger is applied to, because there is table-dependent data in there. This was done correctly before 9.1, but commit 46211da1b84bc3537e799ee1126098e71c2428e8 broke it by simplifying the lookup key from "function OID and triggered table OID" to "function OID and is-trigger boolean". Go back to using both OIDs as the lookup key. Per bug report from Sandro Santilli. Andres Freund
* Make pg_dump exclude unlogged table data on hot standby slavesMagnus Hagander2013-01-25
| | | | Noted by Joe Van Dyk
* Fix SPI documentation for new handling of ExecutorRun's count parameter.Tom Lane2013-01-24
| | | | | | | | | | | | | | Since 9.0, the count parameter has only limited the number of tuples actually returned by the executor. It doesn't affect the behavior of INSERT/UPDATE/DELETE unless RETURNING is specified, because without RETURNING, the ModifyTable plan node doesn't return control to execMain.c for each tuple. And we only check the limit at the top level. While this behavioral change was unintentional at the time, discussion of bug #6572 led us to the conclusion that we prefer the new behavior anyway, and so we should just adjust the docs to match rather than change the code. Accordingly, do that. Back-patch as far as 9.0 so that the docs match the code in each branch.
* Use correct output device for Windows prompts.Andrew Dunstan2013-01-24
| | | | | | | | | | This ensures that mapping of non-ascii prompts to the correct code page occurs. Bug report and original patch from Alexander Law, reviewed and reworked by Noah Misch. Backpatch to all live branches.
* Fix rare missing cancellations in Hot Standby.Simon Riggs2013-01-24
| | | | | | | | | | | | The machinery around XLOG_HEAP2_CLEANUP_INFO failed to correctly pass through the necessary information on latestRemovedXid, avoiding cancellations in some infrequent concurrent update/cleanup scenarios. Backpatchable fix to 9.0 Detailed bug report and fix by Noah Misch, backpatchable version by me.
* Also fix rotation of csvlog on Windows.Heikki Linnakangas2013-01-24
| | | | Backpatch to 9.2, like the previous fix.
* Fix failure to rotate postmaster log file for size reasons on Windows.Tom Lane2013-01-23
| | | | | | | | | | | | When we eliminated "unnecessary" wakeups of the syslogger process, we broke size-based logfile rotation on Windows, because on that platform data transfer is done in a separate thread. While non-Windows platforms would recheck the output file size after every log message, Windows only did so when the control thread woke up for some other reason, which might be quite infrequent. Per bug #7814 from Tsunezumi. Back-patch to 9.2 where the problem was introduced. Jeff Janes
* Fix performance problems with autovacuum truncation in busy workloads.Kevin Grittner2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In situations where there are over 8MB of empty pages at the end of a table, the truncation work for trailing empty pages takes longer than deadlock_timeout, and there is frequent access to the table by processes other than autovacuum, there was a problem with the autovacuum worker process being canceled by the deadlock checking code. The truncation work done by autovacuum up that point was lost, and the attempt tried again by a later autovacuum worker. The attempts could continue indefinitely without making progress, consuming resources and blocking other processes for up to deadlock_timeout each time. This patch has the autovacuum worker checking whether it is blocking any other thread at 20ms intervals. If such a condition develops, the autovacuum worker will persist the work it has done so far, release its lock on the table, and sleep in 50ms intervals for up to 5 seconds, hoping to be able to re-acquire the lock and try again. If it is unable to get the lock in that time, it moves on and a worker will try to continue later from the point this one left off. While this patch doesn't change the rules about when and what to truncate, it does cause the truncation to occur sooner, with less blocking, and with the consumption of fewer resources when there is contention for the table's lock. The only user-visible change other than improved performance is that the table size during truncation may change incrementally instead of just once. Backpatched to 9.0 from initial master commit at b19e4250b45e91c9cbdd18d35ea6391ab5961c8d -- before that the differences are too large to be clearly safe. Jan Wieck
* Fix one-byte buffer overrun in PQprintTuples().Tom Lane2013-01-20
| | | | | | | | | | This bug goes back to the original Postgres95 sources. Its significance to modern PG versions is marginal, since we have not used PQprintTuples() internally in a very long time, and it doesn't seem to have ever been documented either. Still, it *is* exposed to client apps, so somebody out there might possibly be using it. Xi Wang
* Fix error-checking typo in check_TSCurrentConfig().Tom Lane2013-01-20
| | | | | | The code failed to detect an out-of-memory failure. Xi Wang
* Modernize string literal syntax in tutorial example.Tom Lane2013-01-19
| | | | | | | | | | Un-double the backslashes in the LIKE patterns, since standard_conforming_strings is now the default. Just to be sure, include a command to set standard_conforming_strings to ON in the example. Back-patch to 9.1, where standard_conforming_strings became the default. Josh Kupershmidt, reviewed by Jeff Janes
* Make pgxs build executables with the right suffix.Andrew Dunstan2013-01-19
| | | | | | | | | | | | | | | Complaint and patch from Zoltán Böszörményi. When cross-compiling, the native make doesn't know about the Windows .exe suffix, so it only builds with it when explicitly told to do so. The native make will not see the link between the target name and the built executable, and might this do unnecesary work, but that's a bigger problem than this one, if in fact we consider it a problem at all. Back-patch to all live branches.
* Protect against SnapshotNow race conditions in pg_tablespace scans.Tom Lane2013-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use of SnapshotNow is known to expose us to race conditions if the tuple(s) being sought could be updated by concurrently-committing transactions. CREATE DATABASE and DROP DATABASE are particularly exposed because they do heavyweight filesystem operations during their scans of pg_tablespace, so that the scans run for a very long time compared to most. Furthermore, the potential consequences of a missed or twice-visited row are nastier than average: * createdb() could fail with a bogus "file already exists" error, or silently fail to copy one or more tablespace's worth of files into the new database. * remove_dbtablespaces() could miss one or more tablespaces, thus failing to free filesystem space for the dropped database. * check_db_file_conflict() could likewise miss a tablespace, leading to an OID conflict that could result in data loss either immediately or in future operations. (This seems of very low probability, though, since a duplicate database OID would be unlikely to start with.) Hence, it seems worth fixing these three places to use MVCC snapshots, even though this will someday be superseded by a generic solution to SnapshotNow race conditions. Back-patch to all active branches. Stephen Frost and Tom Lane
* Unbreak lock conflict detection for Hot Standby.Robert Haas2013-01-18
| | | | | | | | | | This got broken in the original fast-path locking patch, because I failed to account for the fact that Hot Standby startup process might take a strong relation lock on a relation in a database to which it is not bound, and confused MyDatabaseId with the database ID of the relation being locked. Report and diagnosis by Andres Freund. Final form of patch by me.
* On second thought, use an empty string instead of "none" when not connected.Heikki Linnakangas2013-01-15
| | | | | | | | | "none" could mislead to think that you're connected a database with that name. Also, it needs to be translated, which might be hard without some context. So in back-branches, use empty string, so that the message is (currently ""), which is at least unambiguous and doens't require translation. In master, it's no problem to add translatable strings, so use a different fix there.
* Don't pass NULL to fprintf, if not currently connected to a database.Heikki Linnakangas2013-01-15
| | | | | Backpatch all the way to 8.3. Fixes bug #7811, per report and diagnosis by Meng Qingzhong.
* Reject out-of-range dates in to_date().Tom Lane2013-01-14
| | | | | | | | | Dates outside the supported range could be entered, but would not print reasonably, and operations such as conversion to timestamp wouldn't behave sanely either. Since this has the potential to result in undumpable table data, it seems worth back-patching. Hitoshi Harada
* Add new timezone abbrevation "FET".Tom Lane2013-01-14
| | | | | | | This seems to have been invented in 2011 to represent GMT+3, non daylight savings rules, as now used in Europe/Kaliningrad and Europe/Minsk. There are no conflicts so might as well add it to the Default list. Per bug #7804 from Ruslan Izmaylov.
* Extend and improve use of EXTRA_REGRESS_OPTS.Andrew Dunstan2013-01-12
| | | | | | | | | | This is now used by ecpg tests, and not clobbered by pg_upgrade tests. This change won't affect anything that doesn't set this environment variable, but will enable the buildfarm to control exactly what port regression test installs will be running on, and thus to detect possible rogue postmasters more easily. Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
* Revert ill-considered change of index-size fudge factor.Tom Lane2013-01-11
| | | | | | | | | This partially reverts commit 21a39de5809cd3050a37d2554323cc1d0cbeed9d, restoring the pre-9.2 cost estimates for index usage. That change introduced much too large a bias against larger indexes, as per reports from Jeff Janes and others. The whole thing needs a rewrite, which I've done in HEAD, but the safest thing to do in 9.2 is just to undo this multiplier change.
* Properly install ecpg_compat and pgtypes libraries on msvcMagnus Hagander2013-01-09
| | | | JiangGuiqing
* Fix potential corruption of lock table in CREATE/DROP INDEX CONCURRENTLY.Tom Lane2013-01-08
| | | | | | | | | | | | | | | | If VirtualXactLock() has to wait for a transaction that holds its VXID lock as a fast-path lock, it must first convert the fast-path lock to a regular lock. It failed to take the required "partition" lock on the main shared-memory lock table while doing so. This is the direct cause of the assert failure in GetLockStatusData() recently observed in the buildfarm, but more worryingly it could result in arbitrary corruption of the shared lock table if some other process were concurrently engaged in modifying the same partition of the lock table. Fortunately, VirtualXactLock() is only used by CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY, so the opportunities for failure are fewer than they might have been. In passing, improve some comments and be a bit more consistent about order of operations.
* Invent a "one-shot" variant of CachedPlans for better performance.Tom Lane2013-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPI_execute() and related functions create a CachedPlan, execute it once, and immediately discard it, so that the functionality offered by plancache.c is of no value in this code path. And performance measurements show that the extra data copying and invalidation checking done by plancache.c slows down simple queries by 10% or more compared to 9.1. However, enough of the SPI code is shared with functions that do need plan caching that it seems impractical to bypass plancache.c altogether. Instead, let's invent a variant version of cached plans that preserves 99% of the API but doesn't offer any of the actual functionality, nor the overhead. This puts SPI_execute() performance back on par, or maybe even slightly better, than it was before. This change should resolve recent complaints of performance degradation from Dong Ye, Pavel Stehule, and others. By avoiding data copying, this change also reduces the amount of memory needed to execute many-statement SPI_execute() strings, as for instance in a recent complaint from Tomas Vondra. An additional benefit of this change is that multi-statement SPI_execute() query strings are now processed fully serially, that is we complete execution of earlier statements before running parse analysis and planning on following ones. This eliminates a long-standing POLA violation, in that DDL that affects the behavior of a later statement will now behave as expected. Back-patch to 9.2, since this was a performance regression compared to 9.1. (In 9.2, place the added struct fields so as to avoid changing the offsets of existing fields.) Heikki Linnakangas and Tom Lane
* Tolerate timeline switches while "pg_basebackup -X fetch" is running.Heikki Linnakangas2013-01-03
| | | | | | | | | | | | | | | | | | | | | | If you take a base backup from a standby server with "pg_basebackup -X fetch", and the timeline switches while the backup is being taken, the backup used to fail with an error "requested WAL segment %s has already been removed". This is because the server-side code that sends over the required WAL files would not construct the WAL filename with the correct timeline after a switch. Fix that by using readdir() to scan pg_xlog for all the WAL segments in the range, regardless of timeline. Also, include all timeline history files in the backup, if taken with "-X fetch". That fixes another related bug: If a timeline switch happened just before the backup was initiated in a standby, the WAL segment containing the initial checkpoint record contains WAL from the older timeline too. Recovery will not accept that without a timeline history file that lists the older timeline. Backpatch to 9.2. Versions prior to that were not affected as you could not take a base backup from a standby before 9.2.
* Keep timeline history files restored from archive in pg_xlog.Heikki Linnakangas2012-12-30
| | | | | | | | | | | | | | | | | | | | | | | The cascading standby patch in 9.2 changed the way WAL files are treated when restored from the archive. Before, they were restored under a temporary filename, and not kept in pg_xlog, but after the patch, they were copied under pg_xlog. This is necessary for a cascading standby to find them, but it also means that if the archive goes offline and a standby is restarted, it can recover back to where it was using the files in pg_xlog. It also means that if you take an offline backup from a standby server, it includes all the required WAL files in pg_xlog. However, the same change was not made to timeline history files, so if the WAL segment containing the checkpoint record contains a timeline switch, you will still get an error if you try to restart recovery without the archive, or recover from an offline backup taken from the standby. With this patch, timeline history files restored from archive are copied into pg_xlog like WAL files are, so that pg_xlog contains all the files required to recover. This is a corner-case pre-existing issue in 9.2, but even more important in master where it's possible for a standby to follow a timeline switch through streaming replication. To make that possible, the timeline history files must be present in pg_xlog.
* Fix some minor issues in view pretty-printing.Tom Lane2012-12-24
| | | | | | | | Code review for commit 2f582f76b1945929ff07116cd4639747ce9bb8a1: don't use a static variable for what ought to be a deparse_context field, fix non-multibyte-safe test for spaces, avoid useless and potentially O(N^2) (though admittedly with a very small constant) calculations of wrap positions when we aren't going to wrap.
* Prevent failure when RowExpr or XmlExpr is parse-analyzed twice.Tom Lane2012-12-23
| | | | | | | | | | | | transformExpr() is required to cope with already-transformed expression trees, for various ugly-but-not-quite-worth-cleaning-up reasons. However, some of its newer subroutines hadn't gotten the memo. This accounts for bug #7763 from Norbert Buchmuller: transformRowExpr() was overwriting the previously determined type of a RowExpr during CREATE TABLE LIKE INCLUDING INDEXES. Additional investigation showed that transformXmlExpr had the same kind of problem, but all the other cases seem to be safe. Andres Freund and Tom Lane
* Fix race condition if a file is removed while pg_basebackup is running.Heikki Linnakangas2012-12-21
| | | | | | | | If a relation file was removed when the server-side counterpart of pg_basebackup was just about to open it to send it to the client, you'd get a "could not open file" error. Fix that. Backpatch to 9.1, this goes back to when pg_basebackup was introduced.
* Fix grammatical mistake in error messagePeter Eisentraut2012-12-20
|
* Fix pg_extension_config_dump() to handle update cases more sanely.Tom Lane2012-12-20
| | | | | | | | | | | | | | | | | | | | | | If pg_extension_config_dump() is executed again for a table already listed in the extension's extconfig, the code was blindly making a new array entry. This does not seem useful. Fix it to replace the existing array entry instead, so that it's possible for extension update scripts to alter the filter conditions for configuration tables. In addition, teach ALTER EXTENSION DROP TABLE to check for an extconfig entry for the target table, and remove it if present. This is not a 100% solution because it's allowed for an extension update script to just summarily DROP a member table, and that code path doesn't go through ExecAlterExtensionContentsStmt. We could probably make that case clean things up if we had to, but it would involve sticking a very ugly wart somewhere in the guts of dependency.c. Since on the whole it seems quite unlikely that extension updates would want to remove pre-existing configuration tables, making the case possible with an explicit command seems sufficient. Per bug #7756 from Regina Obe. Back-patch to 9.1 where extensions were introduced.
* Fix recycling of WAL segments after changing recovery target timeline.Heikki Linnakangas2012-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | After the recovery target timeline is changed, we would still recycle and preallocate WAL segments on the old target timeline. Those WAL segments created for the old timeline are a waste of space, although otherwise harmless. The problem is that when installing a recycled WAL segment as a future one, ThisTimeLineID is used to construct the filename. ThisTimeLineID is initialized in the checkpointer process to the recovery target timeline at startup, but it was not updated when the startup process chooses a new target timeline (recovery_target_timeline='latest'). To fix, always update ThisTimeLineID before recycling WAL segments at a restartpoint. This still leaves a small window where we might install WAL segments under wrong timeline ID, if the target timeline is changed just as we're about to start recycling. Also, when we're not on the target timeline yet, but still replaying some older timeline, we'll install WAL segments to the newer timeline anyway and they will still go wasted. We'll just live with the waste in that situation. Commit to 9.2 and 9.1. Older versions didn't change recovery target timeline after startup, and for master, I'll commit a slightly different variant of this.
* Check if we've reached end-of-backup point also if no redo is required.Heikki Linnakangas2012-12-19
| | | | | | | | | | | | | If you restored from a backup taken from a standby, and the last record in the backup is the checkpoint record, ie. there is no redo required except for the checkpoint record, we would fail to notice that we've reached the end-of-backup point, and the database is consistent. The result was an error "WAL ends before end of online backup". To fix, move the have-we-reached-end-of-backup check into CheckRecoveryConsistency(), which is already responsible for similar checks with minRecoveryPoint, and is called in the right places. Backpatch to 9.2, this check and bug did not exist before that.
* Fix failure to ignore leftover temp tables after a server crash.Tom Lane2012-12-17
| | | | | | | | | | | | | | | | | | | During crash recovery, we remove disk files belonging to temporary tables, but the system catalog entries for such tables are intentionally not cleaned up right away. Instead, the first backend that uses a temp schema is expected to clean out any leftover objects therein. This approach requires that we be careful to ignore leftover temp tables (since any actual access attempt would fail), *even if their BackendId matches our session*, if we have not yet established use of the session's corresponding temp schema. That worked fine in the past, but was broken by commit debcec7dc31a992703911a9953e299c8d730c778 which incorrectly removed the rd_islocaltemp relcache flag. Put it back, and undo various changes that substituted tests like "rel->rd_backend == MyBackendId" for use of a state-aware flag. Per trouble report from Heikki Linnakangas. Back-patch to 9.1 where the erroneous change was made. In the back branches, be careful to add rd_islocaltemp in a spot in the struct that was alignment padding before, so as not to break existing add-on code.
* Fix filling of postmaster.pid in bootstrap/standalone mode.Tom Lane2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | We failed to ever fill the sixth line (LISTEN_ADDR), which caused the attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared memory key never got added to the file in standalone mode. This has been broken since we added more content to our lock files in 9.1. To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR line in standalone mode. This is a tad grotty, but since that function already knows almost everything there is to know about the contents of lock files, it doesn't seem that it's any better to hack it elsewhere. It's not clear how significant this bug really is, since a standalone backend should never have any children and thus it seems not critical to be able to check the nattch count of the shmem segment externally. But I'm going to back-patch the fix anyway. This problem had escaped notice because of an ancient (and in hindsight pretty dubious) decision to suppress LOG-level messages by default in standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile that should have warned of the problem didn't do anything. Fixing that is material for a separate patch though.
* Properly copy fmgroids.h after clean on Win32Magnus Hagander2012-12-16
| | | | Craig Ringer
* In multi-insert, don't go into infinite loop on a huge tuple and fillfactor.Heikki Linnakangas2012-12-12
| | | | | | | | | | | | | | | | | | If a tuple is larger than page size minus space reserved for fillfactor, heap_multi_insert would never find a page that it fits in and repeatedly ask for a new page from RelationGetBufferForTuple. If a tuple is too large to fit on any page, taking fillfactor into account, RelationGetBufferForTuple will always expand the relation. In a normal insert, heap_insert will accept that and put the tuple on the new page. heap_multi_insert, however, does a fillfactor check of its own, and doesn't accept the newly-extended page RelationGetBufferForTuple returns, even though there is no other choice to make the tuple fit. Fix that by making the logic in heap_multi_insert more like the heap_insert logic. The first tuple is always put on the page RelationGetBufferForTuple gives us, and the fillfactor check is only applied to the subsequent tuples. Report from David Gould, although I didn't use his patch.
* Add defenses against integer overflow in dynahash numbuckets calculations.Tom Lane2012-12-11
| | | | | | | | | | | | | | | The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
* Consistency check should compare last record replayed, not last record read.Heikki Linnakangas2012-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | EndRecPtr is the last record that we've read, but not necessarily yet replayed. CheckRecoveryConsistency should compare minRecoveryPoint with the last replayed record instead. This caused recovery to think it's reached consistency too early. Now that we do the check in CheckRecoveryConsistency correctly, we have to move the call of that function to after redoing a record. The current place, after reading a record but before replaying it, is wrong. In particular, if there are no more records after the one ending at minRecoveryPoint, we don't enter hot standby until one extra record is generated and read by the standby, and CheckRecoveryConsistency is called. These two bugs conspired to make the code appear to work correctly, except for the small window between reading the last record that reaches minRecoveryPoint, and replaying it. In the passing, rename recoveryLastRecPtr, which is the last record replayed, to lastReplayedEndRecPtr. This makes it slightly less confusing with replayEndRecPtr, which is the last record read that we're about to replay. Original report from Kyotaro HORIGUCHI, further diagnosis by Fujii Masao. Backpatch to 9.0, where Hot Standby subtly changed the test from "minRecoveryPoint < EndRecPtr" to "minRecoveryPoint <= EndRecPtr". The former works because where the test is performed, we have always read one more record than we've replayed.
* Add mode where contrib installcheck runs each module in a separately named ↵Andrew Dunstan2012-12-11
| | | | | | | | | | | | | | | | | | | | | database. Normally each module is tested in a database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This new mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Second attempt at this, this time accomodating make versions older than 3.82. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
* Update minimum recovery point on truncation.Heikki Linnakangas2012-12-10
| | | | | | | | | If a file is truncated, we must update minRecoveryPoint. Once a file is truncated, there's no going back; it would not be safe to stop recovery at a point earlier than that anymore. Per report from Kyotaro HORIGUCHI. Backpatch to 8.4. Before that, minRecoveryPoint was not updated during recovery at all.
* Fix assorted bugs in privileges-for-types patch.Tom Lane2012-12-09
| | | | | | | | Commit 729205571e81b4767efc42ad7beb53663e08d1ff added privileges on data types, but there were a number of oversights. The implementation of default privileges for types missed a few places, and pg_dump was utterly innocent of the whole concept. Per bug #7741 from Nathan Alden, and subsequent wider investigation.
* Fix intermittent crash in DROP INDEX CONCURRENTLY.Tom Lane2012-12-05
| | | | | | | | | | | | | | | When deleteOneObject closes and reopens the pg_depend relation, we must see to it that the relcache pointer held by the calling function (typically performMultipleDeletions) is updated. Usually the relcache entry is retained so that the pointer value doesn't change, which is why the problem had escaped notice ... but after a cache flush event there's no guarantee that the same memory will be reassigned. To fix, change the recursive functions' APIs so that we pass around a "Relation *" not just "Relation". Per investigation of occasional buildfarm failures. This is trivial to reproduce with -DCLOBBER_CACHE_ALWAYS, which points up the sad lack of any buildfarm member running that way on a regular basis.
* Ensure recovery pause feature doesn't pause unless users can connect.Tom Lane2012-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | If we're not in hot standby mode, then there's no way for users to connect to reset the recoveryPause flag, so we shouldn't pause. The code was aware of this but the test to see if pausing was safe was seriously inadequate: it wasn't paying attention to reachedConsistency, and besides what it was testing was that we could legally enter hot standby, not that we have done so. Get rid of that in favor of checking LocalHotStandbyActive, which because of the coding in CheckRecoveryConsistency is tantamount to checking that we have told the postmaster to enter hot standby. Also, move the recoveryPausesHere() call that reacts to asynchronous recoveryPause requests so that it's not in the middle of application of a WAL record. I put it next to the recoveryStopsHere() call --- in future those are going to need to interact significantly, so this seems like a good waystation. Also, don't bother trying to read another WAL record if we've already decided not to continue recovery. This was no big deal when the code was written originally, but now that reading a record might entail actions like fetching an archive file, it seems a bit silly to do it like that. Per report from Jeff Janes and subsequent discussion. The pause feature needs quite a lot more work, but this gets rid of some indisputable bugs, and seems safe enough to back-patch.