aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Make VACUUM avoid waiting for a cleanup lock, where possible.Robert Haas2011-11-07
| | | | | | | | | | | In a regular VACUUM, it's OK to skip pages for which a cleanup lock isn't immediately available; the next VACUUM will deal with them. If we're scanning the entire relation to advance relfrozenxid, we might need to wait, but only if there are tuples on the page that actually require freezing. These changes should greatly reduce the incidence of of vacuum processes getting "stuck". Simon Riggs and Robert Haas
* Fix timestamp range subdiff functions, when using float datetimes.Heikki Linnakangas2011-11-07
|
* On second thought, we'd better just drop these tests altogether.Tom Lane2011-11-06
| | | | | | | | Further experimentation reveals that my previous change didn't fix the issue entirely: these tests would still fail at the spring-forward DST transition. There doesn't seem to be any great value in testing this specific issue for both timestamp and timestamptz, so just lose the latter tests.
* Un-break horology regression test.Tom Lane2011-11-06
| | | | | | Adjust ill-considered timezone-dependent tests added in commit 8a3d33c8e6c681d512f79af4a521ee0c02befcef so that they won't fail on DST transition days. Per all-pink buildfarm.
* Oops, forgot to fix the catversion when I committed the range types patch.Heikki Linnakangas2011-11-06
| | | | | | It was inadvertently changed to 201111111, which is a wrong date. Change it to current date, and remove the comment that was supposed to remind me to fix it before committing.
* Update regression tests for \d+ modificationMagnus Hagander2011-11-05
| | | | Noted by Tom
* Show statistics target for columns in \d+ on a tableMagnus Hagander2011-11-05
|
* Make psql \d on a sequence show the table/column owning itMagnus Hagander2011-11-05
|
* Don't assume that a tuple's header size is unchanged during toasting.Tom Lane2011-11-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This assumption can be wrong when the toaster is passed a raw on-disk tuple, because the tuple might pre-date an ALTER TABLE ADD COLUMN operation that added columns without rewriting the table. In such a case the tuple's natts value is smaller than what we expect from the tuple descriptor, and so its t_hoff value could be smaller too. In fact, the tuple might not have a null bitmap at all, and yet our current opinion of it is that it contains some trailing nulls. In such a situation, toast_insert_or_update did the wrong thing, because to save a few lines of code it would use the old t_hoff value as the offset where heap_fill_tuple should start filling data. This did not leave enough room for the new nulls bitmap, with the result that the first few bytes of data could be overwritten with null flag bits, as in a recent report from Hubert Depesz Lubaczewski. The particular case reported requires ALTER TABLE ADD COLUMN followed by CREATE TABLE AS SELECT * FROM ... or INSERT ... SELECT * FROM ..., and further requires that there be some out-of-line toasted fields in one of the tuples to be copied; else we'll not reach the troublesome code. The problem can only manifest in this form in 8.4 and later, because before commit a77eaa6a95009a3441e0d475d1980259d45da072, CREATE TABLE AS or INSERT/SELECT wouldn't result in raw disk tuples getting passed directly to heap_insert --- there would always have been at least a junkfilter in between, and that would reconstitute the tuple header with an up-to-date t_natts and hence t_hoff. But I'm backpatching the tuptoaster change all the way anyway, because I'm not convinced there are no older code paths that present a similar risk.
* Add missing space in commentMagnus Hagander2011-11-04
|
* Move user functions related to WAL into xlogfuncs.cSimon Riggs2011-11-04
|
* Unbreak isolationtester on Win32Alvaro Herrera2011-11-04
| | | | | | | I broke it in a previous commit because I neglected to install the necessary incantations to have getopt() work on Windows. Per red blots in buildfarm.
* Improve comments for TSLexeme data structure.Tom Lane2011-11-03
| | | | Mostly, clean up long-ago pgindent damage.
* Fix inline_set_returning_function() to allow multiple OUT parameters.Tom Lane2011-11-03
| | | | | | | | inline_set_returning_function failed to distinguish functions returning generic RECORD (which require a column list in the RTE, as well as run-time type checking) from those with multiple OUT parameters (which do not). This prevented inlining from happening. Per complaint from Jay Levitt. Back-patch to 8.4 where this capability was introduced.
* Implement a dry-run mode for isolationtesterAlvaro Herrera2011-11-03
| | | | | | | | This mode prints out the permutations that would be run by the given spec file, in the same format used by the permutation lines in spec files. This helps in building new spec files. Author: Alexander Shulgin, with some tweaks by me
* Do not treat a superuser as a member of every role for HBA purposes.Andrew Dunstan2011-11-03
| | | | | | This makes it possible to use reject lines with group roles. Andrew Dunstan, reviewd by Robert Haas.
* Properly close replication connection in pg_receivexlogMagnus Hagander2011-11-03
|
* Pre-pad WAL files when streaming transaction logMagnus Hagander2011-11-03
| | | | | | | | | | | | | | Instead of filling files as they appear, pre-pad the WAL files received when streaming xlog the same way that the server does. Data is streamed into a .partial file which is then renamed()d into palce when it's complete, but it will always be 16MB. This also means that the starting position for pg_receivexlog is now simply right after the last complete segment, and we never need to deal with partial segments there. Patch by me, review by Fujii Masao
* Support range data types.Heikki Linnakangas2011-11-03
| | | | | | | Selectivity estimation functions are missing for some range type operators, which is a TODO. Jeff Davis
* Fix handling of PlaceHolderVars in nestloop parameter management.Tom Lane2011-11-03
| | | | | | | | | | | | | If we use a PlaceHolderVar from the outer relation in an inner indexscan, we need to reference the PlaceHolderVar as such as the value to be passed in from the outer relation. The previous code effectively tried to reconstruct the PHV from its component expression, which doesn't work since (a) the Vars therein aren't necessarily bubbled up far enough, and (b) it would be the wrong semantics anyway because of the possibility that the PHV is supposed to have gone to null at some point before the current join. Point (a) led to "variable not found in subplan target list" planner errors, but point (b) would have led to silently wrong answers. Per report from Roger Niederland.
* Avoid scanning nulls at the beginning of a btree index scan.Tom Lane2011-11-02
| | | | | | | | | | | If we have an inequality key that constrains the other end of the index, it doesn't directly help us in doing the initial positioning ... but it does imply a NOT NULL constraint on the index column. If the index stores nulls at this end, we can use the implied NOT NULL condition for initial positioning, just as if it had been stated explicitly. This avoids wasting time when there are a lot of nulls in the column. This is the reverse of the examples given in bugs #6278 and #6283, which were about failing to stop early when we encounter nulls at the end of the indexscan.
* Fix btree stop-at-nulls logic properly.Tom Lane2011-11-02
| | | | | | | | | | | | As pointed out by Naoya Anzai, my previous try at this was a few bricks shy of a load, because I had forgotten that the initial-positioning logic might not try to skip over nulls at the end of the index the scan will start from. We ought to fix that, because it represents an unnecessary inefficiency, but first let's get the scan-stop logic back to a safe state. With this patch, we preserve the performance benefit requested in bug #6278 for the case of scanning forward into NULLs (in a NULLS LAST index), but the reverse case of scanning backward across NULLs when there's no suitable initial-positioning qual is still inefficient.
* Update more comments about checkpoints being done by bgwriterSimon Riggs2011-11-02
|
* Reduce checkpoints and WAL traffic on low activity database serverSimon Riggs2011-11-02
| | | | | | | | | | | | Previously, we skipped a checkpoint if no WAL had been written since last checkpoint, though this does not appear in user documentation. As of now, we skip a checkpoint until we have written at least one enough WAL to switch the next WAL file. This greatly reduces the level of activity and number of WAL messages generated by a very low activity server. This is safe because the purpose of a checkpoint is to act as a starting place for a recovery, in case of crash. This patch maintains minimal WAL volume for replay in case of crash, thus maintaining very low crash recovery time.
* Refactor xlog.c to create src/backend/postmaster/startup.cSimon Riggs2011-11-02
| | | | | Startup process now has its own dedicated file, just like all other special/background processes. Reduces role and size of xlog.c
* Derive oldestActiveXid at correct time for Hot Standby.Simon Riggs2011-11-02
| | | | | | | | | There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop
* Start Hot Standby faster when initial snapshot is incomplete.Simon Riggs2011-11-02
| | | | | | | | | If the initial snapshot had overflowed then we can start whenever the latest snapshot is empty, not overflowed or as we did already, start when the xmin on primary was higher than xmax of our starting snapshot, which proves we have full snapshot data. Bug report by Chris Redekop
* Remove spurious entry from missed catch while patch jugglingSimon Riggs2011-11-02
|
* Fix timing of Startup CLOG and MultiXact during Hot StandbySimon Riggs2011-11-02
| | | | Patch by me, bug report by Chris Redekop, analysis by Florian Pflug
* Initialize myProcLocks queues just once, at postmaster startup.Robert Haas2011-11-01
| | | | | | | In assert-enabled builds, we assert during the shutdown sequence that the queues have been properly emptied, and during process startup that we are inheriting empty queues. In non-assert enabled builds, we just save a few cycles.
* Preserve Var location information during flatten_join_alias_vars.Tom Lane2011-11-01
| | | | | | | This allows us to give correct syntax error pointers when complaining about ungrouped variables in a join query with aggregates or GROUP BY. It's pretty much irrelevant for the planner's use of the function, though perhaps it might aid debugging sometimes.
* Fix race condition with toast table access from a stale syscache entry.Tom Lane2011-11-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a tuple in a syscache contains an out-of-line toasted field, and we try to fetch that field shortly after some other transaction has committed an update or deletion of the tuple, there is a race condition: vacuum could come along and remove the toast tuples before we can fetch them. This leads to transient failures like "missing chunk number 0 for toast value NNNNN in pg_toast_2619", as seen in recent reports from Andrew Hammond and Tim Uckun. The design idea of syscache is that access to stale syscache entries should be prevented by relation-level locks, but that fails for at least two cases where toasted fields are possible: ANALYZE updates pg_statistic rows without locking out sessions that might want to plan queries on the same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without any meaningful lock at all. The least risky fix seems to be an idea that Heikki suggested when we were dealing with a related problem back in August: forcibly detoast any out-of-line fields before putting a tuple into syscache in the first place. This avoids the problem because at the time we fetch the parent tuple from the catalog, we should be holding an MVCC snapshot that will prevent removal of the toast tuples, even if the parent tuple is outdated immediately after we fetch it. (Note: I'm not convinced that this statement holds true at every instant where we could be fetching a syscache entry at all, but it does appear to hold true at the times where we could fetch an entry that could have a toasted field. We will need to be a bit wary of adding toast tables to low-level catalogs that don't have them already.) An additional benefit is that subsequent uses of the syscache entry should be faster, since they won't have to detoast the field. Back-patch to all supported versions. The problem is significantly harder to reproduce in pre-9.0 releases, because of their willingness to flush every entry in a syscache whenever the underlying catalog is vacuumed (cf CatalogCacheFlushRelation); but there is still a window for trouble.
* Clean up whitespace and indentation in parser and scanner filesPeter Eisentraut2011-11-01
| | | | These are not touched by pgindent, so clean them up a bit manually.
* Comment changes to show bgwriter no longer performs checkpoints.Simon Riggs2011-11-01
|
* Have checkpointer send stats once each processing loop.Simon Riggs2011-11-01
| | | | Noted by Fujii Masao
* Add new file for checkpointer.cSimon Riggs2011-11-01
|
* Split work of bgwriter between 2 processes: bgwriter and checkpointer.Simon Riggs2011-11-01
| | | | | | | | | | | | | bgwriter is now a much less important process, responsible for page cleaning duties only. checkpointer is now responsible for checkpoints and so has a key role in shutdown. Later patches will correct doc references to the now old idea that bgwriter performs checkpoints. Has beneficial effect on performance at high write rates, but mainly refactoring to more easily allow changes for power reduction by simplifying previously tortuous code around required to allow page cleaning and checkpointing to time slice in the same process. Patch by me, Review by Dickson Guedes
* Stop btree indexscans upon reaching nulls in either direction.Tom Lane2011-10-31
| | | | | | | | | | | The existing scan-direction-sensitive tests were overly complex, and failed to stop the scan in cases where it's perfectly legitimate to do so. Per bug #6278 from Maksym Boguk. Back-patch to 8.3, which is as far back as the patch applies easily. Doesn't seem worth sweating over a relatively minor performance issue in 8.2 at this late date. (But note that this was a performance regression from 8.1 and before, so 8.2 is being left as an outlier.)
* Support more locale-specific formatting options in cash_out().Tom Lane2011-10-30
| | | | | | | | | | | | | | | The POSIX spec defines locale fields for controlling the ordering of the value, sign, and currency symbol in monetary output, but cash_out only supported a small subset of these options. Fully implement p/n_sign_posn, p/n_cs_precedes, and p/n_sep_by_space per spec. Fix up cash_in so that it will accept all these format variants. Also, make sure that thousands_sep is only inserted to the left of the decimal point, as required by spec. Per bug #6144 from Eduard Kracmar and discussion of bug #6277. This patch includes some ideas from Alexander Lakhin's proposed patch, though it is very different in detail.
* Further improvement of make_greater_string.Tom Lane2011-10-30
| | | | | | | | | Make sure that it considers all the possibilities that the old code did, instead of trying only one possibility per character position. To keep the runtime in bounds, instead tweak the character incrementers to not try every possible multibyte character code. Remove unnecessary logic to restore the old character value on failure. Additional comment and formatting cleanup.
* Update visibilitymap.c header comments.Robert Haas2011-10-29
| | | | Recent work on index-only scans left this somewhat out of date.
* Fix assorted bogosities in cash_in() and cash_out().Tom Lane2011-10-29
| | | | | | | | | | | | | cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.
* Improve make_greater_string() with encoding-specific incrementers.Robert Haas2011-10-29
| | | | | | | | | This infrastructure doesn't in any way guarantee that the character we produce will sort before the one we incremented; but it does at least make it much more likely that we'll end up with something that is a valid character, which improves our chances. Kyotaro Horiguchi, with various adjustments by me.
* Allow hint bits to be set sooner for temporary and unlogged tables.Robert Haas2011-10-28
| | | | | | | | | | | We need not wait until the commit record is durably on disk, because in the event of a crash the page we're updating with hint bits will be gone anyway. Per off-list report from Heikki Linnakangas, this can significantly degrade the performance of unlogged tables; I was able to show a 2x speedup from this patch on a pgbench run with scale factor 15. In practice, this will mostly help small, heavily updated tables, because on larger tables you're unlikely to run into the same row again before the commit record makes it out to disk.
* Demote some sanity checks in BufferIsValid() to assertions.Robert Haas2011-10-28
| | | | | Testing reveals that this macro is a hot-spot for index-only-scans. Per discussion with Tom Lane.
* Remove hard-coded "\connect postgres" from pg_dumpall.Robert Haas2011-10-28
| | | | | This doesn't appear to accompish anything useful, and does make the restore fail if the postgres database happens to have been dropped.
* De-parallelize ecpg build some more.Tom Lane2011-10-28
| | | | | | Make sure ecpg/include/ is rebuilt before the other subdirectories, so that ecpg_config.h is up to date. This is not likely to matter during production builds, only development, so no back-patch.
* Update docs to point to the timezone library's new home at IANA.Tom Lane2011-10-27
| | | | | The recent unpleasantness with copyrights has accelerated a move that was already in planning.
* Fix the number of lwlocks needed by the "fast path" lock patch. It needsHeikki Linnakangas2011-10-27
| | | | | | | | one lock per backend or auxiliary process - the need for a lock for each aux processes was not accounted for in NumLWLocks(). No-one noticed, because the three locks needed for the three aux processes fit into the few extra lwlocks we allocate for 3rd party modules that don't call RequestAddinLWLocks() (NUM_USER_DEFINED_LWLOCKS, 4 by default).
* Avoid recursion while processing ELSIF lists in plpgsql.Tom Lane2011-10-27
| | | | | | | | | | | | | The original implementation of ELSIF in plpgsql converted the construct into nested simple IF statements. This was prone to stack overflow with long ELSIF lists, in two different ways. First, it's difficult to generate the parsetree without using right-recursion in the bison grammar, and that's prone to parser stack overflow since nothing can be reduced until the whole list has been read. Second, we'd recurse during execution, thus creating an unnecessary risk of execution-time stack overflow. Rewrite so that the ELSIF list is represented as a flat list, scanned via iteration not recursion, and generated through left-recursion in the grammar. Per a gripe from Håvard Kongsgård.