aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Fix ALTER EXTENSION / SET SCHEMAAlvaro Herrera2012-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | In its original conception, it was leaving some objects into the old schema, but without their proper pg_depend entries; this meant that the old schema could be dropped, causing future pg_dump calls to fail on the affected database. This was originally reported by Jeff Frost as #6704; there have been other complaints elsewhere that can probably be traced to this bug. To fix, be more consistent about altering a table's subsidiary objects along the table itself; this requires some restructuring in how tables are relocated when altering an extension -- hence the new AlterTableNamespaceInternal routine which encapsulates it for both the ALTER TABLE and the ALTER EXTENSION cases. There was another bug lurking here, which was unmasked after fixing the previous one: certain objects would be reached twice via the dependency graph, and the second attempt to move them would cause the entire operation to fail. Per discussion, it seems the best fix for this is to do more careful tracking of objects already moved: we now maintain a list of moved objects, to avoid attempting to do it twice for the same object. Authors: Alvaro Herrera, Dimitri Fontaine Reviewed by Tom Lane
* Prefer actual constants to pseudo-constants in equivalence class machinery.Tom Lane2012-10-26
| | | | | | | | generate_base_implied_equalities_const() should prefer plain Consts over other em_is_const eclass members when choosing the "pivot" value that all the other members will be equated to. This makes it more likely that the generated equalities will be useful in constraint-exclusion proofs. Per report from Rushabh Lathia.
* Prevent parser from believing that views have system columns.Tom Lane2012-10-24
| | | | | | | | | | | | | Views should not have any pg_attribute entries for system columns. However, we forgot to remove such entries when converting a table to a view. This could lead to crashes later on, if someone attempted to reference such a column, as reported by Kohei KaiGai. This problem is corrected properly in HEAD (by removing the pg_attribute entries during conversion), but in the back branches we need to defend against existing mis-converted views. This fix costs us an extra syscache lookup per system column reference, which is annoying but probably not really measurable in the big scheme of things.
* Fix hash_search to avoid corruption of the hash table on out-of-memory.Tom Lane2012-10-19
| | | | | | | | | | | | | | | | An out-of-memory error during expand_table() on a palloc-based hash table would leave a partially-initialized entry in the table. This would not be harmful for transient hash tables, since they'd get thrown away anyway at transaction abort. But for long-lived hash tables, such as the relcache hash, this would effectively corrupt the table, leading to crash or other misbehavior later. To fix, rearrange the order of operations so that table enlargement is attempted before we insert a new entry, rather than after adding it to the hash table. Problem discovered by Hitoshi Harada, though this is a bit different from his proposed patch.
* Fix ruleutils to print "INSERT INTO foo DEFAULT VALUES" correctly.Tom Lane2012-10-19
| | | | | Per bug #7615 from Marko Tiikkaja. Apparently nobody ever tried this case before ...
* Fix planning of non-strict equivalence clauses above outer joins.Tom Lane2012-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a potential equivalence clause references a variable from the nullable side of an outer join, the planner needs to take care that derived clauses are not pushed to below the outer join; else they may use the wrong value for the variable. (The problem arises only with non-strict clauses, since if an upper clause can be proven strict then the outer join will get simplified to a plain join.) The planner attempted to prevent this type of error by checking that potential equivalence clauses aren't outerjoin-delayed as a whole, but actually we have to check each side separately, since the two sides of the clause will get moved around separately if it's treated as an equivalence. Bugs of this type can be demonstrated as far back as 7.4, even though releases before 8.3 had only a very ad-hoc notion of equivalence clauses. In addition, we neglected to account for the possibility that such clauses might have nonempty nullable_relids even when not outerjoin-delayed; so the equivalence-class machinery lacked logic to compute correct nullable_relids values for clauses it constructs. This oversight was harmless before 9.2 because we were only using RestrictInfo.nullable_relids for OR clauses; but as of 9.2 it could result in pushing constructed equivalence clauses to incorrect places. (This accounts for bug #7604 from Bill MacArthur.) Fix the first problem by adding a new test check_equivalence_delay() in distribute_qual_to_rels, and fix the second one by adding code in equivclass.c and called functions to set correct nullable_relids for generated clauses. Although I believe the second part of this is not currently necessary before 9.2, I chose to back-patch it anyway, partly to keep the logic similar across branches and partly because it seems possible we might find other reasons why we need valid values of nullable_relids in the older branches. Add regression tests illustrating these problems. In 9.0 and up, also add test cases checking that we can push constants through outer joins, since we've broken that optimization before and I nearly broke it again with an overly simplistic patch for this problem.
* Close un-owned SMgrRelations at transaction end.Tom Lane2012-10-17
| | | | | | | | | | | | | | | | | | If an SMgrRelation is not "owned" by a relcache entry, don't allow it to live past transaction end. This design allows the same SMgrRelation to be used for blind writes of multiple blocks during a transaction, but ensures that we don't hold onto such an SMgrRelation indefinitely. Because an SMgrRelation typically corresponds to open file descriptors at the fd.c level, leaving it open when there's no corresponding relcache entry can mean that we prevent the kernel from reclaiming deleted disk space. (While CacheInvalidateSmgr messages usually fix that, there are cases where they're not issued, such as DROP DATABASE. We might want to add some more sinval messaging for that, but I'd be inclined to keep this type of logic anyway, since allowing VFDs to accumulate indefinitely for blind-written relations doesn't seem like a good idea.) This code replaces a previous attempt towards the same goal that proved to be unreliable. Back-patch to 9.1 where the previous patch was added.
* Revert "Use "transient" files for blind writes, take 2".Tom Lane2012-10-17
| | | | | | | | | | This reverts commit fba105b1099f4f5fa7283bb17cba6fed2baa8d0c. That approach had problems with the smgr-level state not tracking what we really want to happen, and with the VFD-level state not tracking the smgr-level state very well either. In consequence, it was still possible to hold kernel file descriptors open for long-gone tables (as in recent report from Tore Halset), and yet there were also cases of FDs being closed undesirably soon. A replacement implementation will follow.
* Split up process latch initialization for more-fail-soft behavior.Tom Lane2012-10-14
| | | | | | | | | | | | | | | | | | | | In the previous coding, new backend processes would attempt to create their self-pipe during the OwnLatch call in InitProcess. However, pipe creation could fail if the kernel is short of resources; and the system does not recover gracefully from a FATAL error right there, since we have armed the dead-man switch for this process and not yet set up the on_shmem_exit callback that would disarm it. The postmaster then forces an unnecessary database-wide crash and restart, as reported by Sean Chittenden. There are various ways we could rearrange the code to fix this, but the simplest and sanest seems to be to split out creation of the self-pipe into a new function InitializeLatchSupport, which must be called from a place where failure is allowed. For most processes that gets called in InitProcess or InitAuxiliaryProcess, but processes that don't call either but still use latches need their own calls. Back-patch to 9.1, which has only a part of the latch logic that 9.2 and HEAD have, but nonetheless includes this bug.
* Fix cross-type case in partial row matching for hashed subplans.Tom Lane2012-10-11
| | | | | | | | | | | | | | When hashing a subplan like "WHERE (a, b) NOT IN (SELECT x, y FROM ...)", findPartialMatch() attempted to match rows using the hashtable's internal equality operators, which of course are for x and y's datatypes. What we need to use are the potentially cross-type operators for a=x, b=y, etc. Failure to do that leads to wrong answers or even crashes. The scope for problems is limited to cases where we have different types with compatible hash functions (else we'd not be using a hashed subplan), but for example int4 vs int8 can cause the problem. Per bug #7597 from Bo Jensen. This has been wrong since the hashed-subplan code was written, so patch all the way back.
* Fix PGXS support for building loadable modules on AIX.Tom Lane2012-10-09
| | | | | | | | | | Building a shlib on AIX requires use of the mkldexport.sh script, but we failed to install that, preventing its use from non-source-tree contexts. Also, Makefile.aix had the wrong idea about where to find the installed copy of the postgres.imp symbol file used by AIX. Per report from John Pierce. Patch all the way back, since this has been broken since the beginning of PGXS.
* Say ANALYZE, not VACUUM, in error message on analyze in hot standby.Heikki Linnakangas2012-10-08
| | | | Tomonaru Katsumata
* REASSIGN OWNED: consider grants on tablespaces, tooAlvaro Herrera2012-10-03
| | | | | | | | Apparently this was considered in the original code (see commit cec3b0a9) but I failed to notice that such entries would always be skipped by the database check at the start of the loop. Per bugs #7578 by Nikolay, #6116 by tushar.qa@gmail.com.
* Fix access past end of string in date parsing.Heikki Linnakangas2012-10-02
| | | | | | This affects date_in(), and a couple of other funcions that use DecodeDate(). Hitoshi Harada
* Fix tar files emitted by pg_basebackup to be POSIX conformant.Tom Lane2012-09-28
| | | | | | | | | | | | | Back-patch portions of commit 05b555d12bc2ad0d581f48a12b45174db41dc10d. There doesn't seem to be any reason not to fix pg_basebackup fully, but we can't change pg_dump's "magic" string without breaking older versions of pg_restore. Instead, just patch pg_restore to accept either version of the magic string, in hopes of avoiding compatibility problems when 9.3 comes out. I also fixed pg_dump to write the correct 2-block EOF marker, since that won't create a compatibility problem with pg_restore and it could help with some versions of tar. Brian Weaver and Tom Lane
* Translation updatesPeter Eisentraut2012-09-19
|
* Fix bufmgr so CHECKPOINT_END_OF_RECOVERY behaves as a shutdown checkpoint.Simon Riggs2012-09-16
| | | | | | | | | Recovery code documents clearly that a shutdown checkpoint is executed at end of recovery - a shutdown checkpoint WAL record is written but the buffer manager had been altered to treat end of recovery as a normal checkpoint. This bug exacerbates the bufmgr relpersistence bug. Bug spotted by Andres Freund, patch by me.
* Properly set relpersistence for fake relcache entries.Robert Haas2012-09-14
| | | | | | | This can result in buffers failing to be properly flushed at checkpoint time, leading to data loss. Report, diagnosis, and patch by Jeff Davis.
* Fix logical errors in tsquery selectivity estimation for prefix queries.Tom Lane2012-09-11
| | | | | | | | | | | | | | | | | | | | | | I made multiple errors in commit 97532f7c29468010b87e40a04f8daa3eb097f654, stemming mostly from failure to think about the available frequency data as being element frequencies not value frequencies (so that occurrences of different elements are not mutually exclusive). This led to sillinesses such as estimating that "word" would match more rows than "word:*". The choice to clamp to a minimum estimate of DEFAULT_TS_MATCH_SEL also seems pretty ill-considered in hindsight, as it would frequently result in an estimate much larger than the available data suggests. We do need some sort of clamp, since a pattern not matching any of the MCELEMs probably still needs a selectivity estimate of more than zero. I chose instead to clamp to at least what a non-MCELEM word would be estimated as, preserving the property that "word:*" doesn't get an estimate less than plain "word", whether or not the word appears in MCELEM. Per investigation of a gripe from Bill Martin, though I suspect that his example case actually isn't even reaching the erroneous code. Back-patch to 9.1 where this code was introduced.
* Fix PARAM_EXEC assignment mechanism to be safe in the presence of WITH.Tom Lane2012-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The planner previously assumed that parameter Vars having the same absolute query level, varno, and varattno could safely be assigned the same runtime PARAM_EXEC slot, even though they might be different Vars appearing in different subqueries. This was (probably) safe before the introduction of CTEs, but the lazy-evalution mechanism used for CTEs means that a CTE can be executed during execution of some other subquery, causing the lifespan of Params at the same syntactic nesting level as the CTE to overlap with use of the same slots inside the CTE. In 9.1 we created additional hazards by using the same parameter-assignment technology for nestloop inner scan parameters, but it was broken before that, as illustrated by the added regression test. To fix, restructure the planner's management of PlannerParamItems so that items having different semantic lifespans are kept rigorously separated. This will probably result in complex queries using more runtime PARAM_EXEC slots than before, but the slots are cheap enough that this hardly matters. Also, stop generating PlannerParamItems containing Params for subquery outputs: all we really need to do is reserve the PARAM_EXEC slot number, and that now only takes incrementing a counter. The planning code is simpler and probably faster than before, as well as being more correct. Per report from Vik Reykja. Back-patch of commit 46c508fbcf98ac334f1e831d21021d731c882fbb into all branches that support WITH.
* Fix inappropriate error messages for Hot Standby misconfiguration errors.Tom Lane2012-09-05
| | | | | | | | Give the correct name of the GUC parameter being complained of. Also, emit a more suitable SQLSTATE (INVALID_PARAMETER_VALUE, not the default INTERNAL_ERROR). Gurjeet Singh, errcode adjustment by me
* Make configure probe for mbstowcs_l as well as wcstombs_l.Tom Lane2012-08-31
| | | | | | | | | | | We previously supposed that any given platform would supply both or neither of these functions, so that one configure test would be sufficient. It now appears that at least on AIX this is not the case ... which is likely an AIX bug, but nonetheless we need to cope with it. So use separate tests. Per bug #6758; thanks to Andrew Hastie for doing the followup testing needed to confirm what was happening. Backpatch to 9.1, where we began using these functions.
* Back-patch recent fixes for gistchoose and gistRelocateBuildBuffersOnSplit.Tom Lane2012-08-30
| | | | | | | | | | | | | | | | This back-ports commits c8ba697a4bdb934f0c51424c654e8db6133ea255 and e5db11c5582b469c04a11f217a0f32c827da5dd7, which fix one definite and one speculative bug in gistchoose, and make the code a lot more intelligible as well. In 9.2 only, this also affects the largely-copied-and-pasted logic in gistRelocateBuildBuffersOnSplit. The impact of the bugs was that the functions might make poor decisions as to which index tree branch to push a new entry down into, resulting in GiST index bloat and poor performance. The fixes rectify these decisions for future insertions, but a REINDEX would be needed to clean up any existing index bloat. Alexander Korotkov, Robert Haas, Tom Lane
* Add missing period to detail message.Robert Haas2012-08-30
| | | | Per note from Peter Eisentraut.
* Back-patch fixes for some issues in our Windows socket code into 9.1.Robert Haas2012-08-27
| | | | | | This is a backport of commit b85427f2276d02756b558c0024949305ea65aca5. Per discussion of bug #4958. Some of these fixes probably need to be back-patched further, but I'm just doing this much for now.
* Fix issues with checks for unsupported transaction states in Hot Standby.Tom Lane2012-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | The GUC check hooks for transaction_read_only and transaction_isolation tried to check RecoveryInProgress(), so as to disallow setting read/write mode or serializable isolation level (respectively) in hot standby sessions. However, GUC check hooks can be called in many situations where we're not connected to shared memory at all, resulting in a crash in RecoveryInProgress(). Among other cases, this results in EXEC_BACKEND builds crashing during child process start if default_transaction_isolation is serializable, as reported by Heikki Linnakangas. Protect those calls by silently allowing any setting when not inside a transaction; which is okay anyway since these GUCs are always reset at start of transaction. Also, add a check to GetSerializableTransactionSnapshot() to complain if we are in hot standby. We need that check despite the one in check_XactIsoLevel() because default_transaction_isolation could be serializable. We don't want to complain any sooner than this in such cases, since that would prevent running transactions at all in such a state; but a transaction can be run, if SET TRANSACTION ISOLATION is done before setting a snapshot. Per report some months ago from Robert Haas. Back-patch to 9.1, since these problems were introduced by the SSI patch. Kevin Grittner and Tom Lane, with ideas from Heikki Linnakangas
* Fix cascading privilege revoke to notice when privileges are still held.Tom Lane2012-08-23
| | | | | | | | | | | | If we revoke a grant option from some role X, but X still holds the option via another grant, we should not recursively revoke the privilege from role(s) Y that X had granted it to. This was supposedly fixed as one aspect of commit 4b2dafcc0b1a579ef5daaa2728223006d1ff98e9, but I must not have tested it, because in fact that code never worked: it forgot to shift the grant-option bits back over when masking the bits being revoked. Per bug #6728 from Daniel German. Back-patch to all active branches, since this has been wrong since 8.0.
* Fix rescan logic in nodeCtescan.Tom Lane2012-08-15
| | | | | | | | | | | | | | | | | | | | | | The previous coding essentially assumed that nodes would be rescanned in the same order they were initialized in; or at least that the "leader" of a group of CTEscans would be rescanned before any others were required to execute. Unfortunately, that isn't even a little bit true. It's possible to devise queries in which the leader isn't rescanned until other CTEscans on the same CTE have run to completion, or even in which the leader never gets a rescan call at all. The fix makes the leader specially responsible only for initial creation and final destruction of the tuplestore; rescan resets are now a symmetrically shared responsibility. This means that we might reset the tuplestore multiple times when restarting a plan subtree containing multiple CTEscans; but resetting an already-empty tuplestore is cheap enough that that doesn't seem like a problem. Per report from Adam Mackler; the new regression test cases are based on his example query. Back-patch to 8.4 where CTE scans were introduced.
* Disallow extensions from owning the schema they are assigned to.Tom Lane2012-08-15
| | | | | | | | | | | This situation creates a dependency loop that confuses pg_dump and probably other things. Moreover, since the mental model is that the extension "contains" schemas it owns, but "is contained in" its extschema (even though neither is strictly true), having both true at once is confusing for people too. So prevent the situation from being set up. Reported and patched by Thom Brown. Back-patch to 9.1 where extensions were added.
* Prevent access to external files/URLs via XML entity references.Tom Lane2012-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | xml_parse() would attempt to fetch external files or URLs as needed to resolve DTD and entity references in an XML value, thus allowing unprivileged database users to attempt to fetch data with the privileges of the database server. While the external data wouldn't get returned directly to the user, portions of it could be exposed in error messages if the data didn't parse as valid XML; and in any case the mere ability to check existence of a file might be useful to an attacker. The ideal solution to this would still allow fetching of references that are listed in the host system's XML catalogs, so that documents can be validated according to installed DTDs. However, doing that with the available libxml2 APIs appears complex and error-prone, so we're not going to risk it in a security patch that necessarily hasn't gotten wide review. So this patch merely shuts off all access, causing any external fetch to silently expand to an empty string. A future patch may improve this. In HEAD and 9.2, also suppress warnings about undefined entities, which would otherwise occur as a result of not loading referenced DTDs. Previous branches don't show such warnings anyway, due to different error handling arrangements. Credit to Noah Misch for first reporting the problem, and for much work towards a solution, though this simplistic approach was not his preference. Also thanks to Daniel Veillard for consultation. Security: CVE-2012-3489
* Translation updatesPeter Eisentraut2012-08-14
|
* Fix dependencies generated during ALTER TABLE ADD CONSTRAINT USING INDEX.Tom Lane2012-08-11
| | | | | | | | | | | | | | | | | | This command generated new pg_depend entries linking the index to the constraint and the constraint to the table, which match the entries made when a unique or primary key constraint is built de novo. However, it did not bother to get rid of the entries linking the index directly to the table. We had considered the issue when the ADD CONSTRAINT USING INDEX patch was written, and concluded that we didn't need to get rid of the extra entries. But this is wrong: ALTER COLUMN TYPE wasn't expecting such redundant dependencies to exist, as reported by Hubert Depesz Lubaczewski. On reflection it seems rather likely to break other things as well, since there are many bits of code that crawl pg_depend for one purpose or another, and most of them are pretty naive about what relationships they're expecting to find. Fortunately it's not that hard to get rid of the extra dependency entries, so let's do that. Back-patch to 9.1, where ALTER TABLE ADD CONSTRAINT USING INDEX was added.
* Fix upper limit of superuser_reserved_connections, add limit for wal_sendersMagnus Hagander2012-08-10
| | | | | | | | Should be limited to the maximum number of connections excluding autovacuum workers, not including. Add similar check for max_wal_senders, which should never be higher than max_connections.
* fsync backup_label after pg_start_backup()Simon Riggs2012-08-07
| | | | Dave Kerr, backpatched by Simon Riggs
* Fix bugs with parsing signed hh:mm and hh:mm:ss fields in interval input.Tom Lane2012-08-03
| | | | | | | | | | | | | | | | | | | | | | | DecodeInterval() failed to honor the "range" parameter (the special SQL syntax for indicating which fields appear in the literal string) if the time was signed. This seems inappropriate, so make it work like the not-signed case. The inconsistency was introduced in my commit f867339c0148381eb1d01f93ab5c79f9d10211de, which as noted in its log message was only really focused on making SQL-compliant literals work per spec. Including a sign here is not per spec, but if we're going to allow it then it's reasonable to expect it to work like the not-signed case. Also, remove bogus setting of tmask, which caused subsequent processing to think that what had been given was a timezone and not an hh:mm(:ss) field, thus confusing checks for redundant fields. This seems to be an aboriginal mistake in Lockhart's commit 2cf1642461536d0d8f3a1cf124ead0eac04eb760. Add regression test cases to illustrate the changed behaviors. Back-patch as far as 8.4, where support for spec-compliant interval literals was added. Range problem reported and diagnosed by Amit Kapila, tmask problem by me.
* Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT).Tom Lane2012-07-31
| | | | | | | | | | | | | Parse analysis neglected to cover the case of a WITH clause attached to an intermediate-level set operation; it only handled WITH at the top level or WITH attached to a leaf-level SELECT. Per report from Adam Mackler. In HEAD, I rearranged the order of SelectStmt's fields to put withClause with the other fields that can appear on non-leaf SelectStmts. In back branches, leave it alone to avoid a possible ABI break for third-party code. Back-patch to 8.4 where WITH support was added.
* Fix syslogger so that log_truncate_on_rotation works in the first rotation.Tom Lane2012-07-31
| | | | | | | | | | | | | | | In the original coding of the log rotation stuff, we did not bother to make the truncation logic work for the very first rotation after postmaster start (or after a syslogger crash and restart). It just always appended in that case. It did not seem terribly important at the time, but we've recently had two separate complaints from people who expected it to work unsurprisingly. (Both users tend to restart the postmaster about as often as a log rotation is configured to happen, which is maybe not typical use, but still...) Since the initial log file is opened in the postmaster, fixing this requires passing down some more state to the syslogger child process. It's always been like this, so back-patch to all supported branches.
* Only allow autovacuum to be auto-canceled by a directly blocked process.Tom Lane2012-07-26
| | | | | | | | | | | | | | | | | | | In the original coding of the autovacuum cancel feature, commit acac68b2bcae818bc8803b8cb8cbb17eee8d5e2b, an autovacuum process was considered a target for cancellation if it was found to hard-block any process examined in the deadlock search. This patch tightens the test so that the autovacuum must directly hard-block the current process. This should make the behavior more predictable in general, and in particular it ensures that an autovacuum will not be canceled with less than deadlock_timeout grace period. In the old coding, it was possible for an autovacuum to be canceled almost instantly, given unfortunate timing of two or more other processes' lock attempts. This also justifies the logging methodology in the recent commit d7318d43d891bd63e82dcfc27948113ed7b1db80; without this restriction, that patch isn't providing enough information to see the connection of the canceling process to the autovacuum. Like that one, patch all the way back.
* Log a better message when canceling autovacuum.Robert Haas2012-07-26
| | | | | | | | | | The old message was at DEBUG2, so typically it didn't show up in the log at all. As a result, in most cases where autovacuum was canceled, the only information that was logged was the table being vacuumed, with no indication as to what problem caused the cancel. Crank up the level to LOG and add some more details to assist with debugging. Back-patch all the way, per discussion on pgsql-hackers.
* Fix longstanding crash-safety bug with newly-created-or-reset sequences.Tom Lane2012-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a crash occurred immediately after the first nextval() call for a serial column, WAL replay would restore the sequence to a state in which it appeared that no nextval() had been done, thus allowing the first sequence value to be returned again by the next nextval() call; as reported in bug #6748 from Xiangming Mei. More generally, the problem would occur if an ALTER SEQUENCE was executed on a freshly created or reset sequence. (The manifestation with serial columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step to serial column creation.) The cause is that sequence creation attempted to save one WAL entry by writing out a WAL record that made it appear that the first nextval() had already happened (viz, with is_called = true), while marking the sequence's in-database state with log_cnt = 1 to show that the first nextval() need not emit a WAL record. However, ALTER SEQUENCE would emit a new WAL entry reflecting the actual in-database state (with is_called = false). Then, nextval would allocate the first sequence value and set is_called = true, but it would trust the log_cnt value and not emit any WAL record. A crash at this point would thus restore the sequence to its post-ALTER state, causing the next nextval() call to return the first sequence value again. To fix, get rid of the idea of logging an is_called status different from reality. This means that the first nextval-driven WAL record will happen at the first nextval call not the second, but the marginal cost of that is pretty negligible. In addition, make sure that ALTER SEQUENCE resets log_cnt to zero in any case where it touches sequence parameters that affect future nextval results. This will result in some user-visible changes in the contents of a sequence's log_cnt column, as reflected in the patch's regression test changes; but no application should be depending on that anyway, since it was already true that log_cnt changes rather unpredictably depending on checkpoint timing. In addition, make some basically-cosmetic improvements to get rid of sequence.c's undesirable intimacy with page layout details. It was always really trying to WAL-log the contents of the sequence tuple, so we should have it do that directly using a HeapTuple's t_data and t_len, rather than backing into it with some magic assumptions about where the tuple would be on the sequence's page. Back-patch to all supported branches.
* Fix whole-row Var evaluation to cope with resjunk columns (again).Tom Lane2012-07-20
| | | | | | | | | | | | | | | | | | When a whole-row Var is reading the result of a subquery, we need it to ignore any "resjunk" columns that the subquery might have evaluated for GROUP BY or ORDER BY purposes. We've hacked this area before, in commit 68e40998d058c1f6662800a648ff1e1ce5d99cba, but that fix only covered whole-row Vars of named composite types, not those of RECORD type; and it was mighty klugy anyway, since it just assumed without checking that any extra columns in the result must be resjunk. A proper fix requires getting hold of the subquery's targetlist so we can actually see which columns are resjunk (whereupon we can use a JunkFilter to get rid of them). So bite the bullet and add some infrastructure to make that possible. Per report from Andrew Dunstan and additional testing by Merlin Moncure. Back-patch to all supported branches. In 8.3, also back-patch commit 292176a118da6979e5d368a4baf27f26896c99a5, which for some reason I had not done at the time, but it's a prerequisite for this change.
* Improve coding around the fsync request queue.Tom Lane2012-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In all branches back to 8.3, this patch fixes a questionable assumption in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are no uninitialized pad bytes in the request queue structs. This would only cause trouble if (a) there were such pad bytes, which could happen in 8.4 and up if the compiler makes enum ForkNumber narrower than 32 bits, but otherwise would require not-currently-planned changes in the widths of other typedefs; and (b) the kernel has not uniformly initialized the contents of shared memory to zeroes. Still, it seems a tad risky, and we can easily remove any risk by pre-zeroing the request array for ourselves. In addition to that, we need to establish a coding rule that struct RelFileNode can't contain any padding bytes, since such structs are copied into the request array verbatim. (There are other places that are assuming this anyway, it turns out.) In 9.1 and up, the risk was a bit larger because we were also effectively assuming that struct RelFileNodeBackend contained no pad bytes, and with fields of different types in there, that would be much easier to break. However, there is no good reason to ever transmit fsync or delete requests for temp files to the bgwriter/checkpointer, so we can revert the request structs to plain RelFileNode, getting rid of the padding risk and saving some marginal number of bytes and cycles in fsync queue manipulation while we are at it. The savings might be more than marginal during deletion of a temp relation, because the old code transmitted an entirely useless but nonetheless expensive-to-process ForgetRelationFsync request to the background process, and also had the background process perform the file deletion even though that can safely be done immediately. In addition, make some cleanup of nearby comments and small improvements to the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
* Prevent corner-case core dump in rfree().Tom Lane2012-07-15
| | | | | | | | | | | | | rfree() failed to cope with the case that pg_regcomp() had initialized the regex_t struct but then failed to allocate any memory for re->re_guts (ie, the first malloc call in pg_regcomp() failed). It would try to touch the guts struct anyway, and thus dump core. This is a sufficiently narrow corner case that it's not surprising it's never been seen in the field; but still a bug is a bug, so patch all active branches. Noted while investigating whether we need to call pg_regfree after a failure return from pg_regcomp. Other than this bug, it turns out we don't, so adjust comments appropriately.
* Fix walsender processes to establish a SIGALRM handler.Tom Lane2012-07-12
| | | | | | | | | | | | | | | | | | | | | Walsenders must have working SIGALRM handling during InitPostgres, but they set the handler to SIG_IGN so that nothing would happen if a timeout was reached. This could result in two failure modes: * If a walsender participated in a deadlock during its authentication transaction, and was the last to wait in the deadly embrace, the deadlock would not get cleared automatically. This would require somebody to be trying to take out AccessExclusiveLock on multiple system catalogs, so it's not very probable. * If a client failed to respond to a walsender's authentication challenge, the intended disconnect after AuthenticationTimeout wouldn't happen, and the walsender would wait indefinitely for the client. For the moment, fix in back branches only, since this is fixed in a different way in the timeout-infrastructure patch that's awaiting application to HEAD. If we choose not to apply that, then we'll need to do this in HEAD as well.
* Back-patch fix for extraction of fixed prefixes from regular expressions.Tom Lane2012-07-10
| | | | | | Back-patch of commits 628cbb50ba80c83917b07a7609ddec12cda172d0 and c6aae3042be5249e672b731ebeb21875b5343010. This has been broken since 7.3, so back-patch to all supported branches.
* Back-patch addition of pg_wchar-to-multibyte conversion functionality.Tom Lane2012-07-10
| | | | | | | | | Back-patch of commits 72dd6291f216440f6bb61a8733729a37c7e3b2d2, f6a05fd973a102f7e66c491d3f854864b8d24844, and 60e9c224a197aa37abb1aa3aefa3aad42da61f7f. This is needed to support fixing the regex prefix extraction bug in back branches.
* Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.Tom Lane2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, pattern_fixed_prefix() was defined to return whatever fixed prefix it could extract from the pattern, plus the "rest" of the pattern. That definition was sensible for LIKE patterns, but not so much for regexes, where reconstituting a valid pattern minus the prefix could be quite tricky (certainly the existing code wasn't doing that correctly). Since the only thing that callers ever did with the "rest" of the pattern was to pass it to like_selectivity() or regex_selectivity(), let's cut out the middle-man and just have pattern_fixed_prefix's subroutines do this directly. Then pattern_fixed_prefix can return a simple selectivity number, and the question of how to cope with partial patterns is removed from its API specification. While at it, adjust the API spec so that callers who don't actually care about the pattern's selectivity (which is a lot of them) can pass NULL for the selectivity pointer to skip doing the work of computing a selectivity estimate. This patch is only an API refactoring that doesn't actually change any processing, other than allowing a little bit of useless work to be skipped. However, it's necessary infrastructure for my upcoming fix to regex prefix extraction, because after that change there won't be any simple way to identify the "rest" of the regex, not even to the low level of fidelity needed by regex_selectivity. We can cope with that if regex_fixed_prefix and regex_selectivity communicate directly, but not if we have to work within the old API. Hence, back-patch to all active branches.
* Fix planner to pass correct collation to operator selectivity estimators.Tom Lane2012-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | We can do this without creating an API break for estimation functions by passing the collation using the existing fmgr functionality for passing an input collation as a hidden parameter. The need for this was foreseen at the outset, but we didn't get around to making it happen in 9.1 because of the decision to sort all pg_statistic histograms according to the database's default collation. That meant that selectivity estimators generally need to use the default collation too, even if they're estimating for an operator that will do something different. The reason it's suddenly become more interesting is that regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE property), and we no longer want to use the wrong collation when examining regexps during planning. It's not that the selectivity estimate is likely to change much from this; rather that we are thinking of caching compiled regexps during planner estimation, and we won't get the intended benefit if we cache them with a different collation than the executor will use. Back-patch to 9.1, both because the regexp change is likely to get back-patched and because we might as well get this right in all collation-supporting branches, in case any third-party code wants to rely on getting the collation. The patch turns out to be minuscule now that I've done it ...
* Always treat a standby returning an an invalid flush location as asyncMagnus Hagander2012-07-04
| | | | | | | | This ensures that a standby such as pg_receivexlog will not be selected as sync standby - which would cause the master to block waiting for a location that could never happen. Fujii Masao
* Forgot an #include in the previous patch :-(Alvaro Herrera2012-07-03
|