aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils
Commit message (Collapse)AuthorAge
...
* Remove obsolete comment.Tom Lane2016-07-17
| | | | Peter Geoghegan
* Fix crash in close_ps() for NaN input coordinates.Tom Lane2016-07-16
| | | | | | | | | | The Assert() here seems unreasonably optimistic. Andreas Seltenreich found that it could fail with NaNs in the input geometries, and it seems likely to me that it might fail in corner cases due to roundoff error, even for ordinary input values. As a band-aid, make the function return SQL NULL instead of crashing. Report: <87d1md1xji.fsf@credativ.de>
* Avoid invalidating all foreign-join cached plans when user mappings change.Tom Lane2016-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must not push down a foreign join when the foreign tables involved should be accessed under different user mappings. Previously we tried to enforce that rule literally during planning, but that meant that the resulting plans were dependent on the current contents of the pg_user_mapping catalog, and we had to blow away all cached plans containing any remote join when anything at all changed in pg_user_mapping. This could have been improved somewhat, but the fact that a syscache inval callback has very limited info about what changed made it hard to do better within that design. Instead, let's change the planner to not consider user mappings per se, but to allow a foreign join if both RTEs have the same checkAsUser value. If they do, then they necessarily will use the same user mapping at runtime, and we don't need to know specifically which one that is. Post-plan-time changes in pg_user_mapping no longer require any plan invalidation. This rule does give up some optimization ability, to wit where two foreign table references come from views with different owners or one's from a view and one's directly in the query, but nonetheless the same user mapping would have applied. We'll sacrifice the first case, but to not regress more than we have to in the second case, allow a foreign join involving both zero and nonzero checkAsUser values if the nonzero one is the same as the prevailing effective userID. In that case, mark the plan as only runnable by that userID. The plancache code already had a notion of plans being userID-specific, in order to support RLS. It was a little confused though, in particular lacking clarity of thought as to whether it was the rewritten query or just the finished plan that's dependent on the userID. Rearrange that code so that it's clearer what depends on which, and so that the same logic applies to both RLS-injected role dependency and foreign-join-injected role dependency. Note that this patch doesn't remove the other issue mentioned in the original complaint, which is that while we'll reliably stop using a foreign join if it's disallowed in a new context, we might fail to start using a foreign join if it's now allowed, but we previously created a generic cached plan that didn't use one. It was agreed that the chance of winning that way was not high enough to justify the much larger number of plan invalidations that would have to occur if we tried to cause it to happen. In passing, clean up randomly-varying spelling of EXPLAIN commands in postgres_fdw.sql, and fix a COSTS ON example that had been allowed to leak into the committed tests. This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the previous attempt at ensuring we wouldn't push down foreign joins that span permissions contexts. Etsuro Fujita and Tom Lane Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
* Fix parsing NOT sequence in tsqueryTeodor Sigaev2016-07-15
| | | | | | | Digging around bug #14245 I found that commit 6734a1cacd44f5b731933cbc93182b135b167d0c missed that NOT operation is right associative in opposite to all other. This miss is resposible for tsquery parser fail on sequence of NOT operations
* Fix nested NOT operation cleanup in tsquery.Teodor Sigaev2016-07-15
| | | | | | | | | During normalization of tsquery tree it tries to simplify nested NOT operations but there it's obvioulsy missed that subsequent node could be a leaf node (value node) Bug #14245: Segfault on weird to_tsquery Reported by David Kellum.
* Adjust spellings of forms of "cancel"Peter Eisentraut2016-07-14
|
* Fix GiST index build for NaN values in geometric types.Tom Lane2016-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GiST index build could go into an infinite loop when presented with boxes (or points, circles or polygons) containing NaN component values. This happened essentially because the code assumed that x == x is true for any "double" value x; but it's not true for NaNs. The looping behavior was not the only problem though: we also attempted to sort the items using simple double comparisons. Since NaNs violate the trichotomy law, qsort could (in principle at least) get arbitrarily confused and mess up the sorting of ordinary values as well as NaNs. And we based splitting choices on box size calculations that could produce NaNs, again resulting in undesirable behavior. To fix, replace all comparisons of doubles in this logic with float8_cmp_internal, which is NaN-aware and is careful to sort NaNs consistently, higher than any non-NaN. Also rearrange the box size calculation to not produce NaNs; instead it should produce an infinity for a box with NaN on one side and not-NaN on the other. I don't by any means claim that this solves all problems with NaNs in geometric values, but it should at least make GiST index insertion work reliably with such data. It's likely that the index search side of things still needs some work, and probably regular geometric operations too. But with this patch we're laying down a convention for how such cases ought to behave. Per bug #14238 from Guang-Dih Lei. Back-patch to 9.2; the code used before commit 7f3bd86843e5aad8 is quite different and doesn't lock up on my simple test case, nor on the submitter's dataset. Report: <20160708151747.1426.60150@wrigleys.postgresql.org> Discussion: <28685.1468246504@sss.pgh.pa.us>
* Properly adjust pointers when tuples are moved during CLUSTER.Robert Haas2016-07-07
| | | | | | | | Otherwise, when we abandon incremental memory accounting and use batch allocation for the final merge pass, we might crash. This has been broken since 0011c0091e886b874e485a46ff2c94222ffbf550. Peter Geoghegan, tested by Noah Misch
* Fix a prototype which is inconsistent with the function definition.Robert Haas2016-07-07
| | | | Peter Geoghegan
* Clarify resource utilization of parallel query.Robert Haas2016-07-07
| | | | | | | | | | | | temp_file_limit is a per-process limit, not a per-session limit across all cooperating parallel processes; change wording accordingly, per a suggestion from Tom Lane. Also, document under max_parallel_workers_per_gather the fact that each process involved in a parallel query may use as many resources as a separate session. Caveat emptor. Per a complaint from Peter Geoghegan.
* Fix typosPeter Eisentraut2016-07-06
|
* Fix typo in comment.Fujii Masao2016-07-06
| | | | Author: Masahiko Sawada
* Be more paranoid in ruleutils.c's get_variable().Tom Lane2016-07-01
| | | | | | | | | | | | | | | | | | | We were merely Assert'ing that the Var matched the RTE it's supposedly from. But if the user passes incorrect information to pg_get_expr(), the RTE might in fact not match; this led either to Assert failures or core dumps, as reported by Chris Hanks in bug #14220. To fix, just convert the Asserts to test-and-elog. Adjust an existing test-and-elog elsewhere in the same function to be consistent in wording. (If we really felt these were user-facing errors, we might promote them to ereport's; but I can't convince myself that they're worth translating.) Back-patch to 9.3; the problematic code doesn't exist before that, and a quick check says that 9.2 doesn't crash on such cases. Michael Paquier and Thomas Munro Report: <20160629224349.1407.32667@wrigleys.postgresql.org>
* Fix crash bug in RestoreSnapshot.Robert Haas2016-07-01
| | | | | | | | | If serialized_snapshot->subxcnt > 0 and serialized_snapshot->xcnt == 0, the old coding would do the wrong thing and crash. This can happen on standby servers. Report by Andreas Seltenreich. Patch by Thomas Munro, reviewed by Amit Kapila and tested by Andreas Seltenreich.
* Change predecence of phrase operator.Teodor Sigaev2016-06-27
| | | | | | | | | | <-> operator now have higher predecence than & (AND) operator. This change was motivated by unexpected difference of similar queries: 'a & b <-> c'::tsquery and 'b <-> c & a'. Before first query means (a & b) <-> c and second one - '(b <-> c) & a', now phrase operator evaluates first. Per suggestion from Tom Lane 32260.1465402409@sss.pgh.pa.us
* Do not fallback to AND for FTS phrase operator.Teodor Sigaev2016-06-27
| | | | | | | | | If there is no positional information of lexemes then phrase operator will not fallback to AND operator. This change makes needing to modify TS_execute() interface, because somewhere (in indexes, for example) positional information is unaccesible and in this cases we need to force fallback to AND. Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
* Make exact distance match for FTS phrase operatorTeodor Sigaev2016-06-27
| | | | | | | Phrase operator now requires exact distance betweens lexems instead of less-or-equal. Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
* Rethink node-level representation of partial-aggregation modes.Tom Lane2016-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit 59a3795c2 got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>
* Fix handling of multixacts predating pg_upgradeAlvaro Herrera2016-06-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After pg_upgrade, it is possible that some tuples' Xmax have multixacts corresponding to the old installation; such multixacts cannot have running members anymore. In many code sites we already know not to read them and clobber them silently, but at least when VACUUM tries to freeze a multixact or determine whether one needs freezing, there's an attempt to resolve it to its member transactions by calling GetMultiXactIdMembers, and if the multixact value is "in the future" with regards to the current valid multixact range, an error like this is raised: ERROR: MultiXactId 123 has not been created yet -- apparent wraparound and vacuuming fails. Per discussion with Andrew Gierth, it is completely bogus to try to resolve multixacts coming from before a pg_upgrade, regardless of where they stand with regards to the current valid multixact range. It's possible to get from under this problem by doing SELECT FOR UPDATE of the problem tuples, but if tables are large, this is slow and tedious, so a more thorough solution is desirable. To fix, we realize that multixacts in xmax created in 9.2 and previous have a specific bit pattern that is never used in 9.3 and later (we already knew this, per comments and infomask tests sprinkled in various places, but we weren't leveraging this knowledge appropriately). Whenever the infomask of the tuple matches that bit pattern, we just ignore the multixact completely as if Xmax wasn't set; or, in the case of tuple freezing, we act as if an unwanted value is set and clobber it without decoding. This guarantees that no errors will be raised, and that the values will be progressively removed until all tables are clean. Most callers of GetMultiXactIdMembers are patched to recognize directly that the value is a removable "empty" multixact and avoid calling GetMultiXactIdMembers altogether. To avoid changing the signature of GetMultiXactIdMembers() in back branches, we keep the "allow_old" boolean flag but rename it to "from_pgupgrade"; if the flag is true, we always return an empty set instead of looking up the multixact. (I suppose we could remove the argument in the master branch, but I chose not to do so in this commit). This was broken all along, but the error-facing message appeared first because of commit 8e9a16ab8f7f and was partially fixed in a25c2b7c4db3. This fix, backpatched all the way back to 9.3, goes approximately in the same direction as a25c2b7c4db3 but should cover all cases. Bug analysis by Andrew Gierth and Álvaro Herrera. A number of public reports match this bug: https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org
* Fix small memory leak in partial-aggregate deserialization functions.Tom Lane2016-06-23
| | | | | | | | | | | | | | | | A deserialize function's result is short-lived data during partial aggregation, since we're just going to pass it to the combine function and then it's of no use anymore. However, the built-in deserialize functions allocated their results in the aggregate state context, resulting in a query-lifespan memory leak. It's probably not possible for this to amount to anything much at present, since the number of leaked results would only be the number of worker processes. But it might become a problem in future. To fix, don't use the same convenience subroutine for setting up results that the aggregate transition functions use. David Rowley Report: <10050.1466637736@sss.pgh.pa.us>
* Fix type-safety problem with parallel aggregate serial/deserialization.Tom Lane2016-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
* Restore foreign-key-aware estimation of join relation sizes.Tom Lane2016-06-18
| | | | | | | | | | | | | | | | | | | | This patch provides a new implementation of the logic added by commit 137805f89 and later removed by 77ba61080. It differs from the original primarily in expending much less effort per joinrel in large queries, which it accomplishes by doing most of the matching work once per query not once per joinrel. Hopefully, it's also less buggy and better commented. The never-documented enable_fkey_estimates GUC remains gone. There remains work to be done to make the selectivity estimates account for nulls in FK referencing columns; but that was true of the original patch as well. We may be able to address this point later in beta. In the meantime, any error should be in the direction of overestimating rather than underestimating joinrel sizes, which seems like the direction we want to err in. Tomas Vondra and Tom Lane Discussion: <31041.1465069446@sss.pgh.pa.us>
* Fix validation of overly-long IPv6 addresses.Tom Lane2016-06-16
| | | | | | | | | The inet/cidr types sometimes failed to reject IPv6 inputs with too many colon-separated fields, instead translating them to '::/0'. This is the result of a thinko in the original ISC code that seems to be as yet unreported elsewhere. Per bug #14198 from Stefan Kaltenbrunner. Report: <20160616182222.5798.959@wrigleys.postgresql.org>
* Invent min_parallel_relation_size GUC to replace a hard-wired constant.Tom Lane2016-06-16
| | | | | | | | | | | | | | The main point of doing this is to allow the cutoff to be set very small, even zero, to allow parallel-query behavior to be tested on relatively small tables such as we typically use in the regression tests. But it might be of use to users too. The number-of-workers scaling behavior in create_plain_partial_paths() is pretty ad-hoc and subject to change, so we won't expose anything about that, but the notion of not considering parallel query at all for tables below size X seems reasonably stable. Amit Kapila, per a suggestion from me Discussion: <17170.1465830165@sss.pgh.pa.us>
* Finish pgindent run for 9.6: Perl files.Noah Misch2016-06-12
|
* Change default of backend_flush_after GUC to 0 (disabled).Andres Freund2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | While beneficial, both for throughput and average/worst case latency, in a significant number of workloads, there are other workloads in which backend_flush_after can cause significant performance regressions in comparison to < 9.6 releases. The regression is most likely when the hot data set is bigger than shared buffers, but significantly smaller than the operating system's page cache. I personally think that the benefit of enabling backend flush control is considerably bigger than the potential downsides, but a fair argument can be made that not regressing is more important than improving performance/latency. As the latter is the consensus, change the default to 0. The other settings introduced in 428b1d6b2 do not have the same potential for regressions, so leave them enabled. Benchmarks leading up to changing the default have been performed by Mithun Cy, Ashutosh Sharma and Robert Haas. Discussion: CAD__OuhPmc6XH=wYRm_+Q657yQE88DakN4=Ybh2oveFasHkoeA@mail.gmail.com
* Refactor to reduce code duplication for function property checking.Tom Lane2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | As noted by Andres Freund, we'd accumulated quite a few similar functions in clauses.c that examine all functions in an expression tree to see if they satisfy some boolean test. Reduce the duplication by inventing a function check_functions_in_node() that applies a simple callback function to each SQL function OID appearing in a given expression node. This also fixes some arguable oversights; for example, contain_mutable_functions() did not check aggregate or window functions for mutability. I doubt that that represents a live bug at the moment, because we don't really consider mutability for aggregates; but it might someday be one. I chose to put check_functions_in_node() in nodeFuncs.c because it seemed like other modules might wish to use it in future. That in turn forced moving set_opfuncid() et al into nodeFuncs.c, as the alternative was for nodeFuncs.c to depend on optimizer/setrefs.c which didn't seem very clean. In passing, teach contain_leaked_vars_walker() about a few more expression node types it can safely look through, and improve the rather messy and undercommented code in has_parallel_hazard_walker(). Discussion: <20160527185853.ziol2os2zskahl7v@alap3.anarazel.de>
* Fix interaction between CREATE INDEX and "snapshot too old".Kevin Grittner2016-06-10
| | | | | | | | | | | | | | | Since indexes are created without valid LSNs, an index created while a snapshot older than old_snapshot_threshold existed could cause queries to return incorrect results when those old snapshots were used, if any relevant rows had been subject to early pruning before the index was built. Prevent usage of a newly created index until all such snapshots are released, for relations where this can happen. Questions about the interaction of "snapshot too old" with index creation were initially raised by Andres Freund. Reviewed by Robert Haas.
* Improve the situation for parallel query versus temp relations.Tom Lane2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Transmit the leader's temp-namespace state to workers. This is important because without it, the workers do not really have the same search path as the leader. For example, there is no good reason (and no extant code either) to prevent a worker from executing a temp function that the leader created previously; but as things stood it would fail to find the temp function, and then either fail or execute the wrong function entirely. We still prohibit a worker from creating a temp namespace on its own. In effect, a worker can only see the session's temp namespace if the leader had created it before starting the worker, which seems like the right semantics. Also, transmit the leader's BackendId to workers, and arrange for workers to use that when determining the physical file path of a temp relation belonging to their session. While the original intent was to prevent such accesses entirely, there were a number of holes in that, notably in places like dbsize.c which assume they can safely access temp rels of other sessions anyway. We might as well get this right, as a small down payment on someday allowing workers to access the leader's temp tables. (With this change, directly using "MyBackendId" as a relation or buffer backend ID is deprecated; you should use BackendIdForTempRelations() instead. I left a couple of such uses alone though, as they're not going to be reachable in parallel workers until we do something about localbuf.c.) Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down into localbuf.c, which is where it actually matters, instead of having it in relation_open(). This amounts to recognizing that access to temp tables' catalog entries is perfectly safe in a worker, it's only the data in local buffers that is problematic. Having done all that, we can get rid of the test in has_parallel_hazard() that says that use of a temp table's rowtype is unsafe in parallel workers. That test was unduly expensive, and if we really did need such a prohibition, that was not even close to being a bulletproof guard for it. (For example, any user-defined function executed in a parallel worker might have attempted such access.)
* pgindent run for 9.6Robert Haas2016-06-09
|
* Eliminate "parallel degree" terminology.Robert Haas2016-06-09
| | | | | | | | | | | | This terminology provoked widespread complaints. So, instead, rename the GUC max_parallel_degree to max_parallel_workers_per_gather (leaving room for a possible future GUC max_parallel_workers that acts as a system-wide limit), and rename the parallel_degree reloption to parallel_workers. Rename structure members to match. These changes create a dump/restore hazard for users of PostgreSQL 9.6beta1 who have set the reloption (or applied the GUC using ALTER USER or ALTER DATABASE).
* Revert "Use Foreign Key relationships to infer multi-column join selectivity".Tom Lane2016-06-07
| | | | | | | | | | | | | | This commit reverts 137805f89 as well as the associated commits 015e88942, 5306df283, and 68d704edb. We found multiple bugs in this feature, and there was concern about possible planner slowdown (though to be fair, exhibiting a very large slowdown proved difficult). The way forward requires a considerable rewrite, which may or may not be possible to accomplish in time for beta2. In my judgment reviewing the rewrite will be easier to accomplish starting from a clean slate, so let's temporarily revert what's there now. This also leaves us in a safe state if it turns out to be necessary to postpone the rewrite to the next development cycle. Discussion: <20160429102531.GA13701@huehner.biz>
* Message style and wording fixesPeter Eisentraut2016-06-07
|
* Minor typos / copy-editing for snapmgr.cStephen Frost2016-06-07
| | | | Noticed while reviewing snapshot management.
* Inline the easy cases in MakeExpandedObjectReadOnly().Tom Lane2016-06-03
| | | | | | | | | | | | | | This attempts to buy back some of whatever performance we lost from fixing bug #14174 by inlining the initial checks in MakeExpandedObjectReadOnly() into the callers. We can do that in a macro without creating multiple- evaluation hazards, so it's pretty much free notationally; and the amount of code added to callers should be minimal as well. (Testing a value can't take many more instructions than passing it to a subroutine.) Might as well inline DatumIsReadWriteExpandedObject() while we're at it. This is an ABI break for callers, so it doesn't seem safe to put into 9.5, but I see no reason not to do it in HEAD.
* Add new snapshot fields to serialize/deserialize functions.Kevin Grittner2016-06-03
| | | | | | | The "snapshot too old" condition was not being recognized when using a copied snapshot, since the original timestamp and lsn were not being passed along. Noticed when testing the combination of "snapshot too old" with parallel query execution.
* Fix various common mispellings.Greg Stark2016-06-03
| | | | | | | | | | Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
* Be more predictable about reporting "lock timeout" vs "statement timeout".Tom Lane2016-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <8693.1464314819@sss.pgh.pa.us>
* Fix assorted missing infrastructure for ON CONFLICT.Tom Lane2016-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | subquery_planner() failed to apply expression preprocessing to the arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the theory was that this wasn't necessary because we don't actually try to execute those expressions; but that's wrong, because it results in failure to match to index expressions or index predicates that are changed at all by preprocessing. Per bug #14132 from Reynold Smith. Also add pullup_replace_vars processing for onConflictWhere. Perhaps it's impossible to have a subquery reference there, but I'm not exactly convinced; and even if true today it's a failure waiting to happen. Also add some comments to other places where one or another field of OnConflictExpr is intentionally ignored, with explanation as to why it's okay to do so. Also, catalog/dependency.c failed to record any dependency on the named constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to be dropped while rules exist that depend on it, and allowing pg_dump to dump such a rule before the constraint it refers to. The normal execution path managed to error out reasonably for a dangling constraint reference, but ruleutils.c dumped core; so in addition to fixing the omission, add a protective check in ruleutils.c, since we can't retroactively add a dependency in existing databases. Back-patch to 9.5 where this code was introduced. Report: <20160510190350.2608.48667@wrigleys.postgresql.org>
* Mitigate "snapshot too old" performance regression on NUMAKevin Grittner2016-05-06
| | | | | | | | | | | | Limit maintenance of time to xid mapping to once per minute. At least in the tested case this brings performance within 5% of when the feature is off, compared to several times slower without this patch. While there, fix comments and whitespace. Ants Aasma, with cosmetic adjustments suggested by Andres Freund Reviewed by Kevin Grittner and Andres Freund
* Limit maximum parallel degree to 1024.Robert Haas2016-05-06
| | | | | | | | | | | | | | This new limit affects both the max_parallel_degree GUC and the parallel_degree reloption. There may some day be a use case for using more than 1024 CPUs for a single query, but that's surely not the case right now. Not only do not very many people have that many CPUs, but the code hasn't been tested at that kind of scale and is very unlikely to perform well, or even work at all, without a lot more work. The issue addressed by commit 06bd458cb812623c3f1fdd55216c4c08b06a8447 is probably just one problem of many. The idea of a more reasonable limit here was suggested by Tom Lane; the value of 1024 was suggested by Amit Kapila.
* Fix possible read past end of string in to_timestamp().Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.
* Fix hash index vs "snapshot too old" problemmsKevin Grittner2016-05-06
| | | | | | | | | | | | | | | | Hash indexes are not WAL-logged, and so do not maintain the LSN of index pages. Since the "snapshot too old" feature counts on detecting error conditions using the LSN of a table and all indexes on it, this makes it impossible to safely do early vacuuming on any table with a hash index, so add this to the tests for whether the xid used to vacuum a table can be adjusted based on old_snapshot_threshold. While at it, add a paragraph to the docs for old_snapshot_threshold which specifically mentions this and other aspects of the feature which may otherwise surprise users. Problem reported and patch reviewed by Amit Kapila
* Rename tsvector delete() to ts_delete(), and filter() to ts_filter().Tom Lane2016-05-05
| | | | | | | | | The similarity of the original names to SQL keywords seems like a bad idea. Rename them before we're stuck with 'em forever. In passing, minor code and docs cleanup. Discussion: <4875.1462210058@sss.pgh.pa.us>
* Fix corner-case loss of precision in numeric pow() calculationDean Rasheed2016-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7d9a4737c268f61fb8800957631f12d3f13be218 greatly improved the accuracy of the numeric transcendental functions, however it failed to consider the case where the result from pow() is close to the overflow threshold, for example 0.12 ^ -2345.6. For such inputs, where the result has more than 2000 digits before the decimal point, the decimal result weight estimate was being clamped to 2000, leading to a loss of precision in the final calculation. Fix this by replacing the clamping code with an overflow test that aborts the calculation early if the final result is sure to overflow, based on the overflow limit in exp_var(). This provides the same protection against integer overflow in the subsequent result scale computation as the original clamping code, but it also ensures that precision is never lost and saves compute cycles in cases that are sure to overflow. The new early overflow test works with the initial low-precision result (expected to be accurate to around 8 significant digits) and includes a small fuzz factor to ensure that it doesn't kick in for values that would not overflow exp_var(), so the overall overflow threshold of pow() is unchanged and consistent for all inputs with non-integer exponents. Author: Dean Rasheed Reviewed-by: Tom Lane Discussion: http://www.postgresql.org/message-id/CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com See-also: http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com
* Fix crash of filter(tsvector)Teodor Sigaev2016-05-04
| | | | | | | Variable storing a position of lexeme, had a wrong type: char, it's obviously not enough to store 2^14 possible positions. Stas Kelvich
* Fix more things to be parallel-safe.Robert Haas2016-05-03
| | | | | | | | | | | Conversion functions were previously marked as parallel-unsafe, since that is the default, but in fact they are safe. Parallel-safe functions defined in pg_proc.h and redefined in system_views.sql were ending up as parallel-unsafe because the redeclarations were not marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug() parallel safe also. Andreas Karlsson
* Tweak a few more things in preparation for upcoming pgindent run.Robert Haas2016-05-03
| | | | | | | | These adjustments adjust code and comments in minor ways to prevent pgindent from mangling them. Among other things, I tried to avoid situations where pgindent would emit "a +b" instead of "a + b", and I tried to avoid having it break up inline comments across multiple lines.
* Note that max_worker_processes requires restart.Robert Haas2016-05-03
| | | | | | Since this is a minor issue, no back-patch. Julien Rouhaud
* Add a few entries to the tail of time mapping, to see old values.Kevin Grittner2016-04-29
| | | | | | | | | | | | | | | | | | Without a few entries beyond old_snapshot_threshold, the lookup would often fail, resulting in the more aggressive pruning or vacuum being skipped often enough to matter. This was very clearly shown by a python test script posted by Ants Aasma, and was likely a factor in an earlier but somewhat less clear-cut test case posted by Jeff Janes. This patch makes no change to the logic, per se -- it just makes the array of mapping entries big enough to make lookup misses based on timing much less likely. An occasional miss is still possible if a thread stalls for more than 10 minutes, but that does not create any problem with correctness of behavior. Besides, if things are so busy that a thread is stalling for more than 10 minutes, it is probably OK to skip the more aggressive cleanup at that particular point in time.