aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils
Commit message (Collapse)AuthorAge
* Avoid testing tuple visibility without buffer lock in RI_FKey_check().Tom Lane2016-10-23
| | | | | | | | | | | | | | | | Despite the argumentation I wrote in commit 7a2fe85b0, it's unsafe to do this, because in corner cases it's possible for HeapTupleSatisfiesSelf to try to set hint bits on the target tuple; and at least since 8.2 we have required the buffer content lock to be held while setting hint bits. The added regression test exercises one such corner case. Unpatched, it causes an assertion failure in assert-enabled builds, or otherwise would cause a hint bit change in a buffer we don't hold lock on, which given the right race condition could result in checksum failures or other data consistency problems. The odds of a problem in the field are probably pretty small, but nonetheless back-patch to all supported branches. Report: <19391.1477244876@sss.pgh.pa.us>
* Suppress "Factory" zone in pg_timezone_names view for tzdata >= 2016g.Tom Lane2016-10-19
| | | | | | IANA got rid of the really silly "abbreviation" and replaced it with one that's only moderately silly. But it's still pointless, so keep on not showing it.
* Fix cidin() to handle values above 2^31 platform-independently.Tom Lane2016-10-18
| | | | | | | | | | | | | | | | | | | | | | CommandId is declared as uint32, and values up to 4G are indeed legal. cidout() handles them properly by treating the value as unsigned int. But cidin() was just using atoi(), which has platform-dependent behavior for values outside the range of signed int, as reported by Bart Lengkeek in bug #14379. Use strtoul() instead, as xidin() does. In passing, make some purely cosmetic changes to make xidin/xidout look more like cidin/cidout; the former didn't have a monopoly on best practice IMO. Neither xidin nor cidin make any attempt to throw error for invalid input. I didn't change that here, and am not sure it's worth worrying about since neither is really a user-facing type. The point is just to ensure that indubitably-valid inputs work as expected. It's been like this for a long time, so back-patch to all supported branches. Report: <20161018152550.1413.6439@wrigleys.postgresql.org>
* Fix assorted integer-overflow hazards in varbit.c.Tom Lane2016-10-14
| | | | | | | | | | | | | | | | | bitshiftright() and bitshiftleft() would recursively call each other infinitely if the user passed INT_MIN for the shift amount, due to integer overflow in negating the shift amount. To fix, clamp to -VARBITMAXLEN. That doesn't change the results since any shift distance larger than the input bit string's length produces an all-zeroes result. Also fix some places that seemed inadequately paranoid about input typmods exceeding VARBITMAXLEN. While a typmod accepted by anybit_typmodin() will certainly be much less than that, at least some of these spots are reachable with user-chosen integer values. Andreas Seltenreich and Tom Lane Discussion: <87d1j2zqtz.fsf@credativ.de>
* Don't require dynamic timezone abbreviations to match underlying time zone.Tom Lane2016-09-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we threw an error if a dynamic timezone abbreviation did not match any abbreviation recorded in the referenced IANA time zone entry. That seemed like a good consistency check at the time, but it turns out that a number of the abbreviations in the IANA database are things that Olson and crew made up out of whole cloth. Their current policy is to remove such names in favor of using simple numeric offsets. Perhaps unsurprisingly, a lot of these made-up abbreviations have varied in meaning over time, which meant that our commit b2cbced9e and later changes made them into dynamic abbreviations. So with newer IANA database versions that don't mention these abbreviations at all, we fail, as reported in bug #14307 from Neil Anderson. It's worse than just a few unused-in-the-wild abbreviations not working, because the pg_timezone_abbrevs view stops working altogether (since its underlying function tries to compute the whole view result in one call). We considered deleting these abbreviations from our abbreviations list, but the problem with that is that we can't stay ahead of possible future IANA changes. Instead, let's leave the abbreviations list alone, and treat any "orphaned" dynamic abbreviation as just meaning the referenced time zone. It will behave a bit differently than it used to, in that you can't any longer override the zone's standard vs. daylight rule by using the "wrong" abbreviation of a pair, but that's better than failing entirely. (Also, this solution can be interpreted as adding a small new feature, which is that any abbreviation a user wants can be defined as referencing a time zone name.) Back-patch to all supported branches, since this problem affects all of them when using tzdata 2016f or newer. Report: <20160902031551.15674.67337@wrigleys.postgresql.org> Discussion: <6189.1472820913@sss.pgh.pa.us>
* Remove bogus dependencies on NUMERIC_MAX_PRECISION.Tom Lane2016-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NUMERIC_MAX_PRECISION is a purely arbitrary constraint on the precision and scale you can write in a numeric typmod. It might once have had something to do with the allowed range of a typmod-less numeric value, but at least since 9.1 we've allowed, and documented that we allowed, any value that would physically fit in the numeric storage format; which is something over 100000 decimal digits, not 1000. Hence, get rid of numeric_in()'s use of NUMERIC_MAX_PRECISION as a limit on the allowed range of the exponent in scientific-format input. That was especially silly in view of the fact that you can enter larger numbers as long as you don't use 'e' to do it. Just constrain the value enough to avoid localized overflow, and let make_result be the final arbiter of what is too large. Likewise adjust ecpg's equivalent of this code. Also get rid of numeric_recv()'s use of NUMERIC_MAX_PRECISION to limit the number of base-NBASE digits it would accept. That created a dump/restore hazard for binary COPY without doing anything useful; the wire-format limit on number of digits (65535) is about as tight as we would want. In HEAD, also get rid of pg_size_bytes()'s unnecessary intimacy with what the numeric range limit is. That code doesn't exist in the back branches. Per gripe from Aravind Kumar. Back-patch to all supported branches, since they all contain the documentation claim about allowed range of NUMERIC (cf commit cabf5d84b). Discussion: <2895.1471195721@sss.pgh.pa.us>
* Fix several one-byte buffer over-reads in to_numberPeter Eisentraut2016-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | Several places in NUM_numpart_from_char(), which is called from the SQL function to_number(text, text), could accidentally read one byte past the end of the input buffer (which comes from the input text datum and is not null-terminated). 1. One leading space character would be skipped, but there was no check that the input was at least one byte long. This does not happen in practice, but for defensiveness, add a check anyway. 2. Commit 4a3a1e2cf apparently accidentally doubled that code that skips one space character (so that two spaces might be skipped), but there was no overflow check before skipping the second byte. Fix by removing that duplicate code. 3. A logic error would allow a one-byte over-read when looking for a trailing sign (S) placeholder. In each case, the extra byte cannot be read out directly, but looking at it might cause a crash. The third item was discovered by Piotr Stefaniak, the first two were found and analyzed by Tom Lane and Peter Eisentraut.
* Fix misestimation of n_distinct for a nearly-unique column with many nulls.Tom Lane2016-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>
* Fix assorted fallout from IS [NOT] NULL patch.Tom Lane2016-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <10682.1469566308@sss.pgh.pa.us>
* Fix crash in close_ps() for NaN input coordinates.Tom Lane2016-07-16
| | | | | | | | | | The Assert() here seems unreasonably optimistic. Andreas Seltenreich found that it could fail with NaNs in the input geometries, and it seems likely to me that it might fail in corner cases due to roundoff error, even for ordinary input values. As a band-aid, make the function return SQL NULL instead of crashing. Report: <87d1md1xji.fsf@credativ.de>
* Fix validation of overly-long IPv6 addresses.Tom Lane2016-06-16
| | | | | | | | | The inet/cidr types sometimes failed to reject IPv6 inputs with too many colon-separated fields, instead translating them to '::/0'. This is the result of a thinko in the original ISC code that seems to be as yet unreported elsewhere. Per bug #14198 from Stefan Kaltenbrunner. Report: <20160616182222.5798.959@wrigleys.postgresql.org>
* Fix possible read past end of string in to_timestamp().Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.
* Rename strtoi() to strtoint().Tom Lane2016-04-23
| | | | | | | | | | | | NetBSD has seen fit to invent a libc function named strtoi(), which conflicts with the long-established static functions of the same name in datetime.c and ecpg's interval.c. While muttering darkly about intrusions on application namespace, we'll rename our functions to avoid the conflict. Back-patch to all supported branches, since this would affect attempts to build any of them on recent NetBSD. Thomas Munro
* Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK.Tom Lane2016-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we shoehorned "x op ANY (array)" into the SQL syntax, we created a fundamental ambiguity as to the proper treatment of a sub-SELECT on the righthand side: perhaps what's meant is to compare x against each row of the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar sub-SELECT that delivers a single array value whose members should be compared against x. The grammar resolves it as the former case whenever the RHS is a select_with_parens, making the latter case hard to reach --- but you can get at it, with tricks such as attaching a no-op cast to the sub-SELECT. Parse analysis would throw away the no-op cast, leaving a parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr. ruleutils.c was not clued in on this fine point, and would naively emit "x op ANY ((SELECT ...))", which would be parsed as the first alternative, typically leading to errors like "operator does not exist: text = text[]" during dump/reload of a view or rule containing such a construct. To fix, emit a no-op cast when dumping such a parsetree. This might well be exactly what the user wrote to get the construct accepted in the first place; and even if she got there with some other dodge, it is a valid representation of the parsetree. Per report from Karl Czajkowski. He mentioned only a case involving RLS policies, but actually the problem is very old, so back-patch to all supported branches. Report: <20160421001832.GB7976@moraine.isi.edu>
* Remove dependency on psed for MSVC builds.Andrew Dunstan2016-03-19
| | | | | | | | | | | | Modern Perl has removed psed from its core distribution, so it might not be readily available on some build platforms. We therefore replace its use with a Perl script generated by s2p, which is equivalent to the sed script. The latter is retained for non-MSVC builds to avoid creating a new hard dependency on Perl for non-Windows tarball builds. Backpatch to all live branches. Michael Paquier and me.
* Avoid multiple free_struct_lconv() calls on same data.Tom Lane2016-02-28
| | | | | | | | | | | A failure partway through PGLC_localeconv() led to a situation where the next call would call free_struct_lconv() a second time, leading to free() on already-freed strings, typically leading to a core dump. Add a flag to remember whether we need to do that. Per report from Thom Brown. His example case only provokes the failure as far back as 9.4, but nonetheless this code is obviously broken, so back-patch to all supported branches.
* Force certain "pljava" custom GUCs to be PGC_SUSET.Noah Misch2016-02-05
| | | | | | | Future PL/Java versions will close CVE-2016-0766 by making these GUCs PGC_SUSET. This PostgreSQL change independently mitigates that PL/Java vulnerability, helping sites that update PostgreSQL more frequently than PL/Java. Back-patch to 9.1 (all supported versions).
* Properly install dynloader.h on MSVC buildsBruce Momjian2016-01-19
| | | | | | | | | | | This will enable PL/Java to be cleanly compiled, as dynloader.h is a requirement. Report by Chapman Flack Patch by Michael Paquier Backpatch through 9.1
* Teach flatten_reloptions() to quote option values safely.Tom Lane2016-01-01
| | | | | | | | | | | | | | | | | | | | | | | | | | flatten_reloptions() supposed that it didn't really need to do anything beyond inserting commas between reloption array elements. However, in principle the value of a reloption could be nearly anything, since the grammar allows a quoted string there. Any restrictions on it would come from validity checking appropriate to the particular option, if any. A reloption value that isn't a simple identifier or number could thus lead to dump/reload failures due to syntax errors in CREATE statements issued by pg_dump. We've gotten away with not worrying about this so far with the core-supported reloptions, but extensions might allow reloption values that cause trouble, as in bug #13840 from Kouhei Sutou. To fix, split the reloption array elements explicitly, and then convert any value that doesn't look like a safe identifier to a string literal. (The details of the quoting rule could be debated, but this way is safe and requires little code.) While we're at it, also quote reloption names if they're not safe identifiers; that may not be a likely problem in the field, but we might as well try to be bulletproof here. It's been like this for a long time, so back-patch to all supported branches. Kouhei Sutou, adjusted some by me
* Add some more defenses against silly estimates to gincostestimate().Tom Lane2016-01-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A report from Andy Colson showed that gincostestimate() was not being nearly paranoid enough about whether to believe the statistics it finds in the index metapage. The problem is that the metapage stats (other than the pending-pages count) are only updated by VACUUM, and in the worst case could still reflect the index's original empty state even when it has grown to many entries. We attempted to deal with that by scaling up the stats to match the current index size, but if nEntries is zero then scaling it up still gives zero. Moreover, the proportion of pages that are entry pages vs. data pages vs. pending pages is unlikely to be estimated very well by scaling if the index is now orders of magnitude larger than before. We can improve matters by expanding the use of the rule-of-thumb estimates I introduced in commit 7fb008c5ee59b040: if the index has grown by more than a cutoff amount (here set at 4X growth) since VACUUM, then use the rule-of-thumb numbers instead of scaling. This might not be exactly right but it seems much less likely to produce insane estimates. I also improved both the scaling estimate and the rule-of-thumb estimate to account for numPendingPages, since it's reasonable to expect that that is accurate in any case, and certainly pages that are in the pending list are not either entry or data pages. As a somewhat separate issue, adjust the estimation equations that are concerned with extra fetches for partial-match searches. These equations suppose that a fraction partialEntries / numEntries of the entry and data pages will be visited as a consequence of a partial-match search. Now, it's physically impossible for that fraction to exceed one, but our estimate of partialEntries is mostly bunk, and our estimate of numEntries isn't exactly gospel either, so we could arrive at a silly value. In the example presented by Andy we were coming out with a value of 100, leading to insane cost estimates. Clamp the fraction to one to avoid that. Like the previous patch, back-patch to all supported branches; this problem can be demonstrated in one form or another in all of them.
* Add missing CHECK_FOR_INTERRUPTS in lseg_inside_polyAlvaro Herrera2015-12-14
| | | | | | | | | Apparently, there are bugs in this code that cause it to loop endlessly. That bug still needs more research, but in the meantime it's clear that the loop is missing a check for interrupts so that it can be cancelled timely. Backpatch to 9.1 -- this has been missing since 49475aab8d0d.
* Make gincostestimate() cope with hypothetical GIN indexes.Tom Lane2015-12-01
| | | | | | | | | | | | | | | | | | | | | | | We tried to fetch statistics data from the index metapage, which does not work if the index isn't actually present. If the index is hypothetical, instead extrapolate some plausible internal statistics based on the index page count provided by the index-advisor plugin. There was already some code in gincostestimate() to invent internal stats in this way, but since it was only meant as a stopgap for pre-9.1 GIN indexes that hadn't been vacuumed since upgrading, it was pretty crude. If we want it to support index advisors, we should try a little harder. A small amount of testing says that it's better to estimate the entry pages as 90% of the index, not 100%. Also, estimating the number of entries (keys) as equal to the heap tuple count could be wildly wrong in either direction. Instead, let's estimate 100 entries per entry page. Perhaps someday somebody will want the index advisor to be able to provide these numbers more directly, but for the moment this should serve. Problem report and initial patch by Julien Rouhaud; modified by me to invent less-bogus internal statistics. Back-patch to all supported branches, since we've supported index advisors since 9.0.
* Fix failure to consider failure cases in GetComboCommandId().Tom Lane2015-11-26
| | | | | | | | | | | | Failure to initially palloc the comboCids array, or to realloc it bigger when needed, left combocid's data structures in an inconsistent state that would cause trouble if the top transaction continues to execute. Noted while examining a user complaint about the amount of memory used for this. (There's not much we can do about that, but it does point up that repalloc failure has a non-negligible chance of occurring here.) In HEAD/9.5, also avoid possible invocation of memcpy() with a null pointer in SerializeComboCIDState; cf commit 13bba0227.
* Fix handling of inherited check constraints in ALTER COLUMN TYPE (again).Tom Lane2015-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
* Fix possible internal overflow in numeric division.Tom Lane2015-11-17
| | | | | | | | | | | div_var_fast() postpones propagating carries in the same way as mul_var(), so it has the same corner-case overflow risk we fixed in 246693e5ae8a36f0, namely that the size of the carries has to be accounted for when setting the threshold for executing a carry propagation step. We've not devised a test case illustrating the brokenness, but the required fix seems clear enough. Like the previous fix, back-patch to all active branches. Dean Rasheed
* Fix ruleutils.c's dumping of whole-row Vars in ROW() and VALUES() contexts.Tom Lane2015-11-15
| | | | | | | | | | | | | | | Normally ruleutils prints a whole-row Var as "foo.*". We already knew that that doesn't work at top level of a SELECT list, because the parser would treat the "*" as a directive to expand the reference into separate columns, not a whole-row Var. However, Joshua Yanovski points out in bug #13776 that the same thing happens at top level of a ROW() construct; and some nosing around in the parser shows that the same is true in VALUES(). Hence, apply the same workaround already devised for the SELECT-list case, namely to add a forced cast to the appropriate rowtype in these cases. (The alternative of just printing "foo" was rejected because it is difficult to avoid ambiguity against plain columns named "foo".) Back-patch to all supported branches.
* Perform an immediate shutdown if the postmaster.pid file is removed.Tom Lane2015-10-06
| | | | | | | | | | | | | | | | | | | | | | | | The postmaster now checks every minute or so (worst case, at most two minutes) that postmaster.pid is still there and still contains its own PID. If not, it performs an immediate shutdown, as though it had received SIGQUIT. The original goal behind this change was to ensure that failed buildfarm runs would get fully cleaned up, even if the test scripts had left a postmaster running, which is not an infrequent occurrence. When the buildfarm script removes a test postmaster's $PGDATA directory, its next check on postmaster.pid will fail and cause it to exit. Previously, manual intervention was often needed to get rid of such orphaned postmasters, since they'd block new test postmasters from obtaining the expected socket address. However, by checking postmaster.pid and not something else, we can provide additional robustness: manual removal of postmaster.pid is a frequent DBA mistake, and now we can at least limit the damage that will ensue if a new postmaster is started while the old one is still alive. Back-patch to all supported branches, since we won't get the desired improvement in buildfarm reliability otherwise.
* Prevent stack overflow in query-type functions.Noah Misch2015-10-05
| | | | | | The tsquery, ltxtquery and query_int data types have a common ancestor. Having acquired check_stack_depth() calls independently, each was missing at least one call. Back-patch to 9.0 (all supported versions).
* Prevent stack overflow in container-type functions.Noah Misch2015-10-05
| | | | | | | | | | | | | | A range type can name another range type as its subtype, and a record type can bear a column of another record type. Consequently, functions like range_cmp() and record_recv() are recursive. Functions at risk include operator family members and referents of pg_type regproc columns. Treat as recursive any such function that looks up and calls the same-purpose function for a record column type or the range subtype. Back-patch to 9.0 (all supported versions). An array type's element type is never itself an array type, so array functions are unaffected. Recursion depth proportional to array dimensionality, found in array_dim_to_jsonb(), is fine thanks to MAXDIM.
* Add recursion depth protection to LIKE matching.Tom Lane2015-10-02
| | | | | Since MatchText() recurses, it could in principle be driven to stack overflow, although quite a long pattern would be needed.
* Lower *_freeze_max_age minimum values.Andres Freund2015-09-24
| | | | | | | | | | | | | | | | | | The old minimum values are rather large, making it time consuming to test related behaviour. Additionally the current limits, especially for multixacts, can be problematic in space-constrained systems. 10000000 multixacts can contain a lot of members. Since there's no good reason for the current limits, lower them a good bit. Setting them to 0 would be a bad idea, triggering endless vacuums, so still retain a limit. While at it fix autovacuum_multixact_freeze_max_age to refer to multixact.c instead of varsup.c. Reviewed-By: Robert Haas Discussion: CA+TgmoYmQPHcrc3GSs7vwvrbTkbcGD9Gik=OztbDGGrovkkEzQ@mail.gmail.com Backpatch: 9.0 (in parts)
* Fix possible internal overflow in numeric multiplication.Tom Lane2015-09-21
| | | | | | | | | | | | | mul_var() postpones propagating carries until it risks overflow in its internal digit array. However, the logic failed to account for the possibility of overflow in the carry propagation step, allowing wrong results to be generated in corner cases. We must slightly reduce the when-to-propagate-carries threshold to avoid that. Discovered and fixed by Dean Rasheed, with small adjustments by me. This has been wrong since commit d72f6c75038d8d37e64a29a04b911f728044d83b, so back-patch to all supported branches.
* Add gin_fuzzy_search_limit to postgresql.conf.sample.Fujii Masao2015-09-09
| | | | | | | This was forgotten in 8a3631f (commit that originally added the parameter) and 0ca9907 (commit that added the documentation later that year). Back-patch to all supported versions.
* Move DTK_ISODOW DTK_DOW and DTK_DOY to be type UNITS rather thanGreg Stark2015-09-06
| | | | | | | | | | | | | | | | RESERV. RESERV is meant for tokens like "now" and having them in that category throws errors like these when used as an input date: stark=# SELECT 'doy'::timestamptz; ERROR: unexpected dtype 33 while parsing timestamptz "doy" LINE 1: SELECT 'doy'::timestamptz; ^ stark=# SELECT 'dow'::timestamptz; ERROR: unexpected dtype 32 while parsing timestamptz "dow" LINE 1: SELECT 'dow'::timestamptz; ^ Found by LLVM's Libfuzzer
* Fix subtransaction cleanup after an outer-subtransaction portal fails.Tom Lane2015-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, we treated only portals created in the current subtransaction as having failed during subtransaction abort. However, if the error occurred while running a portal created in an outer subtransaction (ie, a cursor declared before the last savepoint), that has to be considered broken too. To allow reliable detection of which ones those are, add a bookkeeping field to struct Portal that tracks the innermost subtransaction in which each portal has actually been executed. (Without this, we'd end up failing portals containing functions that had called the subtransaction, thereby breaking plpgsql exception blocks completely.) In addition, when we fail an outer-subtransaction Portal, transfer its resources into the subtransaction's resource owner, so that they're released early in cleanup of the subxact. This fixes a problem reported by Jim Nasby in which a function executed in an outer-subtransaction cursor could cause an Assert failure or crash by referencing a relation created within the inner subtransaction. The proximate cause of the Assert failure is that AtEOSubXact_RelationCache assumed it could blow away a relcache entry without first checking that the entry had zero refcount. That was a bad idea on its own terms, so add such a check there, and to the similar coding in AtEOXact_RelationCache. This provides an independent safety measure in case there are still ways to provoke the situation despite the Portal-level changes. This has been broken since subtransactions were invented, so back-patch to all supported branches. Tom Lane and Michael Paquier
* Add a small cache of locks owned by a resource owner in ResourceOwner.Tom Lane2015-08-27
| | | | | | | | | | | | Back-patch 9.3-era commit eeb6f37d89fc60c6449ca12ef9e91491069369cb, to improve the older branches' ability to cope with pg_dump dumping a large number of tables. I back-patched into 9.2 and 9.1, but not 9.0 as it would have required a significant amount of refactoring, thus negating the argument that this is by-now-well-tested code. Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
* Allow record_in() and record_recv() to work for transient record types.Tom Lane2015-08-21
| | | | | | | | | | | | | | | | | | If we have the typmod that identifies a registered record type, there's no reason that record_in() should refuse to perform input conversion for it. Now, in direct SQL usage, record_in() will always be passed typmod = -1 with type OID RECORDOID, because no typmodin exists for type RECORD, so the case can't arise. However, some InputFunctionCall users such as PLs may be able to supply the right typmod, so we should allow this to support them. Note: the previous coding and comment here predate commit 59c016aa9f490b53. There has been no case since 8.1 in which the passed type OID wouldn't be valid; and if it weren't, this error message wouldn't be apropos anyway. Better to let lookup_rowtype_tupdesc complain about it. Back-patch to 9.1, as this is necessary for my upcoming plpython fix. I'm committing it separately just to make it a bit more visible in the commit history.
* Don't use 'bool' as a struct member name in help_config.c.Andres Freund2015-08-15
| | | | | | | | | | | | Doing so doesn't work if bool is a macro rather than a typedef. Although c.h spends some effort to support configurations where bool is a preexisting macro, help_config.c has existed this way since 2003 (b700a6), and there have not been any reports of problems. Backpatch anyway since this is as riskless as it gets. Discussion: 20150812084351.GD8470@awork2.anarazel.de Backpatch: 9.0-master
* Fix bogus "out of memory" reports in tuplestore.c.Tom Lane2015-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tuplesort/tuplestore memory management logic assumed that the chunk allocation overhead for its memtuples array could not increase when increasing the array size. This is and always was true for tuplesort, but we (I, I think) blindly copied that logic into tuplestore.c without noticing that the assumption failed to hold for the much smaller array elements used by tuplestore. Given rather small work_mem, this could result in an improper complaint about "unexpected out-of-memory situation", as reported by Brent DeSpain in bug #13530. The easiest way to fix this is just to increase tuplestore's initial array size so that the assumption holds. Rather than relying on magic constants, though, let's export a #define from aset.c that represents the safe allocation threshold, and make tuplestore's calculation depend on that. Do the same in tuplesort.c to keep the logic looking parallel, even though tuplesort.c isn't actually at risk at present. This will keep us from breaking it if we ever muck with the allocation parameters in aset.c. Back-patch to all supported versions. The error message doesn't occur pre-9.3, not so much because the problem can't happen as because the pre-9.3 tuplestore code neglected to check for it. (The chance of trouble is a great deal larger as of 9.3, though, due to changes in the array-size-increasing strategy.) However, allowing LACKMEM() to become true unexpectedly could still result in less-than-desirable behavior, so let's patch it all the way back.
* Cap wal_buffers to avoid a server crash when it's set very large.Robert Haas2015-08-04
| | | | | | | | | | It must be possible to multiply wal_buffers by XLOG_BLCKSZ without overflowing int, or calculations in StartupXLOG will go badly wrong and crash the server. Avoid that by imposing a maximum value on wal_buffers. This will be just under 2GB, assuming the usual value for XLOG_BLCKSZ. Josh Berkus, per an analysis by Andrew Gierth.
* Avoid some zero-divide hazards in the planner.Tom Lane2015-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | Although I think on all modern machines floating division by zero results in Infinity not SIGFPE, we still don't want infinities running around in the planner's costing estimates; too much risk of that leading to insane behavior. grouping_planner() failed to consider the possibility that final_rel might be known dummy and hence have zero rowcount. (I wonder if it would be better to set a rows estimate of 1 for dummy relations? But at least in the back branches, changing this convention seems like a bad idea, so I'll leave that for another day.) Make certain that get_variable_numdistinct() produces a nonzero result. The case that can be shown to be broken is with stadistinct < 0.0 and small ntuples; we did not prevent the result from rounding to zero. For good luck I applied clamp_row_est() to all the nonconstant return values. In ExecChooseHashTableSize(), Assert that we compute positive nbuckets and nbatch. I know of no reason to think this isn't the case, but it seems like a good safety check. Per reports from Piotr Stefaniak. Back-patch to all active branches.
* Disable ssl renegotiation by default.Andres Freund2015-07-28
| | | | | | | | | | | | | | | | | | | | | | While postgres' use of SSL renegotiation is a good idea in theory, it turned out to not work well in practice. The specification and openssl's implementation of it have lead to several security issues. Postgres' use of renegotiation also had its share of bugs. Additionally OpenSSL has a bunch of bugs around renegotiation, reported and open for years, that regularly lead to connections breaking with obscure error messages. We tried increasingly complex workarounds to get around these bugs, but we didn't find anything complete. Since these connection breakages often lead to hard to debug problems, e.g. spuriously failing base backups and significant latency spikes when synchronous replication is used, we have decided to change the default setting for ssl renegotiation to 0 (disabled) in the released backbranches and remove it entirely in 9.5 and master.. Author: Michael Paquier, with changes by me Discussion: 20150624144148.GQ4797@alap3.anarazel.de Backpatch: 9.0-9.4; 9.5 and master get a different patch
* Fix the logic for putting relations into the relcache init file.Tom Lane2015-06-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.
* Use a safer method for determining whether relcache init file is stale.Tom Lane2015-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.
* Fix incorrect order of database-locking operations in InitPostgres().Tom Lane2015-06-05
| | | | | | | | | | | | | | | | | | | | | | | | We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches.
* Check return values of sensitive system library calls.Noah Misch2015-05-18
| | | | | | | | | | | | | | | | | | | | | | PostgreSQL already checked the vast majority of these, missing this handful that nearly cannot fail. If putenv() failed with ENOMEM in pg_GSS_recvauth(), authentication would proceed with the wrong keytab file. If strftime() returned zero in cache_locale_time(), using the unspecified buffer contents could lead to information exposure or a crash. Back-patch to 9.0 (all supported versions). Other unchecked calls to these functions, especially those in frontend code, pose negligible security concern. This patch does not address them. Nonetheless, it is always better to check return values whose specification provides for indicating an error. In passing, fix an off-by-one error in strftime_win32()'s invocation of WideCharToMultiByte(). Upon retrieving a value of exactly MAX_L10N_DATA bytes, strftime_win32() would overrun the caller's buffer by one byte. MAX_L10N_DATA is chosen to exceed the length of every possible value, so the vulnerable scenario probably does not arise. Security: CVE-2015-3166
* Remove spurious semicolons.Heikki Linnakangas2015-03-31
| | | | Petr Jelinek
* Fix dumping of views that are just VALUES(...) but have column aliases.Tom Lane2015-02-25
| | | | | | | | | | | | | | | The "simple" path for printing VALUES clauses doesn't work if we need to attach nondefault column aliases, because there's noplace to do that in the minimal VALUES() syntax. So modify get_simple_values_rte() to detect nondefault aliases and treat that as a non-simple case. This further exposes that the "non-simple" path never actually worked; it didn't produce valid syntax. Fix that too. Per bug #12789 from Curtis McEnroe, and analysis by Andrew Gierth. Back-patch to all supported branches. Before 9.3, this also requires back-patching the part of commit 092d7ded29f36b0539046b23b81b9f0bf2d637f1 that created get_simple_values_rte() to begin with; inserting the extra test into the old factorization of that logic would've been too messy.
* Be more careful to not lose sync in the FE/BE protocol.Heikki Linnakangas2015-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244
* to_char(): prevent writing beyond the allocated bufferBruce Momjian2015-02-02
| | | | | | | | | | Previously very long localized month and weekday strings could overflow the allocated buffers, causing a server crash. Reported and patch reviewed by Noah Misch. Backpatch to all supported versions. Security: CVE-2015-0241