aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Remove bogus code to apply PathTargets to partial paths.Robert Haas2016-06-03
| | | | | | | | | | | | | | | | | The partial paths that get modified may already have been used as part of a GatherPath which appears in the path list, so modifying them is not a good idea at this stage - especially because this code has no check that the PathTarget is in fact parallel-safe. When partial aggregation is being performed, this is actually harmless because we'll end up replacing the pathtargets here with the correct ones within create_grouping_paths(). But if we've got a query tree containing only scan/join operations then this can result in incorrectly pushing down parallel-restricted target list entries. If those are, for example, references to subqueries, that can crash the server; but it's wrong in any event. Amit Kapila
* Add new snapshot fields to serialize/deserialize functions.Kevin Grittner2016-06-03
| | | | | | | The "snapshot too old" condition was not being recognized when using a copied snapshot, since the original timestamp and lsn were not being passed along. Noticed when testing the combination of "snapshot too old" with parallel query execution.
* Fix comment to be more accurate.Robert Haas2016-06-03
| | | | | | | Now that we skip vacuuming all-frozen pages, this comment needs updating. Masahiko Sawada
* Fix various common mispellings.Greg Stark2016-06-03
| | | | | | | | | | Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
* Cosmetic improvements to freeze map code.Robert Haas2016-06-03
| | | | | | | Per post-commit review comments from Andres Freund, improve variable names, comments, and in one place, slightly improve the code structure. Masahiko Sawada
* Be conservative about alignment requirements of struct epoll_event.Greg Stark2016-06-02
| | | | | | | | | | | | Use MAXALIGN size/alignment to guarantee that later uses of memory are aligned correctly. E.g. epoll_event might need 8 byte alignment on some platforms, but earlier allocations like WaitEventSet and WaitEvent might not sized to guarantee that when purely using sizeof(). Found by myself while testing on an Sun Ultra 5 (Sparc IIi) with some editorializing by Andres Freund. In passing fix a couple typos in the area
* Fix btree mark/restore bug.Kevin Grittner2016-06-02
| | | | | | | | | | | | | | | | | | | | | Commit 2ed5b87f96d473962ec5230fd820abfeaccb2069 introduced a bug in mark/restore, in an attempt to optimize repeated restores to the same page. This caused an assertion failure during a merge join which fed directly from an index scan, although the impact would not be limited to that case. Revert the bad chunk of code from that commit. While investigating this bug it was discovered that a particular "paranoia" set of the mark position field would not prevent bad behavior; it would just make it harder to diagnose. Change that into an assertion, which will draw attention to any future problem in that area more directly. Backpatch to 9.5, where the bug was introduced. Bug #14169 reported by Shinta Koyanagi. Preliminary analysis by Tom Lane identified which commit caused the bug.
* Avoid useless closely-spaced writes of statistics files.Tom Lane2016-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original intent in the stats collector was that we should not write out stats data oftener than every PGSTAT_STAT_INTERVAL msec. Backends will not make requests at all if they see the existing data is newer than that, and the stats collector is supposed to disregard requests having a cutoff_time older than its most recently written data, so that close-together requests don't result in multiple writes. But the latter part of that got broken in commit 187492b6c2e8cafc, so that if two backends concurrently decide the existing stats are too old, the collector would write the data twice. (In principle the collector's logic would still merge requests as long as the second one arrives before we've actually written data ... but since the message collection loop would write data immediately after processing a single inquiry message, that never happened in practice, and in any case the window in which it might work would be much shorter than PGSTAT_STAT_INTERVAL.) To fix, improve pgstat_recv_inquiry so that it checks whether the cutoff time is too old, and doesn't add a request to the queue if so. This means that we do not need DBWriteRequest.request_time, because the decision is taken before making a queue entry. And that means that we don't really need the DBWriteRequest data structure at all; an OID list of database OIDs will serve and allow removal of some rather verbose and crufty code. In passing, improve the comments in this area, which have been rather neglected. Also change backend_read_statsfile so that it's not silently relying on MyDatabaseId to have some particular value in the autovacuum launcher process. It accidentally worked as desired because MyDatabaseId is zero in that process; but that does not seem like a dependency we want, especially with no documentation about it. Although this patch is mine, it turns out I'd rediscovered a known bug, for which Tomas Vondra had already submitted a patch that's functionally equivalent to the non-cosmetic aspects of this patch. Thanks to Tomas for reviewing this version. Back-patch to 9.3 where the bug was introduced. Prior-Discussion: <1718942738eb65c8407fcd864883f4c8@fuzzy.cz> Patch: <4625.1464202586@sss.pgh.pa.us>
* Mirror struct Aggref field order in _copyAggref().Noah Misch2016-05-31
| | | | This is cosmetic, and no supported release has the affected fields.
* Fix PageAddItem BRIN bugAlvaro Herrera2016-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BRIN was relying on the ability to remove a tuple from an index page, then putting another tuple in the same line pointer. But PageAddItem refuses to add a tuple beyond the first free item past the last used item, and in particular, it rejects an attempt to add an item to an empty page anywhere other than the first line pointer. PageAddItem issues a WARNING and indicates to the caller that it failed, which in turn causes the BRIN calling code to issue a PANIC, so the whole sequence looks like this: WARNING: specified item offset is too large PANIC: failed to add BRIN tuple To fix, create a new function PageAddItemExtended which is like PageAddItem except that the two boolean arguments become a flags bitmap; the "overwrite" and "is_heap" boolean flags in PageAddItem become PAI_OVERWITE and PAI_IS_HEAP flags in the new function, and a new flag PAI_ALLOW_FAR_OFFSET enables the behavior required by BRIN. PageAddItem() retains its original signature, for compatibility with third-party modules (other callers in core code are not modified, either). Also, in the belt-and-suspenders spirit, I added a new sanity check in brinGetTupleForHeapBlock to raise an error if an TID found in the revmap is not marked as live by the page header. This causes it to react with "ERROR: corrupted BRIN index" to the bug at hand, rather than a hard crash. Backpatch to 9.5. Bug reported by Andreas Seltenreich as detected by his handy sqlsmith fuzzer. Discussion: https://www.postgresql.org/message-id/87mvni77jh.fsf@elite.ansel.ydns.eu
* Fix DROP ACCESS METHOD IF EXISTS.Tom Lane2016-05-27
| | | | | | | | | | The IF EXISTS option was documented, and implemented in the grammar, but it didn't actually work for lack of support in does_not_exist_skipping(). Per bug #14160. Report and patch by Kouhei Sutou Report: <20160527070433.19424.81712@wrigleys.postgresql.org>
* Be more predictable about reporting "lock timeout" vs "statement timeout".Tom Lane2016-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <8693.1464314819@sss.pgh.pa.us>
* Disable physical tlist if any Var would need multiple sortgroupref labels.Tom Lane2016-05-26
| | | | | | | | | | | | | | | | | | | | | | | | As part of upper planner pathification (commit 3fc6e2d7f5b652b4) I redid createplan.c's approach to the physical-tlist optimization, in which scan nodes are allowed to return exactly the underlying table's columns so as to save doing a projection step at runtime. The logic was intentionally more aggressive than before about applying the optimization, which is generally a good thing, but Andres Freund found a case in which it got too aggressive. Namely, if any column is referenced more than once in the parent plan node's sorting or grouping column list, we can't optimize because then that column would need to have more than one ressortgroupref label, and we only have space for one. Add logic to detect this situation in use_physical_tlist(), and also add some error checking in apply_pathtarget_labeling_to_tlist(), which this example proves was being overly cavalier about whether what it was doing made any sense. The added test case exposes the problem only because we do not eliminate duplicate grouping keys. That might be something to fix someday, but it doesn't seem like appropriate post-beta work. Report: <20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de>
* Remove option to write USING before opclass name in CREATE INDEX.Tom Lane2016-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | Dating back to commit f10b63923, our grammar has allowed "USING" to optionally appear before an opclass name in CREATE INDEX (and, lately, some related places such as ON CONFLICT specifications). Nikolay Shaplov noticed that this syntax existed but wasn't documented, and proposed documenting it. But what seems like a better idea is to remove the production, thereby making the code match the docs not vice versa. This isn't our usual modus operandi for such cases, but there are a couple of good reasons to proceed this way: * So far as I can find, this syntax has never been documented anywhere. It isn't relied on by any of our own code or test cases, and there seems little reason to suppose that it's been used in the wild either. * Documenting it would mean that there would be two separate uses of USING in the CREATE INDEX syntax, the other being "USING access_method". That can lead to nothing but confusion. So, let's just remove it. On the off chance that somebody somewhere is using it, this isn't something to back-patch, but we can fix it in HEAD. Discussion: <1593237.l7oKHRpxSe@nataraj-amd64>
* Ensure that backends see up-to-date statistics for shared catalogs.Tom Lane2016-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Ever since we split the statistics collector's reports into per-database files (commit 187492b6c2e8cafc), backends have been seeing stale statistics for shared catalogs. This is because the inquiry message only prompts the collector to write the per-database file for the requesting backend's own database. Stats for shared catalogs are in a separate file for "DB 0", which didn't get updated. In normal operation this was partially masked by the fact that the autovacuum launcher would send an inquiry message at least once per autovacuum_naptime that asked for "DB 0"; so the shared-catalog stats would never be more than a minute out of date. However the problem becomes very obvious with autovacuum disabled, as reported by Peter Eisentraut. To fix, redefine the semantics of inquiry messages so that both the specified DB and DB 0 will be dumped. (This might seem a bit inefficient, but we have no good way to know whether a backend's transaction will look at shared-catalog stats, so we have to read both groups of stats whenever we request stats. Sending two inquiry messages would definitely not be better.) Back-patch to 9.3 where the bug was introduced. Report: <56AD41AC.1030509@gmx.net>
* Fetch XIDs atomically during vac_truncate_clog().Tom Lane2016-05-24
| | | | | | | | | | Because vac_update_datfrozenxid() updates datfrozenxid and datminmxid in-place, it's unsafe to assume that successive reads of those values will give consistent results. Fetch each one just once to ensure sane behavior in the minimum calculation. Noted while reviewing Alexander Korotkov's patch in the same area. Discussion: <8564.1464116473@sss.pgh.pa.us>
* Avoid consuming an XID during vac_truncate_clog().Tom Lane2016-05-24
| | | | | | | | | | | | | | | | | | | vac_truncate_clog() uses its own transaction ID as the comparison point in a sanity check that no database's datfrozenxid has already wrapped around "into the future". That was probably fine when written, but in a lazy vacuum we won't have assigned an XID, so calling GetCurrentTransactionId() causes an XID to be assigned when otherwise one would not be. Most of the time that's not a big problem ... but if we are hard up against the wraparound limit, consuming XIDs during antiwraparound vacuums is a very bad thing. Instead, use ReadNewTransactionId(), which not only avoids this problem but is in itself a better comparison point to test whether wraparound has already occurred. Report and patch by Alexander Korotkov. Back-patch to all versions. Report: <CAPpHfdspOkmiQsxh-UZw2chM6dRMwXAJGEmmbmqYR=yvM7-s6A@mail.gmail.com>
* Fix range check for effective_io_concurrencyAlvaro Herrera2016-05-24
| | | | | | | | Commit 1aba62ec moved the range check of that option form guc.c into bufmgr.c, but introduced a bug by changing a >= 0.0 to > 0.0, which made the value 0 no longer accepted. Put it back. Reported by Jeff Janes, diagnosed by Tom Lane
* Fix BTREE_BUILD_STATS build.Tom Lane2016-05-23
| | | | | | | | | | Commit 65c5fcd353a859da9e61bfb2b92a99f12937de3b broke this by removing a header include directive that is conditionally required. Add that back to nbtree.c, with annotation to keep pgrminclude from re-breaking it. Peter Geoghegan Report: <CAM3SWZTNjHFYW_UG8bu0BnogqQ2HfsTgkzXLueuUhfTcYbu5HA@mail.gmail.com>
* Support IndexElem in raw_expression_tree_walker().Tom Lane2016-05-23
| | | | | | | | | Needed for cases in which INSERT ... ON CONFLICT appears inside a recursive CTE item. Per bug #14153 from Thomas Alton. Patch by Peter Geoghegan, slightly adjusted by me Report: <20160521232802.22598.13537@wrigleys.postgresql.org>
* Add support for more extensive testing of raw_expression_tree_walker().Tom Lane2016-05-23
| | | | | | | | | | | | | | | | | | | | | If RAW_EXPRESSION_COVERAGE_TEST is defined, do a no-op tree walk over every basic DML statement submitted to parse analysis. If we'd had this in place earlier, bug #14153 would have been caught by buildfarm testing. The difficulty is that raw_expression_tree_walker() is only used in limited cases involving CTEs (particularly recursive ones), so it's very easy for an oversight in it to not be noticed during testing of a seemingly-unrelated feature. The type of error we can expect to catch with this is complete omission of a node type from raw_expression_tree_walker(), and perhaps also recursion into a field that doesn't contain a node tree, though that would be an unlikely mistake. It won't catch failure to add new fields that need to be recursed into, unfortunately. I'll go enable this on one or two of my own buildfarm animals once bug #14153 is dealt with. Discussion: <27861.1464040417@sss.pgh.pa.us>
* Fix latent crash in do_text_output_multiline().Tom Lane2016-05-23
| | | | | | | | | | | | | | | | | | | | | do_text_output_multiline() would fail (typically with a null pointer dereference crash) if its input string did not end with a newline. Such cases do not arise in our current sources; but it certainly could happen in future, or in extension code's usage of the function, so we should fix it. To fix, replace "eol += len" with "eol = text + len". While at it, make two cosmetic improvements: mark the input string const, and rename the argument from "text" to "txt" to dodge pgindent strangeness (since "text" is a typedef name). Even though this problem is only latent at present, it seems like a good idea to back-patch the fix, since it's a very simple/safe patch and it's not out of the realm of possibility that we might in future back-patch something that expects sane behavior from do_text_output_multiline(). Per report from Hao Lee. Report: <CAGoxFiFPAGyPAJLcFxTB5cGhTW2yOVBDYeqDugYwV4dEd1L_Ag@mail.gmail.com>
* Allocate all page images at once in generic wal interfaceTeodor Sigaev2016-05-17
| | | | | | That reduces number of allocation. Per gripe from Michael Paquier and Tom Lane suggestion.
* Correctly align page's images in generic wal APITeodor Sigaev2016-05-17
| | | | | | | Page image should be MAXALIGN'ed because existing code could directly align pointers in page instead of align offset from beginning of page. Found during play with indexes as extenstion, Alexander Korotkov and me
* sql_features: Fix typosPeter Eisentraut2016-05-13
| | | | | | This makes the feature names match the SQL standard. From: Alexander Law <exclusion@gmail.com>
* Fix bogus commentsAlvaro Herrera2016-05-12
| | | | | | Some comments mentioned XLogReplayBuffer, but there's no such function: that was an interim name for a function that got renamed to XLogReadBufferForRedo, before commit 2c03216d831160 was pushed.
* Fix obsolete commentAlvaro Herrera2016-05-12
|
* Fix infer_arbiter_indexes() to not barf on system columns.Tom Lane2016-05-11
| | | | | | | | | | | | | While it could be argued that rejecting system column mentions in the ON CONFLICT list is an unsupported feature, falling over altogether just because the table has a unique index on OID is indubitably a bug. As far as I can tell, fixing infer_arbiter_indexes() is sufficient to make ON CONFLICT (oid) actually work, though making a regression test for that case is problematic because of the impossibility of setting the OID counter to a known value. Minor cosmetic cleanups along with the bug fix.
* Fix assorted missing infrastructure for ON CONFLICT.Tom Lane2016-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | subquery_planner() failed to apply expression preprocessing to the arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the theory was that this wasn't necessary because we don't actually try to execute those expressions; but that's wrong, because it results in failure to match to index expressions or index predicates that are changed at all by preprocessing. Per bug #14132 from Reynold Smith. Also add pullup_replace_vars processing for onConflictWhere. Perhaps it's impossible to have a subquery reference there, but I'm not exactly convinced; and even if true today it's a failure waiting to happen. Also add some comments to other places where one or another field of OnConflictExpr is intentionally ignored, with explanation as to why it's okay to do so. Also, catalog/dependency.c failed to record any dependency on the named constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to be dropped while rules exist that depend on it, and allowing pg_dump to dump such a rule before the constraint it refers to. The normal execution path managed to error out reasonably for a dangling constraint reference, but ruleutils.c dumped core; so in addition to fixing the omission, add a protective check in ruleutils.c, since we can't retroactively add a dependency in existing databases. Back-patch to 9.5 where this code was introduced. Report: <20160510190350.2608.48667@wrigleys.postgresql.org>
* Fix autovacuum for shared relationsAlvaro Herrera2016-05-10
| | | | | | | | | | | | | | | | | | | | The table-skipping logic in autovacuum would fail to consider that multiple workers could be processing the same shared catalog in different databases. This normally wouldn't be a problem: firstly because autovacuum workers not for wraparound would simply ignore tables in which they cannot acquire lock, and secondly because most of the time these tables are small enough that even if multiple for-wraparound workers are stuck in the same catalog, they would be over pretty quickly. But in cases where the catalogs are severely bloated it could become a problem. Backpatch all the way back, because the problem has been there since the beginning. Reported by Ondřej Světlík Discussion: https://www.postgresql.org/message-id/572B63B1.3030603%40flexibee.eu https://www.postgresql.org/message-id/572A1072.5080308%40flexibee.eu
* Translation updatesPeter Eisentraut2016-05-09
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 17bf3e8564abf600274789fcc90e72532d5e7c05
* Fix poorly-worded log message.Tom Lane2016-05-08
| | | | Euler Taveira
* Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old.Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch essentially reverts commit 4c6780fd17aa43ed, in favor of a much simpler solution for the case where the new cluster would choose to create a TOAST table but the old cluster doesn't have one: just don't create a TOAST table. The existing code failed in at least two different ways if the situation arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the lock sanity check in create_toast_table failed; (2) pg_upgrade did not provide a pg_type OID for the new toast table, so that the crosscheck in TypeCreate failed. While both these problems were introduced by later patches, they show that the hack being used to cause TOAST table creation is overwhelmingly fragile (and untested). I also note that before the TypeCreate crosscheck was added, the code would have resulted in assigning an indeterminate pg_type OID to the toast table, possibly causing a later OID conflict in that catalog; so that it didn't really work even when committed. If we simply don't create a TOAST table, there will only be a problem if the code tries to store a tuple that's wider than a page, and field compression isn't sufficient to get it under a page. Given that the TOAST creation threshold is intended to be about a quarter of a page, it's very hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic could result in an observable problem. So let's just follow the old version's conclusion about whether a TOAST table is needed. (If we ever do change needs_toast_table() so much that this conclusion doesn't apply, we can devise a solution at that time, and hopefully do it in a less klugy way than 4c6780fd17aa43ed did.) Back-patch to 9.3, like the previous patch. Discussion: <8110.1462291671@sss.pgh.pa.us>
* Mitigate "snapshot too old" performance regression on NUMAKevin Grittner2016-05-06
| | | | | | | | | | | | Limit maintenance of time to xid mapping to once per minute. At least in the tested case this brings performance within 5% of when the feature is off, compared to several times slower without this patch. While there, fix comments and whitespace. Ants Aasma, with cosmetic adjustments suggested by Andres Freund Reviewed by Kevin Grittner and Andres Freund
* Minimal fix for crash bug in quals_match_foreign_key.Robert Haas2016-05-06
| | | | | | | | Discussion is still underway as to whether to revert the entire patch that added this function, but that discussion may not conclude before beta1. So, in the meantime, let's do at least this much. David Rowley
* Limit maximum parallel degree to 1024.Robert Haas2016-05-06
| | | | | | | | | | | | | | This new limit affects both the max_parallel_degree GUC and the parallel_degree reloption. There may some day be a use case for using more than 1024 CPUs for a single query, but that's surely not the case right now. Not only do not very many people have that many CPUs, but the code hasn't been tested at that kind of scale and is very unlikely to perform well, or even work at all, without a lot more work. The issue addressed by commit 06bd458cb812623c3f1fdd55216c4c08b06a8447 is probably just one problem of many. The idea of a more reasonable limit here was suggested by Tom Lane; the value of 1024 was suggested by Amit Kapila.
* Use mul_size when multiplying by the number of parallel workers.Robert Haas2016-05-06
| | | | | | | | | | That way, if the result overflows size_t, you'll get an error instead of undefined behavior, which seems like a plus. This also has the effect of casting the number of workers from int to Size, which is better because it's harder to overflow int than size_t. Dilip Kumar reported this issue and provided a patch upon which this patch is based, but his version did use mul_size.
* Remove various special checks around default rolesStephen Frost2016-05-06
| | | | | | | | | | | | | Default roles really should be like regular roles, for the most part. This removes a number of checks that were trying to make default roles extra special by not allowing them to be used as regular roles. We still prevent users from creating roles in the "pg_" namespace or from altering roles which exist in that namespace via ALTER ROLE, as we can't preserve such changes, but otherwise the roles are very much like regular roles. Based on discussion with Robert and Tom.
* Fix possible read past end of string in to_timestamp().Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.
* Fix hash index vs "snapshot too old" problemmsKevin Grittner2016-05-06
| | | | | | | | | | | | | | | | Hash indexes are not WAL-logged, and so do not maintain the LSN of index pages. Since the "snapshot too old" feature counts on detecting error conditions using the LSN of a table and all indexes on it, this makes it impossible to safely do early vacuuming on any table with a hash index, so add this to the tests for whether the xid used to vacuum a table can be adjusted based on old_snapshot_threshold. While at it, add a paragraph to the docs for old_snapshot_threshold which specifically mentions this and other aspects of the feature which may otherwise surprise users. Problem reported and patch reviewed by Amit Kapila
* Rename tsvector delete() to ts_delete(), and filter() to ts_filter().Tom Lane2016-05-05
| | | | | | | | | The similarity of the original names to SQL keywords seems like a bad idea. Rename them before we're stuck with 'em forever. In passing, minor code and docs cleanup. Discussion: <4875.1462210058@sss.pgh.pa.us>
* Fix corner-case loss of precision in numeric pow() calculationDean Rasheed2016-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7d9a4737c268f61fb8800957631f12d3f13be218 greatly improved the accuracy of the numeric transcendental functions, however it failed to consider the case where the result from pow() is close to the overflow threshold, for example 0.12 ^ -2345.6. For such inputs, where the result has more than 2000 digits before the decimal point, the decimal result weight estimate was being clamped to 2000, leading to a loss of precision in the final calculation. Fix this by replacing the clamping code with an overflow test that aborts the calculation early if the final result is sure to overflow, based on the overflow limit in exp_var(). This provides the same protection against integer overflow in the subsequent result scale computation as the original clamping code, but it also ensures that precision is never lost and saves compute cycles in cases that are sure to overflow. The new early overflow test works with the initial low-precision result (expected to be accurate to around 8 significant digits) and includes a small fuzz factor to ensure that it doesn't kick in for values that would not overflow exp_var(), so the overall overflow threshold of pow() is unchanged and consistent for all inputs with non-integer exponents. Author: Dean Rasheed Reviewed-by: Tom Lane Discussion: http://www.postgresql.org/message-id/CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com See-also: http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com
* Revert timeline following in replication slotsAlvaro Herrera2016-05-04
| | | | | | | | | This reverts commits f07d18b6e94d, 82c83b337202, 3a3b309041b0, and 24c5f1a103ce. This feature has shown enough immaturity that it was deemed better to rip it out before rushing some more fixes at the last minute. There are discussions on larger changes in this area for the next release.
* Fix crash of filter(tsvector)Teodor Sigaev2016-05-04
| | | | | | | Variable storing a position of lexeme, had a wrong type: char, it's obviously not enough to store 2^14 possible positions. Stas Kelvich
* Fix transient mdsync() errors of truncated relations due to 72a98a6395.Andres Freund2016-05-04
| | | | | | | | | | | | | | | | | | | | | Unfortunately the segment size checks from 72a98a6395 had the negative side-effect of breaking a corner case in mdsync(): When processing a fsync request for a truncated away segment mdsync() could fail with "could not fsync file" (if previous segment < RELSEG_SIZE) because _mdfd_getseg() now wouldn't return the relevant segment anymore. The cleanest fix seems to be to allow the caller of _mdfd_getseg() to specify whether checks for RELSEG_SIZE are performed. To allow doing so, change the ExtensionBehavior enum into a bitmask. Besides allowing for the addition of EXTENSION_DONT_CHECK_SIZE, this makes for a nicer implementation of EXTENSION_REALLY_RETURN_NULL. Besides mdsync() the only callsite that should change behaviour due to this is mdprefetch() which now doesn't create segments anymore, even in recovery. Given the uses of mdprefetch() that seems better. Reported-By: Thom Brown Discussion: CAA-aLv72QazLvPdKZYpVn4a_Eh+i4_cxuB03k+iCuZM_xjc+6Q@mail.gmail.com
* Fix more things to be parallel-safe.Robert Haas2016-05-03
| | | | | | | | | | | Conversion functions were previously marked as parallel-unsafe, since that is the default, but in fact they are safe. Parallel-safe functions defined in pg_proc.h and redefined in system_views.sql were ending up as parallel-unsafe because the redeclarations were not marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug() parallel safe also. Andreas Karlsson
* Tweak a few more things in preparation for upcoming pgindent run.Robert Haas2016-05-03
| | | | | | | | These adjustments adjust code and comments in minor ways to prevent pgindent from mangling them. Among other things, I tried to avoid situations where pgindent would emit "a +b" instead of "a + b", and I tried to avoid having it break up inline comments across multiple lines.
* Note that max_worker_processes requires restart.Robert Haas2016-05-03
| | | | | | Since this is a minor issue, no back-patch. Julien Rouhaud
* Fix code comments regarding logical decodingAlvaro Herrera2016-05-02
| | | | | | | | | | | | | | Back in 3b02ea4f0780 I added some comments in various places to explain how logical decoding and other things worked. Not all of the changes were welcome, because they were misleading or wrong. This changes them a little bit to make them more accurate. Some other comments are also changed to be more accurate. Also, fix a bunch of typos. Author: Álvaro Herrera, Craig Ringer Andres Freund reviewed some parts of this.
* Fix parallel safety markings for pg_start_backup.Robert Haas2016-05-02
| | | | | | | | | | | Commit 7117685461af50f50c03f43e6a622284c8d54694 made pg_start_backup parallel-restricted rather than parallel-safe, because it now relies on backend-private state that won't be synchronized with the parallel worker. However, it didn't update pg_proc.h. Separately, Andreas Karlsson observed that system_views.sql neglected to reiterate the parallel-safety markings whe redefining various functions, including this one; so add a PARALLEL RESTRICTED declaration there to match the new value in pg_proc.h.