aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* Add a few comments about ANALYZE's strategy for collecting MCVs.Tom Lane2016-04-04
| | | | | | | Alex Shulgin complained that the underlying strategy wasn't all that apparent, particularly not the fact that we intentionally have two code paths depending on whether we think the column has a limited set of possible values or not. Try to make it clearer.
* Partially revert commit 3d3bf62f30200500637b24fdb7b992a99f9704c3.Tom Lane2016-04-04
| | | | | | | | | | | | | | On reflection, the pre-existing logic in ANALYZE is specifically meant to compare the frequency of a candidate MCV against the estimated frequency of a random distinct value across the whole table. The change to compare it against the average frequency of values actually seen in the sample doesn't seem very principled, and if anything it would make us less likely not more likely to consider a value an MCV. So revert that, but keep the aspect of considering only nonnull values, which definitely is correct. In passing, rename the local variables in these stanzas to "ndistinct_table", to avoid confusion with the "ndistinct" that appears at an outer scope in compute_scalar_stats.
* Silence compiler warningAlvaro Herrera2016-04-04
| | | | Reported by Peter Eisentraut to occur on 32bit systems
* Add a \gexec command to psql for evaluation of computed queries.Tom Lane2016-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | \gexec executes the just-entered query, like \g, but instead of printing the results it takes each field as a SQL command to send to the server. Computing a series of queries to be executed is a fairly common thing, but up to now you always had to resort to kluges like writing the queries to a file and then inputting the file. Now it can be done with no intermediate step. The implementation is fairly straightforward except for its interaction with FETCH_COUNT. ExecQueryUsingCursor isn't capable of being called recursively, and even if it were, its need to create a transaction block interferes unpleasantly with the desired behavior of \gexec after a failure of a generated query (i.e., that it can continue). Therefore, disable use of ExecQueryUsingCursor when doing the master \gexec query. We can still apply it to individual generated queries, however, and there might be some value in doing so. While testing this feature's interaction with single-step mode, I (tgl) was led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed) as a negative response to the single-step prompt. Perhaps that's a back-patchable bug fix, but for now I just included it here. Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself
* Introduce a LOG_SERVER_ONLY ereport level, which is never sent to client.Tom Lane2016-04-04
| | | | | | | | | | | | | | | | This elevel is useful for logging audit messages and similar information that should not be passed to the client. It's equivalent to LOG in terms of decisions about logging priority in the postmaster log, but messages with this elevel will never be sent to the client. In the current implementation, it's just an alias for the longstanding COMMERROR elevel (or more accurately, we've made COMMERROR an alias for this). At some point it might be interesting to allow a LOG_ONLY flag to be attached to any elevel, but that would be considerably more complicated, and it's not clear there's enough use-cases to justify the extra work. For now, let's just take the easy 90% solution. David Steele, reviewed by Fabien Coelho, Petr Jelínek, and myself
* Fix latent portability issue in pgwin32_dispatch_queued_signals().Tom Lane2016-04-04
| | | | | | | | | | | | | | | | The first iteration of the signal-checking loop would compute sigmask(0) which expands to 1<<(-1) which is undefined behavior according to the C standard. The lack of field reports of trouble suggest that it evaluates to 0 on all existing Windows compilers, but that's hardly something to rely on. Since signal 0 isn't a queueable signal anyway, we can just make the loop iterate from 1 instead, and save a few cycles as well as avoiding the undefined behavior. In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE twice in a row; there's no reason to waste cycles like that. Noted by Aleksander Alekseev, though this isn't his proposed fix. Back-patch to all supported branches.
* Fix typoTeodor Sigaev2016-04-04
| | | | Michael Paquier
* fix typoTeodor Sigaev2016-04-04
| | | | Andreas Ulbrich
* Improve estimate of distinct values in estimate_num_groups().Dean Rasheed2016-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adjusting the estimate for the number of distinct values from a rel in a grouped query to take into account the selectivity of the rel's restrictions, use a formula that is less likely to produce under-estimates. The old formula simply multiplied the number of distinct values in the rel by the restriction selectivity, which would be correct if the restrictions were fully correlated with the grouping expressions, but can produce significant under-estimates in cases where they are not well correlated. The new formula is based on the random selection probability, and so assumes that the restrictions are not correlated with the grouping expressions. This is guaranteed to produce larger estimates, and of course risks over-estimating in cases where the restrictions are correlated, but that has less severe consequences than under-estimating, which might lead to a HashAgg that consumes an excessive amount of memory. This could possibly be improved upon in the future by identifying correlated restrictions and using a hybrid of the old and new formulae. Author: Tomas Vondra, with some hacking be me Reviewed-by: Mark Dilger, Alexander Korotkov, Dean Rasheed and Tom Lane Discussion: http://www.postgresql.org/message-id/flat/56CD0381.5060502@2ndquadrant.com
* Avoid archiving XLOG_RUNNING_XACTS on idle serverSimon Riggs2016-04-04
| | | | | | | | | | If archive_timeout > 0 we should avoid logging XLOG_RUNNING_XACTS if idle. Bug 13685 reported by Laurence Rowe, investigated in detail by Michael Paquier, though this is not his proposed fix. 20151016203031.3019.72930@wrigleys.postgresql.org Simple non-invasive patch to allow later backpatch to 9.4 and 9.5
* Clean up dubious code in contrib/seg.Tom Lane2016-04-03
| | | | | | | | | | | | | | | | | | | | | The restore() function assumed that the result of sprintf() with %e format would necessarily contain an 'e', which is false: what if the supplied number is an infinity or NaN? If that did happen, we'd get a null-pointer-dereference core dump. The case appears impossible currently, because seg_in() does not accept such values, and there are no seg-creating functions that would create one. But it seems unwise to rely on it never happening in future. Quite aside from that, the code was pretty ugly: it relied on modifying a static format string when it could use a "*" precision argument, and it used strtok() entirely gratuitously, and it stripped off trailing spaces by hand instead of just not asking for them to begin with. Coverity noticed the potential null pointer dereference (though I wonder why it didn't complain years ago, since this code is ancient). Since this is just code cleanup and forestalling a hypothetical future bug, there seems no need for back-patching.
* Fix contrib/bloom to not fail under CLOBBER_CACHE_ALWAYS.Tom Lane2016-04-03
| | | | | | The code was supposing that rd_amcache wouldn't disappear from under it during a scan; which is wrong. Copy the data out of the relcache rather than trying to reference it there.
* Clean up some stuff in new contrib/bloom module.Tom Lane2016-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | Coverity complained about implicit sign-extension in the BloomPageGetFreeSpace macro, probably because sizeOfBloomTuple isn't wide enough for size calculations. No overflow is really possible as long as maxoff and sizeOfBloomTuple are small enough to represent a realistic situation, but it seems like a good idea to declare sizeOfBloomTuple as Size not int32. Add missing check on BloomPageAddItem() result, again from Coverity. Avoid core dump due to not allocating so->sign array when scan->numberOfKeys is zero. Also thanks to Coverity. Use FLEXIBLE_ARRAY_MEMBER rather than declaring an array as size 1 when it isn't necessarily. Very minor beautification of related code. Unfortunately, none of the Coverity-detected mistakes look like they could account for the remaining buildfarm unhappiness with this module. It's barely possible that the FLEXIBLE_ARRAY_MEMBER mistake does account for that, if it's enabling bogus compiler optimizations; but I'm not terribly optimistic. We probably still have bugs to find here.
* Avoid pin scan for replay of XLOG_BTREE_VACUUM in all casesSimon Riggs2016-04-03
| | | | | | | | | | | | | | | | Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the pin scan that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here and WAL is identical. The current commit applies in all cases, effectively replacing commit 687f2cd7a0150647794efe432ae0397cb41b60ff.
* Add psql \errverbose command to see last server error at full verbosity.Tom Lane2016-04-03
| | | | | | | | | | | | | | | | | | Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds a psql feature that redisplays the most recent server error at full verbosity, without needing to make any variable changes or re-execute the failed command. We just need to hang onto the latest error PGresult in case the user executes \errverbose, and then apply libpq's new PQresultVerboseErrorMessage() function to it. This will consume some trivial amount of psql memory, but otherwise the cost when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
* Add libpq support for recreating an error message with different verbosity.Tom Lane2016-04-03
| | | | | | | | | | | | | | | | | | | | | Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds support in libpq for regenerating the error message for an existing error PGresult at any desired verbosity level. This is almost just a matter of refactoring the existing code into a subroutine, but there is one bit of possibly-needed information that was not getting put into PGresults: the text of the last query sent to the server. We must add that string to the contents of an error PGresult. But we only need to save it if it might be used, which with the existing error-formatting code only happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is probably pretty rare for errors in production situations. So really the overhead when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
* Add missing "static".Tom Lane2016-04-02
| | | | Per buildfarm member pademelon.
* Make all the declarations of WaitEventSetWaitBlock be marked "inline".Tom Lane2016-04-02
| | | | | The inconsistency here triggered compiler warnings on some buildfarm members, and it's surely pretty pointless.
* Suppress compiler warning.Tom Lane2016-04-02
| | | | | | Some buildfarm members are showing "comparison is always false due to limited range of data type" complaints on this test, so #ifdef it out on machines with 32-bit int.
* Fix condition in e9e441c9fac6cbc0510cded6abb9d0e6b646ecafTeodor Sigaev2016-04-02
| | | | Comment is right, but if - not.
* Fix typo in pg_regress.cStephen Frost2016-04-02
| | | | | | s/afer/after Pointed out by Andreas 'ads' Scherbaum
* Prevent mark as deleted and as 'has free space' page in bloom moduleTeodor Sigaev2016-04-02
| | | | | Vacuum might put page into list of pages with some free space and mark as deleted at the same time.
* Fixes in bloom contrib moduleTeodor Sigaev2016-04-02
| | | | | | | | Looking at result of buildfarm member jaguarundi it seems to me that BloomOptions isn't inited sometime, but I don't see yet how it's possible. Nevertheless, check of signature length's is missed, so, add a limit of it. Also add missed GenericXLogAbort() in case of already deleted page in vacuum + minor code refactoring.
* Refer to a TOKEN_USER payload as a "token user," not as a "user token".Noah Misch2016-04-01
| | | | | This corrects messages for can't-happen errors. The corresponding "user token" appears in the HANDLE argument of GetTokenInformation().
* Copyedit comments and documentation.Noah Misch2016-04-01
|
* test_slot_timelines: Fix alternate expected outputAlvaro Herrera2016-04-01
|
* Omit null rows when setting the threshold for what's a most-common value.Tom Lane2016-04-01
| | | | | | | | | | | | | | | | | | | As with the previous patch, large numbers of null rows could skew this calculation unfavorably, causing us to discard values that have a legitimate claim to be MCVs, since our definition of MCV is that it's most common among the non-null population of the column. Hence, make the numerator of avgcount be the number of non-null sample values not the number of sample rows; likewise for maxmincount in the compute_scalar_stats variant. Also, make the denominator be the number of distinct values actually observed in the sample, rather than reversing it back out of the computed stadistinct. This avoids depending on the accuracy of the Haas-Stokes approximation, and really it's what we want anyway; the threshold should depend only on what we see in the sample, not on what we extrapolate about the contents of the whole column. Alex Shulgin, reviewed by Tomas Vondra and myself
* pgbench: Remove unused parameterAlvaro Herrera2016-04-01
| | | | | | | For some reason this parameter was introduced as unused in 3da0dfb4b146, and has never been used for anything. Remove it. Author: Fabien Coelho
* Omit null rows when applying the Haas-Stokes estimator for ndistinct.Tom Lane2016-04-01
| | | | | | | | | | | | | | | | | | | | Previously, we included null rows in the values of n and N that went into the formula, which amounts to considering null as a value in its own right; but the d and f1 values do not include nulls. This is inconsistent, and it contributes to significant underestimation of ndistinct when the column is mostly nulls. In any case stadistinct is defined as the number of distinct non-null values, so we should exclude nulls when doing this computation. This is an aboriginal bug in our application of the Haas-Stokes formula, but we'll refrain from back-patching for fear of destabilizing plan choices in released branches. While at it, make the code a bit more readable by omitting unnecessary casts and intermediate variables. Observation and original patch by Tomas Vondra, adjusted to fix both uses of the formula by Alex Shulgin, cosmetic improvements by me
* Fix logical_decoding_timelines test crashesAlvaro Herrera2016-04-01
| | | | | | | | | | | | | | | | | | | In the test_slot_timelines test module, we were abusing passing NULL values which was received as zeroes in x86, but this breaks in ARM (buildfarm member hamster) by crashing instead. Fix the breakage by marking these functions as STRICT; the InvalidXid value that was previously implicit in NULL values (on x86 at least) can now be passed as 0. Failing to follow the fmgr protocol to check for NULLs beforehand was causing ARM to fail, as evidenced by segmentation faults in buildfarm member hamster. In order to use the new functionality in the test script, use COALESCE in the right spot to avoid forwarding NULL values. This was diagnosed from the hamster crash by Craig Ringer, who also proposed a different patch (checking for NULL values explicitely in the C function code, and keeping the non-strictness in the C functions). I decided to go with this approach instead.
* Fixes in bloom contrib module missed during reviewTeodor Sigaev2016-04-01
| | | | | - macroses llike (var & FLAG) are changed to ((var & FLAG) != 0) - do not copy uninitialized part of notFullPage array to page
* Type names should not be quotedAlvaro Herrera2016-04-01
| | | | | | | | | Our actual convention, contrary to what I said in 59a2111b23f, is not to quote type names, as evidenced by unquoted use of format_type_be() result value in error messages. Remove quotes from recently tweaked messages accordingly. Per note from Tom Lane
* Get rid of minus zero in box regression test.Tom Lane2016-04-01
| | | | | | | | | | Commit acdf2a8b added a test case involving minus zero as a box endpoint. This is not very portable, as evidenced by the several older buildfarm members that are failing on the test because they print minus zero as just "0". If there were any significant reason to test this behavior, we could consider carrying a separate expected-file; but it doesn't look to me like there's adequate justification to accept such a maintenance burden. Just change the test to use plain zero, instead.
* Fix oversight in getParamDescriptions(), and improve comments.Tom Lane2016-04-01
| | | | | | | | | | | | | | | | | | | | | | | When getParamDescriptions was changed to handle out-of-memory better by cribbing error recovery logic from getRowDescriptions/getAnotherTuple, somebody omitted to copy the stanza about checking for excess data in the message. But you need to do that, since continue'ing out of the switch in pqParseInput3 means no such check gets applied there anymore. Noted while looking at Michael Paquier's patch that made yet another copy of this advance_and_error logic. (This whole business desperately needs refactoring, because I sure don't want to see a dozen copies of this code, but that's where we seem to be headed. What's more, the "suspend parsing on EOF return" convention is a holdover from protocol 2 and shouldn't exist at all in protocol 3, because we don't process partial messages anymore. But for now, just fix the obvious bug.) Also, fix some wrong/missing comments about what the API spec is for these three functions. This doesn't seem worthy of back-patching, even though it's a bug; the case shouldn't ever arise in the field.
* Fix English in bloom module documentationTeodor Sigaev2016-04-01
| | | | Author: Erik Rijkers
* Bloom index contrib moduleTeodor Sigaev2016-04-01
| | | | | | | | | | | Module provides new access method. It is actually a simple Bloom filter implemented as pgsql's index. It could give some benefits on search with large number of columns. Module is a single way to test generic WAL interface committed earlier. Author: Teodor Sigaev, Alexander Korotkov Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby
* Fix typo in generic wal docsTeodor Sigaev2016-04-01
| | | | Markus Nullmeier
* Add Generic WAL interfaceTeodor Sigaev2016-04-01
| | | | | | | | | | | | | | | This interface is designed to give an access to WAL for extensions which could implement new access method, for example. Previously it was impossible because restoring from custom WAL would need to access system catalog to find a redo custom function. This patch suggests generic way to describe changes on page with standart layout. Bump XLOG_PAGE_MAGIC because of new record type. Author: Alexander Korotkov with a help of Petr Jelinek, Markus Nullmeier and minor editorization by my Reviewers: Petr Jelinek, Alvaro Herrera, Teodor Sigaev, Jim Nasby, Michael Paquier
* Another zic portability fix.Tom Lane2016-03-31
| | | | | | | | | | | | | I should have remembered that we can't use INT64_MODIFIER with sscanf(): configure chooses that to work with snprintf(), but it might be for our src/port/snprintf.c implementation and so not compatible with the platform's sscanf(). This appears to be the explanation for buildfarm member frogmouth's continuing unhappiness with the tzcode update. Fortunately, in all of the places where zic is attempting to read into an int64 variable, it's reading a year which certainly will fit just fine into an int. So make it read into an int with %d, and then cast or copy as necessary.
* Fix recovery_min_apply_delay testAlvaro Herrera2016-03-31
| | | | | | | | | | | | Previously this test was relying too much on WAL replay to occur in the exact configured interval, which was unreliable on slow or overly busy servers. Use a custom loop instead of poll_query_until, which is hopefully more reliable. Per continued failures on buildfarm member hamster (which is probably the only one running this test suite) Author: Michaël Paquier
* Support using index-only scans with partial indexes in more cases.Tom Lane2016-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the planner would reject an index-only scan if any restriction clause for its table used a column not available from the index, even if that restriction clause would later be dropped from the plan entirely because it's implied by the index's predicate. This is a fairly common situation for partial indexes because predicates using columns not included in the index are often the most useful kind of predicate, and we have to duplicate (or at least imply) the predicate in the WHERE clause in order to get the index to be considered at all. So index-only scans were essentially unavailable with such partial indexes. To fix, we have to do detection of implied-by-predicate clauses much earlier in the planner. This patch puts it in check_index_predicates (nee check_partial_indexes), meaning it gets done for every partial index, whereas we previously only considered this issue at createplan time, so that the work was only done for an index actually selected for use. That could result in a noticeable planning slowdown for queries against tables with many partial indexes. However, testing suggested that there isn't really a significant cost, especially not with reasonable numbers of partial indexes. We do get a small additional benefit, which is that cost_index is more accurate since it correctly discounts the evaluation cost of clauses that will be removed. We can also avoid considering such clauses as potential indexquals, which saves useless matching cycles in the case where the predicate columns aren't in the index, and prevents generating bogus plans that double-count the clause's selectivity when the columns are in the index. Tomas Vondra and Kyotaro Horiguchi, reviewed by Kevin Grittner and Konstantin Knizhnik, and whacked around a little by me
* Fix broken variable declarationAlvaro Herrera2016-03-30
| | | | Author: Konstantin Knizhnik
* Blind attempt at fixing Win32 issue on 24c5f1a103cAlvaro Herrera2016-03-30
| | | | | | | As best as I can tell, MyReplicationSlot needs to be PGDLLIMPORT in order for the new test_slot_timelines test module to compile. Per buildfarm
* Use proper format specifier %X/%X for LSN.Fujii Masao2016-03-31
|
* I forgot the alternate expected file in previous commitAlvaro Herrera2016-03-30
| | | | | | | Without this, the test_slot_timelines modules fails "make installcheck" because the required feature is not enabled in a stock server. Per buildfarm
* Enable logical slots to follow timeline switchesAlvaro Herrera2016-03-30
| | | | | | | | | | | | | | | | | | When decoding from a logical slot, it's necessary for xlog reading to be able to read xlog from historical (i.e. not current) timelines; otherwise, decoding fails after failover, because the archives are in the historical timeline. This is required to make "failover logical slots" possible; it currently has no other use, although theoretically it could be used by an extension that creates a slot on a standby and continues to replay from the slot when the standby is promoted. This commit includes a module in src/test/modules with functions to manipulate the slots (which is not otherwise possible in SQL code) in order to enable testing, and a new test in src/test/recovery to ensure that the behavior is as expected. Author: Craig Ringer Reviewed-By: Oleksii Kliukin, Andres Freund, Petr Jelínek
* XLogReader general code cleanupAlvaro Herrera2016-03-30
| | | | | | | | | Some minor tweaks and comment additions, for cleanliness sake and to avoid having the upcoming timeline-following patch be polluted with unrelated cleanup. Extracted from a larger patch by Craig Ringer, reviewed by Andres Freund, with some additions by myself.
* Improve portability of I/O behavior for the geometric types.Tom Lane2016-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, the geometric I/O routines such as box_in and point_out relied directly on strtod() and sprintf() for conversion of the float8 component values of their data types. However, the behavior of those functions is pretty platform-dependent, especially for edge-case values such as infinities and NaNs. This was exposed by commit acdf2a8b372aec1d, which added test cases involving boxes with infinity endpoints, and immediately failed on Windows and AIX buildfarm members. We solved these problems years ago in the main float8in and float8out functions, so let's fix it by making the geometric types use that code instead of depending directly on the platform-supplied functions. To do this, refactor the float8in code so that it can be used to parse just part of a string, and as a convenience make the guts of float8out usable without going through DirectFunctionCall. While at it, get rid of geo_ops.c's fairly shaky assumptions about the maximum output string length for a double, by having it build results in StringInfo buffers instead of fixed-length strings. In passing, convert all the "invalid input syntax for type foo" messages in this area of the code into "invalid input syntax for type %s" to reduce the number of distinct translatable strings, per recent discussion. We would have needed a fair number of the latter anyway for code-sharing reasons, so we might as well just go whole hog. Note: this patch is by no means intended to guarantee that the geometric types uniformly behave sanely for infinity or NaN component values. But any bugs we have in that line were there all along, they were just harder to reach in a platform-independent way.
* Suppress uninitialized-variable warnings.Tom Lane2016-03-30
| | | | | | My compiler doesn't like the lack of initialization of "flag", and I think it's right: if there were zero keys we'd have an undefined result. The AND of zero items is TRUE, so initialize to TRUE.
* Bump catalog version, forget in acdf2a8b372aec1da09370fca77ff7dccac7646dTeodor Sigaev2016-03-30
|