aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Make max_parallel_degree PGC_USERSET.Robert Haas2016-03-21
| | | | | It was intended to be this way all along, just like other planner GUCs such as work_mem. But I goofed.
* Support parallel aggregation.Robert Haas2016-03-21
| | | | | | | | | Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.
* Properly declare FeBeWaitSet.Andres Freund2016-03-21
| | | | | Surprising that this worked on a number of systems. Reported by buildfarm member longfin.
* Introduce WaitEventSet API.Andres Freund2016-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
* Combine win32 and unix latch implementations.Andres Freund2016-03-21
| | | | | | | | | | | | | Previously latches for windows and unix had been implemented in different files. A later patch introduce an expanded wait infrastructure, keeping the implementation separate would introduce too much duplication. This basically just moves the functions, without too much change. The reason to keep this separate is that it allows blame to continue working a little less badly; and to make review a tiny bit easier. Discussion: 20160114143931.GG10941@awork2.anarazel.de
* Remove dependency on psed for MSVC builds.Andrew Dunstan2016-03-19
| | | | | | | | | | | | Modern Perl has removed psed from its core distribution, so it might not be readily available on some build platforms. We therefore replace its use with a Perl script generated by s2p, which is equivalent to the sed script. The latter is retained for non-MSVC builds to avoid creating a new hard dependency on Perl for non-Windows tarball builds. Backpatch to all live branches. Michael Paquier and me.
* Sync backend/parser/scan.l with bin/psql/psqlscan.l.Tom Lane2016-03-19
| | | | | | | | | | | | | | Make some minor formatting adjustments to make it easier to diff these files and see that they indeed implement the same flex rules (at least to the extent that we want them to be the same). (Someday it'd be nice to make ecpg's pgc.l more easily diff'able too, but today is not that day.) Also run relevant parts of these files and psqlscanslash.l through pgindent. No actual behavioral changes here, just obsessive neatnik-ism.
* Build backend/parser/scan.l and interfaces/ecpg/preproc/pgc.l standalone.Tom Lane2016-03-19
| | | | | | | | | | | | | Now that we know about the %top{} trick, we can revert to building flex lexers as separate .o files. This is worth doing for a couple of reasons besides sheer cleanliness. We can narrow the scope of the -Wno-error flag that's forced on scan.c. Also, since these grammar and lexer files are so large, splitting them into separate build targets should have some advantages in build speed, particularly in parallel or ccache'd builds. We have quite a few other .l files that could be changed likewise, but the above arguments don't apply to them, so the benefit of fixing them seems pretty minimal. Leave the rest for some other day.
* Allow SSL server key file to have group read access if owned by rootPeter Eisentraut2016-03-19
| | | | | | | | | | | | | We used to require the server key file to have permissions 0600 or less for best security. But some systems (such as Debian) have certificate and key files managed by the operating system that can be shared with other services. In those cases, the "postgres" user is made a member of a special group that has access to those files, and the server key file has permissions 0640. To accommodate that kind of setup, also allow the key file to have permissions 0640 but only if owned by root. From: Christoph Berg <myon@debian.org> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
* Fix stupid omission in c4901a1e.Andres Freund2016-03-18
| | | | | Reported-By: Jeff Janes Discussion: CAMkU=1zGxREwoyaCrp_CHadEB+dPgpVyKBysCJ+6xP9gCOvAuw@mail.gmail.com
* Fix missed update in _readForeignScan().Tom Lane2016-03-19
| | | | | Blatant fail in 0bf3ae88af330496517722e391e7c975e6bad219. Caught by buildfarm member mandrill.
* Merge wal_level "archive" and "hot_standby" into new name "replica"Peter Eisentraut2016-03-18
| | | | | | | | | | | | | | | | | The distinction between "archive" and "hot_standby" existed only because at the time "hot_standby" was added, there was some uncertainty about stability. This is now a long time ago. We would like to move forward with simplifying the replication configuration, but this distinction is in the way, because a primary server cannot tell (without asking a standby or predicting the future) which one of these would be the appropriate level. Pick a new name for the combined setting to make it clearer that it covers all (non-logical) backup and replication uses. The old values are still accepted but are converted internally. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: David Steele <david@pgmasters.net>
* Use INT64_FORMAT instead of %ld for int64.Robert Haas2016-03-18
| | | | | | | | Commit 0011c0091e886b874e485a46ff2c94222ffbf550 introduced this mistake. Patch by me. Reported by Andres Freund, who also reviewed the patch.
* Only clear latch self-pipe/event if there is a pending notification.Andres Freund2016-03-18
| | | | | | | | | | | | | | | | This avoids a good number of, individually quite fast, system calls in scenarios with many quick queries. Besides the aesthetic benefit of seing fewer superflous system calls with strace, it also improves performance by ~2% measured by pgbench -M prepared -c 96 -j 8 -S (scale 100). Without having benchmarked it, this patch also adjust the windows code, as that makes it easier to unify the unix/windows codepaths in a later patch. There's little reason to diverge in behaviour between the platforms. Discussion: CA+TgmoYc1Zm+Szoc_Qbzi92z2c1vRHZmjhfPn5uC=w8bXv6Avg@mail.gmail.com Reviewed-By: Robert Haas
* Make it easier to choose the used waiting primitive in unix_latch.c.Andres Freund2016-03-18
| | | | | | | | This allows for easier testing of the different primitives; in preparation for adding a new primitive. Discussion: 20160114143931.GG10941@awork2.anarazel.de Reviewed-By: Robert Haas
* Error out if waiting on socket readiness without a specified socket.Andres Freund2016-03-18
| | | | | | | | | Previously we just ignored such an attempt, but that seems to serve no purpose but making things harder to debug. Discussion: 20160114143931.GG10941@awork2.anarazel.de 20151230173734.hx7jj2fnwyljfqek@alap3.anarazel.de Reviewed-By: Robert Haas
* Directly modify foreign tables.Robert Haas2016-03-18
| | | | | | | | | postgres_fdw can now sent an UPDATE or DELETE statement directly to the foreign server in simple cases, rather than sending a SELECT FOR UPDATE statement and then updating or deleting rows one-by-one. Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro Horiguchi, Albe Laurenz, Thom Brown, and me.
* Introduce parse_ident()Teodor Sigaev2016-03-18
| | | | | | SQL-layer function to split qualified identifier into array parts. Author: Pavel Stehule with minor editorization by me and Jim Nasby
* Push scan/join target list beneath Gather when possible.Robert Haas2016-03-18
| | | | | | | | | | | | | | | | | This means that, for example, "SELECT expensive_func(a) FROM bigtab WHERE something" can compute expensive_func(a) in the workers rather than the leader if it happens to be parallel-safe, which figures to be a big win in some practical cases. Currently, we can only do this if the entire target list is parallel-safe. If we worked harder, we might be able to evaluate parallel-safe targets in the worker and any parallel-restricted targets in the leader, but that would be more complicated, and there aren't that many parallel-restricted functions that people are likely to use in queries anyway. I think. So just do the simple thing for the moment. Robert Haas, Amit Kapila, and Tom Lane
* Various minor corrections of and improvements to comments.Robert Haas2016-03-18
| | | | Aleksander Alekseev
* Remove useless double calls of make_parsestate().Tom Lane2016-03-17
| | | | Aleksander Alekseev
* Update tuplesort.c comments for memory mangement improvements.Robert Haas2016-03-17
| | | | | | | I'm committing these changes separately so that it's clear what is Peter's original work versus what I changed. This is a followup to commit 0011c0091e886b874e485a46ff2c94222ffbf550, and these changes are all by me.
* Improve memory management for external sorts.Robert Haas2016-03-17
| | | | | | | | | | | | | | Introduce a new memory context which stores tuple data, and reset it at the end of each merge pass; this helps avoid memory fragmentation and, consequently, overallocation. Also, for the final merge patch, eliminate memory context chunk header overhead entirely by allocating all of the memory used for buffering tuples during the merge in a single chunk. Since this modestly increases the number of tuples we can store, grow the memtuples array a bit so that we're less likely to run short of slots there. Peter Geoghegan. Review and testing of patches in this series by Jeff Janes, Greg Stark, Mithun Cy, and me.
* Fix assorted breakage in to_char()'s OF format option.Tom Lane2016-03-17
| | | | | | | | | | | | | | | | | | In HEAD, fix incorrect field width for hours part of OF when tm_gmtoff is negative. This was introduced by commit 2d87eedc1d4468d3 as a result of falsely applying a pattern that's correct when + signs are omitted, which is not the case for OF. In 9.4, fix missing abs() call that allowed a sign to be attached to the minutes part of OF. This was fixed in 9.5 by 9b43d73b3f9bef27, but for inscrutable reasons not back-patched. In all three versions, ensure that the sign of tm_gmtoff is correctly reported even when the GMT offset is less than 1 hour. Add regression tests, which evidently we desperately need here. Thomas Munro and Tom Lane, per report from David Fetter
* Improve support of HunspellTeodor Sigaev2016-03-17
| | | | | | | | | | | - allow to use non-ascii characters as affix flag. Non-numeric affix flags now are stored as string instead of numeric value of character. - allow to use 0 as affix flag in numeric encoded affixes That adds support for arabian, hungarian, turkish and brazilian portuguese languages. Author: Artur Zakirov with heavy editorization by me
* Fix typos.Robert Haas2016-03-17
| | | | Jim Nasby
* Add syslog_split_messages parameterPeter Eisentraut2016-03-16
| | | | Reviewed-by: Andreas Karlsson <andreas@proxel.se>
* Add syslog_sequence_numbers parameterPeter Eisentraut2016-03-16
| | | | Reviewed-by: Andreas Karlsson <andreas@proxel.se>
* Fix j2day() to behave sanely for negative Julian dates.Tom Lane2016-03-16
| | | | | | | | | | | | | | | | Somebody had apparently once figured that casting to unsigned int would produce the right output for negative inputs, but that would only be true if 2^32 were a multiple of 7, which of course it ain't. We need to use a signed division and then correct the sign of the remainder. AFAICT, the only case where this would arise currently is when doing ISO-week calculations for dates in 4714BC, where we'd compute a negative Julian date representing 4714-01-04BC and then do some arithmetic with it. Since we don't even really document support for such dates, this is not of much consequence. But we may as well get it right. Per report from Vitaly Burovoy.
* Be more careful about out-of-range dates and timestamps.Tom Lane2016-03-16
| | | | | | | | | | | | | | | | | | | | | Tighten the semantics of boundary-case timestamptz so that we allow timestamps >= '4714-11-24 00:00+00 BC' and < 'ENDYEAR-01-01 00:00+00 AD' exactly, no more and no less, but it is allowed to enter timestamps within that range using non-GMT timezone offsets (which could make the nominal date 4714-11-23 BC or ENDYEAR-01-01 AD). This eliminates dump/reload failure conditions for timestamps near the endpoints. To do this, separate checking of the inputs for date2j() from the final range check, and allow the Julian date code to handle a range slightly wider than the nominal range of the datatypes. Also add a bunch of checks to detect out-of-range dates and timestamps that formerly could be returned by operations such as date-plus-integer. All C-level functions that return date, timestamp, or timestamptz should now be proof against returning a value that doesn't pass IS_VALID_DATE() or IS_VALID_TIMESTAMP(). Vitaly Burovoy, reviewed by Anastasia Lubennikova, and substantially whacked around by me
* Another comment update.Robert Haas2016-03-16
| | | | I thought this was in my last commit, but I goofed.
* Fix problems in commit c16dc1aca5e01e6acaadfcf38f5fc964a381dc62.Robert Haas2016-03-16
| | | | | | | Vinayak Pokale provided a patch for a copy-and-paste error in a comment. I noticed that I'd use the word "automatically" nearby where I meant to talk about things being "atomic". Rahila Syed spotted a misplaced counter update. Fix all that stuff.
* Add idle_in_transaction_session_timeout.Robert Haas2016-03-16
| | | | | Vik Fearing, reviewed by Stéphane Schildknecht and me, and revised slightly by me.
* UCS_to_EUC_JIS_2004.pl: Turn off "test" mode by defaultPeter Eisentraut2016-03-16
| | | | | It produces debugging output files that are of no further use, so we don't need that by default.
* Make spacing and punctuation consistentPeter Eisentraut2016-03-16
|
* Fix typos.Robert Haas2016-03-15
| | | | Oskari Saarenmaa
* Avoid incorrectly indicating exclusion constraint waitStephen Frost2016-03-15
| | | | | | | | | | | | | | | | | INSERT ... ON CONFLICT's precheck may have to wait on the outcome of another insertion, which may or may not itself be a speculative insertion. This wait is not necessarily associated with an exclusion constraint, but was always reported that way in log messages if the wait happened to involve a tuple that had no speculative token. Initially discovered through use of ON CONFLICT DO NOTHING, where spurious references to exclusion constraints in log messages were more likely. Patch by Peter Geoghegan. Reviewed by Julien Rouhaud. Back-patch to 9.5 where INSERT ... ON CONFLICT was added.
* Fix typos in commentsAlvaro Herrera2016-03-15
|
* Add simple VACUUM progress reporting.Robert Haas2016-03-15
| | | | | | | | | | | There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.
* Add a GetForeignUpperPaths callback function for FDWs.Tom Lane2016-03-14
| | | | | | | | | | | | This is basically like the just-added create_upper_paths_hook, but control is funneled only to the FDW responsible for all the baserels of the current query; so providing such a callback is much less likely to add useless overhead than using the hook function is. The documentation is a bit sketchy. We'll likely want to improve it, and/or adjust the call conventions, when we get some experience with actually using this callback. Hopefully somebody will find time to experiment with it before 9.6 feature freeze.
* Fix EXPLAIN ANALYZE SELECT INTO not to choose a parallel plan.Robert Haas2016-03-14
| | | | | | | | | We don't support any parallel write operations at present, so choosing a parallel plan causes us to error out. Also, add a new regression test that uses EXPLAIN ANALYZE SELECT INTO; if we'd had this previously, force_parallel_mode testing would have caught this issue. Mithun Cy and Robert Haas
* Provide a planner hook at a suitable place for creating upper-rel Paths.Tom Lane2016-03-14
| | | | | | | | | | | | | | | | | | | | | | | | In the initial revision of the upper-planner pathification work, the only available way for an FDW or custom-scan provider to inject Paths representing post-scan-join processing was to insert them during scan-level GetForeignPaths or similar processing. While that's not impossible, it'd require quite a lot of duplicative processing to look forward and see if the extension would be capable of implementing the whole query. To improve matters for custom-scan providers, provide a hook function at the point where the core code is about to start filling in upperrel Paths. At this point Paths are available for the whole scan/join tree, which should reduce the amount of redundant effort considerably. (An alternative design that was suggested was to provide a separate hook for each post-scan-join processing step, but that seems messy and not clearly more useful.) Following our time-honored tradition, there's no documentation for this hook outside the source code. As-is, this hook is only meant for custom scan providers, which we can't assume very much about. A followon patch will implement an FDW callback to let FDWs do the same thing in a somewhat more structured fashion.
* Allow callers of create_foreignscan_path to specify nondefault PathTarget.Tom Lane2016-03-14
| | | | | | | | | Although the default choice of rel->reltarget should typically be sufficient for scan or join paths, it's not at all sufficient for the purposes PathTargets were invented for; in particular not for upper-relation Paths. So break API compatibility by adding a PathTarget argument to create_foreignscan_path(). To ease updating of existing code, accept a NULL value of the argument as selecting rel->reltarget.
* Rethink representation of PathTargets.Tom Lane2016-03-14
| | | | | | | | | | | | | | In commit 19a541143a09c067 I did not make PathTarget a subtype of Node, and embedded a RelOptInfo's reltarget directly into it rather than having a separately-allocated Node. In hindsight that was misguided micro-optimization, enabled by the fact that at that point we didn't have any Paths with custom PathTargets. Now that PathTarget processing has been fleshed out some more, it's easier to see that it's better to have PathTarget as an indepedent Node type, even if it does cost us one more palloc to create a RelOptInfo. So change it while we still can. This commit just changes the representation, without doing anything more interesting than that.
* Update more comments for 96198d94cb7adc664bda341842dc8db671d8be72.Robert Haas2016-03-14
| | | | | Etsuro Fujita, reviewed (though not completely endorsed) by Ashutosh Bapat, and slightly expanded by me.
* Use repalloc_huge() to enlarge a SPITupleTable's tuple pointer array.Tom Lane2016-03-14
| | | | | | | | | | | Commit 23a27b039d94ba35 widened the rows-stored counters to uint64, but that's academic unless we allow the tuple pointer array to exceed 1GB. (It might be a good idea to provide some other limit on how much storage a SPITupleTable can eat. On the other hand, there are plenty of other ways to drive a backend into swap hell.) Dagfinn Ilmari Mannsåker
* Improve check for overly-long extensible node name.Robert Haas2016-03-14
| | | | | | | | | | The old code is bad for two reasons. First, it has an off-by-one error. Second, it won't help if you aren't running with assertions enabled. Per discussion, we want a check here in that case too. Author: KaiGai Kohei, adjusted by me. Reviewed-by: Petr Jelinek Discussion: 56E0D547.1030101@2ndquadrant.com
* Fix memory leak in repeated GIN index searches.Tom Lane2016-03-13
| | | | | | | | | | | | | | | | | | | | | | | | Commit d88976cfa1302e8d removed this code from ginFreeScanKeys(): - if (entry->list) - pfree(entry->list); evidently in the belief that that ItemPointer array is allocated in the keyCtx and so would be reclaimed by the following MemoryContextReset. Unfortunately, it isn't and it won't. It'd likely be a good idea for that to become so, but as a simple and back-patchable fix in the meantime, restore this code to ginFreeScanKeys(). Also, add a similar pfree to where startScanEntry() is about to zero out entry->list. I am not sure if there are any code paths where this change prevents a leak today, but it seems like cheap future-proofing. In passing, make the initial allocation of so->entries[] use palloc not palloc0. The code doesn't depend on unused entries being zero; if it did, the array-enlargement code in ginFillScanEntry() would be wrong. So using palloc0 initially can only serve to confuse readers about what the invariant is. Per report from Felipe de Jesús Molina Bravo, via Jaime Casanova in <CAJGNTeMR1ndMU2Thpr8GPDUfiHTV7idELJRFusA5UXUGY1y-eA@mail.gmail.com>
* Fix whitespace and remove obsolete gitattributes entryPeter Eisentraut2016-03-13
|
* Report memory context stats upon out-of-memory in repalloc[_huge].Tom Lane2016-03-13
| | | | | | This longstanding functionality evidently got lost in commit 3d6d1b585524aab6. Noted while studying an OOM report from Jaime Casanova. Backpatch to 9.5 where the bug was introduced.