aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Fix another bug in merging of inherited CHECK constraints.Tom Lane2016-10-13
| | | | | | | | | | | | | | | | It's not good for an inherited child constraint to be marked connoinherit; that would result in the constraint not propagating to grandchild tables, if any are created later. The code mostly prevented this from happening but there was one case that was missed. This is somewhat related to commit e55a946a8, which also tightened checks on constraint merging. Hence, back-patch to 9.2 like that one. This isn't so much because there's a concrete feature-related reason to stop there, as to avoid having more distinct behaviors than we have to in this area. Amit Langote Discussion: <b28ee774-7009-313d-dd55-5bdd81242c41@lab.ntt.co.jp>
* Try to find out the actual hugepage size when making a MAP_HUGETLB request.Tom Lane2016-10-13
| | | | | | | | | | | | | | | | | | | | | | Even if Linux's mmap() is okay with a partial-hugepage request, munmap() is not, as reported by Chris Richards. Therefore it behooves us to try a bit harder to find out the actual hugepage size, instead of assuming that we can skate by with a guess. For the moment, just look into /proc/meminfo to find out the default hugepage size, and use that. Later, on kernels that support requests for nondefault sizes, we might try to consider other alternatives. But that smells more like a new feature than a bug fix, especially if we want to provide any way for the DBA to control it, so leave it for another day. I set this up to allow easy addition of platform-specific code for non-Linux platforms, if needed; but right now there are no reports suggesting that we need to work harder on other platforms. Back-patch to 9.4 where hugepage support was introduced. Discussion: <31056.1476303954@sss.pgh.pa.us>
* Clean up handling of anonymous mmap'd shared-memory segment.Tom Lane2016-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix detaching of the mmap'd segment to have its own on_shmem_exit callback, rather than piggybacking on the one for detaching from the SysV segment. That was confusing, and given the distance between the two attach calls, it was trouble waiting to happen. Make the detaching calls idempotent by clearing AnonymousShmem to show we've already unmapped. I spent quite a bit of time yesterday trying to find a path that would allow the munmap()'s to be done twice, and while I did not succeed, it seems silly that there's even a question. Make the #ifdef logic less confusing by separating "do we want to use anonymous shmem" from EXEC_BACKEND. Even though there's no current scenario where those conditions are different, it is not helpful for different places in the same file to be testing EXEC_BACKEND for what are fundamentally different reasons. Don't do on_exit_reset() in StartBackgroundWorker(). At best that's useless (InitPostmasterChild would have done it already) and at worst it could zap some callback that's unrelated to shared memory. Improve comments, and simplify the huge_pages enablement logic slightly. Back-patch to 9.4 where hugepage support was introduced. Arguably this should go into 9.3 as well, but the code looks significantly different there, and I doubt it's worth the trouble of adapting the patch given I can't show a live bug.
* Fix broken jsonb_set() logic for replacing array elements.Tom Lane2016-10-13
| | | | | | | | | | | Commit 0b62fd036 did a fairly sloppy job of refactoring setPath() to support jsonb_insert() along with jsonb_set(). In its defense, though, there was no regression test case exercising the case of replacing an existing element in a jsonb array. Per bug #14366 from Peng Sun. Back-patch to 9.6 where bug was introduced. Report: <20161012065349.1412.47858@wrigleys.postgresql.org>
* Remove unnecessary int2vector-specific hash function and equality operator.Tom Lane2016-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These functions were originally added in commit d8cedf67a to support use of int2vector columns as catcache lookup keys. However, there are no catcaches that use such columns. (Indeed I now think it must always have been dead code: a catcache with such a key column would need an underlying unique index on the column, but we've never had an int2vector btree opclass.) Getting rid of the int2vector-specific operator and function does not lose any functionality, because operations on int2vectors will now fall back to the generic anyarray support. This avoids a wart that a btree index on an int2vector column (made using anyarray_ops) would fail to match equality searches, because int2vectoreq wasn't a member of the opclass. We don't really care much about that, since int2vector is not meant as a type for users to use, but it's silly to have extra code and less functionality. If we ever do want a catcache to be indexed by an int2vector column, we'd need to put back full btree and hash opclasses for int2vector, comparable to the support for oidvector. (The anyarray code can't be used at such a low level, because it needs to do catcache lookups.) But we'll deal with that if/when the need arises. Also worth noting is that removal of the hash int2vector_ops opclass will break any user-created hash indexes on int2vector columns. While hash anyarray_ops would serve the same purpose, it would probably not compute the same hash values and thus wouldn't be on-disk-compatible. Given that int2vector isn't a user-facing type and we're planning other incompatible changes in hash indexes for v10 anyway, this doesn't seem like something to worry about, but it's probably worth mentioning here. Amit Langote Discussion: <d9bb74f8-b194-7307-9ebd-90645d377e45@lab.ntt.co.jp>
* Simplify the code for logical tape read buffers.Heikki Linnakangas2016-10-12
| | | | | | | | | | | | | | | | | Pass the buffer size as argument to LogicalTapeRewindForRead, rather than setting it earlier with the separate LogicTapeAssignReadBufferSize call. This way, the buffer size is set closer to where it's actually used, which makes the code easier to understand. This makes the calculation for how much memory to use for the buffers less precise. We now use the same amount of memory for every tape, rounded down to the nearest BLCKSZ boundary, instead of using one more block for some tapes, to get the total up to exact amount of memory available. That should be OK, merging isn't too sensitive to the exact amount of memory used. Reviewed by Peter Geoghegan Discussion: <0f607c4b-df23-353e-bf56-c0389d28495f@iki.fi>
* Drop server support for FE/BE protocol version 1.0.Tom Lane2016-10-11
| | | | | | | | | While this isn't a lot of code, it's been essentially untestable for a very long time, because libpq doesn't support anything older than protocol 2.0, and has not since release 6.3. There's no reason to believe any other client-side code still uses that protocol, either. Discussion: <2661.1475849167@sss.pgh.pa.us>
* Remove "sco" and "unixware" ports.Tom Lane2016-10-11
| | | | | | | | | | | SCO OpenServer and SCO UnixWare are more or less dead platforms. We have never had a buildfarm member testing the "sco" port, and the last "unixware" member was last heard from in 2012, so it's fair to doubt that the code even compiles anymore on either one. Remove both ports. We can always undo this if someone shows up with an interest in maintaining and testing these platforms. Discussion: <17177.1476136994@sss.pgh.pa.us>
* Remove some unnecessary #includes.Heikki Linnakangas2016-10-10
| | | | Amit Langote
* Add a noreturn attribute to help static analyzersPeter Eisentraut2016-10-09
|
* Fix incorrect handling of polymorphic aggregates used as window functions.Tom Lane2016-10-09
| | | | | | | | | | | | | | | | | The transfunction was told that its first argument and result were of the window function output type, not the aggregate state type. This'd only matter if the transfunction consults get_fn_expr_argtype, which typically only polymorphic functions would do. Although we have several regression tests around polymorphic aggs, none of them detected this mistake --- in fact, they still didn't fail when I injected the same mistake into nodeAgg.c. So add some more tests covering both plain agg and window-function-agg cases. Per report from Sebastian Luque. Back-patch to 9.6 where the error was introduced (by sloppy refactoring in commit 804163bc2, looks like). Report: <87int2qkat.fsf@gmail.com>
* Fix two bugs in merging of inherited CHECK constraints.Tom Lane2016-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically, we've allowed users to add a CHECK constraint to a child table and then add an identical CHECK constraint to the parent. This results in "merging" the two constraints so that the pre-existing child constraint ends up with both conislocal = true and coninhcount > 0. However, if you tried to do it in the other order, you got a duplicate constraint error. This is problematic for pg_dump, which needs to issue separated ADD CONSTRAINT commands in some cases, but has no good way to ensure that the constraints will be added in the required order. And it's more than a bit arbitrary, too. The goal of complaining about duplicated ADD CONSTRAINT commands can be served if we reject the case of adding a constraint when the existing one already has conislocal = true; but if it has conislocal = false, let's just make the ADD CONSTRAINT set conislocal = true. In this way, either order of adding the constraints has the same end result. Another problem was that the code allowed creation of a parent constraint marked convalidated that is merged with a child constraint that is !convalidated. In this case, an inheritance scan of the parent table could emit some rows violating the constraint condition, which would be an unexpected result given the marking of the parent constraint as validated. Hence, forbid merging of constraints in this case. (Note: valid child and not-valid parent seems fine, so continue to allow that.) Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced possibly-not-valid check constraints. The second bug obviously doesn't apply before that, and I think the first doesn't either, because pg_dump only gets into this situation when dealing with not-valid constraints. Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com> Discussion: <22108.1475874586@sss.pgh.pa.us>
* libpqwalreceiver needs to link with libintl when using --enable-nls.Tom Lane2016-10-07
| | | | | | | | | | | | | | | | The need for this was previously obscured even on picky platforms by the hack we used to support direct cross-module references in the transforms contrib modules. Now that that hack is gone, the undefined symbol is exposed, as reported by Robert Haas. Back-patch to 9.5 where we started to use -Wl,-undefined,dynamic_lookup. I'm a bit surprised that the older branches don't seem to contain any gettext references in this module, but since they don't fail at build time, they must not. (We might be able to get away with leaving this alone in 9.5/9.6, but I think it's cleaner if the reference gets resolved at link time.) Report: <CA+TgmoaHJKU5kcWZcYduATYVT7Mnx+8jUnycaYYL7OtCwCigug@mail.gmail.com>
* Fix fallback implementation of pg_atomic_write_u32().Andres Freund2016-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I somehow had assumed that in the spinlock (in turn possibly using semaphores) based fallback atomics implementation 32 bit writes could be done without a lock. As far as the write goes that's correct, since postgres supports only platforms with single-copy atomicity for aligned 32bit writes. But writing without holding the spinlock breaks read-modify-write operations like pg_atomic_compare_exchange_u32(), since they'll potentially "miss" a concurrent write, which can't happen in actual hardware implementations. In 9.6+ when using the fallback atomics implementation this could lead to buffer header locks not being properly marked as released, and potentially some related state corruption. I don't see a related danger in 9.5 (earliest release with the API), because pg_atomic_write_u32() wasn't used in a concurrent manner there. The state variable of local buffers, before this change, were manipulated using pg_atomic_write_u32(), to avoid unnecessary synchronization overhead. As that'd not be the case anymore, introduce and use pg_atomic_unlocked_write_u32(), which does not correctly interact with RMW operations. This bug only caused issues when postgres is compiled on platforms without atomics support (i.e. no common new platform), or when compiled with --disable-atomics, which explains why this wasn't noticed in testing. Reported-By: Tom Lane Discussion: <14947.1475690465@sss.pgh.pa.us> Backpatch: 9.5-, where the atomic operations API was introduced.
* Remove bogus mapping from UTF-8 to SJIS conversion table.Heikki Linnakangas2016-10-07
| | | | | | | | | | | | | | 0xc19c is not a valid UTF-8 byte sequence. It doesn't do any harm, AFAICS, but it's surely not intentional. No backpatching though, just to be sure. In the passing, also add a file header comment to the file, like the UCS_to_SJIS.pl script would produce. (The file was originally created with UCS_to_SJIS.pl, but has been modified by hand since then. That's questionable, but I'll leave fixing that for later.) Kyotaro Horiguchi Discussion: <20160907.155050.233844095.horiguchi.kyotaro@lab.ntt.co.jp>
* Fix excessive memory consumption in the new sort pre-reading code.Heikki Linnakangas2016-10-06
| | | | | | | | | | | | LogicalTapeRewind() should not allocate large read buffer, if the tape is completely empty. The calling code relies on that, for its calculation of how much memory to allocate for the read buffers. That lead to massive overallocation of memory, if maxTapes was high, but only a few tapes were actually used. Reported by Tomas Vondra Discussion: <7303da46-daf7-9c68-3cc1-9f83235cf37e@2ndquadrant.com>
* Re-alphabetize #include directives.Robert Haas2016-10-05
| | | | Thomas Munro
* Rename WAIT_* constants to PG_WAIT_*.Robert Haas2016-10-05
| | | | | | | | Windows apparently has a constant named WAIT_TIMEOUT, and some of these other names are pretty generic, too. Insert "PG_" at the front of each name in order to disambiguate. Michael Paquier
* Fix another Windows compile break.Robert Haas2016-10-04
| | | | | Commit 6f3bd98ebfc008cbd676da777bb0b2376c4c4bfa is still making the buildfarm unhappy. This time it's mastodon that is complaining.
* Fix Windows compile break in 6f3bd98ebfc008cbd676da777bb0b2376c4c4bfa.Robert Haas2016-10-04
|
* Fix another outdated comment.Heikki Linnakangas2016-10-04
| | | | Preloading is done by logtape.c now.
* Extend framework from commit 53be0b1ad to report latch waits.Robert Haas2016-10-04
| | | | | | | | | | | | | | | | | | | | | | WaitLatch, WaitLatchOrSocket, and WaitEventSetWait now taken an additional wait_event_info parameter; legal values are defined in pgstat.h. This makes it possible to uniquely identify every point in the core code where we are waiting for a latch; extensions can pass WAIT_EXTENSION. Because latches were the major wait primitive not previously covered by this patch, it is now possible to see information in pg_stat_activity on a large number of important wait events not previously addressed, such as ClientRead, ClientWrite, and SyncRep. Unfortunately, many of the wait events added by this patch will fail to appear in pg_stat_activity because they're only used in background processes which don't currently appear in pg_stat_activity. We should fix this either by creating a separate view for such information, or else by deciding to include them in pg_stat_activity after all. Michael Paquier and Robert Haas, reviewed by Alexander Korotkov and Thomas Munro.
* Update comment.Heikki Linnakangas2016-10-04
| | | | | mergepreread()/mergeprereadone() don't exist anymore, the function that does roughly the same is now called mergereadnext().
* Correct logical decoding restore behaviour for subtransactions.Andres Freund2016-10-03
| | | | | | | | | | | | | | | | | | | Before initializing iteration over a subtransaction's changes, the last few changes were not spilled to disk. That's correct if the transaction didn't spill to disk, but otherwise... This bug can lead to missed or misorderd subtransaction contents when they were spilled to disk. Move spilling of the remaining in-memory changes to ReorderBufferIterTXNInit(), where it can easily be applied to the top transaction and, if present, subtransactions. Since this code had too many bugs already, noticeably increase test coverage. Fixes: #14319 Reported-By: Huan Ruan Discussion: <20160909012610.20024.58169@wrigleys.postgresql.org> Backport: 9,4-, where logical decoding was added
* Show a sensible value in pg_settings.unit for GUC_UNIT_XSEGS variables.Tom Lane2016-10-03
| | | | | | | | | | | | | | | | Commit 88e982302 invented GUC_UNIT_XSEGS for min_wal_size and max_wal_size, but neglected to make it display sensibly in pg_settings.unit (by adding a case to the switch in GetConfigOptionByNum). Fix that, and adjust said switch to throw a run-time error the next time somebody forgets. In passing, avoid using a static buffer for the output string --- the rest of this function pstrdup's from a local buffer, and I see no very good reason why the units code should do it differently and less safely. Per report from Otar Shavadze. Back-patch to 9.5 where the new unit type was added. Report: <CAG-jOyA=iNFhN+yB4vfvqh688B7Tr5SArbYcFUAjZi=0Exp-Lg@mail.gmail.com>
* Fix RLS with COPY (col1, col2) FROM tabStephen Frost2016-10-03
| | | | | | | | | | | | | | | | Attempting to COPY a subset of columns from a table with RLS enabled would fail due to an invalid query being constructed (using a single ColumnRef with the list of fields to exact in 'fields', but that's for the different levels of an indirection for a single column, not for specifying multiple columns). Correct by building a ColumnRef and then RestTarget for each column being requested and then adding those to the targetList for the select query. Include regression tests to hopefully catch if this is broken again in the future. Patch-By: Adam Brightwell Reviewed-By: Michael Paquier
* Change the way pre-reading in external sort's merge phase works.Heikki Linnakangas2016-10-03
| | | | | | | | | | | | | | | | | | | | | | Don't pre-read tuples into SortTuple slots during merge. Instead, use the memory for larger read buffers in logtape.c. We're doing the same number of READTUP() calls either way, but managing the pre-read SortTuple slots is much more complicated. Also, the on-tape representation is more compact than SortTuples, so we can fit more pre-read tuples into the same amount of memory this way. And we have better cache-locality, when we use just a small number of SortTuple slots. Now that we only hold one tuple from each tape in the SortTuple slots, we can greatly simplify the "batch memory" management. We now maintain a small set of fixed-sized slots, to hold the tuples, and fall back to palloc() for larger tuples. We use this method during all merge phases, not just the final merge, and also when randomAccess is requested, and also in the TSS_SORTEDONTAPE case. In other words, it's used whenever we do an external sort. Reviewed by Peter Geoghegan and Claudio Freire. Discussion: <CAM3SWZTpaORV=yQGVCG8Q4axcZ3MvF-05xe39ZvORdU9JcD6hQ@mail.gmail.com>
* Add ALTER EXTENSION ADD/DROP ACCESS METHOD, and use it in pg_upgrade.Tom Lane2016-10-02
| | | | | | | | | | Without this, an extension containing an access method is not properly dumped/restored during pg_upgrade --- the AM ends up not being a member of the extension after upgrading. Another oversight in commit 473b93287, reported by Andrew Dunstan. Report: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
* Do ClosePostmasterPorts() earlier in SubPostmasterMain().Tom Lane2016-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | In standard Unix builds, postmaster child processes do ClosePostmasterPorts immediately after InitPostmasterChild, that is almost immediately after being spawned. This is important because we don't want children holding open the postmaster's end of the postmaster death watch pipe. However, in EXEC_BACKEND builds, SubPostmasterMain was postponing this responsibility significantly, in order to make it slightly more convenient to pass the right flag value to ClosePostmasterPorts. This is bad, particularly seeing that process_shared_preload_libraries() might invoke nearly-arbitrary code. Rearrange so that we do it as soon as we've fetched the socket FDs via read_backend_variables(). Also move the comment explaining about randomize_va_space to before the call of PGSharedMemoryReAttach, which is where it's relevant. The old placement was appropriate when the reattach happened inside CreateSharedMemoryAndSemaphores, but that was a long time ago. Back-patch to 9.3; the patch doesn't apply cleanly before that, and it doesn't seem worth a lot of effort given that we've had no actual field complaints traceable to this. Discussion: <4157.1475178360@sss.pgh.pa.us>
* Exclude additional directories in pg_basebackupPeter Eisentraut2016-09-28
| | | | | | | | | | | | | | | | | | | The list of files and directories that pg_basebackup excludes from the backup was somewhat incomplete and unorganized. Change that with having the exclusion driven from tables. Clean up some code around it. Also document the exclusions in more detail so that users of pg_start_backup can make use of it as well. The contents of these directories are now excluded from the backup: pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would cause a corrupt tar header to be created. Now such symlinks are included in the backup as empty directories. Bug found by Ashutosh Sharma <ashu.coek88@gmail.com>. From: David Steele <david@pgmasters.net> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Silence compiler warningsAlvaro Herrera2016-09-28
| | | | Reported by Peter Eisentraut. Coding suggested by Tom Lane.
* Rationalize format-picture caching logic in formatting.c.Tom Lane2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a validity flag to DCHCacheEntry and NUMCacheEntry entries, and do not set it true until after we've parsed the supplied format string. This allows dealing with possible errors while parsing the format without the baroque hack that was there before (which only covered errors within NUMDesc_prepare, anyway). We can get rid of the PG_TRY in NUMDesc_prepare, as well as last_NUMCacheEntry and NUM_cache_remove. (Essentially, this reverts commit ff783fbae in favor of a less fragile solution; the problems with that approach are well illustrated by later hacking such as 55f927a46.) In passing, define the size of these caches as DCH_CACHE_ENTRIES not DCH_CACHE_FIELDS + 1 (whoever thought that was a good definition?) and likewise for the NUM cache. Also const-ify format string parameters where convenient, and merge duplicated cache lookup logic. This is primarily driven by a proposed patch from Artur Zakirov, which introduced some ereport's into format string parsing for the datetime case. He proposed preventing the creation of invalid cache entries by parsing the format string first into a local-variable array, and then copying that to a cache entry. That seemed a bit ugly to me, and anyway randomly different from the way the identical problem had been solved for the numeric case. Let's make the two sets of code more similar not less so. I'm not sure whether we'll adopt the new error conditions Artur proposes, but this patch seems like good code cleanup and future-proofing in any case. The existing code is critically (and undocumented-ly) dependent on no elog being thrown out of several nontrivial functions, which is trouble waiting to happen, though it doesn't seem to be actively broken today. Discussion: <b2a39359-3282-b402-f4a3-057aae500ee7@postgrespro.ru>
* Make to_timestamp() and to_date() range-check fields of their input.Tom Lane2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | Historically, something like to_date('2009-06-40','YYYY-MM-DD') would return '2009-07-10' because there was no prohibition on out-of-range month or day numbers. This has been widely panned, and it also turns out that Oracle throws an error in such cases. Since these functions are nominally Oracle-compatibility features, let's change that. There's no particular restriction on year (modulo the fact that the scanner may not believe that more than 4 digits are year digits, a matter to be addressed separately if at all). But we now check month, day, hour, minute, second, and fractional-second fields, as well as day-of-year and second-of-day fields if those are used. Currently, no checks are made on ISO-8601-style week numbers or day numbers; it's not very clear what the appropriate rules would be there, and they're probably so little used that it's not worth sweating over. Artur Zakirov, reviewed by Amul Sul, further adjustments by me Discussion: <1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com> See-Also: <57786490.9010201@wars-nicht.de>
* Remove dead line of codePeter Eisentraut2016-09-28
|
* Fix CRC check handling in get_controlfilePeter Eisentraut2016-09-28
| | | | | | | | The previous patch broke this by returning NULL for a failed CRC check, which pg_controldata would then try to read. Fix by returning the result of the CRC check in a separate argument. Michael Paquier and myself
* Fix dangling pointer problem in ReorderBufferSerializeChange.Robert Haas2016-09-28
| | | | | | | | | Commit 3fe3511d05127cc024b221040db2eeb352e7d716 introduced a new case into this function, but neglected to ensure that the "ondisk" pointer got updated after a possible reallocation as the code does in other cases. Stas Kelvich, per diagnosis by Konstantin Knizhnik.
* Turn password_encryption GUC into an enum.Heikki Linnakangas2016-09-28
| | | | | | | | | | | | | This makes the parameter easier to extend, to support other password-based authentication protocols than MD5. (SCRAM is being worked on.) The GUC still accepts on/off as aliases for "md5" and "plain", although we may want to remove those once we actually add support for another password hash type. Michael Paquier, reviewed by David Steele, with some further edits by me. Discussion: <CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com>
* Disallow pushing volatile quals past set-returning functions.Tom Lane2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | Pushing an upper-level restriction clause into an unflattened subquery-in-FROM is okay when the subquery contains no SRFs in its targetlist, or when it does but the SRFs are unreferenced by the clause *and the clause is not volatile*. Otherwise, we're changing the number of times the clause is evaluated, which is bad for volatile quals, and possibly changing the result, since a volatile qual might succeed for some SRF output rows and not others despite not referencing any of the changing columns. (Indeed, if the clause is something like "random() > 0.5", the user is probably expecting exactly that behavior.) We had most of these restrictions down, but not the one about the upper clause not being volatile. Fix that, and add a regression test to illustrate the expected behavior. Although this is definitely a bug, it doesn't seem like back-patch material, since possibly some users don't realize that the broken behavior is broken and are relying on what happens now. Also, while the added test is quite cheap in the wake of commit a4c35ea1c, it would be much more expensive (or else messier) in older branches. Per report from Tom van Tilburg. Discussion: <CAP3PPDiucxYCNev52=YPVkrQAPVF1C5PFWnrQPT7iMzO1fiKFQ@mail.gmail.com>
* Include <sys/select.h> where neededAlvaro Herrera2016-09-27
| | | | | | | | | | | | <sys/select.h> is required by POSIX.1-2001 to get the prototype of select(2), but nearly no systems enforce that because older standards let you get away with including some other headers. Recent OpenBSD hacking has removed that frail touch of friendliness, however, which broke some compiles; fix all the way back to 9.1 by adding the required standard. Only vacuumdb.c was reported to fail, but it seems easier to fix the whole lot in a fell swoop. Per bug #14334 by Sean Farrell.
* Replace the built-in GIN array opclasses with a single polymorphic opclass.Tom Lane2016-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had thirty different GIN array opclasses sharing the same operators and support functions. That still didn't cover all the built-in types, nor did it cover arrays of extension-added types. What we want is a single polymorphic opclass for "anyarray". There were two missing features needed to make this possible: 1. We have to be able to declare the index storage type as ANYELEMENT when the opclass is declared to index ANYARRAY. This just takes a few more lines in index_create(). Although this currently seems of use only for GIN, there's no reason to make index_create() restrict it to that. 2. We have to be able to identify the proper GIN compare function for the index storage type. This patch proceeds by making the compare function optional in GIN opclass definitions, and specifying that the default btree comparison function for the index storage type will be looked up when the opclass omits it. Again, that seems pretty generically useful. Since the comparison function lookup is done in initGinState(), making use of the second feature adds an additional cache lookup to GIN index access setup. It seems unlikely that that would be very noticeable given the other costs involved, but maybe at some point we should consider making GinState data persist longer than it now does --- we could keep it in the index relcache entry, perhaps. Rather fortuitously, we don't seem to need to do anything to get this change to play nice with dump/reload or pg_upgrade scenarios: the new opclass definition is automatically selected to replace existing index definitions, and the on-disk data remains compatible. Also, if a user has created a custom opclass definition for a non-builtin type, this doesn't break that, since CREATE INDEX will prefer an exact match to opcintype over a match to ANYARRAY. However, if there's anyone out there with handwritten DDL that explicitly specifies _bool_ops or one of the other replaced opclass names, they'll need to adjust that. Tom Lane, reviewed by Enrique Meneses Discussion: <14436.1470940379@sss.pgh.pa.us>
* Refer to OS X as "macOS", except for the port name which is still "darwin".Tom Lane2016-09-25
| | | | | | | | | | | | | | | | | | We weren't terribly consistent about whether to call Apple's OS "OS X" or "Mac OS X", and the former is probably confusing to people who aren't Apple users. Now that Apple has rebranded it "macOS", follow their lead to establish a consistent naming pattern. Also, avoid the use of the ancient project name "Darwin", except as the port code name which does not seem desirable to change. (In short, this patch touches documentation and comments, but no actual code.) I didn't touch contrib/start-scripts/osx/, either. I suspect those are obsolete and due for a rewrite, anyway. I dithered about whether to apply this edit to old release notes, but those were responsible for quite a lot of the inconsistencies, so I ended up changing them too. Anyway, Apple's being ahistorical about this, so why shouldn't we be?
* Remove useless code.Tom Lane2016-09-23
| | | | | | | | | Apparent copy-and-pasteo in standby_desc_invalidations() had two entries for msg->id == SHAREDINVALRELMAP_ID. Aleksander Alekseev Discussion: <20160923090814.GB1238@e733>
* Don't trust CreateFileMapping() to clear the error code on success.Tom Lane2016-09-23
| | | | | | | | | | | | | We must test GetLastError() even when CreateFileMapping() returns a non-null handle. If that value were left over from some previous system call, we might be fooled into thinking the segment already existed. Experimentation on Windows 7 suggests that CreateFileMapping() clears the error code on success, but it is not documented to do so, so let's not rely on that happening in all Windows releases. Amit Kapila Discussion: <20811.1474390987@sss.pgh.pa.us>
* Avoid using PostmasterRandom() for DSM control segment ID.Tom Lane2016-09-23
| | | | | | | | | | | | | | | Commits 470d886c3 et al intended to fix the problem that the postmaster selected the same "random" DSM control segment ID on every start. But using PostmasterRandom() for that destroys the intended property that the delay between random_start_time and random_stop_time will be unpredictable. (Said delay is probably already more predictable than we could wish, but that doesn't mean that reducing it by a couple orders of magnitude is OK.) Revert the previous patch and add a comment warning against misuse of PostmasterRandom. Fix the original problem by calling srandom() early in PostmasterMain, using a low-security seed that will later be overwritten by PostmasterRandom. Discussion: <20789.1474390434@sss.pgh.pa.us>
* C comment: fix function header commentBruce Momjian2016-09-22
| | | | | | Fix for transformOnConflictClause(). Author: Tomonari Katsumata
* Remove nearly-unused SizeOfIptrData macro.Tom Lane2016-09-22
| | | | | | | | | | | | | Past refactorings have removed all but one reference to SizeOfIptrData (and that one place was in a pretty noncritical spot). Since nobody's complained, it seems probable that there are no supported compilers that don't think sizeof(ItemPointerData) is 6. If there are, we're wasting MAXALIGN per heap tuple anyway, so it's rather silly to worry about whether we can shave space in places like WAL records. Pavan Deolasee Discussion: <CABOikdOOawDda4hwLOT6zdA6MFfPLu3Z2YBZkX0JdayNS6JOeQ@mail.gmail.com>
* Be sure to rewind the tuplestore read pointer in non-leader CTEScan nodes.Tom Lane2016-09-22
| | | | | | | | | | | | | | | | | | | | | ExecInitCteScan supposed that it didn't have to do anything to the extra tuplestore read pointer it gets from tuplestore_alloc_read_pointer. However, it needs this read pointer to be positioned at the start of the tuplestore, while tuplestore_alloc_read_pointer is actually defined as cloning the current position of read pointer 0. In normal situations that accidentally works because we initialize the whole plan tree at once, before anything gets read. But it fails in an EvalPlanQual recheck, as illustrated in bug #14328 from Dima Pavlov. To fix, just forcibly rewind the pointer after tuplestore_alloc_read_pointer. The cost of doing so is negligible unless the tuplestore is already in TSS_READFILE state, which wouldn't happen in normal cases. We could consider altering tuplestore's API to make that case cheaper, but that would make for a more invasive back-patch and it doesn't seem worth it. This has been broken probably for as long as we've had CTEs, so back-patch to all supported branches. Discussion: <32468.1474548308@sss.pgh.pa.us>
* Delay updating control file to "in production"Peter Eisentraut2016-09-21
| | | | | | | | | | Move the updating of the control file to "in production" status until the point where WAL writes are allowed. Before, there could be a significant gap between the control file update and write transactions actually being allowed. This makes it more reliable to use the control status to verify the end of a promotion. From: Michael Paquier <michael.paquier@gmail.com>
* pg_ctl: Detect current standby state from pg_controlPeter Eisentraut2016-09-21
| | | | | | | | | | pg_ctl used to determine whether a server was in standby mode by looking for a recovery.conf file. With this change, it instead looks into pg_control, which is potentially more accurate. There are also occasional discussions about removing recovery.conf, so this removes one dependency. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Use PostmasterRandom(), not random(), for DSM control segment ID.Robert Haas2016-09-20
| | | | | Otherwise, every startup gets the same "random" value, which is definitely not what was intended.