aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Move pg_crc.c to src/common, and remove pg_crc_tables.hHeikki Linnakangas2015-02-09
| | | | | | | | | | | | | | | | | To get CRC functionality in a client program, you now need to link with libpgcommon instead of libpgport. The CRC code has nothing to do with portability, so libpgcommon is a better home. (libpgcommon didn't exist when pg_crc.c was originally moved to src/port.) Remove the possibility to get CRC functionality by just #including pg_crc_tables.h. I'm not aware of any extensions that actually did that and couldn't simply link with libpgcommon. This also moves the pg_crc.h header file from src/include/utils to src/include/common, which will require changes to any external programs that currently does #include "utils/pg_crc.h". That seems acceptable, as include/common is clearly the right home for it now, and the change needed to any such programs is trivial.
* Move pg_lzcompress.c to src/common.Fujii Masao2015-02-09
| | | | | | | | | | | | | | | | | | | | | | The meta data of PGLZ symbolized by PGLZ_Header is removed, to make the compression and decompression code independent on the backend-only varlena facility. PGLZ_Header is being used to store some meta data related to the data being compressed like the raw length of the uncompressed record or some varlena-related data, making it unpluggable once PGLZ is stored in src/common as it contains some backend-only code paths with the management of varlena structures. The APIs of PGLZ are reworked at the same time to do only compression and decompression of buffers without the meta-data layer, simplifying its use for a more general usage. On-disk format is preserved as well, so there is no incompatibility with previous major versions of PostgreSQL for TOAST entries. Exposing compression and decompression APIs of pglz makes possible its use by extensions and contrib modules. Especially this commit is required for upcoming WAL compression feature so that the WAL reader facility can decompress the WAL data by using pglz_decompress. Michael Paquier, reviewed by me.
* Check DCH_MAX_ITEM_SIZ limits with <=, not <.Noah Misch2015-02-06
| | | | | | | | | We reserve space for the full amount, not one less. The affected checks deal with localized month and day names. Today's DCH_MAX_ITEM_SIZ value would suffice for a 60-byte day name, while the longest known is the 49-byte mn_CN.utf-8 word for "Saturday." Thus, the upshot of this change is merely to avoid misdirecting future readers of the code; users are not expected to see errors either way.
* Assert(PqCommReadingMsg) in pq_peekbyte().Noah Misch2015-02-06
| | | | | Interrupting pq_recvbuf() can break protocol sync, so its callers all deserve this assertion. The one pq_peekbyte() caller suffices already.
* Report WAL flush, not insert, position in replication IDENTIFY_SYSTEMHeikki Linnakangas2015-02-06
| | | | | | | | | | | When beginning streaming replication, the client usually issues the IDENTIFY_SYSTEM command, which used to return the current WAL insert position. That's not suitable for the intended purpose of that field, however. pg_receivexlog uses it to start replication from the reported point, but if it hasn't been flushed to disk yet, it will fail. Change IDENTIFY_SYSTEM to report the flush position instead. Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.
* This routine was calling ecpg_alloc to allocate to memory but did notMichael Meskes2015-02-05
| | | | | | | actually check the returned pointer allocated, potentially NULL which could be the result of a malloc call. Issue noted by Coverity, fixed by Michael Paquier <michael@otacoo.com>
* Use a separate memory context for GIN scan keys.Heikki Linnakangas2015-02-04
| | | | | | | | | | It was getting tedious to track and release all the different things that form a scan key. We were leaking at least the queryCategories array, and possibly more, on a rescan. That was visible if a GIN index was used in a nested loop join. This also protects from leaks in extractQuery method. No backpatching, given the lack of complaints from the field. Maybe later, after this has received more field testing.
* Fix reference-after-free when waiting for another xact due to constraint.Heikki Linnakangas2015-02-04
| | | | | | | | | | | | | | | | | | | If an insertion or update had to wait for another transaction to finish, because there was another insertion with conflicting key in progress, we would pass a just-free'd item pointer to XactLockTableWait(). All calls to XactLockTableWait() and MultiXactIdWait() had similar issues. Some passed a pointer to a buffer in the buffer cache, after already releasing the lock. The call in EvalPlanQualFetch had already released the pin too. All but the call in execUtils.c would merely lead to reporting a bogus ctid, however (or an assertion failure, if enabled). All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus anyway: if the tuple was updated (again) in the same transaction, its ctid field would point to the next tuple in the chain, not the tuple itself. Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added (in commit f88d4cfc)
* Fix memory leaks on OOM in ecpg.Heikki Linnakangas2015-02-04
| | | | | | These are fairly obscure cases, but let's keep Coverity happy. Michael Paquier with some further fixes by me.
* Add missing float.h include to snprintf.c.Andres Freund2015-02-04
| | | | | | | | | On windows _isnan() (which isnan() is redirected to in port/win32.h) is declared in float.h, not math.h. Per buildfarm animal currawong. Backpatch to all supported branches.
* Add dummy PQsslAttributes function for non-SSL builds.Heikki Linnakangas2015-02-04
| | | | | | | All the other new SSL information functions had dummy versions in be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker did not complain about the missing function on most platforms represented in the buildfarm, even though it is exported, except for a few Windows systems.
* Remove ill-conceived Assertion in ProcessClientWriteInterrupt().Andres Freund2015-02-03
| | | | | | | | | | | | It's perfectly fine to have blocked interrupts when ProcessClientWriteInterrupt() is called. In fact it's commonly the case when emitting error reports. And we deal with that correctly. Even if that'd not be the case, it'd be a bad location for such a assertion. Because ProcessClientWriteInterrupt() is only called when the socket is blocked it's hard to hit. Per Heikki and buildfarm animals nightjar and dunlin.
* Remove remnants of ImmediateInterruptOK handling.Andres Freund2015-02-03
| | | | | | | Now that nothing sets ImmediateInterruptOK to true anymore, we can remove all the supporting code. Reviewed-By: Heikki Linnakangas
* Remove the option to service interrupts during PGSemaphoreLock().Andres Freund2015-02-03
| | | | | | | | | The remaining caller (lwlocks) doesn't need that facility, and we plan to remove ImmedidateInterruptOK entirely. That means that interrupts can't be serviced race-free and portably anyway, so there's little reason for keeping the feature. Reviewed-By: Heikki Linnakangas
* Move deadlock and other interrupt handling in proc.c out of signal handlers.Andres Freund2015-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deadlock checking was performed inside signal handlers up to now. While it's a remarkable feat to have made this work reliably, it's quite complex to understand why that is the case. Partially it worked due to the assumption that semaphores are signal safe - which is not actually documented to be the case for sysv semaphores. The reason we had to rely on performing this work inside signal handlers is that semaphores aren't guaranteed to be interruptable by signals on all platforms. But now that latches provide a somewhat similar API, which actually has the guarantee of being interruptible, we can avoid doing so. Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and ProcSendSignal is now done using latches. This increases the likelihood of spurious wakeups. As spurious wakeup already were possible and aren't likely to be frequent enough to be an actual problem, this seems acceptable. This change would allow for further simplification of the deadlock checking, now that it doesn't have to run in a signal handler. But even if I were motivated to do so right now, it would still be better to do that separately. Such a cleanup shouldn't have to be reviewed a the same time as the more fundamental changes in this commit. There is one possible usability regression due to this commit. Namely it is more likely than before that log_lock_waits messages are output more than once. Reviewed-By: Heikki Linnakangas
* Don't allow immediate interrupts during authentication anymore.Andres Freund2015-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | We used to handle authentication_timeout by setting ImmediateInterruptOK to true during large parts of the authentication phase of a new connection. While that happens to work acceptably in practice, it's not particularly nice and has ugly corner cases. Previous commits converted the FE/BE communication to use latches and implemented support for interrupt handling during both send/recv. Building on top of that work we can get rid of ImmediateInterruptOK during authentication, by immediately treating timeouts during authentication as a reason to die. As die interrupts are handled immediately during client communication that provides a sensibly quick reaction time to authentication timeout. Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex authentication methods. More could be added, but this already should provides a reasonable coverage. While it this overall increases the maximum time till a timeout is reacted to, it greatly reduces complexity and increases reliability. That seems like a overall win. If the increase proves to be noticeable we can deal with those cases by moving to nonblocking network code and add interrupt checking there. Reviewed-By: Heikki Linnakangas
* Remove unused "m" field in LSEG.Tom Lane2015-02-03
| | | | | | | | | | This field has been unreferenced since 1998, and does not appear in lseg values stored on disk (since sizeof(lseg) is only 32 bytes according to pg_type). There was apparently some idea of maintaining it just in values appearing in memory, but the bookkeeping required to make that work would surely far outweigh the cost of recalculating the line's slope when needed. Remove it to (a) simplify matters and (b) suppress some uninitialized-field whining from Coverity.
* Process 'die' interrupts while reading/writing from the client socket.Andres Freund2015-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now it was impossible to terminate a backend that was trying to send/recv data to/from the client when the socket's buffer was already full/empty. While the send/recv calls itself might have gotten interrupted by signals on some platforms, we just immediately retried. That could lead to situations where a backend couldn't be terminated , after a client died without the connection being closed, because it was blocked in send/recv. The problem was far more likely to be hit when sending data than when reading. That's because while reading a command from the client, and during authentication, we processed interrupts immediately . That primarily left COPY FROM STDIN as being problematic for recv. Change things so that that we process 'die' events immediately when the appropriate signal arrives. We can't sensibly react to query cancels at that point, because we might loose sync with the client as we could be in the middle of writing a message. We don't interrupt writes if the write buffer isn't full, as indicated by write() returning EWOULDBLOCK, as that would lead to fewer error messages reaching clients. Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas Discussion: 20140927191243.GD5423@alap3.anarazel.de
* Introduce and use infrastructure for interrupt processing during client reads.Andres Freund2015-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now large swathes of backend code ran inside signal handlers while reading commands from the client, to allow for speedy reaction to asynchronous events. Most prominently shared invalidation and NOTIFY handling. That means that complex code like the starting/stopping of transactions is run in signal handlers... The required code was fragile and verbose, and is likely to contain bugs. That approach also severely limited what could be done while communicating with the client. As the read might be from within openssl it wasn't safely possible to trigger an error, e.g. to cancel a backend in idle-in-transaction state. We did that in some cases, namely fatal errors, nonetheless. Now that FE/BE communication in the backend employs non-blocking sockets and latches to block, we can quite simply interrupt reads from signal handlers by setting the latch. That allows us to signal an interrupted read, which is supposed to be retried after returning from within the ssl library. As signal handlers now only need to set the latch to guarantee timely interrupt processing, remove a fair amount of complicated & fragile code from async.c and sinval.c. We could now actually start to process some kinds of interrupts, like sinval ones, more often that before, but that seems better done separately. This work will hopefully allow to handle cases like being blocked by sending data, interrupting idle transactions and similar to be implemented without too much effort. In addition to allowing getting rid of ImmediateInterruptOK, that is. Author: Andres Freund Reviewed-By: Heikki Linnakangas
* Use a nonblocking socket for FE/BE communication and block using latches.Andres Freund2015-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows to introduce more elaborate handling of interrupts while reading from a socket. Currently some interrupt handlers have to do significant work from inside signal handlers, and it's very hard to correctly write code to do so. Generic signal handler limitations, combined with the fact that we can't safely jump out of a signal handler while reading from the client have prohibited implementation of features like timeouts for idle-in-transaction. Additionally we use the latch code to wait in a couple places where we previously only had waiting code on windows as other platforms just busy looped. This can increase the number of systemcalls happening during FE/BE communication. Benchmarks so far indicate that the impact isn't very high, and there's room for optimization in the latch code. The chance of cleaning up the usage of latches gives us, seem to outweigh the risk of small performance regressions. This commit theoretically can't used without the next patch in the series, as WaitLatchOrSocket is not defined to be fully signal safe. As we already do that in some cases though, it seems better to keep the commits separate, so they're easier to understand. Author: Andres Freund Reviewed-By: Heikki Linnakangas
* Fix breakage in GEODEBUG debug code.Tom Lane2015-02-03
| | | | | | | | | | | | | | | LINE doesn't have an "m" field (anymore anyway). Also fix unportable assumption that %x can print the result of pointer subtraction. In passing, improve single_decode() in minor ways: * Remove unnecessary leading-whitespace skip (strtod does that already). * Make GEODEBUG message more intelligible. * Remove entirely-useless test to see if strtod returned a silly pointer. * Don't bother computing trailing-whitespace skip unless caller wants an ending pointer. This has been broken since 261c7d4b653bc3e44c31fd456d94f292caa50d8f. Although it's only debug code, might as well fix the 9.4 branch too.
* Add API functions to libpq to interrogate SSL related stuff.Heikki Linnakangas2015-02-03
| | | | | | | | | | | This makes it possible to query for things like the SSL version and cipher used, without depending on OpenSSL functions or macros. That is a good thing if we ever get another SSL implementation. PQgetssl() still works, but it should be considered as deprecated as it only works with OpenSSL. In particular, PQgetSslInUse() should be used to check if a connection uses SSL, because as soon as we have another implementation, PQgetssl() will return NULL even if SSL is in use.
* Refactor page compactifying code.Heikki Linnakangas2015-02-03
| | | | | | | | The logic to compact away removed tuples from page was duplicated with small differences in PageRepairFragmentation, PageIndexMultiDelete, and PageIndexDeleteNoCompact. Put it into a common function. Reviewed by Peter Geoghegan.
* Fix typo in comment.Heikki Linnakangas2015-02-03
| | | | Amit Langote
* Add new function BackgroundWorkerInitializeConnectionByOid.Robert Haas2015-02-02
| | | | | | Sometimes it's useful for a background worker to be able to initialize its database connection by OID rather than by name, so provide a way to do that.
* Be more careful to not lose sync in the FE/BE protocol.Heikki Linnakangas2015-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244
* port/snprintf(): fix overflow and do paddingBruce Momjian2015-02-02
| | | | | | | | | | | | | Prevent port/snprintf() from overflowing its local fixed-size buffer and pad to the desired number of digits with zeros, even if the precision is beyond the ability of the native sprintf(). port/snprintf() is only used on systems that lack a native snprintf(). Reported by Bruce Momjian. Patch by Tom Lane. Backpatch to all supported versions. Security: CVE-2015-0242
* to_char(): prevent writing beyond the allocated bufferBruce Momjian2015-02-02
| | | | | | | | | | Previously very long localized month and weekday strings could overflow the allocated buffers, causing a server crash. Reported and patch reviewed by Noah Misch. Backpatch to all supported versions. Security: CVE-2015-0241
* to_char(): prevent accesses beyond the allocated bufferBruce Momjian2015-02-02
| | | | | | | | | | Previously very long field masks for floats could access memory beyond the existing buffer allocated to hold the result. Reported by Andres Freund and Peter Geoghegan. Backpatch to all supported versions. Security: CVE-2015-0241
* Translation updatesPeter Eisentraut2015-02-01
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 19c72ea8d856d7b1d4f5d759a766c8206bf9ce53
* Fix documentation of psql's ECHO all mode.Tom Lane2015-01-31
| | | | | | | | | | | | | "ECHO all" is ignored for interactive input, and has been for a very long time, though possibly not for as long as the documentation has claimed the opposite. Fix that, and also note that empty lines aren't echoed, which while dubious is another longstanding behavior (it's embedded in our regression test files for one thing). Per bug #12721 from Hans Ginzel. In HEAD, also improve the code comments in this area, and suppress an unnecessary fflush(stdout) when we're not echoing. That would likely be safe to back-patch, but I'll not risk it mere hours before a release wrap.
* Update time zone data files to tzdata release 2015a.Tom Lane2015-01-30
| | | | | DST law changes in Chile and Mexico (state of Quintana Roo). Historical changes for Iceland.
* Fix jsonb Unicode escape processing, and in consequence disallow \u0000.Tom Lane2015-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've been trying to support \u0000 in JSON values since commit 78ed8e03c67d7333, and have introduced increasingly worse hacks to try to make it work, such as commit 0ad1a816320a2b53. However, it fundamentally can't work in the way envisioned, because the stored representation looks the same as for \\u0000 which is not the same thing at all. It's also entirely bogus to output \u0000 when de-escaped output is called for. The right way to do this would be to store an actual 0x00 byte, and then throw error only if asked to produce de-escaped textual output. However, getting to that point seems likely to take considerable work and may well never be practical in the 9.4.x series. To preserve our options for better behavior while getting rid of the nasty side-effects of 0ad1a816320a2b53, revert that commit in toto and instead throw error if \u0000 is used in a context where it needs to be de-escaped. (These are the same contexts where non-ASCII Unicode escapes throw error if the database encoding isn't UTF8, so this behavior is by no means without precedent.) In passing, make both the \u0000 case and the non-ASCII Unicode case report ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence" rather than claiming there's something wrong with the input syntax. Back-patch to 9.4, where we have to do something because 0ad1a816320a2b53 broke things for many cases having nothing to do with \u0000. 9.3 also has bogus behavior, but only for that specific escape value, so given the lack of field complaints it seems better to leave 9.3 alone.
* Provide a way to supress the "out of memory" error when allocating.Robert Haas2015-01-30
| | | | | | | | Using the new interface MemoryContextAllocExtended, callers can specify MCXT_ALLOC_NO_OOM if they are prepared to handle a NULL return value. Michael Paquier, reviewed and somewhat revised by me.
* Fix assorted oversights in range selectivity estimation.Tom Lane2015-01-30
| | | | | | | | | | | | | | | | | | | calc_rangesel() failed outright when comparing range variables to empty constant ranges with < or >=, as a result of missing cases in a switch. It also produced a bogus estimate for > comparison to an empty range. On top of that, the >= and > cases were mislabeled throughout. For nonempty constant ranges, they managed to produce the right answers anyway as a result of counterbalancing typos. Also, default_range_selectivity() omitted cases for elem <@ range, range &< range, and range &> range, so that rather dubious defaults were applied for these operators. In passing, rearrange the code in rangesel() so that the elem <@ range case is handled in a less opaque fashion. Report and patch by Emre Hasegeli, some additional work by me
* Fix query-duration memory leak with GIN rescans.Heikki Linnakangas2015-01-30
| | | | | | | | | | | | | | | | | | | | | The requiredEntries / additionalEntries arrays were not freed in freeScanKeys() like other per-key stuff. It's not obvious, but startScanKey() was only ever called after the keys have been initialized with ginNewScanKey(). That's why it doesn't need to worry about freeing existing arrays. The ginIsNewKey() test in gingetbitmap was never true, because ginrescan free's the existing keys, and it's not OK to call gingetbitmap twice in a row without calling ginrescan in between. To make that clear, remove the unnecessary ginIsNewKey(). And just to be extra sure that nothing funny happens if there is an existing key after all, call freeScanKeys() to free it if it exists. This makes the code more straightforward. (I'm seeing other similar leaks in testing a query that rescans an GIN index scan, but that's a different issue. This just fixes the obvious leak with those two arrays.) Backpatch to 9.4, where GIN fast scan was added.
* Allow pg_dump to use jobs and serializable transactions together.Kevin Grittner2015-01-30
| | | | | | | | | | | | | | | | | | | | | Since 9.3, when the --jobs option was introduced, using it together with the --serializable-deferrable option generated multiple errors. We can get correct behavior by allowing the connection which acquires the snapshot to use SERIALIZABLE, READ ONLY, DEFERRABLE and pass that to the workers running the other connections using REPEATABLE READ, READ ONLY. This is a bit of a kluge since the SERIALIZABLE behavior is achieved by running some of the participating connections at a different isolation level, but it is a simple and safe change, suitable for back-patching. This will be followed by a proposal for a more invasive fix with some slight behavioral changes on just the master branch, based on suggestions from Andres Freund, but the kluge will be applied to master until something is agreed along those lines. Back-patched to 9.3, where the --jobs option was added. Based on report from Alexander Korotkov
* Fix BuildIndexValueDescription for expressionsStephen Frost2015-01-29
| | | | | | | | | | | | | | | In 804b6b6db4dcfc590a468e7be390738f9f7755fb we modified BuildIndexValueDescription to pay attention to which columns are visible to the user, but unfortunatley that commit neglected to consider indexes which are built on expressions. Handle error-reporting of violations of constraint indexes based on expressions by not returning any detail when the user does not have table-level SELECT rights. Backpatch to 9.0, as the prior commit was. Pointed out by Tom.
* Properly terminate the array returned by GetLockConflicts().Andres Freund2015-01-29
| | | | | | | | | | | | | | | | | | | | GetLockConflicts() has for a long time not properly terminated the returned array. During normal processing the returned array is zero initialized which, while not pretty, is sufficient to be recognized as a invalid virtual transaction id. But the HotStandby case is more than aesthetically broken: The allocated (and reused) array is neither zeroed upon allocation, nor reinitialized, nor terminated. Not having a terminating element means that the end of the array will not be recognized and that recovery conflict handling will thus read ahead into adjacent memory. Only terminating when hitting memory content that looks like a invalid virtual transaction id. Luckily this seems so far not have caused significant problems, besides making recovery conflict more expensive. Discussion: 20150127142713.GD29457@awork2.anarazel.de Backpatch into all supported branches.
* Align buffer descriptors to cache line boundaries.Andres Freund2015-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Benchmarks has shown that aligning the buffer descriptor array to cache lines is important for scalability; especially on bigger, multi-socket, machines. Currently the array sometimes already happens to be aligned by happenstance, depending how large previous shared memory allocations were. That can lead to wildly varying performance results after minor configuration changes. In addition to aligning the start of descriptor array, also force the size of individual descriptors to be of a common cache line size (64 bytes). That happens to already be the case on 64bit platforms, but this way we can change the struct BufferDesc more easily. As the alignment primarily matters in highly concurrent workloads which probably all are 64bit these days, and the space wastage of element alignment would be a bit more noticeable on 32bit systems, we don't force the stride to be cacheline sized on 32bit platforms for now. If somebody does actual performance testing, we can reevaluate that decision by changing the definition of BUFFERDESC_PADDED_SIZE. Discussion: 20140202151319.GD32123@awork2.anarazel.de Per discussion with Bruce Momjan, Tom Lane, Robert Haas, and Peter Geoghegan.
* Fix #ifdefed'ed out code to compile again.Andres Freund2015-01-29
|
* Fix bug where GIN scan keys were not initialized with gin_fuzzy_search_limit.Heikki Linnakangas2015-01-29
| | | | | | | | | | | | | | | | | | | | | | | | When gin_fuzzy_search_limit was used, we could jump out of startScan() without calling startScanKey(). That was harmless in 9.3 and below, because startScanKey()() didn't do anything interesting, but in 9.4 it initializes information needed for skipping entries (aka GIN fast scans), and you readily get a segfault if it's not done. Nevertheless, it was clearly wrong all along, so backpatch all the way to 9.1 where the early return was introduced. (AFAICS startScanKey() did nothing useful in 9.3 and below, because the fields it initialized were already initialized in ginFillScanKey(), but I don't dare to change that in a minor release. ginFillScanKey() is always called in gingetbitmap() even though there's a check there to see if the scan keys have already been initialized, because they never are; ginrescan() free's them.) In the passing, remove unnecessary if-check from the second inner loop in startScan(). We already check in the first loop that the condition is true for all entries. Reported by Olaf Gawenda, bug #12694, Backpatch to 9.1 and above, although AFAICS it causes a live bug only in 9.4.
* Move out-of-memory error checks from aset.c to mcxt.cRobert Haas2015-01-29
| | | | | | | | This potentially allows us to add mcxt.c interfaces that do something other than throw an error when memory cannot be allocated. We'll handle adding those interfaces in a separate commit. Michael Paquier, with minor changes by me
* Add usebypassrls to pg_user and pg_shadowStephen Frost2015-01-28
| | | | | | | | | | | | | | | The row level security patches didn't add the 'usebypassrls' columns to the pg_user and pg_shadow views on the belief that they were deprecated, but we havn't actually said they are and therefore we should include it. This patch corrects that, adds missing documentation for rolbypassrls into the system catalog page for pg_authid, along with the entries for pg_user and pg_shadow, and cleans up a few other uses of 'row-level' cases to be 'row level' in the docs. Pointed out by Amit Kapila. Catalog version bump due to system view changes.
* Clean up range-table building in copy.cStephen Frost2015-01-28
| | | | | | | | | | | | | | | | | | | Commit 804b6b6db4dcfc590a468e7be390738f9f7755fb added the build of a range table in copy.c to initialize the EState es_range_table since it can be needed in error paths. Unfortunately, that commit didn't appreciate that some code paths might end up not initializing the rte which is used to build the range table. Fix that and clean up a couple others things along the way- build it only once and don't explicitly set it on the !is_from path as it doesn't make any sense there (cstate is palloc0'd, so this isn't an issue from an initializing standpoint either). The prior commit went back to 9.0, but this only goes back to 9.1 as prior to that the range table build happens immediately after building the RTE and therefore doesn't suffer from this issue. Pointed out by Robert.
* Fix column-privilege leak in error-message pathsStephen Frost2015-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While building error messages to return to the user, BuildIndexValueDescription, ExecBuildSlotValueDescription and ri_ReportViolation would happily include the entire key or entire row in the result returned to the user, even if the user didn't have access to view all of the columns being included. Instead, include only those columns which the user is providing or which the user has select rights on. If the user does not have any rights to view the table or any of the columns involved then no detail is provided and a NULL value is returned from BuildIndexValueDescription and ExecBuildSlotValueDescription. Note that, for key cases, the user must have access to all of the columns for the key to be shown; a partial key will not be returned. Further, in master only, do not return any data for cases where row security is enabled on the relation and row security should be applied for the user. This required a bit of refactoring and moving of things around related to RLS- note the addition of utils/misc/rls.c. Back-patch all the way, as column-level privileges are now in all supported versions. This has been assigned CVE-2014-8161, but since the issue and the patch have already been publicized on pgsql-hackers, there's no point in trying to hide this commit.
* Fix typo in comment.Heikki Linnakangas2015-01-28
|
* Remove dead NULL-pointer checks in GiST code.Heikki Linnakangas2015-01-28
| | | | | | | | | | | | | | | | | | | | gist_poly_compress() and gist_circle_compress() checked for a NULL-pointer key argument, but that was dead code; the gist code never passes a NULL-pointer to the "compress" method. This commit also removes a documentation note added in commit a0a3883, about doing NULL-pointer checks in the "compress" method. It was added based on the fact that some implementations were doing NULL-pointer checks, but those checks were unnecessary in the first place. The NULL-pointer check in gbt_var_same() function was also unnecessary. The arguments to the "same" method come from the "compress", "union", or "picksplit" methods, but none of them return a NULL pointer. None of this is to be confused with SQL NULL values. Those are dealt with by the gist machinery, and are never passed to the GiST opclass methods. Michael Paquier
* Fix NUMERIC field access macros to treat NaNs consistently.Tom Lane2015-01-27
| | | | | | | | | | | | | | | | Commit 145343534c153d1e6c3cff1fa1855787684d9a38 arranged to store numeric NaN values as short-header numerics, but the field access macros did not get the memo: they thought only "SHORT" numerics have short headers. Most of the time this makes no difference because we don't access the weight or dscale of a NaN; but numeric_send does that. As pointed out by Andrew Gierth, this led to fetching uninitialized bytes. AFAICS this could not have any worse consequences than that; in particular, an unaligned stored numeric would have been detoasted by PG_GETARG_NUMERIC, so that there's no risk of a fetch off the end of memory. Still, the code is wrong on its own terms, and it's not hard to foresee future changes that might expose us to real risks. So back-patch to all affected branches.
* Add a note to PG_TRY's documentation about volatile safety.Tom Lane2015-01-26
| | | | We had better memorialize what the actual requirements are for this.