aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Avoid palloc in critical section in GiST WAL-logging.Heikki Linnakangas2014-04-03
| | | | | | | | | | | | | | | | Memory allocation can fail if you run out of memory, and inside a critical section that will lead to a PANIC. Use conservatively-sized arrays in stack instead. There was previously no explicit limit on the number of pages a GiST split can produce, it was only limited by the number of LWLocks that can be held simultaneously (100 at the moment). This patch adds an explicit limit of 75 pages. That should be plenty, a typical split shouldn't produce more than 2-3 page halves. The bug has been there forever, but only backpatch down to 9.1. The code was changed significantly in 9.1, and it doesn't seem worth the risk or trouble to adapt this for 9.0 and 8.4.
* Fix assorted issues in client host name lookup.Tom Lane2014-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code for matching clients to pg_hba.conf lines that specify host names (instead of IP address ranges) failed to complain if reverse DNS lookup failed; instead it silently didn't match, so that you might end up getting a surprising "no pg_hba.conf entry for ..." error, as seen in bug #9518 from Mike Blackwell. Since we don't want to make this a fatal error in situations where pg_hba.conf contains a mixture of host names and IP addresses (clients matching one of the numeric entries should not have to have rDNS data), remember the lookup failure and mention it as DETAIL if we get to "no pg_hba.conf entry". Apply the same approach to forward-DNS lookup failures, too, rather than treating them as immediate hard errors. Along the way, fix a couple of bugs that prevented us from detecting an rDNS lookup error reliably, and make sure that we make only one rDNS lookup attempt; formerly, if the lookup attempt failed, the code would try again for each host name entry in pg_hba.conf. Since more or less the whole point of this design is to ensure there's only one lookup attempt not one per entry, the latter point represents a performance bug that seems sufficient justification for back-patching. Also, adjust src/port/getaddrinfo.c so that it plays as well as it can with this code. Which is not all that well, since it does not have actual support for rDNS lookup, but at least it should return the expected (and required by spec) error codes so that the main code correctly perceives the lack of functionality as a lookup failure. It's unlikely that PG is still being used in production on any machines that require our getaddrinfo.c, so I'm not excited about working harder than this. To keep the code in the various branches similar, this includes back-patching commits c424d0d1052cb4053c8712ac44123f9b9a9aa3f2 and 1997f34db4687e671690ed054c8f30bb501b1168 into 9.2 and earlier. Back-patch to 9.1 where the facility for hostnames in pg_hba.conf was introduced.
* De-anonymize the union in JsonbValue.Tom Lane2014-04-02
| | | | Needed for strict C89 compliance.
* Fix bugs in manipulation of PgBackendStatus.st_clienthostname.Tom Lane2014-04-01
| | | | | | | | | | Initialization of this field was not being done according to the st_changecount protocol (it has to be done within the changecount increment range, not outside). And the test to see if the value should be reported as null was wrong. Noted while perusing uses of Port.remote_hostname. This was wrong from the introduction of this code (commit 4a25bc145), so back-patch to 9.1.
* Fix bug in the new GIN incomplete-split code.Heikki Linnakangas2014-04-01
| | | | | | | | Inserting a downlink to an internal page clears the incomplete-split flag of the child's left sibling, so the left sibling's LSN also needs to be updated and it needs to be marked dirty. The codepath for an insertion got this right, but the case where the internal node is split because of inserting the new downlink missed that.
* Remove dead check for backup block, replace with Assert.Heikki Linnakangas2014-04-01
| | | | | We don't use backup blocks with GIN vacuum records anymore, the page is always recreated from scratch.
* Fix bug in the new B-tree incomplete-split code.Heikki Linnakangas2014-04-01
| | | | | | Inserting a downlink to an internal page clears the incomplete-split flag of the child's left sibling, so the left sibling's LSN also needs to be updated.
* Mark FastPathStrongRelationLocks volatile.Robert Haas2014-03-31
| | | | | | | | | Otherwise, the compiler might decide to move modifications to data within this structure outside the enclosing SpinLockAcquire / SpinLockRelease pair, leading to shared memory corruption. This may or may not explain a recent lmgr-related buildfarm failure on prairiedog, but it needs to be fixed either way.
* Count buffers dirtied due to hints in pgBufferUsage.shared_blks_dirtied.Robert Haas2014-03-31
| | | | | | | | | | Previously, such buffers weren't counted, with the possible result that EXPLAIN (BUFFERS) and pg_stat_statements would understate the true number of blocks dirtied by an SQL statement. Back-patch to 9.2, where this counter was introduced. Amit Kapila
* Fix thinko in logical decoding code.Robert Haas2014-03-31
| | | | Andres Freund
* Rewrite the way GIN posting lists are packed on a page, to reduce WAL volume.Heikki Linnakangas2014-03-31
| | | | | | | | | | | | | | Inserting (in retail) into the new 9.4 format GIN posting tree created much larger WAL records than in 9.3. The previous strategy to WAL logging was basically to log the whole page on each change, with the exception of completely unmodified segments up to the first modified one. That was not too bad when appending to the end of the page, as only the last segment had to be WAL-logged, but per Fujii Masao's testing, even that produced 2x the WAL volume that 9.3 did. The new strategy is to keep track of changes to the posting lists in a more fine-grained fashion, and also make the repacking" code smarter to avoid decoding and re-encoding segments unnecessarily.
* Rename GinLogicValue to GinTernaryValue.Heikki Linnakangas2014-03-31
| | | | | It's more descriptive. Also, get rid of the enum, and use #defines instead, per Greg Stark's suggestion.
* Adjust getpwuid() fix commit to display errno string on failureBruce Momjian2014-03-28
| | | | This adjusts patch 613c6d26bd42dd8c2dd0664315be9551475b8864.
* Fix EquivalenceClass processing for nested append relations.Tom Lane2014-03-28
| | | | | | | | | | | | | The original coding of EquivalenceClasses didn't foresee that appendrel child relations might themselves be appendrels; but this is possible for example when a UNION ALL subquery scans a table with inheritance children. The oversight led to failure to optimize ordering-related issues very well for the grandchild tables. After some false starts involving explicitly flattening the appendrel representation, we found that this could be fixed easily by removing a few implicit assumptions about appendrel parent rels not being children themselves. Kyotaro Horiguchi and Tom Lane, reviewed by Noah Misch
* Un-break peer authentication.Tom Lane2014-03-28
| | | | | | | | | | | Commit 613c6d26bd42dd8c2dd0664315be9551475b8864 sloppily replaced a lookup of the UID obtained from getpeereid() with a lookup of the server's own user name, thus totally destroying peer authentication. Revert. Per report from Christoph Berg. In passing, make sure get_user_name() zeroes *errstr on success on Windows as well as non-Windows. I don't think any callers actually depend on this ATM, but we should be consistent across platforms.
* Silence compiler warnings in new jsonb code.Heikki Linnakangas2014-03-27
| | | | Amit Kapila.
* Fix uninitialized variables in json's populate_record_worker().Andrew Dunstan2014-03-26
| | | | Peter Geoghegan.
* Pass more than the first XLogRecData entry to rm_desc, with WAL_DEBUG.Heikki Linnakangas2014-03-26
| | | | | | | | | | | | | | | If you compile with WAL_DEBUG and enable it with wal_debug=on, we used to only pass the first XLogRecData entry to the rm_desc routine. I think the original assumprion was that the first XLogRecData entry contains all the necessary information for the rm_desc routine, but that's a pretty shaky assumption. At least standby_redo didn't get the memo. To fix, piece together all the data in a temporary buffer, and pass that to the rm_desc routine. It's been like this forever, but the patch didn't apply cleanly to back-branches. Probably wouldn't be hard to fix the conflicts, but it's not worth the trouble.
* Cleanup around json_to_record/json_to_recordsetAndrew Dunstan2014-03-26
| | | | | | | | | Set function parameter names and defaults. Add jsonb versions (which the code already provided for so the actual new code is trivial). Add jsonb regression tests and docs. Bump catalog version (which I apparently forgot to do when jsonb was committed).
* Fix 'recheck' flag in tsquery's GIN tri-consistent function.Heikki Linnakangas2014-03-26
| | | | | | | It needs to be initialized, like in the boolean gin_tsquery_consistent version. Peter Geoghegan.
* Tidy up the populate/to_record{set} code for json a bit.Andrew Dunstan2014-03-25
| | | | In the process fix a small bug.
* Don't forget to flush XLOG_PARAMETER_CHANGE record.Fujii Masao2014-03-26
| | | | Backpatch to 9.0 where XLOG_PARAMETER_CHANGE record was instroduced.
* Remove wchar.c Asserts that were stricter than the main codeBruce Momjian2014-03-24
| | | | | | | | | | Assert errors were thrown for functions being passed invalid encodings, while the main code handled it just fine. Also document that libpq's PQclientEncoding() returns -1 for an encoding lookup failure. Per report from Peter Geoghegan
* Fix ts_rank_cd() to ignore stripped lexemesBruce Momjian2014-03-24
| | | | | | | Previously, stripped lexemes got a default location and could be considered if mixed with non-stripped lexemes. BACKWARD INCOMPATIBILITY CHANGE
* Change ginMergeItemPointers to return a palloc'd array.Heikki Linnakangas2014-03-24
| | | | | That seems nicer than making it the caller's responsibility to pass a suitable-sized array. All the callers were just palloc'ing an array anyway.
* Remove dead code and add comments.Heikki Linnakangas2014-03-24
| | | | | 'cbuffer' variable was left over from an earlier version of the patch to rewrite the incomplete split handling.
* Fix "the the" typos.Heikki Linnakangas2014-03-24
| | | | Erik Rijkers
* Introduce jsonb, a structured format for storing json.Andrew Dunstan2014-03-23
| | | | | | | | | | | | | | | | | | | | | | The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund
* Offer triggers on foreign tables.Noah Misch2014-03-23
| | | | | | | | | | | | | | | | | This covers all the SQL-standard trigger types supported for regular tables; it does not cover constraint triggers. The approach for acquiring the old row mirrors that for view INSTEAD OF triggers. For AFTER ROW triggers, we spool the foreign tuples to a tuplestore. This changes the FDW API contract; when deciding which columns to populate in the slot returned from data modification callbacks, writable FDWs will need to check for AFTER ROW triggers in addition to checking for a RETURNING clause. In support of the feature addition, refactor the TriggerFlags bits and the assembly of old tuples in ModifyTable. Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
* Improve comments about AfterTriggerBeginQuery() query level usage.Noah Misch2014-03-23
|
* Address ccvalid/ccnoinherit in TupleDesc support functions.Noah Misch2014-03-23
| | | | | | | | | equalTupleDescs() neglected both of these ConstrCheck fields, and CreateTupleDescCopyConstr() neglected ccnoinherit. At this time, the only known behavior defect resulting from these omissions is constraint exclusion disregarding a CHECK constraint validated by an ALTER TABLE VALIDATE CONSTRAINT statement issued earlier in the same transaction. Back-patch to 9.2, where these fields were introduced.
* Fix build with LWLOCK_STATS or dtrace.Heikki Linnakangas2014-03-21
| | | | | | | | Also fix the name of the dtrace probe for LWLockAcquireOrWait(). The function was renamed from LWLockWaitUntilFree to LWLockAqcuireOrWait, but the dtrace probe was neglected. Pointed out by Andres Freund and the buildfarm.
* Remove MinGW readdir/errno bug workaround fixed on 2003-10-10Bruce Momjian2014-03-21
|
* Properly check for readdir/closedir() failuresBruce Momjian2014-03-21
| | | | | | | Clear errno before calling readdir() and handle old MinGW errno bug while adding full test coverage for readdir/closedir failures. Backpatch through 8.4.
* Replace the XLogInsert slots with regular LWLocks.Heikki Linnakangas2014-03-21
| | | | | | | | | | The special feature the XLogInsert slots had over regular LWLocks is the insertingAt value that was updated atomically with releasing backends waiting on it. Add new functions to the LWLock API to do that, and replace the slots with LWLocks. This reduces the amount of duplicated code. (There's still some duplication, but at least it's all in lwlock.c now.) Reviewed by Andres Freund.
* Again fix initialization of auto-tuned effective_cache_size.Tom Lane2014-03-20
| | | | | | | | | | | | | | | | | | | The previous method was overly complex and underly correct; in particular, by assigning the default value with PGC_S_OVERRIDE, it prevented later attempts to change the setting in postgresql.conf, as noted by Jeff Janes. We should just assign the default value with source PGC_S_DYNAMIC_DEFAULT, which will have the desired priority relative to the boot_val as well as user-set values. There is still a gap in this method: if there's an explicit assignment of effective_cache_size = -1 in the postgresql.conf file, and that assignment appears before shared_buffers is assigned, the code will substitute 4 times the bootstrap default for shared_buffers, and that value will then persist (since it will have source PGC_S_FILE). I don't see any very nice way to avoid that though, and it's not a case to be expected in practice. The existing comments in guc-file.l look forward to a redesign of the DYNAMIC_DEFAULT mechanism; if that ever happens, we should consider this case as one of the things we'd like to improve.
* Setup error context callback for transaction lock waitsAlvaro Herrera2014-03-19
| | | | | | | | | | | | | | | | | | With this in place, a session blocking behind another one because of tuple locks will get a context line mentioning the relation name, tuple TID, and operation being done on tuple. For example: LOG: process 11367 still waiting for ShareLock on transaction 717 after 1000.108 ms DETAIL: Process holding the lock: 11366. Wait queue: 11367. CONTEXT: while updating tuple (0,2) in relation "foo" STATEMENT: UPDATE foo SET value = 3; Most usefully, the new line is displayed by log entries due to log_lock_waits, although of course it will be printed by any other log message as well. Author: Christian Kruse, some tweaks by Álvaro Herrera Reviewed-by: Amit Kapila, Andres Freund, Tom Lane, Robert Haas
* Fix memory leak during regular expression execution.Tom Lane2014-03-19
| | | | | | | | For a regex containing backrefs, pg_regexec() might fail to free all the sub-DFAs that were created during execution, resulting in a permanent (session lifespan) memory leak. Problem was introduced by me in commit 587359479acbbdc95c8e37da40707e37097423f5. Per report from Sandro Santilli; diagnosis by Greg Stark.
* Remove rm_safe_restartpoint machinery.Heikki Linnakangas2014-03-18
| | | | | | | | | It is no longer used, none of the resource managers have multi-record actions that would make it unsafe to perform a restartpoint. Also don't allow rm_cleanup to write WAL records, it's also no longer required. Move the call to rm_cleanup routines to make it more symmetric with rm_startup.
* Make the handling of interrupted B-tree page splits more robust.Heikki Linnakangas2014-03-18
| | | | | | | | | | | | | | | | | | | | | | Splitting a page consists of two separate steps: splitting the child page, and inserting the downlink for the new right page to the parent. Previously, we handled the case that you crash in between those steps with a cleanup routine after the WAL recovery had finished, which finished the incomplete split. However, that doesn't help if the page split is interrupted but the database doesn't crash, so that you don't perform WAL recovery. That could happen for example if you run out of disk space. Remove the end-of-recovery cleanup step. Instead, when a page is split, the left page is marked with a new INCOMPLETE_SPLIT flag, and when the downlink is inserted to the parent, the flag is cleared again. If an insertion sees a page with the flag set, it knows that the split was interrupted for some reason, and inserts the missing downlink before proceeding. I used the same approach to fix GIN and GiST split algorithms earlier. This was the last WAL cleanup routine, so we could get rid of that whole machinery now, but I'll leave that for a separate patch. Reviewed by Peter Geoghegan.
* Rewrite comment for shm_mq_receive_bytes.Robert Haas2014-03-18
| | | | | | | The comment and the code diverged at some point before the initial commit of this feature, and I failed to notice. Noted by Tom Lane.
* Fix relcache reference leak in refresh_by_match_merge().Tom Lane2014-03-18
| | | | | | | | | | | One path through the loop over indexes forgot to do index_close(). Rather than adding a fourth call, restructure slightly so that there's only one. In passing, get rid of an unnecessary syscache lookup: the pg_index struct for the index is already available from its relcache entry. Per report from YAMAMOTO Takashi, though this is a bit different from his suggested patch. This is new code in HEAD, so no need for back-patch.
* Improve shm_mq portability around MAXIMUM_ALIGNOF and sizeof(Size).Robert Haas2014-03-18
| | | | | | | | | | | Revise the original decision to expose a uint64-based interface and use Size everywhere possible. Avoid assuming that MAXIMUM_ALIGNOF is 8, or making any assumption about the relationship between that value and sizeof(Size). If MAXIMUM_ALIGNOF is bigger, we'll now insert padding after the length word; if it's smaller, we are now prepared to read and write the length word in chunks. Per discussion with Tom Lane.
* Make it easy to detach completely from shared memory.Robert Haas2014-03-18
| | | | | | | | | | The new function dsm_detach_all() can be used either by postmaster children that don't wish to take any risk of accidentally corrupting shared memory; or by forked children of regular backends with the same need. This patch also updates the postmaster children that already do PGSharedMemoryDetach() to do dsm_detach_all() as well. Per discussion with Tom Lane.
* During index build, check and elog (not just Assert) for broken HOT chain.Tom Lane2014-03-17
| | | | | | | The recently-fixed bug in WAL replay could result in not finding a parent tuple for a heap-only tuple. The existing code would either Assert or generate an invalid index entry, neither of which is desirable. Throw a regular error instead.
* Fix thinko: have trueTriConsistentFn return GIN_TRUE.Heikki Linnakangas2014-03-17
| | | | While we're at it, also improve comments in ginlogic.c.
* Fix typos in comments.Fujii Masao2014-03-17
| | | | Thom Brown
* Fix bug in clean shutdown of walsender that pg_receiving is connecting to.Fujii Masao2014-03-17
| | | | | | | | | | | | | | On clean shutdown, walsender waits for all WAL to be replicated to a standby, and exits. It determined whether that replication had been completed by checking whether its sent location had been equal to a standby's flush location. Unfortunately this condition never becomes true when the standby such as pg_receivexlog which always returns an invalid flush location is connecting to walsender, and then walsender waits forever. This commit changes walsender so that it just checks a standby's write location if a flush location is invalid. Back-patch to 9.1 where enough infrastructure for this exists.
* Make punctuation consistentPeter Eisentraut2014-03-16
|
* Fix whitespacePeter Eisentraut2014-03-16
|