aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Reconsider when to wait for WAL flushes/syncrep during commit.Andres Freund2015-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now RecordTransactionCommit() waited for WAL to be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transaction did not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid, but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit. Not something people expect in a read only transaction. This lead to such strange symptoms as backends being seemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying when said stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involved in a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for syncrep using latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler. Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interrupts aren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid, and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xid has been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to cause overhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usage of nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0 doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches.
* Add hasRowSecurity to copyfuncs/outfuncsStephen Frost2015-02-25
| | | | | | | | | | | The RLS patch added a hasRowSecurity field to PlannerGlobal and PlannedStmt but didn't update nodes/copyfuncs.c and nodes/outfuncs.c to reflect those additional fields. Correct that by adding entries to the appropriate functions for those fields. Pointed out by Robert.
* Add locking clause for SB views for update/deleteStephen Frost2015-02-25
| | | | | | | | | | | | | | | | In expand_security_qual(), we were handling locking correctly when a PlanRowMark existed, but not when we were working with the target relation (which doesn't have any PlanRowMarks, but the subquery created for the security barrier quals still needs to lock the rows under it). Noted by Etsuro Fujita when working with the Postgres FDW, which wasn't properly issuing a SELECT ... FOR UPDATE to the remote side under a DELETE. Back-patch to 9.4 where updatable security barrier views were introduced. Per discussion with Etsuro and Dean Rasheed.
* Fix over-optimistic caching in fetch_array_arg_replace_nulls().Tom Lane2015-02-25
| | | | | | | When I rewrote this in commit 56a79a869bedc4bf6c35853642694cc0b0594dd2, I forgot that it's possible for the input array type to change from one call to the next (this can happen when applying the function to pg_statistic columns, for instance). Fix that.
* Fix dumping of views that are just VALUES(...) but have column aliases.Tom Lane2015-02-25
| | | | | | | | | | | | | | | The "simple" path for printing VALUES clauses doesn't work if we need to attach nondefault column aliases, because there's noplace to do that in the minimal VALUES() syntax. So modify get_simple_values_rte() to detect nondefault aliases and treat that as a non-simple case. This further exposes that the "non-simple" path never actually worked; it didn't produce valid syntax. Fix that too. Per bug #12789 from Curtis McEnroe, and analysis by Andrew Gierth. Back-patch to all supported branches. Before 9.3, this also requires back-patching the part of commit 092d7ded29f36b0539046b23b81b9f0bf2d637f1 that created get_simple_values_rte() to begin with; inserting the extra test into the old factorization of that logic would've been too messy.
* Improve parser's one-extra-token lookahead mechanism.Tom Lane2015-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a couple of places in our grammar that fail to be strict LALR(1), by requiring more than a single token of lookahead to decide what to do. Up to now we've dealt with that by using a filter between the lexer and parser that merges adjacent tokens into one in the places where two tokens of lookahead are necessary. But that creates a number of user-visible anomalies, for instance that you can't name a CTE "ordinality" because "WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one token. I realized that there's a better way. In this patch, we still do the lookahead basically as before, but we never merge the second token into the first; we replace just the first token by a special lookahead symbol when one of the lookahead pairs is seen. This requires a couple extra productions in the grammar, but it involves fewer special tokens, so that the grammar tables come out a bit smaller than before. The filter logic is no slower than before, perhaps a bit faster. I also fixed the filter logic so that when backing up after a lookahead, the current token's terminator is correctly restored; this eliminates some weird behavior in error message issuance, as is shown by the one change in existing regression test outputs. I believe that this patch entirely eliminates odd behaviors caused by lookahead for WITH. It doesn't really improve the situation for NULLS followed by FIRST/LAST unfortunately: those sequences still act like a reserved word, even though there are cases where they should be seen as two ordinary identifiers, eg "SELECT nulls first FROM ...". I experimented with additional grammar hacks but couldn't find any simple solution for that. Still, this is better than before, and it seems much more likely that we *could* somehow solve the NULLS case on the basis of this filter behavior than the previous one.
* Error when creating names too long for tar formatPeter Eisentraut2015-02-24
| | | | | | | | | | | | | | | | | | | | | | | The tar format (at least the version we are using), does not support file names or symlink targets longer than 99 bytes. Until now, the tar creation code would silently truncate any names that are too long. (Its original application was pg_dump, where this never happens.) This creates problems when running base backups over the replication protocol. The most important problem is when a tablespace path is longer than 99 bytes, which will result in a truncated tablespace path being backed up. Less importantly, the basebackup protocol also promises to back up any other files it happens to find in the data directory, which would also lead to file name truncation if someone put a file with a long name in there. Now both of these cases result in an error during the backup. Add tests that fail when a too-long file name or symlink is attempted to be backed up. Reviewed-by: Robert Hass <robertmhaas@gmail.com>
* Fix typo in README.Heikki Linnakangas2015-02-24
| | | | Kyotaro Horiguchi
* Fix stupid merge errors in previous commitAlvaro Herrera2015-02-23
| | | | Brown paper bag installed permanently.
* Further tweaking of raw grammar output to distinguish different inputs.Tom Lane2015-02-23
| | | | | | | | | | | | | | | | | | | | Use a different A_Expr_Kind for LIKE/ILIKE/SIMILAR TO constructs, so that they can be distinguished from direct invocation of the underlying operators. Also, postpone selection of the operator name when transforming "x IN (select)" to "x = ANY (select)", so that those syntaxes can be told apart at parse analysis time. I had originally thought I'd also have to do something special for the syntaxes IS NOT DISTINCT FROM, IS NOT DOCUMENT, and x NOT IN (SELECT...), which the grammar translates as though they were NOT (construct). On reflection though, we can distinguish those cases reliably by noting whether the parse location shown for the NOT is the same as for its child node. This only requires tweaking the parse locations for NOT IN, which I've done here. These changes should have no effect outside the parser; they're just in support of being able to give accurate warnings for planned operator precedence changes.
* Support more commands in event triggersAlvaro Herrera2015-02-23
| | | | | | | | COMMENT, SECURITY LABEL, and GRANT/REVOKE now also fire ddl_command_start and ddl_command_end event triggers, when they operate on database-local objects. Reviewed-By: Michael Paquier, Andres Freund, Stephen Frost
* Replace checkpoint_segments with min_wal_size and max_wal_size.Heikki Linnakangas2015-02-23
| | | | | | | | | | | | | | | | | | | | | | | | Instead of having a single knob (checkpoint_segments) that both triggers checkpoints, and determines how many checkpoints to recycle, they are now separate concerns. There is still an internal variable called CheckpointSegments, which triggers checkpoints. But it no longer determines how many segments to recycle at a checkpoint. That is now auto-tuned by keeping a moving average of the distance between checkpoints (in bytes), and trying to keep that many segments in reserve. The advantage of this is that you can set max_wal_size very high, but the system won't actually consume that much space if there isn't any need for it. The min_wal_size sets a floor for that; you can effectively disable the auto-tuning behavior by setting min_wal_size equal to max_wal_size. The max_wal_size setting is now the actual target size of WAL at which a new checkpoint is triggered, instead of the distance between checkpoints. Previously, you could calculate the actual WAL usage with the formula "(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this patch, you set the desired WAL usage with max_wal_size, and the system calculates the appropriate CheckpointSegments with the reverse of that formula. That's a lot more intuitive for administrators to set. Reviewed by Amit Kapila and Venkata Balaji N.
* Refactor unit conversions code in guc.c.Heikki Linnakangas2015-02-23
| | | | | | | Replace the if-switch-case constructs with two conversion tables, containing all the supported conversions between human-readable unit strings and the base units used in GUC variables. This makes the code easier to read, and makes adding new units simpler.
* Guard against spurious signals in LockBufferForCleanup.Andres Freund2015-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When LockBufferForCleanup() has to wait for getting a cleanup lock on a buffer it does so by setting a flag in the buffer header and then wait for other backends to signal it using ProcWaitForSignal(). Unfortunately LockBufferForCleanup() missed that ProcWaitForSignal() can return for other reasons than the signal it is hoping for. If such a spurious signal arrives the wait flags on the buffer header will still be set. That then triggers "ERROR: multiple backends attempting to wait for pincount 1". The fix is simple, unset the flag if still set when retrying. That implies an additional spinlock acquisition/release, but that's unlikely to matter given the cost of waiting for a cleanup lock. Alternatively it'd have been possible to move responsibility for maintaining the relevant flag to the waiter all together, but that might have had negative consequences due to possible floods of signals. Besides being more invasive. This looks to be a very longstanding bug. The relevant code in LockBufferForCleanup() hasn't changed materially since its introduction and ProcWaitForSignal() was documented to return for unrelated reasons since 8.2. The master only patch series removing ImmediateInterruptOK made it much easier to hit though, as ProcSendSignal/ProcWaitForSignal now uses a latch shared with other tasks. Per discussion with Kevin Grittner, Tom Lane and me. Backpatch to all supported branches. Discussion: 11553.1423805224@sss.pgh.pa.us
* Add GUC to control the time to wait before retrieving WAL after failed attempt.Fujii Masao2015-02-23
| | | | | | | | | | | | | | Previously when the standby server failed to retrieve WAL files from any sources (i.e., streaming replication, local pg_xlog directory or WAL archive), it always waited for five seconds (hard-coded) before the next attempt. For example, this is problematic in warm-standby because restore_command can fail every five seconds even while new WAL file is expected to be unavailable for a long time and flood the log files with its error messages. This commit adds new parameter, wal_retrieve_retry_interval, to control that wait time. Alexey Vasiliev and Michael Paquier, reviewed by Andres Freund and me.
* Add parse location fields to NullTest and BooleanTest structs.Tom Lane2015-02-22
| | | | | | | | We did not need a location tag on NullTest or BooleanTest before, because no error messages referred directly to their locations. That's planned to change though, so add these fields in a separate housekeeping commit. Catversion bump because stored rules may change.
* Get rid of multiple applications of transformExpr() to the same tree.Tom Lane2015-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | transformExpr() has for many years had provisions to do nothing when applied to an already-transformed expression tree. However, this was always ugly and of dubious reliability, so we'd be much better off without it. The primary historical reason for it was that gram.y sometimes returned multiple links to the same subexpression, which is no longer true as of my BETWEEN fixes. We'd also grown some lazy hacks in CREATE TABLE LIKE (failing to distinguish between raw and already-transformed index specifications) and one or two other places. This patch removes the need for and support for re-transforming already transformed expressions. The index case is dealt with by adding a flag to struct IndexStmt to indicate that it's already been transformed; which has some benefit anyway in that tablecmds.c can now Assert that transformation has happened rather than just assuming. The other main reason was some rather sloppy code for array type coercion, which can be fixed (and its performance improved too) by refactoring. I did leave transformJoinUsingClause() still constructing expressions containing untransformed operator nodes being applied to Vars, so that transformExpr() still has to allow Var inputs. But that's a much narrower, and safer, special case than before, since Vars will never appear in a raw parse tree, and they don't have any substructure to worry about. In passing fix some oversights in the patch that added CREATE INDEX IF NOT EXISTS (missing processing of IndexStmt.if_not_exists). These appear relatively harmless, but still sloppy coding practice.
* Represent BETWEEN as a special node type in raw parse trees.Tom Lane2015-02-22
| | | | | | | | | | | | | | | | | | | | | Previously, gram.y itself converted BETWEEN into AND (or AND/OR) nests of expression comparisons. This was always as bogus as could be, but fixing it hasn't risen to the top of the to-do list. The present patch invents an A_Expr representation for BETWEEN expressions, and does the expansion to comparison trees in parse_expr.c which is at least a slightly saner place to be doing semantic conversions. There should be no change in the post- parse-analysis results. This does nothing for the semantic issues with BETWEEN (dubious connection to btree-opclass semantics, and multiple evaluation of possibly volatile subexpressions) ... but it's a necessary preliminary step before we could fix any of that. The main immediate benefit is that preserving BETWEEN as an identifiable raw-parse-tree construct will enable better error messages. While at it, fix the code so that multiply-referenced subexpressions are physically duplicated before being passed through transformExpr(). This gets rid of one of the principal reasons why transformExpr() has historically had to allow already-processed input.
* Rename variable in AllocSetContextCreate to be consistent.Jeff Davis2015-02-21
| | | | | | Everywhere else in the file, "context" is of type MemoryContext and "set" is of type AllocSet. AllocSetContextCreate uses a variable of type AllocSet, so rename it from "context" to "set".
* In array_agg(), don't create a new context for every group.Jeff Davis2015-02-21
| | | | | | | | | | | | | | | Previously, each new array created a new memory context that started out at 8kB. This is incredibly wasteful when there are lots of small groups of just a few elements each. Change initArrayResult() and friends to accept a "subcontext" argument to indicate whether the caller wants the ArrayBuildState allocated in a new subcontext or not. If not, it can no longer be released separately from the rest of the memory context. Fixes bug report by Frank van Vugt on 2013-10-19. Tomas Vondra. Reviewed by Ali Akbar, Tom Lane, and me.
* Allow forcing nullness of columns during bootstrap.Andres Freund2015-02-21
| | | | | | | | | | | | | | | Bootstrap determines whether a column is null based on simple builtin rules. Those work surprisingly well, but nonetheless a few existing columns aren't set correctly. Additionally there is at least one patch sent to hackers where forcing the nullness of a column would be helpful. The boostrap format has gained FORCE [NOT] NULL for this, which will be emitted by genbki.pl when BKI_FORCE_(NOT_)?NULL is specified for a column in a catalog header. This patch doesn't change the marking of any existing columns. Discussion: 20150215170014.GE15326@awork2.anarazel.de
* Use FLEXIBLE_ARRAY_MEMBER in a number of other places.Tom Lane2015-02-21
| | | | I think we're about done with this...
* Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[].Tom Lane2015-02-21
| | | | | | | This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me
* Don't require users of src/port/gettimeofday.c to initialize it.Robert Haas2015-02-21
| | | | | | | | | | Commit 8001fe67a3d66c95861ce1f7075ef03953670d13 introduced this requirement, but per discussion, we want to avoid requirements of this type to make things easier on the calling code. An especially important consideration is that this may be used in frontend code, not just the backend. Asif Naeem, reviewed by Michael Paquier
* Some more FLEXIBLE_ARRAY_MEMBER fixes.Tom Lane2015-02-21
|
* Fix statically allocated struct with FLEXIBLE_ARRAY_MEMBER member.Tom Lane2015-02-20
| | | | | | | | | clang complains about this, not unreasonably, so define another struct that's explicitly for a WordEntryPos with exactly one element. While at it, get rid of pretty dubious use of a static variable for more than one purpose --- if it were being treated as const maybe I'd be okay with this, but it isn't.
* Use FLEXIBLE_ARRAY_MEMBER in some more places.Tom Lane2015-02-20
| | | | | | Fix a batch of structs that are only visible within individual .c files. Michael Paquier
* Use FLEXIBLE_ARRAY_MEMBER in struct RecordIOData.Tom Lane2015-02-20
| | | | | | | I (tgl) fixed this last night in rowtypes.c, but I missed that the code had been copied into a couple of other places. Michael Paquier
* Use FLEXIBLE_ARRAY_MEMBER in struct varlena.Tom Lane2015-02-20
| | | | | | | This forces some minor coding adjustments in tuptoaster.c and inv_api.c, but the new coding there is cleaner anyway. Michael Paquier
* Have TRUNCATE update pgstat tuple countersAlvaro Herrera2015-02-20
| | | | | | | | | | | | | | | | | | This works by keeping a per-subtransaction record of the ins/upd/del counters before the truncate, and then resetting them; this record is useful to return to the previous state in case the truncate is rolled back, either in a subtransaction or whole transaction. The state is propagated upwards as subtransactions commit. When the per-table data is sent to the stats collector, a flag indicates to reset the live/dead counters to zero as well. Catalog version bumped due to the change in pgstat format. Author: Alexander Shulgin Discussion: 1007.1207238291@sss.pgh.pa.us Discussion: 548F7D38.2000401@BlueTreble.com Reviewed-by: Álvaro Herrera, Jim Nasby
* Some more FLEXIBLE_ARRAY_MEMBER hacking.Tom Lane2015-02-20
|
* Remove unused variable.Tom Lane2015-02-20
| | | | Per buildfarm.
* Use FLEXIBLE_ARRAY_MEMBER in a bunch more places.Tom Lane2015-02-20
| | | | | | | | | | | | | | | | Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.
* Add pg_stat_get_snapshot_timestamp() to show statistics snapshot timestamp.Tom Lane2015-02-19
| | | | | | | | | Per discussion, this could be useful for purposes such as programmatically detecting a nonresponding stats collector. We already have the timestamp anyway, it's just a matter of providing a SQL-accessible function to fetch it. Matt Kelly, reviewed by Jim Nasby
* Split array_push into separate array_append and array_prepend functions.Tom Lane2015-02-18
| | | | | | | | | | | | | | | There wasn't any good reason for a single C function to implement both these SQL functions: it saved very little code overall, and it required significant pushups to re-determine at runtime which case applied. Redoing it as two functions ends up with just slightly more lines of code, but it's simpler to understand, and faster too because we need not repeat syscache lookups on every call. An important side benefit is that this eliminates the only case in which different aliases of the same C function had both anyarray and anyelement arguments at the same position, which would almost always be a mistake. The opr_sanity regression test will now notice such mistakes since there's no longer a valid case where it happens.
* Fix opclass/opfamily identity stringsAlvaro Herrera2015-02-18
| | | | | | | | | | | | | | | | | The original representation uses "opcname for amname", which is good enough; but if we replace "for" with "using", we can apply the returned identity directly in a DROP command, as in DROP OPERATOR CLASS opcname USING amname This slightly simplifies code using object identities to programatically execute commands on these kinds of objects. Note backwards-incompatible change: The previous representation dates back to 9.3 when object identities were introduced by commit f8348ea3, but we don't want to change the behavior on released branches unnecessarily and so this is not backpatched.
* Fix object identities for pg_conversion objectsAlvaro Herrera2015-02-18
| | | | | | | We were neglecting to schema-qualify them. Backpatch to 9.3, where object identities were introduced as a concept by commit f8348ea32ec8.
* Fix EXPLAIN output for cases where parent table is excluded by constraints.Tom Lane2015-02-17
| | | | | | | | | | | | | | The previous coding in EXPLAIN always labeled a ModifyTable node with the name of the target table affected by its first child plan. When originally written, this was necessarily the parent table of the inheritance tree, so everything was unconfusing. But when we added NO INHERIT constraints, it became possible for the parent table to be deleted from the plan by constraint exclusion while still leaving child tables present. This led to the ModifyTable plan node being labeled with the first surviving child, which was deemed confusing. Fix it by retaining the parent table's RT index in a new field in ModifyTable. Etsuro Fujita, reviewed by Ashutosh Bapat and myself
* Fix a bug in pairing heap removal code.Heikki Linnakangas2015-02-17
| | | | | | | | | | | | | | | | After removal, the next_sibling pointer of a node was sometimes incorrectly left to point to another node in the heap, which meant that a node was sometimes linked twice into the heap. Surprisingly that didn't cause any crashes in my testing, but it was clearly wrong and could easily segfault in other scenarios. Also always keep the prev_or_parent pointer as NULL on the root node. That was not a correctness issue AFAICS, but let's be tidy. Add a debugging function, to dump the contents of a pairing heap as a string. It's #ifdef'd out, as it's not used for anything in any normal code, but it was highly useful in debugging this. Let's keep it handy for further reference.
* Fix knn-GiST queue comparison function to return heap tuples first.Heikki Linnakangas2015-02-17
| | | | | | | | | The part of the comparison function that was supposed to keep heap tuples ahead of index items was backwards. It would not lead to incorrect results, but it is more efficient to return heap tuples first, before scanning more index pages, when both have the same distance. Alexander Korotkov
* Remove code to match IPv4 pg_hba.conf entries to IPv4-in-IPv6 addresses.Tom Lane2015-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | In investigating yesterday's crash report from Hugo Osvaldo Barrera, I only looked back as far as commit f3aec2c7f51904e7 where the breakage occurred (which is why I thought the IPv4-in-IPv6 business was undocumented). But actually the logic dates back to commit 3c9bb8886df7d56a and was simply broken by erroneous refactoring in the later commit. A bit of archives excavation shows that we added the whole business in response to a report that some 2003-era Linux kernels would report IPv4 connections as having IPv4-in-IPv6 addresses. The fact that we've had no complaints since 9.0 seems to be sufficient confirmation that no modern kernels do that, so let's just rip it all out rather than trying to fix it. Do this in the back branches too, thus essentially deciding that our effective behavior since 9.0 is correct. If there are any platforms on which the kernel reports IPv4-in-IPv6 addresses as such, yesterday's fix would have made for a subtle and potentially security-sensitive change in the effective meaning of IPv4 pg_hba.conf entries, which does not seem like a good thing to do in minor releases. So let's let the post-9.0 behavior stand, and change the documentation to match it. In passing, I failed to resist the temptation to wordsmith the description of pg_hba.conf IPv4 and IPv6 address entries a bit. A lot of this text hasn't been touched since we were IPv4-only.
* Fix misuse of memcpy() in check_ip().Tom Lane2015-02-16
| | | | | | | | | | | | | | | | | | | | | The previous coding copied garbage into a local variable, pretty much ensuring that the intended test of an IPv6 connection address against a promoted IPv4 address from pg_hba.conf would never match. The lack of field complaints likely indicates that nobody realized this was supposed to work, which is unsurprising considering that no user-facing docs suggest it should work. In principle this could have led to a SIGSEGV due to reading off the end of memory, but since the source address would have pointed to somewhere in the function's stack frame, that's quite unlikely. What led to discovery of the bug is Hugo Osvaldo Barrera's report of a crash after an OS upgrade, which is probably because he is now running a system in which memcpy raises abort() upon detecting overlapping source and destination areas. (You'd have to additionally suppose some things about the stack frame layout to arrive at this conclusion, but it seems plausible.) This has been broken since the code was added, in commit f3aec2c7f51904e7, so back-patch to all supported branches.
* Restore the SSL_set_session_id_context() call to OpenSSL renegotiation.Heikki Linnakangas2015-02-16
| | | | | | | | | | This reverts the removal of the call in commit (272923a0). It turns out it wasn't superfluous after all: without it, renegotiation fails if a client certificate was used. The rest of the changes in that commit are still OK and not reverted. Per investigation of bug #12769 by Arne Scheffer, although this doesn't fix the reported bug yet.
* Rationalize the APIs of array element/slice access functions.Tom Lane2015-02-16
| | | | | | | | | | | | | | | | | | | | | | The four functions array_ref, array_set, array_get_slice, array_set_slice have traditionally declared their array inputs and results as being of type "ArrayType *". This is a lie, and has been since Berkeley days, because they actually also support "fixed-length array" types such as "name" and "point"; not to mention that the inputs could be toasted. These values should be declared Datum instead to avoid confusion. The current coding already risks possible misoptimization by compilers, and it'll get worse when "expanded" array representations become a valid alternative. However, there's a fair amount of code using array_ref and array_set with arrays that *are* known to be ArrayType structures, and there might be more such places in third-party code. Rather than cluttering those call sites with PointerGetDatum/DatumGetArrayTypeP cruft, what I did was to rename the existing functions to array_get_element/array_set_element, fix their signatures, then reincarnate array_ref/array_set as backwards compatibility wrappers. array_get_slice/array_set_slice have no such constituency in the core code, and probably not in third-party code either, so I just changed their APIs.
* Fix null-pointer-deref crash while doing COPY IN with check constraints.Tom Lane2015-02-15
| | | | | | | | | | | | | | | | | | | | In commit bf7ca15875988a88e97302e012d7c4808bef3ea9 I introduced an assumption that an RTE referenced by a whole-row Var must have a valid eref field. This is false for RTEs constructed by DoCopy, and there are other places taking similar shortcuts. Perhaps we should make all those places go through addRangeTableEntryForRelation or its siblings instead of having ad-hoc logic, but the most reliable fix seems to be to make the new code in ExecEvalWholeRowVar cope if there's no eref. We can reasonably assume that there's no need to insert column aliases if no aliases were provided. Add a regression test case covering this, and also verifying that a sane column name is in fact available in this situation. Although the known case only crashes in 9.4 and HEAD, it seems prudent to back-patch the code change to 9.2, since all the ingredients for a similar failure exist in the variant patch applied to 9.3 and 9.2. Per report from Jean-Pierre Pelletier.
* Simplify waiting logic in reading from / writing to client.Heikki Linnakangas2015-02-13
| | | | | | | | | The client socket is always in non-blocking mode, and if we actually want blocking behaviour, we emulate it by sleeping and retrying. But we have retry loops at different layers for reads and writes, which was confusing. To simplify, remove all the sleeping and retrying code from the lower levels, from be_tls_read and secure_raw_read and secure_raw_write, and put all the logic in secure_read() and secure_write().
* Simplify the way OpenSSL renegotiation is initiated in server.Heikki Linnakangas2015-02-13
| | | | | | | | | At least in all modern versions of OpenSSL, it is enough to call SSL_renegotiate() once, and then forget about it. Subsequent SSL_write() and SSL_read() calls will finish the handshake. The SSL_set_session_id_context() call is unnecessary too. We only have one SSL context, and the SSL session was created with that to begin with.
* Fix missing PQclear() in libpqrcv_endstreaming().Tom Lane2015-02-11
| | | | | | | This omission leaked one PGresult per WAL streaming cycle, which possibly would never be enough to notice in the real world, but it's still a leak. Per Coverity. Back-patch to 9.3 where the error was introduced.
* Fix minor memory leak in ident_inet().Tom Lane2015-02-11
| | | | | | | | | | | | | We'd leak the ident_serv data structure if the second pg_getaddrinfo_all (the one for the local address) failed. This is not of great consequence because a failure return here just leads directly to backend exit(), but if this function is going to try to clean up after itself at all, it should not have such holes in the logic. Try to fix it in a future-proof way by having all the failure exits go through the same cleanup path, rather than "optimizing" some of them. Per Coverity. Back-patch to 9.2, which is as far back as this patch applies cleanly.
* Fix GEQO to not assume its join order heuristic always works.Tom Lane2015-02-10
| | | | | | | | | | | | | | | | | | | | | | Back in commit 400e2c934457bef4bc3cc9a3e49b6289bd761bc0 I rewrote GEQO's gimme_tree function to improve its heuristic for modifying the given tour into a legal join order. In what can only be called a fit of hubris, I supposed that this new heuristic would *always* find a legal join order, and ripped out the old logic that allowed gimme_tree to sometimes fail. The folly of this is exposed by bug #12760, in which the "greedy" clumping behavior of merge_clump() can lead it into a dead end which could only be recovered from by un-clumping. We have no code for that and wouldn't know exactly what to do with it if we did. Rather than try to improve the heuristic rules still further, let's just recognize that it *is* a heuristic and probably must always have failure cases. So, put back the code removed in the previous commit to allow for failure (but comment it a bit better this time). It's possible that this code was actually fully correct at the time and has only been broken by the introduction of LATERAL. But having seen this example I no longer have much faith in that proposition, so back-patch to all supported branches.