postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Make TRUNCATE ... RESTART IDENTITY restart sequences transactionally.	Tom Lane	2010-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the previous coding, we simply issued ALTER SEQUENCE RESTART commands, which do not roll back on error. This meant that an error between truncating and committing left the sequences out of sync with the table contents, with potentially bad consequences as were noted in a Warning on the TRUNCATE man page. To fix, create a new storage file (relfilenode) for a sequence that is to be reset due to RESTART IDENTITY. If the transaction aborts, we'll automatically revert to the old storage file. This acts just like a rewriting ALTER TABLE operation. A penalty is that we have to take exclusive lock on the sequence, but since we've already got exclusive lock on its owning table, that seems unlikely to be much of a problem. The interaction of this with usual nontransactional behaviors of sequence operations is a bit weird, but it's hard to see what would be completely consistent. Our choice is to discard cached-but-unissued sequence values both when the RESTART is executed, and at rollback if any; but to not touch the currval() state either time. In passing, move the sequence reset operations to happen before not after any AFTER TRUNCATE triggers are fired. The previous ordering was not logically sensible, but was forced by the need to minimize inconsistency if the triggers caused an error. Transactional rollback is a much better solution to that. Patch by Steve Singer, rather heavily adjusted by me.
*	Additional fixes for parallel make	Peter Eisentraut	2010-11-17
\| \| \| \| \| \| \| \| \|	Add some additional dependencies to constrain the build order to prevent parallel make from failing. In the case of src/Makefile, this is likely to be too complicated to be worth maintaining, so just add .NOTPARALLEL to get the old for-loop-like behavior. More fine-tuning might be necessary for some platforms or configurations.
*	Require VALUE keyword when extending an enum type. Based on a patch from ↵	Andrew Dunstan	2010-11-16
\| \| \| \|	Alvaro Herrera.
*	Send paramHandle to subprocesses as 64-bit on Win64	Magnus Hagander	2010-11-16
\| \| \| \| \| \| \| \| \| \| \|	The handle to the shared memory segment containing startup parameters was sent as 32-bit even on 64-bit systems. Since HANDLEs appear to be allocated sequentially this shouldn't be a problem until we reach 2^32 open handles in the postmaster, but a 64-bit value should be sent across as 64-bit, and not zero out the top 32 bits. Noted by Tom Lane.
*	The GiST scan algorithm uses LSNs to detect concurrent pages splits, but	Heikki Linnakangas	2010-11-16
\| \| \| \| \| \| \| \| \| \| \| \| \|	temporary indexes are not WAL-logged. We used a constant LSN for temporary indexes, on the assumption that we don't need to worry about concurrent page splits in temporary indexes because they're only visible to the current session. But that assumption is wrong, it's possible to insert rows and split pages in the same session, while a scan is in progress. For example, by opening a cursor and fetching some rows, and INSERTing new rows before fetching some more. Fix by generating fake increasing LSNs, used in place of real LSNs in temporary GiST indexes.
*	Fix aboriginal mistake in plpython's set-returning-function support.	Tom Lane	2010-11-15
\| \| \| \| \| \| \| \| \| \|	We must stay in the function's SPI context until done calling the iterator that returns the set result. Otherwise, any attempt to invoke SPI features in the python code called by the iterator will malfunction. Diagnosis and patch by Jan Urbanski, per bug report from Jean-Baptiste Quenot. Back-patch to 8.2; there was no support for SRFs in previous versions of plpython.
*	Add new buffers_backend_fsync field to pg_stat_bgwriter.	Robert Haas	2010-11-15
\| \| \| \| \| \| \| \| \| \| \| \|	This new field counts the number of times that a backend which writes a buffer out to the OS must also fsync() it. This happens when the bgwriter fsync request queue is full, and is generally detrimental to performance, so it's good to know when it's happening. Along the way, log a new message at level DEBUG1 whenever we fail to hand off an fsync, so that the problem can also be seen in examination of log files (if the logging level is cranked up high enough). Greg Smith, with minor tweaks by me.
*	Remove outdated comments from the regression test files.	Robert Haas	2010-11-15
\| \| \| \| \| \| \|	Since 2004, int2 and int4 operators do detect overflow; this was fixed by commit 4171bb869f234281a13bb862d3b1e577bf336242. Extracted from a larger patch by Andres Freund.
*	Fix copy-and-pasteo a little more completely.	Robert Haas	2010-11-15
\| \| \| \|	copydir.c is no longer in src/port
*	Fix copy-and-pasteo.	Alvaro Herrera	2010-11-15
\|
*	Avoid spurious Hot Standby conflicts from btree delete records.	Simon Riggs	2010-11-15
\| \| \| \| \| \| \|	Similar conflicts were already avoided for related record types. Massive over-caution resulted in a usability bug. Clear theoretical basis for doing this is now confirmed by me. Request to remove from Heikki (twice), over-caution by me.
*	Adjust comments about what's needed to avoid make 3.80 bug.	Tom Lane	2010-11-15
\| \| \| \|	... based on further tracing through that code.
*	Correct poor grammar in comment.	Robert Haas	2010-11-14
\|
*	Cleanup various comparisons with the constant "true".	Robert Haas	2010-11-14
\| \| \| \|	Itagaki Takahiro, with slight modifications.
*	Fix canAcceptConnections() bugs introduced by replication-related patches.	Tom Lane	2010-11-14
\| \| \| \| \| \| \| \| \| \|	We must not return any "okay to proceed" result code without having checked for too many children, else we might fail later on when trying to add the new child to one of the per-child state arrays. It's not clear whether this oversight explains Stefan Kaltenbrunner's recent report, but it could certainly produce a similar symptom. Back-patch to 8.4; the logic was not broken before that.
*	Work around make 3.80 bug with long expansions of $(eval).	Tom Lane	2010-11-14
\| \| \| \| \| \| \| \| \| \| \|	3.80 breaks if the expansion of $(eval) is long enough to require expansion of its internal variable_buffer. For the purposes of $(recurse) that means it'll work so long as no single evaluation of _create_recursive_target produces more than 195 bytes. We can manage that by looping over subdirectories outside the call instead of complicating the generated rule. This coding is simpler and more readable anyway. Or at least, this works for me. We'll see if the buildfarm likes it.
*	Add missing outfuncs.c support for struct InhRelation.	Tom Lane	2010-11-13
\| \| \| \| \| \|	This is needed to support debug_print_parse, per report from Jon Nelson. Cursory testing via the regression tests suggests we aren't missing anything else.
*	Attempt to fix MSVC builds broken by parallel make changes.	Andrew Dunstan	2010-11-12
\|
*	Move copydir() prototype into its own header file.	Robert Haas	2010-11-12
\| \| \| \| \| \| \|	Having this in src/include/port.h makes no sense, now that copydir.c lives in src/backend/strorage rather than src/port. Along the way, remove an obsolete comment from contrib/pg_upgrade that makes reference to the old location.
*	Fix old oversight in const-simplification of COALESCE() expressions.	Tom Lane	2010-11-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Once we have found a non-null constant argument, there is no need to examine additional arguments of the COALESCE. The previous coding got it right only if the constant was in the first argument position; otherwise it tried to simplify following arguments too, leading to unexpected behavior like this: regression=# select coalesce(f1, 42, 1/0) from int4_tbl; ERROR: division by zero It's a minor corner case, but a bug is a bug, so back-patch all the way.
*	Improved parallel make support	Peter Eisentraut	2010-11-12
\| \| \| \| \| \| \| \|	Replace for loops in makefiles with proper dependencies. Parallel make can now span across directories. Also, make -k and make -q work properly. GNU make 3.80 or newer is now required.
*	Add missing support for removing foreign data wrapper / server privileges	Heikki Linnakangas	2010-11-12
\| \| \| \| \| \| \| \|	belonging to a user at DROP OWNED BY. Foreign data wrappers and servers don't do anything useful yet, which is why no-one has noticed, but since we have them, seems prudent to fix this. Per report from Chetan Suttraway. Backpatch to 9.0, 8.4 has the same problem but this patch didn't apply there so I'm not going to bother.
*	Fix bug introduced by the recent patch to check that the checkpoint redo	Heikki Linnakangas	2010-11-11
\| \| \| \| \| \| \|	location read from backup label file can be found: wasShutdown was set incorrectly when a backup label file was found. Jeff Davis, with a little tweaking by me.
*	Fix line_construct_pm() for the case of "infinite" (DBL_MAX) slope.	Tom Lane	2010-11-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code was just plain wrong: what you got was not a line through the given point but a line almost indistinguishable from the Y-axis, although not truly vertical. The only caller that tries to use this function with m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal; it would end up producing the distance from the given point to the place where the lseg's line crosses the Y-axis. That function is used by other operators too, so there are several operators that could compute wrong distances from a line segment to something else. Per bug #5745 from jindiax. Back-patch to all supported branches.
*	Add monitoring function pg_last_xact_replay_timestamp.	Robert Haas	2010-11-09
\| \| \| \|	Fujii Masao, with a little wordsmithing by me.
*	Don't use __declspec (dllimport) for PGDLLEXPORT to reduce warnings	Itagaki Takahiro	2010-11-10
\| \| \| \| \|	by gcc version 4 on mingw and cygwin. We don't use dllexport here because dllexport and dllwrap don't work well together.
*	Repair memory leakage while ANALYZE-ing complex index expressions.	Tom Lane	2010-11-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general design of memory management in Postgres is that intermediate results computed by an expression are not freed until the end of the tuple cycle. For expression indexes, ANALYZE has to re-evaluate each expression for each of its sample rows, and it wasn't bothering to free intermediate results until the end of processing of that index. This could lead to very substantial leakage if the intermediate results were large, as in a recent example from Jakub Ouhrabka. Fix by doing ResetExprContext for each sample row. This necessitates adding a datumCopy step to ensure that the final expression value isn't recycled too. Some quick testing suggests that this change adds at worst about 10% to the time needed to analyze a table with an expression index; which is annoying, but seems a tolerable price to pay to avoid unexpected out-of-memory problems. Back-patch to all supported branches.
*	In rewriteheap.c (used by VACUUM FULL and CLUSTER), calculate the tuple	Heikki Linnakangas	2010-11-09
\| \| \| \| \| \| \| \| \| \|	length stored in the line pointer the same way it's calculated in the normal heap_insert() codepath. As noted by Jeff Davis, the length stored by raw_heap_insert() included padding but the one stored by the normal codepath did not. While the mismatch seems to be harmless, inconsistency isn't good, and the normal codepath has received a lot more testing over the years. Backpatch to 8.3 where the heap rewrite code was introduced.
*	Fix error handling in temp-file deletion with log_temp_files active.	Tom Lane	2010-11-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original coding in FileClose() reset the file-is-temp flag before unlinking the file, so that if control came back through due to an error, it wouldn't try to unlink the file twice. This was correct when written, but when the log_temp_files feature was added, the logging action was put in between those two steps. An error occurring during the logging action --- such as a query cancel --- would result in the unlink not getting done at all, as in recent report from Michael Glaesemann. To fix this, make sure that we do both the stat and the unlink before doing anything that could conceivably CHECK_FOR_INTERRUPTS. There is a judgment call here, which is which log message to emit first: if you can see only one, which should it be? I chose to log unlink failure at the risk of losing the log_temp_files log message --- after all, if the unlink does fail, the temp file is still there for you to see. Back-patch to all versions that have log_temp_files. The code was OK before that.
*	Fix permanent memory leak in autovacuum launcher	Alvaro Herrera	2010-11-08
\| \| \| \| \| \| \| \| \| \| \| \|	get_database_list was uselessly allocating its output data, along some created along the way, in a permanent memory context. This didn't matter when autovacuum was a single, short-lived process, but now that the launcher is permanent, it shows up as a permanent leak. To fix, make get_database list allocate its output data in the caller's context, which is in charge of freeing it when appropriate; and the memory leaked by heap_beginscan et al is allocated in a throwaway transaction context.
*	Use appendrel planning logic for top-level UNION ALL structures.	Tom Lane	2010-11-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	Formerly, we could convert a UNION ALL structure inside a subquery-in-FROM into an appendrel, as a side effect of pulling up the subquery into its parent; but top-level UNION ALL always caused use of plan_set_operations(). That didn't matter too much because you got an Append-based plan either way. However, now that the appendrel code can do things with MergeAppend, it's worthwhile to hack up the top-level case so it also uses appendrels. This is a bit of a stopgap; but going much further than this will require a major rewrite of the planner's set-operations support, which I'm not prepared to undertake now. For the moment let's grab the low-hanging fruit.
*	Prevent invoking I/O conversion casts via functional/attribute notation.	Tom Lane	2010-11-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PG 8.4 added a built-in feature for casting pretty much any data type to string types (text, varchar, etc). We allowed this to work in any of the historically-allowed syntaxes: CAST(x AS text), x::text, text(x), or x.text. However, multiple complaints have shown that it's too easy to invoke such casts unintentionally in the latter two styles, particularly field selection. To cure the problem with the narrowest possible change of behavior, disallow use of I/O conversion casts from composite types to string types via functional/attribute syntax. The new functionality is still available via cast syntax. In passing, document the equivalence of functional and attribute syntax in a more visible place.
*	Implement an "S" option for psql's \dn command.	Tom Lane	2010-11-06
\| \| \| \| \| \| \|	\dn without "S" now hides all pg_XXX schemas as well as information_schema. Thus, in a bare database you'll only see "public". ("public" is considered a user schema, not a system schema, mainly because it's droppable.) Per discussion back in late September.
*	Add support for detecting register-stack overrun on IA64.	Tom Lane	2010-11-06
\| \| \| \| \| \| \| \| \|	Per recent investigation, the register stack can grow faster than the regular stack depending on compiler and choice of options. To avoid crashes we must check both stacks in check_stack_depth(). Since this is poorly-tested code, committing only to HEAD for the moment ... but we might want to consider back-patching later.
*	Make get_stack_depth_rlimit() handle RLIM_INFINITY more sanely.	Tom Lane	2010-11-06
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than considering this result as meaning "unknown", report LONG_MAX. This won't change what superusers can set max_stack_depth to, but it will cause InitializeGUCOptions() to set the built-in default to 2MB not 100kB. The latter seems like a fairly unreasonable interpretation of "infinity". Per my investigation of odd buildfarm results as well as an old complaint from Heikki. Since this should persuade all the buildfarm animals to use a reasonable stack depth setting during "make check", revert previous patch that dumbed down a recursive regression test to only 5 levels.
*	Include the current value of max_stack_depth in stack depth complaints.	Tom Lane	2010-11-04
\| \| \| \| \| \| \|	I'm mainly interested in finding out what it is on buildfarm machines, but including the active value in the message seems like good practice in any case. Add the info to the HINT, not the ERROR string, so as not to change the regression tests' expected output.
*	Use appendStringInfoString() where appropriate in elog.c.	Tom Lane	2010-11-04
\| \| \| \| \| \| \| \| \|	The nominally equivalent call appendStringInfo(buf, "%s", str) can be significantly slower when str is large. In particular, the former usage in EVALUATE_MESSAGE led to O(N^2) behavior when collecting a large number of context lines, as I found out while testing recursive functions. The other changes are just neatnik-ism and seem unlikely to save anything meaningful, but a cycle shaved is a cycle earned.
*	Reimplement planner's handling of MIN/MAX aggregate optimization.	Tom Lane	2010-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Per my recent proposal, get rid of all the direct inspection of indexes and manual generation of paths in planagg.c. Instead, set up EquivalenceClasses for the aggregate argument expressions, and let the regular path generation logic deal with creating paths that can satisfy those sort orders. This makes planagg.c a bit more visible to the rest of the planner than it was originally, but the approach is basically a lot cleaner than before. A major advantage of doing it this way is that we get MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans) practically for free, whereas in the old way we'd have had to add a whole lot more duplicative logic. One small disadvantage of this approach is that MIN/MAX aggregates can no longer exploit partial indexes having an "x IS NOT NULL" predicate, unless that restriction or something that implies it is specified in the query. The previous implementation was able to use the added "x IS NOT NULL" condition as an extra predicate proof condition, but in this version we rely entirely on indexes that are considered usable by the main planning process. That seems a fair tradeoff for the simplicity and functionality gained.
*	Reduce recursion depth in recently-added regression test.	Tom Lane	2010-11-03
\| \| \| \| \| \| \| \| \| \|	Some buildfarm members fail the test with the original depth of 10 levels, apparently because they are running at the minimum max_stack_depth setting of 100kB and using ~ 10k per recursion level. While it might be interesting to try to figure out why they're eating so much stack, it isn't likely that any fix for that would be back-patchable. So just change the test to recurse only 5 levels. The extra levels don't prove anything correctness-wise anyway.
*	Use only one hash entry for all instances of a pltcl trigger function.	Tom Lane	2010-11-03
\| \| \| \| \| \| \| \| \|	Like plperl and unlike plpgsql, there isn't any cached state that could depend on exactly which relation the trigger is being fired for. So we can use just one hash entry for all relations, which might save a little something. Alex Hunsaker
*	Fix adjust_semi_join to be more cautious about clauseless joins.	Tom Lane	2010-11-02
\| \| \| \| \| \| \|	It was reporting that these were fully indexed (hence cheap), when of course they're the exact opposite of that. I'm not certain if the case would arise in practice, since a clauseless semijoin is hard to produce in SQL, but if it did happen we'd make some dumb decisions.
*	Ensure an index that uses a whole-row Var still depends on its table.	Tom Lane	2010-11-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We failed to record any dependency on the underlying table for an index declared like "create index i on t (foo(t.*))". This would create trouble if the table were dropped without previously dropping the index. To fix, simplify some overly-cute code in index_create(), accepting the possibility that sometimes the whole-table dependency will be redundant. Also document this hazard in dependency.c. Per report from Kevin Grittner. In passing, prevent a core dump in pg_get_indexdef() if the index's table can't be found. I came across this while experimenting with Kevin's example. Not sure it's a real issue when the catalogs aren't corrupt, but might as well be cautious. Back-patch to all supported versions.
*	Some cleanup in ecpg code:	Michael Meskes	2010-11-02
\| \| \| \| \| \|	Use bool as type for booleans instead of int. Do not implicitely cast size_t to int. Make the compiler stop complaining about unused variables by adding an empty statement.
*	Bootstrap WAL to begin at segment logid=0 logseg=1 (000000010000000000000001)	Heikki Linnakangas	2010-11-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	rather than 0/0, so that we can safely use 0/0 as an invalid value. This is a more future-proof fix for the corner-case bug in streaming replication that was fixed yesterday. We had a similar corner-case bug with log/seg 0/0 back in February as well. Avoiding 0/0 as a valid value should prevent bugs like that in the future. Per Tom Lane's idea. Back-patch to 9.0. Since this only affects bootstrapping, it makes no difference to existing installations. We don't need to worry about the bug in existing installations, because if you've managed to get past the initial base backup already, you won't hit the bug in the future either.
*	Avoid using a local FunctionCallInfoData struct in ExecMakeFunctionResult	Tom Lane	2010-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	and related routines. We already had a redundant FunctionCallInfoData struct in FuncExprState, but were using that copy only in set-returning-function cases, to avoid keeping function evaluation state in the expression tree for the benefit of plpgsql's "simple expression" logic. But of course that didn't work anyway. Given the recent fixes in plpgsql there is no need to have two separate behaviors here. Getting rid of the local FunctionCallInfoData structs should make things a little faster (because we don't need to do InitFunctionCallInfoData each time), and it also makes for a noticeable reduction in stack space consumption during recursive calls.
*	Fix corner-case bug in tracking of latest removed WAL segment during	Heikki Linnakangas	2010-11-01
\| \| \| \| \| \| \| \| \|	streaming replication. We used log/seg 0/0 to indicate that no WAL segments have been removed since startup, but 0/0 is a valid value for the very first WAL segment after initdb. To make that disambiguous, store (latest removed WAL segment + 1) in the global variable. Per report from Matt Chesler, also reproduced by Greg Smith.
*	Revert removal of trigger flag from plperl function hash key.REL9_1_ALPHA2	Tom Lane	2010-10-31
\| \| \| \| \| \| \| \|	As noted by Jan Urbanski, this flag is in fact needed to ensure that the function's input/result conversion functions are set up as expected. Add a regression test to discourage anyone from making same mistake in future.
*	Provide hashing support for arrays.	Tom Lane	2010-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The core of this patch is hash_array() and associated typcache infrastructure, which works just about exactly like the existing support for array comparison. In addition I did some work to ensure that the planner won't think that an array type is hashable unless its element type is hashable, and similarly for sorting. This includes adding a datatype parameter to op_hashjoinable and op_mergejoinable, and adding an explicit "hashable" flag to SortGroupClause. The lack of a cross-check on the element type was a pre-existing bug in mergejoin support --- but it didn't matter so much before, because if you couldn't sort the element type there wasn't any good alternative to failing anyhow. Now that we have the alternative of hashing the array type, there are cases where we can avoid a failure by being picky at the planner stage, so it's time to be picky. The issue of exactly how to combine the per-element hash values to produce an array hash is still open for discussion, but the rest of this is pretty solid, so I'll commit it as-is.
*	Fix comparisons of pointers with zero to compare with NULL instead.	Tom Lane	2010-10-29
\| \| \| \| \| \| \|	Per C standard, these are semantically the same thing; but saying NULL when you mean NULL is good for readability. Marti Raudsepp, per results of INRIA's Coccinelle.
*	Oops, missed one fix for EquivalenceClass rearrangement.	Tom Lane	2010-10-29
\| \| \| \| \| \| \| \| \| \|	Now that we're expecting a mergeclause's left_ec/right_ec to persist from the initial assignments, we can't just blithely zero these out when transforming such a clause in adjust_appendrel_attrs. But really it should be okay to keep the parent's values, since a child table's derived Var ought to be equivalent to the parent Var for all EquivalenceClass purposes. (Indeed, I'm wondering whether we couldn't find a way to dispense with add_child_rel_equivalences altogether. But this is wrong in any case.)