postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Introduce the concept of relation forks. An smgr relation can now consist	Heikki Linnakangas	2008-08-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of multiple forks, and each fork can be created and grown separately. The bulk of this patch is about changing the smgr API to include an extra ForkNumber argument in every smgr function. Also, smgrscheduleunlink and smgrdounlink no longer implicitly call smgrclose, because other forks might still exist after unlinking one. The callers of those functions have been modified to call smgrclose instead. This patch in itself doesn't have any user-visible effect, but provides the infrastructure needed for upcoming patches. The additional forks envisioned are a rewritten FSM implementation that doesn't rely on a fixed-size shared memory block, and a visibility map to allow skipping portions of a table in VACUUM that have no dead tuples.
*	Fix corner-case bug introduced with HOT: if REINDEX TABLE pg_class (or a	Tom Lane	2008-08-10
\| \| \| \| \| \| \| \| \| \| \| \|	REINDEX DATABASE including same) is done before a session has done any other update on pg_class, the pg_class relcache entry was left with an incorrect setting of rd_indexattr, because the indexed-attributes set would be first demanded at a time when we'd forced a partial list of indexes into the pg_class entry, and it would remain cached after that. This could result in incorrect decisions about HOT-update safety later in the same session. In practice, since only pg_class_relname_nsp_index would be missed out, only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a problem. Per report and test case from Ondrej Jirman.
*	Install checks in executor startup to ensure that the tuples produced by an	Tom Lane	2008-08-08
\| \| \| \| \| \| \| \| \| \| \| \|	INSERT or UPDATE will match the target table's current rowtype. In pre-8.3 releases inconsistency can arise with stale cached plans, as reported by Merlin Moncure. (We patched the equivalent hazard on the SELECT side in Feb 2007; I'm not sure why we thought there was no risk on the insertion side.) In 8.3 and HEAD this problem should be impossible due to plan cache invalidation management, but it seems prudent to make the check anyway. Back-patch as far as 8.0. 7.x versions lack ALTER COLUMN TYPE, so there seems no way to abuse a stale plan comparably.
*	Improve INTERSECT/EXCEPT hashing by realizing that we don't need to make any	Tom Lane	2008-08-07
\| \| \| \| \| \| \| \| \|	hashtable entries for tuples that are found only in the second input: they can never contribute to the output. Furthermore, this implies that the planner should endeavor to put first the smaller (in number of groups) input relation for an INTERSECT. Implement that, and upgrade prepunion's estimation of the number of rows returned by setops so that there's some amount of sanity in the estimate of which one is smaller.
*	Support hashing for duplicate-elimination in INTERSECT and EXCEPT queries.	Tom Lane	2008-08-07
\| \| \| \| \| \| \| \| \| \|	This completes my project of improving usage of hashing for duplicate elimination (aggregate functions with DISTINCT remain undone, but that's for some other day). As with the previous patches, this means we can INTERSECT/EXCEPT on datatypes that can hash but not sort, and it means that INTERSECT/EXCEPT without ORDER BY are no longer certain to produce sorted output.
*	Teach the system how to use hashing for UNION. (INTERSECT/EXCEPT will follow,	Tom Lane	2008-08-07
\| \| \| \| \| \| \| \| \| \| \|	but seem like a separate patch since most of the remaining work is on the executor side.) I took the opportunity to push selection of the grouping operators for set operations into the parser where it belongs. Otherwise this is just a small exercise in making prepunion.c consider both alternatives. As with the recent DISTINCT patch, this means we can UNION on datatypes that can hash but not sort, and it means that UNION without ORDER BY is no longer certain to produce sorted output.
*	Do not allow Unique nodes to be scanned backwards. The code claimed that it	Tom Lane	2008-08-05
\| \| \| \| \| \| \|	would work, but in fact it didn't return the same rows when moving backwards as when moving forwards. This would have no visible effect in a DISTINCT query (at least assuming the column datatypes use a strong definition of equality), but it gave entirely wrong answers for DISTINCT ON queries.
*	Department of second thoughts: fix newly-added code in planner.c to make real	Tom Lane	2008-08-05
\| \| \| \| \| \| \| \|	sure that DISTINCT ON does what it's supposed to, ie, sort by the full ORDER BY list before unique-ifying. The error seems masked in simple cases by the fact that query_planner won't return query pathkeys that only partially match the requested sort order, but I wouldn't want to bet that it couldn't be exposed in some way or other.
*	In ReadOrZeroBuffer (and related entry points), don't bother to call	Tom Lane	2008-08-05
\| \| \| \| \| \| \| \|	PageHeaderIsValid when we zero the buffer instead of reading the page in. The actual performance improvement is probably marginal since this function isn't very heavily used, but a cycle saved is a cycle earned. Zdenek Kotala
*	Move pgstat.tmp into a temporary directory under $PGDATA named pg_stat_tmp.	Magnus Hagander	2008-08-05
\| \| \| \| \| \| \| \| \| \| \|	This allows the use of a ramdrive (either through mount or symlink) for the temporary file that's written every half second, which should reduce I/O. On server shutdown/startup, the file is written to the old location in the global directory, to preserve data across restarts. Bump catversion since the $PGDATA directory layout changed.
*	Improve SELECT DISTINCT to consider hash aggregation, as well as sort/uniq,	Tom Lane	2008-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	as methods for implementing the DISTINCT step. This eliminates the former performance gap between DISTINCT and GROUP BY, and also makes it possible to do SELECT DISTINCT on datatypes that only support hashing not sorting. SELECT DISTINCT ON is still always implemented by sorting; it would take executor changes to support hashing that, and it's not clear it's worth the trouble. This is a release-note-worthy incompatibility from previous PG versions, since SELECT DISTINCT can no longer be counted on to deliver sorted output without explicitly saying ORDER BY. (Anyone who can't cope with that can consider turning off enable_hashagg.) Several regression test queries needed to have ORDER BY added to preserve stable output order. I fixed the ones that manifested here, but there might be some other cases that show up on other platforms.
*	Improve CREATE/DROP/RENAME DATABASE so that when failing because the source	Tom Lane	2008-08-04
\| \| \| \| \| \| \| \| \|	or target database is being accessed by other users, it tells you whether the "other users" are live sessions or uncommitted prepared transactions. (Indeed, it tells you exactly how many of each, but that's mostly just because it was easy to do so.) This should help forestall the gotcha of not realizing that a prepared transaction is what's blocking the command. Per discussion.
*	Make GROUP BY work properly for datatypes that only support hashing and not	Tom Lane	2008-08-03
\| \| \| \| \| \|	sorting. The infrastructure for this was all in place already; it's only necessary to fix the planner to not assume that sorting is always an available option.
*	Tighten up the sanity checks in TypeCreate(): pass-by-value types must have	Tom Lane	2008-08-03
\| \| \| \| \|	a size that is one of the supported values, not just anything <= sizeof(Datum). Cross-check the alignment specification against size as well.
*	Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items	Tom Lane	2008-08-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	as per my recent proposal: 1. Fold SortClause and GroupClause into a single node type SortGroupClause. We were already relying on them to be struct-equivalent, so using two node tags wasn't accomplishing much except to get in the way of comparing items with equal(). 2. Add an "eqop" field to SortGroupClause to carry the associated equality operator. This is cheap for the parser to get at the same time it's looking up the sort operator, and storing it eliminates the need for repeated not-so-cheap lookups during planning. In future this will also let us represent GROUP/DISTINCT operations on datatypes that have hash opclasses but no btree opclasses (ie, they have equality but no natural sort order). The previous representation simply didn't work for that, since its only indicator of comparison semantics was a sort operator. 3. Add a hasDistinctOn boolean to struct Query to explicitly record whether the distinctClause came from DISTINCT or DISTINCT ON. This allows removing some complicated and not 100% bulletproof code that attempted to figure that out from the distinctClause alone. This patch doesn't in itself create any new capability, but it's necessary infrastructure for future attempts to use hash-based grouping for DISTINCT and UNION/INTERSECT/EXCEPT.
*	Add a few more DTrace probes to the backend.	Alvaro Herrera	2008-08-01
\| \| \| \|	Robert Lor
*	Rearrange the code in auth.c so that all functions for a single authentication	Magnus Hagander	2008-08-01
\| \| \| \| \| \|	method is grouped together in a reasonably similar way, keeping the "global shared functions" together in their own section as well. Makes it a lot easier to find your way around the code.
*	Move ident authentication code into auth.c along with the other authenciation	Magnus Hagander	2008-08-01
\| \| \| \|	routines, leaving hba.c to deal only with processing the HBA specific files.
*	Fix parser so that we don't modify the user-written ORDER BY list in order	Tom Lane	2008-07-31
\| \| \| \| \| \| \| \| \| \|	to represent DISTINCT or DISTINCT ON. This gets rid of a longstanding annoyance that a view or rule using SELECT DISTINCT will be dumped out with an overspecified ORDER BY list, and is one small step along the way to decoupling DISTINCT and ORDER BY enough so that hash-based implementation of DISTINCT will be possible. In passing, improve transformDistinctClause so that it doesn't reject duplicate DISTINCT ON items, as was reported by Steve Midgley a couple weeks ago.
*	Require superuser privilege to create base types (but not composites, enums,	Tom Lane	2008-07-31
\| \| \| \| \| \| \| \|	or domains). This was already effectively required because you had to own the I/O functions, and the I/O functions pretty much have to be written in C since we don't let PL functions take or return cstring. But given the possible security consequences of a malicious type definition, it seems prudent to enforce superuser requirement directly. Per recent discussion.
*	Allow I/O conversion casts to be applied to or from any type that is a member	Tom Lane	2008-07-30
\| \| \| \| \| \| \|	of the STRING type category, thereby opening up the mechanism for user-defined types. This is mainly for the benefit of citext, though; there aren't likely to be a lot of types that are all general-purpose character strings. Per discussion with David Wheeler.
*	Flip the default typispreferred setting from true to false. This affects	Tom Lane	2008-07-30
\| \| \| \| \| \| \| \| \|	only type categories in which the previous coding made every type preferred; so there is no change in effective behavior, because the function resolution rules only do something different when faced with a choice between preferred and non-preferred types in the same category. It just seems safer and less surprising to have CREATE TYPE default to non-preferred status ...
*	Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()	Tom Lane	2008-07-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with system catalog lookups, as was foreseen to be necessary almost since their creation. Instead put the information into two new pg_type columns, typcategory and typispreferred. Add support for setting these when creating a user-defined base type. The category column is just a "char" (i.e. a poor man's enum), allowing a crude form of user extensibility of the category list: just use an otherwise-unused character. This seems sufficient for foreseen uses, but we could upgrade to having an actual category catalog someday, if there proves to be a huge demand for custom type categories. In this patch I have attempted to hew exactly to the behavior of the previous hardwired logic, except for introducing new type categories for arrays, composites, and enums. In particular the default preferred state for user-defined types remains TRUE. That seems worth revisiting, but it should be done as a separate patch from introducing the infrastructure. Likewise, any adjustment of the standard set of categories should be done separately.
*	As noted by Andrew Gierth, there's really no need any more to force a junk	Tom Lane	2008-07-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	filter to be used when INSERT or SELECT INTO has a plan that returns raw disk tuples. The virtual-tuple-slot optimizations that were put in place awhile ago mean that ExecInsert has to do ExecMaterializeSlot, and that already copies the tuple if it's raw (and does so more efficiently than a junk filter, too). So get rid of that logic. This in turn means that we can throw away ExecMayReturnRawTuples, which wasn't used for any other purpose, and was always a kluge anyway. In passing, move a couple of SELECT-INTO-specific fields out of EState and into the private state of the SELECT INTO DestReceiver, as was foreseen in an old comment there. Also make intorel_receive use ExecMaterializeSlot not ExecCopySlotTuple, for consistency with ExecInsert and to possibly save a tuple copy step in some cases.
*	Fix parsing of LDAP URLs so it doesn't reject spaces in the "suffix" part.	Tom Lane	2008-07-24
\| \| \| \|	Per report from César Miguel Oliveira Alves.
*	Remove some redundant tests and improve comments in next_token().	Tom Lane	2008-07-24
\| \| \| \|	Cosmetic, but it might make this a bit less confusing to the next reader.
*	Ratchet up patch to improve autovacuum wraparound messages.	Alvaro Herrera	2008-07-23
\| \| \| \|	Simon Riggs
*	Use guc.c's parse_int() instead of pg_atoi() to parse fillfactor in	Tom Lane	2008-07-23
\| \| \| \| \| \| \| \| \| \|	default_reloptions(). The previous coding was really a bug because pg_atoi() will always throw elog on bad input data, whereas default_reloptions is not supposed to complain about bad input unless its validate parameter is true. Right now you could only expose the problem by hand-modifying pg_class.reloptions into an invalid state, so it doesn't seem worth back-patching; but we should get it right in HEAD because there might be other situations in future. Noted while studying GIN fast-update patch.
*	Publish more openly the fact that autovacuum is working for wraparound	Alvaro Herrera	2008-07-21
\| \| \| \| \| \|	protection. Simon Riggs
*	Add comment about the two different query strings that ExecuteQuery()	Tom Lane	2008-07-21
\| \| \| \|	has to deal with.
*	Code review for array_fill patch: fix inadequate check for array size overflow	Tom Lane	2008-07-21
\| \| \| \| \| \|	and bogus documentation (dimension arrays are int[] not anyarray). Also the errhint() messages seem to be really errdetail(), since there is nothing heuristic about them. Some other trivial cosmetic improvements.
*	Avoid substituting NAMEDATALEN, FLOAT4PASSBYVAL, and FLOAT8PASSBYVAL into	Tom Lane	2008-07-19
\| \| \| \| \| \| \| \| \|	the postgres.bki file during build, because we want that file to be entirely platform- and configuration-independent; else it can't safely be put into /usr/share on multiarch machines. We can do the substitution during initdb, instead. FLOAT4PASSBYVAL and FLOAT8PASSBYVAL are new breakage as of 8.4, while the NAMEDATALEN hazard has been there all along but I guess no one tripped over it. Noticed while trying to build "universal" OS X binaries.
*	Adjust things so that the query_string of a cached plan and the sourceText of	Tom Lane	2008-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a portal are never NULL, but reliably provide the source text of the query. It turns out that there was only one place that was really taking a short-cut, which was the 'EXECUTE' utility statement. That doesn't seem like a sufficiently critical performance hotspot to justify not offering a guarantee of validity of the portal source text. Fix it to copy the source text over from the cached plan. Add Asserts in the places that set up cached plans and portals to reject null source strings, and simplify a bunch of places that formerly needed to guard against nulls. There may be a few places that cons up statements for execution without having any source text at all; I found one such in ConvertTriggerToFK(). It seems sufficient to inject a phony source string in such a case, for instance ProcessUtility((Node *) atstmt, "(generated ALTER TABLE ADD FOREIGN KEY command)", NULL, false, None_Receiver, NULL); We should take a second look at the usage of debug_query_string, particularly the recently added current_query() SQL function. ITAGAKI Takahiro and Tom Lane
*	Provide a function hook to let plug-ins get control around ExecutorRun.	Tom Lane	2008-07-18
\| \| \| \|	ITAGAKI Takahiro
*	Fix a race condition that I introduced into sinvaladt.c during the recent	Tom Lane	2008-07-18
\| \| \| \| \| \| \| \| \|	rewrite. When called from SIInsertDataEntries, SICleanupQueue releases the write lock if it has to issue a kill() to signal some laggard backend. That still seems like a good idea --- but it's possible that by the time we get the lock back, there are no longer enough free message slots to satisfy SIInsertDataEntries' requirement. Must recheck, and repeat the whole SICleanupQueue process if not. Noted while reading code.
*	Implement SQL-spec RETURNS TABLE syntax for functions.	Tom Lane	2008-07-18
\| \| \| \| \| \| \|	(Unlike the original submission, this patch treats TABLE output parameters as being entirely equivalent to OUT parameters -- tgl) Pavel Stehule
*	Avoid crashing when a table is deleted while we're on the process of checking	Alvaro Herrera	2008-07-17
\| \| \| \| \| \|	it. Per report from Tom Lane based on buildfarm evidence.
*	Add dump support for SortBy nodes. Needed this while debugging a reported	Tom Lane	2008-07-17
\| \| \| \|	problem with DISTINCT, so might as well commit it.
*	Fix previous patch so that it actually works --- consider TRUNCATE foo, ↵	Tom Lane	2008-07-16
\| \| \| \|	public.foo
*	Add a "provariadic" column to pg_proc to eliminate the remarkably expensive	Tom Lane	2008-07-16
\| \| \| \| \| \| \| \| \| \|	need to deconstruct proargmodes for each pg_proc entry inspected by FuncnameGetCandidates(). Fixes function lookup performance regression caused by yesterday's variadic-functions patch. In passing, make pg_proc.probin be NULL, rather than a dummy value '-', in cases where it is not actually used for the particular type of function. This should buy back some of the space cost of the extra column.
*	Allow TRUNCATE foo, foo to succeed, per report from Nikhils.	Bruce Momjian	2008-07-16
\|
*	Support "variadic" functions, which can accept a variable number of arguments	Tom Lane	2008-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	so long as all the trailing arguments are of the same (non-array) type. The function receives them as a single array argument (which is why they have to all be the same type). It might be useful to extend this facility to aggregates, but this patch doesn't do that. This patch imposes a noticeable slowdown on function lookup --- a follow-on patch will fix that by adding a redundant column to pg_proc. Pavel Stehule
*	Add array_fill() to create arrays initialized with a value.	Bruce Momjian	2008-07-16
\| \| \| \|	Pavel Stehule
*	Create a type-specific typanalyze routine for tsvector, which collects stats	Tom Lane	2008-07-14
\| \| \| \| \| \| \| \| \| \| \| \|	on the most common individual lexemes in place of the mostly-useless default behavior of counting duplicate tsvectors. Future work: create selectivity estimation functions that actually do something with these stats. (Some other things we ought to look at doing: using the Lossy Counting algorithm in compute_minimal_stats, and using the element-counting idea for stats on regular arrays.) Jan Urbanski
*	Change the PageGetContents() macro to guarantee its result is maxalign'd,	Tom Lane	2008-07-13
\| \| \| \| \| \| \| \| \|	thereby forestalling any problems with alignment of the data structure placed there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this does not actually change anything right now, but it is foreseeable that the header size will change again someday. I had to fix a couple of places that were assuming that the content offset is just SizeOfPageHeaderData rather than MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch.
*	Clean up the use of some page-header-access macros: principally, use	Tom Lane	2008-07-13
\| \| \| \| \| \| \| \| \| \|	SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that makes the code clearer, and avoid casting between Page and PageHeader where possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas. I did not apply the parts of the proposed patch that would have resulted in slightly changing the on-disk format of hash indexes; it seems to me that's not a win as long as there's any chance of having in-place upgrade for 8.4.
*	More replacements of binary compatible to binary coercible.	Peter Eisentraut	2008-07-12
\|
*	Const-ify the arguments of str_tolower() and friends to suppress compile	Tom Lane	2008-07-12
\| \| \| \| \| \| \| \|	warnings. Clean up various unneeded cruft that was left behind after creating those routines. Introduce some convenience functions str_tolower_z etc to eliminate tedious and error-prone double arguments in formatting.c. (Currently there seems no need to export the latter, but maybe reconsider this later.)
*	Multi-column GIN indexes. Teodor Sigaev	Tom Lane	2008-07-11
\|
*	Allow binary-coercible types for cast function arguments and return types.	Peter Eisentraut	2008-07-11
\| \| \| \| \| \| \|	Document return type of cast functions. Also change documentation to prefer the term "binary coercible" in its present sense instead of the previous term "binary compatible".