aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Fix two race conditions between the pending unlink mechanism that was put inHeikki Linnakangas2008-04-18
| | | | | | | | | | | | | | | | | | | | | place to prevent reusing relation OIDs before next checkpoint, and DROP DATABASE. First, if a database was dropped, bgwriter would still try to unlink the files that the rmtree() call by the DROP DATABASE command has already deleted, or is just about to delete. Second, if a database is dropped, and another database is created with the same OID, bgwriter would in the worst case delete a relation in the new database that happened to get the same OID as a dropped relation in the old database. To fix these race conditions: - make rmtree() ignore ENOENT errors. This fixes the 1st race condition. - make ForgetDatabaseFsyncRequests forget unlink requests as well. - force checkpoint on in dropdb on all platforms Since ForgetDatabaseFsyncRequests() is asynchronous, the 2nd change isn't enough on its own to fix the problem of dropping and creating a database with same OID, but forcing a checkpoint on DROP DATABASE makes it sufficient. Per Tom Lane's bug report and proposal. Backpatch to 8.3.
* Cause EXPLAIN's VERBOSE option to print the target list (output column list)Tom Lane2008-04-18
| | | | | | | | | | of each plan node, instead of its former behavior of dumping the internal representation of the plan tree. The latter display is still available for those who really want it (see debug_print_plan), but uses for it are certainly few and and far between. Per discussion. This patch also removes the explain_pretty_print GUC, which is obsoleted by the change.
* Clean up a few places where Datums were being treated as pointers (and viceAlvaro Herrera2008-04-17
| | | | | | versa) without going through DatumGetPointer. Gavin Sherry, with Feng Tian.
* Fix a couple of oversights associated with the "physical tlist" optimization:Tom Lane2008-04-17
| | | | | | | | | | | | | | | | | | | | | we had several code paths where a physical tlist could be used for the input to a Sort node, which is a dumb idea because any unneeded table columns will increase the volume of data the sort has to push around. (Unfortunately the easy-looking fix of calling disuse_physical_tlist during make_sort_xxx doesn't work because in most cases we're already committed to the current input tlist --- it's been marked with sort column numbers, or we've built grouping column numbers using it, etc. The tlist has to be selected properly at the calling level before we start constructing sort-col information. This is easy enough to do, we were just failing to take the point into consideration.) Back-patch to 8.3. I believe the problem probably exists clear back to 7.4 when the physical tlist optimization was added, but I'm afraid to back-patch further than 8.3 without a great deal more study than I want to put into it. The code in this area has drifted a lot over time. The real-world importance of these code paths is uncertain anyway --- I think in many cases we'd probably prefer hash-based methods.
* Re-enable pg_terminate_backend() using SIGTERM. SIGTERM testing stillBruce Momjian2008-04-17
| | | | needed.
* Add some code to EXPLAIN to show the targetlist (ie, output columns)Tom Lane2008-04-17
| | | | | | of each plan node. For the moment this is debug support only and is not enabled unless EXPLAIN_PRINT_TLISTS is defined at build time. Later I'll see about the idea of letting EXPLAIN VERBOSE do it.
* Repair two places where SIGTERM exit could leave shared memory stateTom Lane2008-04-16
| | | | | | | | | | | | | | corrupted. (Neither is very important if SIGTERM is used to shut down the whole database cluster together, but there's a problem if someone tries to SIGTERM individual backends.) To do this, introduce new infrastructure macros PG_ENSURE_ERROR_CLEANUP/PG_END_ENSURE_ERROR_CLEANUP that take care of transiently pushing an on_shmem_exit cleanup hook. Also use this method for createdb cleanup --- that wasn't a shared-memory-corruption problem, but SIGTERM abort of createdb could leave orphaned files lying around. Backpatch as far as 8.2. The shmem corruption cases don't exist in 8.1, and the createdb usage doesn't seem important enough to risk backpatching further.
* Fix MinGW warnings re formats and unused variables. per ITAGAKI TakahiroAndrew Dunstan2008-04-16
|
* Fix LOAD_CRIT_INDEX() macro to take out AccessShareLock on the system indexTom Lane2008-04-16
| | | | | | | | | | it is trying to build a relcache entry for. This is an oversight in my 8.2 patch that tried to ensure we always took a lock on a relation before trying to build its relcache entry. The implication is that if someone committed a reindex of a critical system index at about the same time that some other backend were starting up without a valid pg_internal.init file, the second one might PANIC due to not seeing any valid version of the index's pg_class row. Improbable case, but definitely not impossible.
* Revert addition of pg_terminate_backend() because of race conditions.Bruce Momjian2008-04-15
|
* Add pg_terminate_backend() to allow terminating only a single session.Bruce Momjian2008-04-15
|
* Push index operator lossiness determination down to GIST/GIN opclassTom Lane2008-04-14
| | | | | | | | | | | "consistent" functions, and remove pg_amop.opreqcheck, as per recent discussion. The main immediate benefit of this is that we no longer need 8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery searches on GIN indexes. In future it should be possible to optimize some other queries better than is done now, by detecting at runtime whether the index match is exact or not. Tom Lane, after an idea of Heikki's, and with some help from Teodor.
* Since createplan.c no longer cares whether index operators are lossy, it hasTom Lane2008-04-13
| | | | | | | | | | no particular need to do get_op_opfamily_properties() while building an indexscan plan. Postpone that lookup until executor start. This simplifies createplan.c a lot more than it complicates nodeIndexscan.c, and makes things more uniform since we already had to do it that way for RowCompare expressions. Should be a bit faster too, at least for plans that aren't re-used many times, since we avoid palloc'ing and perhaps copying the intermediate list data structure.
* Phase 2 of project to make index operator lossiness be determined at runtimeTom Lane2008-04-13
| | | | | | | | | | | | instead of plan time. Extend the amgettuple API so that the index AM returns a boolean indicating whether the indexquals need to be rechecked, and make that rechecking happen in nodeIndexscan.c (currently the only place where it's expected to be needed; other callers of index_getnext are just erroring out for now). For the moment, GIN and GIST have stub logic that just always sets the recheck flag to TRUE --- I'm hoping to get Teodor to handle pushing that control down to the opclass consistent() functions. The planner no longer pays any attention to amopreqcheck, and that catalog column will go away in due course.
* Clean up a few places where Datums were being treated as pointers withoutTom Lane2008-04-12
| | | | | | | | going through DatumGetPointer or some other "official" conversion macro. Not actually a bug, since Datum the same size as pointer is the only supported case at the moment, but good cleanup for the future. Gavin Sherry
* Create new routines systable_beginscan_ordered, systable_getnext_ordered,Tom Lane2008-04-12
| | | | | | | | | | | | | | systable_endscan_ordered that have API similar to systable_beginscan etc (in particular, the passed-in scankeys have heap not index attnums), but guarantee ordered output, unlike the existing functions. For the moment these are just very thin wrappers around index_beginscan/index_getnext/etc. Someday they might need to get smarter; but for now this is just a code refactoring exercise to reduce the number of direct callers of index_getnext, in preparation for changing that function's API. In passing, remove index_getnext_indexitem, which has been dead code for quite some time, and will have even less use than that in the presence of run-time-lossy indexes.
* Add some debug support code to try to catch future mistakes in the area ofTom Lane2008-04-11
| | | | | | | | | | input functions that include garbage bytes in their results. Provide a compile-time option RANDOMIZE_ALLOCATED_MEMORY to make palloc fill returned blocks with variable contents. This option also makes the parser perform conversions of literal constants twice and compare the results, emitting a WARNING if they don't match. (This is the code I used to catch the input function bugs fixed in the previous commit.) For the moment, I've set it to be activated automatically by --enable-cassert.
* Fix several datatype input functions that were allowing unused bytes in theirTom Lane2008-04-11
| | | | | | | | | | | | | results to contain uninitialized, unpredictable values. While this was okay as far as the datatypes themselves were concerned, it's a problem for the parser because occurrences of the "same" literal might not be recognized as equal by datumIsEqual (and hence not by equal()). It seems sufficient to fix this in the input functions since the only critical use of equal() is in the parser's comparisons of ORDER BY and DISTINCT expressions. Per a trouble report from Marc Cousin. Patch all the way back. Interestingly, array_in did not have the bug before 8.2, which may explain why the issue went unnoticed for so long.
* Replace "amgetmulti" AM functions with "amgetbitmap", in which the wholeTom Lane2008-04-10
| | | | | | | | | | | | | | | | | | indexscan always occurs in one call, and the results are returned in a TIDBitmap instead of a limited-size array of TIDs. This should improve speed a little by reducing AM entry/exit overhead, and it is necessary infrastructure if we are ever to support bitmap indexes. In an only slightly related change, add support for TIDBitmaps to preserve (somewhat lossily) the knowledge that particular TIDs reported by an index need to have their quals rechecked when the heap is visited. This facility is not really used yet; we'll need to extend the forced-recheck feature to plain indexscans before it's useful, and that hasn't been coded yet. The intent is to use it to clean up 8.3's horrid @@@ kluge for text search with weighted queries. There might be other uses in future, but that one alone is sufficient reason. Heikki Linnakangas, with some adjustments by me.
* Small wording improvements for source code READMEs.Bruce Momjian2008-04-09
|
* Revert README cleanups.Bruce Momjian2008-04-09
|
* Revert sentence removal from nickname in FAQ.Bruce Momjian2008-04-09
|
* Fix tsvector_update_trigger() to be domain-friendly: it needs to allow allTom Lane2008-04-08
| | | | | | the columns it works with to be domains over the expected type, not just exactly the expected type. In passing, fix ts_stat() the same way. Per report from Markus Wollny.
* Implement a few changes to how shared libraries and dynamically loadablePeter Eisentraut2008-04-07
| | | | | | | | | | | | | | | modules are built. Foremost, it creates a solid distinction between these two types of targets based on what had already been implemented and duplicated in ad hoc ways before. Specifically, - Dynamically loadable modules no longer get a soname. The numbers previously set in the makefiles were dummy numbers anyway, and the presence of a soname upset a few packaging tools, so it is nicer not to have one. - The cumbersome detour taken on installation (build a libfoo.so.0.0.0 and then override the rule to install foo.so instead) is removed. - Lots of duplicated code simplified.
* Improve hash_any() to use word-wide fetches when hashing suitably alignedTom Lane2008-04-06
| | | | | | | | | | | | | data. This makes for a significant speedup at the cost that the results now vary between little-endian and big-endian machines; which forces us to add explicit ORDER BYs in a couple of regression tests to preserve machine-independent comparison results. Also, force initdb by bumping catversion, since the contents of hash indexes will change (at least on big-endian machines). Kenneth Marshall and Tom Lane, based on work from Bob Jenkins. This commit does not adopt Bob's new faster mix() algorithm, however, since we still need to convince ourselves that that doesn't degrade the quality of the hashing.
* Defend against JOINs having more than 32K columns altogether. We cannotTom Lane2008-04-05
| | | | | | | | | | | | currently support this because we must be able to build Vars referencing join columns, and varattno is only 16 bits wide. Perhaps this should be improved in future, but considering that it never came up before, I'm not sure the problem is worth much effort. Per bug #4070 from Marcello Ceschia. The problem seems largely academic in 8.0 and 7.4, because they have (different) O(N^2) performance issues with such wide joins, but back-patch all the way anyway.
* Have pg_stop_backup() wait for all archive files to be sent, rather thanBruce Momjian2008-04-05
| | | | | | | returing right away. This guarantees that when pg_stop_backup() returns, you have a valid backup. Simon Riggs
* Re-implement division for numeric values using the traditional "schoolbook"Tom Lane2008-04-04
| | | | | | | | | | | | algorithm. This is a good deal slower than our old roundoff-error-prone code for long inputs, so we keep the old code for use in the transcendental functions, where everything is approximate anyway. Also create a user-accessible function div(numeric, numeric) to provide access to the exact result of trunc(x/y) --- since the regular numeric / operator will round off its result, simply computing that expression in SQL doesn't reliably give the desired answer. This fixes bug #3387 and various related corner cases, and improves the usefulness of PG for high-precision integer arithmetic.
* Remove no-longer-used function assign_backslash_quote()Tom Lane2008-04-04
|
* Implement current_query(), that shows the currently executing query.Bruce Momjian2008-04-04
| | | | | | | | At the same time remove dblink/dblink_current_query() as it is no longer necessary *BACKWARD COMPATIBILITY ISSUE* for dblink Tomas Doran
* Oops, change should go in scan.l to survive a clean checkout and not justMagnus Hagander2008-04-04
| | | | a make clean...
* Convert backslash_quote guc to use enum.Magnus Hagander2008-04-04
|
* Turn xmlbinary and xmloption GUC variables into enumsTurn xmlbinary andMagnus Hagander2008-04-04
| | | | xmloption GUC variables into enums..
* Remove heap_release_fetch, which is no longer used anywhere; this simplifiesTom Lane2008-04-03
| | | | heap_fetch a little.
* Teach ANALYZE to distinguish dead and in-doubt tuples, which it formerlyTom Lane2008-04-03
| | | | | | | | | | | | | | | classed all as "dead"; also get it to count DEAD item pointers as dead rows, instead of ignoring them as before. Also improve matters so that tuples previously inserted or deleted by our own transaction are handled nicely: the stats collector's live-tuple and dead-tuple counts will end up correct after our transaction ends, regardless of whether we end in commit or abort. While there's more work that could be done to improve the counting of in-doubt tuples in both VACUUM and ANALYZE, this commit is enough to alleviate some known bad behaviors in 8.3; and the other stuff that's been discussed seems like research projects anyway. Pavan Deolasee and Tom Lane
* Oops, add proper #ifdef for systems without support for syslog.Magnus Hagander2008-04-03
| | | | Per buildfarm member mastodon.
* Convert syslog_facility guc to enum type.Magnus Hagander2008-04-03
|
* Revert my bad decision of about a year ago to make PortalDefineQueryTom Lane2008-04-02
| | | | | | | | | | | | | | responsible for copying the query string into the new Portal. Such copying is unnecessary in the common code path through exec_simple_query, and in this case it can be enormously expensive because the string might contain a large number of individual commands; we were copying the entire, long string for each command, resulting in O(N^2) behavior for N commands. (This is the cause of bug #4079.) A second problem with it is that PortalDefineQuery really can't risk error, because if it elog's before having set up the Portal, we will leak the plancache refcount that the caller is trying to hand off to the portal. So go back to the design in which the caller is responsible for making sure everything is copied into the portal if necessary.
* Convert three more guc settings to enum type:Magnus Hagander2008-04-02
| | | | default_transaction_isolation, session_replication_role and regex_flavor.
* Add SPI-level support for executing SQL commands with one-time-use plans,Tom Lane2008-04-01
| | | | | | | | | | | that is commands that have out-of-line parameters but the plan is prepared assuming that the parameter values are constants. This is needed for the plpgsql EXECUTE USING patch, but will probably have use elsewhere. This commit includes the SPI functions and documentation, but no callers nor regression tests. The upcoming EXECUTE USING patch will provide regression-test coverage. I thought committing this separately made sense since it's logically a distinct feature.
* Fix an oversight I made in a cleanup patch over a year ago:Tom Lane2008-04-01
| | | | | | | | | | eval_const_expressions needs to be passed the PlannerInfo ("root") structure, because in some cases we want it to substitute values for Param nodes. (So "constant" is not so constant as all that ...) This mistake partially disabled optimization of unnamed extended-Query statements in 8.3: in particular the LIKE-to-indexscan optimization would never be applied if the LIKE pattern was passed as a parameter, and constraint exclusion depending on a parameter value didn't work either.
* Apply my original fix for Taiki Yamaguchi's bug report about DISTINCT MAX().Tom Lane2008-03-31
| | | | Add some regression tests for plausible failures in this area.
* Fix my brain fade in TRUNCATE triggers patch: can't release relcache refcountsTom Lane2008-03-31
| | | | | | while EState still contains pointers to those relations. Exposed by the CLOBBER_CACHE_ALWAYS tests that buildfarm member jaguar is running (I knew those cycles would pay off...)
* Use error message wordings for permissions checks on .pgpass and SSL privateTom Lane2008-03-31
| | | | | | | | | | | | key files that are similar to the one for the postmaster's data directory permissions check. (I chose to standardize on that one since it's the most heavily used and presumably best-wordsmithed by now.) Also eliminate explicit tests on file ownership in these places, since the ensuing read attempt must fail anyway if it's wrong, and there seems no value in issuing the same error message for distinct problems. (But I left in the explicit ownership test in postmaster.c, since it had its own error message anyway.) Also be more specific in the documentation's descriptions of these checks. Per a gripe from Kevin Hunter.
* Fix a number of places that were making file-type tests infelicitously.Tom Lane2008-03-31
| | | | | | | | | | | | | | | | | The places that did, eg, (statbuf.st_mode & S_IFMT) == S_IFDIR were correct, but there is no good reason not to use S_ISDIR() instead, especially when that's what the other 90% of our code does. The places that did, eg, (statbuf.st_mode & S_IFDIR) were flat out *wrong* and would fail in various platform-specific ways, eg a symlink could be mistaken for a regular file on most Unixen. The actual impact of this is probably small, since the problem cases seem to always involve symlinks or sockets, which are unlikely to be found in the directories that PG code might be scanning. But it's clearly trouble waiting to happen, so patch all the way back anyway. (There seem to be no occurrences of the mistake in 7.4.)
* Revert my erroneous fix for Taiki Yamaguchi's DISTINCT MAX() bug.Tom Lane2008-03-29
| | | | Whatever we do about that, this isn't the path to the solution.
* Department of second thoughts: the rule that ORDER BY and DISTINCT areTom Lane2008-03-28
| | | | | | | | | useless for an ungrouped-aggregate query holds regardless of whether optimize_minmax_aggregates succeeds. So we might as well apply the optimization in any case. I'll leave 8.3 as it was, since this version is a tad more invasive than my earlier patch.
* Support statement-level ON TRUNCATE triggers. Simon RiggsTom Lane2008-03-28
|
* When we have successfully optimized a MIN or MAX aggregate into an indexscan,Tom Lane2008-03-27
| | | | | | | | | | the query result must be exactly one row (since we don't do this when there's any GROUP BY). Therefore any ORDER BY or DISTINCT attached to the query is useless and can be dropped. Aside from saving useless cycles, this protects us against problems with matching the hacked-up tlist entries to sort clauses, as seen in a bug report from Taiki Yamaguchi. We might need to work harder if we ever try to optimize grouped queries with this approach, but this solution will do for now.
* Remove ipcclean utility command --- didn't work on all Unixes and onBruce Momjian2008-03-27
| | | | Windows. Users should use their operating system tools instead.