| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
| |
Original patch by Hans-Juergen Schoenig, revisions by Karel Zak
and Tom Lane.
|
|
|
|
|
| |
This patch also includes preliminary update of pg_dumpall for roles.
Petr Jelinek, with review by Bruce Momjian and Tom Lane.
|
| |
|
|
|
|
|
|
|
|
| |
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly. Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries. Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, write multiple WAL buffers out in one write() operation.
ITAGAKI Takahiro
---------------------------------------------------------------------------
> If we disable writeback-cache and use open_sync, the per-page writing
> behavior in WAL module will show up as bad result. O_DIRECT is similar
> to O_DSYNC (at least on linux), so that the benefit of it will disappear
> behind the slow disk revolution.
>
> In the current source, WAL is written as:
> for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
> Is this intentional? Can we rewrite it as follows?
> write(&buffers[0], N * BLCKSZ);
>
> In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
> Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
> These two patches are independent, so they can be applied either or both.
>
>
> I tested them on my machine and the results as follows. It shows that
> direct-io and gather-write is the best choice when writeback-cache is off.
> Are these two patches worth trying if they are used together?
>
>
> | writeback | fsync= | fdata | open_ | fsync_ | open_
> patch | cache | false | sync | sync | direct | direct
> ------------+-----------+--------+-------+-------+--------+---------
> direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2
> direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5
> gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A)
> both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2
>
> - 20runs * pgbench -s 100 -c 50 -t 200
> - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
> - using 2 ATA disks:
> - hda(reiserfs) includes system and wal.
> - hdc(jfs) includes database files. writeback-cache is always on.
>
> ---
> ITAGAKI Takahiro
|
|
|
|
| |
I'm still working on the has_role function and information_schema changes.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
a minimum requirement is that it not completely break the system
meanwhile. Put the test in the right place.
|
|
|
|
|
|
| |
possible compression.
Mark Kirkwood
|
| |
|
|
|
|
|
|
|
|
| |
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA. This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it. Per recent discussion.
|
|
|
|
|
|
|
| |
basic regression tests for GiST to the standard regression tests.
I took the opportunity to add an rtree-equivalent gist opclass for
circles; the contrib version only covered boxes and polygons, but
indexing circles is very handy for distance searches.
|
| |
|
|
|
|
|
|
| |
- add forgotten check of lsn for insert completion
- remove level of pages: hard to check in recovery
- some cleanups
|
|
|
|
|
|
|
|
| |
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE). Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption. Per gripe from
Chris K-L.
|
|
|
|
|
|
|
|
| |
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields. This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- full concurrency for insert/update/select/vacuum:
- select and vacuum never locks more than one page simultaneously
- select (gettuple) hasn't any lock across it's calls
- insert never locks more than two page simultaneously:
- during search of leaf to insert it locks only one page
simultaneously
- while walk upward to the root it locked only parent (may be
non-direct parent) and child. One of them X-lock, another may
be S- or X-lock
- 'vacuum full' locks index
- improve gistgetmulti
- simplify XLOG records
Fix bug in index_beginscan_internal: LockRelation may clean
rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to the existing X-direction tests. An rtree class now includes 4 actual
2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests.
This involved adding four new Y-direction test operators for each of
box and polygon; I followed the PostGIS project's lead as to the names
of these operators.
NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright
(&>) operators now have semantics comparable to box_overleft and box_overright.
This is necessary to make r-tree indexes work correctly on polygons.
Also, I changed circle_left and circle_right to agree with box_left and
box_right --- formerly they allowed the boundaries to touch. This isn't
actually essential given the lack of any r-tree opclass for circles, but
it seems best to sync all the definitions while we are at it.
|
|
|
|
|
|
|
|
|
|
| |
polygon operators (<<, &<, >>, &>). Per ideas originally put forward
by andrew@supernews and later rediscovered by moi. This patch just
fixes the existing opclasses, and does not add any new behavior as I
proposed earlier; that can be sorted out later. In principle this
could be back-patched, since it changes only search behavior and not
system catalog entries nor rtree index contents. I'm not currently
planning to do that, though, since I think it could use more testing.
|
|
|
|
|
| |
of columns of a query result so that it can "see through" cursors and
prepared statements. Per gripe a couple months back from John DeSoi.
|
|
|
|
|
|
| |
(a/k/a SELECT INTO). Instead, flush and fsync the whole relation before
committing. We do still need the WAL log when PITR is active, however.
Simon Riggs and Tom Lane.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
2. improve vacuum for gist
- use FSM
- full vacuum:
- reforms parent tuple if it's needed
( tuples was deleted on child page or parent tuple remains invalid
after crash recovery )
- truncate index file if possible
3. fixes bugs and mistakes
|
|
|
|
|
|
|
| |
scankeys arrays that it needs can never have more than INDEX_MAX_KEYS
entries, so it's reasonable to just allocate them as fixed-size local
arrays, and save the cost of palloc/pfree. Not a huge savings, but
a cycle saved is a cycle earned ...
|
| |
|
|
|
|
|
|
|
| |
includes error checking and an appropriate ereport(ERROR) message.
This gets rid of rather tedious and error-prone manipulation of errno,
as well as a Windows-specific bug workaround, at more than a dozen
call sites. After an idea in a recent patch by Heikki Linnakangas.
|
|
|
|
|
|
| |
given reasonably short lifespans for prepared transactions, this should
mean that only a small minority of state files ever need to be fsynced
at all. Per discussion with Heikki Linnakangas.
|
|
|
|
|
|
|
|
|
|
| |
old suggestion by Oliver Jowett. Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own. There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
|
|
|
|
| |
releasing locks, so COMMIT PREPARED should too.
|
|
|
|
| |
hacking by Alvaro Herrera and Tom Lane.
|
|
|
|
| |
prevents a large number of *.backup files from existing in pg_xlog/
|
|
|
|
|
|
|
| |
recovery after crash (power loss etc) it may say that it can't restore
index and index should be reindexed.
Some refactoring code.
|
|
|
|
|
|
|
|
|
| |
nonconsecutive columns of a multicolumn index, as per discussion around
mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This
turns out to require only minimal changes in btree, and so far as I can
see none at all in GiST. btcostestimate did need some work, but its
original assumption that index selectivity == heap selectivity was
quite bogus even before this.
|
| |
|
| |
|
|
|
|
| |
discussion with Qingqing Zhou.
|
|
|
|
|
|
| |
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart. Per previous discussion, this is
essential for PITR log shipping and for 2PC.
|
|
|
|
|
|
|
|
| |
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero. That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.
|
|
|
|
|
|
| |
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block. Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages. Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective. All per recent discussions.
|
|
|
|
|
| |
WAL record; this is necessary to be sure we recognize stale WAL records
when a WAL page was only partially written during a system crash.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spotted by Qingqing Zhou. The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places. If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL. But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
|
|
|
|
|
|
|
|
|
|
|
|
| |
routines in the index's relcache entry, instead of doing a fresh fmgr_info
on every index access. We were already doing this for the index's opclass
support functions; not sure why we didn't think to do it for the AM
functions too. This supersedes the former method of caching (only)
amgettuple in indexscan scan descriptors; it's an improvement because the
function lookup can be amortized across multiple statements instead of
being repeated for each statement. Even though lookup for builtin
functions is pretty cheap, this seems to drop a percent or two off some
simple benchmarks.
|
|
|
|
| |
them, the executation behavior could be unexpected.
|