| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presently, the dynamic shared memory (DSM) registry only provides
GetNamedDSMSegment(), which allocates a fixed-size segment. To use
the DSM registry for more sophisticated things like dynamic shared
memory areas (DSAs) or a hash table backed by a DSA (dshash), users
need to create a DSM segment that stores various handles and LWLock
tranche IDs and to write fairly complicated initialization code.
Furthermore, there is likely little variation in this
initialization code between libraries.
This commit introduces functions that simplify allocating a DSA or
dshash within the DSM registry. These functions are very similar
to GetNamedDSMSegment(). Notable differences include the lack of
an initialization callback parameter and the prohibition of calling
the functions more than once for a given entry in each backend
(which should be trivially avoidable in most circumstances). While
at it, this commit bumps the maximum DSM registry entry name length
from 63 bytes to 127 bytes.
Also note that even though one could presumably detach/destroy the
DSAs and dshashes created in the registry, such use-cases are not
yet well-supported, if for no other reason than the associated DSM
registry entries cannot be removed. Adding such support is left as
a future exercise.
The test_dsm_registry test module contains tests for the new
functions and also serves as a complete usage example.
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Florents Tselai <florents.tselai@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consistently use "Size" (or size_t, or in some places int64 or double)
as the type for variables holding memory allocation sizes. In most
places variables' data types were fine already, but we had an ancient
habit of computing bytes from kilobytes-units GUCs with code like
"work_mem * 1024L". That risks overflow on Win64 where they did not
make "long" as wide as "size_t". We worked around that by restricting
such GUCs' ranges, so you couldn't set work_mem et al higher than 2GB
on Win64. This patch removes that restriction, after replacing such
calculations with "work_mem * (Size) 1024" or variants of that.
It should be noted that this patch was constructed by searching
outwards from the GUCs that have MAX_KILOBYTES as upper limit.
So I can't positively guarantee there are no other places doing
memory-size arithmetic in int or long variables. I do however feel
pretty confident that increasing MAX_KILOBYTES on Win64 is safe now.
Also, nothing in our code should be dealing in multiple-gigabyte
allocations without authorization from a relevant GUC, so it seems
pretty likely that this search caught everything that could be at
risk of overflow.
Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
|
|
|
|
| |
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Redefine our exact width types with standard C99 types and macros,
including int64_t, INT64_MAX, INT64_C(), PRId64 etc. We were already
using <stdint.h> types in a few places.
One complication is that Windows' <inttypes.h> uses format strings like
"%I64d", "%I32", "%I" for PRI*64, PRI*32, PTR*PTR, instead of mapping to
other standardized format strings like "%lld" etc as seen on other known
systems. Teach our snprintf.c to understand them.
This removes a lot of configure clutter, and should also allow 64-bit
numbers and other standard types to be used in localized messages
without casting.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the DSA segment size always started with 1MB and grew up
to DSA_MAX_SEGMENT_SIZE. It was inconvenient in certain scenarios,
such as when the caller desired a soft constraint on the total DSA
segment size, limiting it to less than 1MB.
This commit introduces the capability to specify the initial and
maximum DSA segment sizes when creating a DSA area, providing more
flexibility and control over memory usage.
Reviewed-by: John Naylor, Tomas Vondra
Discussion: https://postgr.es/m/CAD21AoAYGGC1ePjVX0H%2Bpp9rH%3D9vuPK19nNOiu12NprdV5TVJA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main
technique that makes it adaptive is using several different node types,
each with a different capacity of elements, and a different algorithm
for accessing them. The nodes start small and grow/shrink as needed.
The main advantage over hash tables is efficient sorted iteration and
better memory locality when successive keys are lexicographically
close together. The implementation currently assumes 64-bit integer
keys, and traversing the tree is in general slower than a linear
probing hash table, so this is not a general-purpose associative array.
The paper describes two other techniques not implemented here,
namely "path compression" and "lazy expansion". These can further
reduce memory usage and speed up traversal, but the former would add
significant complexity and the latter requires storing the full key
with the value. We do trivially compress the path when leading bytes
of the key are zeros, however.
For value storage, we use "combined pointer/value slots", as
recommended in the paper. Values of size equal or smaller than the the
platform's pointer type are stored in the array of child pointers in
the last level node, while larger values are each stored in a separate
allocation. This is for now fixed at compile time, but it would be
fairly trivial to allow determining at runtime how variable-length
values are stored.
One innovation in our implementation compared to the ART paper is
decoupling the notion of node "size class" from "kind". The size
classes within a given node kind have the same underlying type, but
a variable capacity for children, so we can introduce additional node
sizes with little additional code.
To enable different use cases to specialize for different value types
and for shared/local memory, we use macro-templatized code generation
in the same manner as simplehash.h and sort_template.h.
Future commits will use this infrastructure for storing TIDs.
Patch by Masahiko Sawada and John Naylor, but a substantial amount of
credit is due to Andres Freund, whose proof-of-concept was a valuable
source of coding idioms and awareness of performance pitfalls, and
who reviewed earlier versions.
Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Invent separate macros for "invalid" values of these types, so that
we needn't embed knowledge of their representations into calling code.
These are all zeroes anyway ATM, so this is not fixing any live bug,
but it makes the code cleaner and more future-proof.
I (tgl) also chose to move DSM_HANDLE_INVALID into dsm_impl.h,
since it seems like it should live beside the typedef for dsm_handle.
Hou Zhijie, Nathan Bossart, Kyotaro Horiguchi, Tom Lane
Discussion: https://postgr.es/m/OS0PR01MB5716860B1454C34E5B179B6694C99@OS0PR01MB5716.jpnprd01.prod.outlook.com
|
|
|
|
| |
Backpatch-through: 11
|
|
|
|
| |
Backpatch-through: 10
|
|
|
|
| |
Backpatch-through: 9.5
|
|
|
|
| |
Backpatch-through: update all files in master, backpatch legal files through 9.4
|
|
|
|
|
|
|
|
| |
These seem to be leftovers from old patches, perhaps.
Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoDuAYsRb3Q9aobkFZ6DZMWxsyg4HOmgkwgeWNfSkTwGxw@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Switch to 2.1 version of pg_bsd_indent. This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.
Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
|
|
|
|
| |
Backpatch-through: certain files through 9.4
|
|
|
|
|
|
|
|
| |
Takeshi Ideriha complained that there is a mixture of Size and size_t
in dsa.c and corresponding header. Let's use size_t. Back-patch to 10
where dsa.c landed, to make future back-patching easy.
Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F19ABD9%40G01JPEXMBKW04
|
|
|
|
| |
Backpatch-through: certain files through 9.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
Thomas Munro, per project policy articuled by Andres Freund and
Tom Lane.
Discussion: http://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new function dsa_allocate_extended now takes flags which indicate
that huge allocations should be permitted, that out-of-memory
conditions should not throw an error, and/or that the returned memory
should be zero-filled, just like MemoryContextAllocateExtended.
Commit 9acb85597f1223ac26a5b19a9345849c43d0ff54, which added
dsa_allocate0, was broken because it failed to account for the
possibility that dsa_allocate() might return InvalidDsaPointer.
This fixes that problem along the way.
Thomas Munro, with some comment changes by me.
Discussion: http://postgr.es/m/CA+Tgmobt7CcF_uQP2UQwWmu4K9qCHehMJP9_9m1urwP8hbOeHQ@mail.gmail.com
|
|
|
|
|
|
|
| |
This does the same thing as dsa_allocate, except that the memory
is guaranteed to be zero-filled on return.
Dilip Kumar, adjusted by me.
|
|
|
|
|
|
|
|
|
|
| |
With the old code, a backend that read pg_stat_activity without ever
having executed a parallel query might see a backend in the midst of
executing one waiting on a DSA LWLock, resulting in a crash. The
solution is for backends to register the tranche at startup time, not
the first time a parallel query is executed.
Report by Andreas Seltenreich. Patch by me, reviewed by Thomas Munro.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
On 32-bit systems, don't try to use 64-bit DSA pointers, because the
computation of DSA_MAX_SEGMENT_SIZE overflows Size.
Cast 1 to Size before shifting it, so that the compiler doesn't
produce a result of the wrong width.
In passing, change one use of size_t to Size.
|
|
|
|
|
|
|
|
|
| |
Commit 13df76a537cca3b8884911d8fdf7c89a457a8dd3 was overconfident
about how portable %016lx is. Some compilers complain because they
need %016llx, while platforms where DSA pointers are only 32 bits
get unhappy about using a 64-bit format for a 32-bit quantity.
Thomas Munro, per an off-list suggestion from me.
|
| |
|
|
Programmers discovered decades ago that it was useful to have a simple
interface for allocating and freeing memory, which is why malloc() and
free() were invented. Unfortunately, those handy tools don't work
with dynamic shared memory segments because those are specific to
PostgreSQL and are not necessarily mapped at the same address in every
cooperating process. So invent our own allocator instead. This makes
it possible for processes cooperating as part of parallel query
execution to allocate and free chunks of memory without having to
reserve them prior to the start of execution. It could also be used
for longer lived objects; for example, we could consider storing data
for pg_stat_statements or the stats collector in shared memory using
these interfaces, rather than writing them to files. Basically,
anything that needs shared memory but can't predict in advance how
much it's going to need might find this useful.
Thomas Munro and Robert Haas. The original code (of mine) on which
Thomas based his work was actually designed to be a new backend-local
memory allocator for PostgreSQL, but that hasn't gone anywhere - or
not yet, anyway. Thomas took that work and performed major
refactoring and extensive modifications to make it work with dynamic
shared memory, including the addition of appropriate locking.
Discussion: CA+TgmobkeWptGwiNa+SGFWsTLzTzD-CeLz0KcE-y6LFgoUus4A@mail.gmail.com
Discussion: CAEepm=1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA@mail.gmail.com
|