aboutsummaryrefslogtreecommitdiff
path: root/src/backend/storage/buffer/buf_init.c
diff options
context:
space:
mode:
authorThomas Munro <tmunro@postgresql.org>2023-04-08 10:38:09 +1200
committerThomas Munro <tmunro@postgresql.org>2023-04-08 16:34:50 +1200
commitfaeedbcefd40bfdf314e048c425b6d9208896d90 (patch)
treed6bc53f2196b37e0ce2a408ab44a734382e485d5 /src/backend/storage/buffer/buf_init.c
parentd73c285af5c29a0b486643b77350bc23fbb6114c (diff)
downloadpostgresql-faeedbcefd40bfdf314e048c425b6d9208896d90.tar.gz
postgresql-faeedbcefd40bfdf314e048c425b6d9208896d90.zip
Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a later commit, we need the addresses of user space buffers to be well aligned. The exact requirements vary by OS and file system (typically sectors and/or memory pages). The address alignment size is set to 4096, which is enough for currently known systems: it matches modern sectors and common memory page size. There is no standard governing O_DIRECT's requirements so we might eventually have to reconsider this with more information from the field or future systems. Aligning I/O buffers on memory pages is also known to improve regular buffered I/O performance. Three classes of I/O buffers for regular data pages are adjusted: (1) Heap buffers are now allocated with the new palloc_aligned() or MemoryContextAllocAligned() functions introduced by commit 439f6175. (2) Stack buffers now use a new struct PGIOAlignedBlock to respect PG_IO_ALIGN_SIZE, if possible with this compiler. (3) The buffer pool is also aligned in shared memory. WAL buffers were already aligned on XLOG_BLCKSZ. It's possible for XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus for O_DIRECT WAL writes to fail to be well aligned, but that's a pre-existing condition and will be addressed by a later commit. BufFiles are not yet addressed (there's no current plan to use O_DIRECT for those, but they could potentially get some incidental speedup even in plain buffered I/O operations through better alignment). If we can't align stack objects suitably using the compiler extensions we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to 0. This avoids the need to consider systems that have O_DIRECT but can't align stack objects the way we want; such systems could in theory be supported with more work but we don't currently know of any such machines, so it's easier to pretend there is no O_DIRECT support instead. That's an existing and tested class of system. Add assertions that all buffers passed into smgrread(), smgrwrite() and smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack alignment tricks may be unavailable) or the block size has been set too small to allow arrays of buffers to be all aligned. Author: Thomas Munro <thomas.munro@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com
Diffstat (limited to 'src/backend/storage/buffer/buf_init.c')
-rw-r--r--src/backend/storage/buffer/buf_init.c10
1 files changed, 7 insertions, 3 deletions
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 20946c47cb4..0057443f0c6 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -78,9 +78,12 @@ InitBufferPool(void)
NBuffers * sizeof(BufferDescPadded),
&foundDescs);
+ /* Align buffer pool on IO page size boundary. */
BufferBlocks = (char *)
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ, &foundBufs);
+ TYPEALIGN(PG_IO_ALIGN_SIZE,
+ ShmemInitStruct("Buffer Blocks",
+ NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
+ &foundBufs));
/* Align condition variables to cacheline boundary. */
BufferIOCVArray = (ConditionVariableMinimallyPadded *)
@@ -163,7 +166,8 @@ BufferShmemSize(void)
/* to allow aligning buffer descriptors */
size = add_size(size, PG_CACHE_LINE_SIZE);
- /* size of data pages */
+ /* size of data pages, plus alignment padding */
+ size = add_size(size, PG_IO_ALIGN_SIZE);
size = add_size(size, mul_size(NBuffers, BLCKSZ));
/* size of stuff controlled by freelist.c */