aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThomas Munro <tmunro@postgresql.org>2025-02-27 14:15:15 +1300
committerThomas Munro <tmunro@postgresql.org>2025-02-27 20:49:48 +1300
commit55918f798bc2d1846eea0d805fbec91d4e1816e0 (patch)
tree3fcb4c7c20fcaaaced7feb3cfa34fabe2b6da26d
parent48e4ae9a0707b22cf874a4e8e531a07077318424 (diff)
downloadpostgresql-55918f798bc2d1846eea0d805fbec91d4e1816e0.tar.gz
postgresql-55918f798bc2d1846eea0d805fbec91d4e1816e0.zip
Remove arbitrary cap on read_stream.c buffer queue.
Previously the internal queue of buffers was capped at max_ios * 4, though not less than io_combine_limit, at allocation time. That was done in the first version based on conservative theories about resource usage and heuristics pending later work. The configured I/O depth could not always be reached with dense random streams generated by ANALYZE, VACUUM, the proposed Bitmap Heap Scan patch, and also sequential streams with the proposed AIO subsystem to name some examples. The new formula is (max_ios + 1) * io_combine_limit, enough buffers for the full configured I/O concurrency level using the full configured I/O combine size, plus the buffers from one finished but not yet consumed full-sized I/O. Significantly more memory would be needed for high GUC values if the client code requests a large per-buffer data size, but that is discouraged (existing and proposed stream users try to keep it under a few words, if not zero). With this new formula, an intermediate variable could have overflowed under maximum GUC values, so its data type is adjusted to cope. Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
-rw-r--r--src/backend/storage/aio/read_stream.c18
1 files changed, 11 insertions, 7 deletions
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 99e44ed99fe..04bdb5e6d4b 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -447,13 +447,17 @@ read_stream_begin_impl(int flags,
/*
* Choose the maximum number of buffers we're prepared to pin. We try to
- * pin fewer if we can, though. We clamp it to at least io_combine_limit
- * so that we can have a chance to build up a full io_combine_limit sized
- * read, even when max_ios is zero. Be careful not to allow int16 to
- * overflow (even though that's not possible with the current GUC range
- * limits), allowing also for the spare entry and the overflow space.
+ * pin fewer if we can, though. We add one so that we can make progress
+ * even if max_ios is set to 0 (see also further down). For max_ios > 0,
+ * this also allows an extra full I/O's worth of buffers: after an I/O
+ * finishes we don't want to have to wait for its buffers to be consumed
+ * before starting a new one.
+ *
+ * Be careful not to allow int16 to overflow (even though that's not
+ * possible with the current GUC range limits), allowing also for the
+ * spare entry and the overflow space.
*/
- max_pinned_buffers = Max(max_ios * 4, io_combine_limit);
+ max_pinned_buffers = (max_ios + 1) * io_combine_limit;
max_pinned_buffers = Min(max_pinned_buffers,
PG_INT16_MAX - io_combine_limit - 1);
@@ -725,7 +729,7 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
stream->ios[stream->oldest_io_index].buffer_index == oldest_buffer_index)
{
int16 io_index = stream->oldest_io_index;
- int16 distance;
+ int32 distance; /* wider temporary value, clamped below */
/* Sanity check that we still agree on the buffers. */
Assert(stream->ios[io_index].op.buffers ==