aboutsummaryrefslogtreecommitdiff
path: root/src/port/pg_numa.c
Commit message (Collapse)AuthorAge
* Fix indentation in pg_numa codeTomas Vondra2025-07-01
| | | | | | | Broken by commits 7fe2f67c7c9f, 81f287dc923f and bf1119d74a79. Backpatch to 18, same as the offending commits. Backpatch-through: 18
* Add CHECK_FOR_INTERRUPTS into pg_numa_query_pagesTomas Vondra2025-07-01
| | | | | | | | | | | | | | | Querying the NUMA status can be quite time consuming, especially with large shared buffers. 8cc139bec34a called numa_move_pages() once, for all buffers, and we had to wait for the syscall to complete. But with the chunking, introduced by 7fe2f67c7c to work around a kernel bug, we can do CHECK_FOR_INTERRUPTS() after each chunk, allowing users to abort the execution. Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Backpatch-through: 18
* Limit the size of numa_move_pages requestsTomas Vondra2025-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a kernel bug in do_pages_stat(), affecting systems combining 64-bit kernel and 32-bit user space. The function splits the request into chunks of 16 pointers, but forgets the pointers are 32-bit when advancing to the next chunk. Some of the pointers get skipped, and memory after the array is interpreted as pointers. The result is that the produced status of memory pages is mostly bogus. Systems combining 64-bit and 32-bit environments like this might seem rare, but that's not the case - all 32-bit Debian packages are built in a 32-bit chroot on a system with a 64-bit kernel. This is a long-standing kernel bug (since 2010), affecting pretty much all kernels, so it'll take time until all systems get a fixed kernel. Luckily, we can work around the issue by chunking the requests the same way do_pages_stat() does, at least on affected systems. We don't know what kernel a 32-bit build will run on, so all 32-bit builds use chunks of 16 elements (the largest chunk before hitting the issue). 64-bit builds are not affected by this issue, and so could work without the chunking. But chunking has other advantages, so we apply chunking even for 64-bit builds, with chunks of 1024 elements. Reported-by: Christoph Berg <myon@debian.org> Author: Christoph Berg <myon@debian.org> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Context: https://marc.info/?l=linux-mm&m=175077821909222&w=2 Backpatch-through: 18
* Cleanup of pg_numa.cTomas Vondra2025-04-09
| | | | | | | | | | | | | | | | | | | | | | | | | This moves/renames some of the functions defined in pg_numa.c: * pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and moved to src/backend/storage/ipc/shmem.c. The new name better reflects that the page size is not related to NUMA, and it's specifically about the page size used for the main shared memory segment. * move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into the backend (which more appropriate for functions callable from SQL). While at it, improve the comment to explain what page size it returns. * remove unnecessary includes from src/port/pg_numa.c, adding unnecessary dependencies (src/port should be suitable for frontent). These were either leftovers or unnecessary thanks to the other changes in this commit. This eliminates unnecessary dependencies on backend symbols, which we don't want in src/port. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
* Add support for basic NUMA awarenessTomas Vondra2025-04-07
Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c portability wrapper and an optional build dependency, enabled by --with-libnuma configure option. For now this is Linux-only, other platforms may be supported later. A built-in SQL function pg_numa_available() allows checking NUMA support, i.e. that the server was built/linked with the NUMA library. The main function introduced is pg_numa_query_pages(), which allows determining the NUMA node for individual memory pages. Internally the function uses move_pages(2) syscall, as it allows batching, and is more efficient than get_mempolicy(2). Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com