aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/mmgr/slab.c
Commit message (Collapse)AuthorAge
* Make the order of the header file includes consistentPeter Eisentraut2024-03-13
| | | | | | | | Similar to commit 7e735035f20. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
* Optimize GenerationAlloc() and SlabAlloc()David Rowley2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | In a similar effort to 413c18401, separate out the hot and cold paths in GenerationAlloc() and SlabAlloc() to avoid having to setup the stack frame for the hot path. This additionally adjusts how we use the GenerationContext's freeblock. Freeblock, when set, is now always empty and we only switch to using it when the current allocation request finds the current block does not have enough space and the freeblock is large enough to accomodate the allocation. This commit also adjusts GenerationFree() so that if we pfree the final allocation in the current generation block, we now mark that block as empty and keep it as the current block. Previously we free'd that block and set the current block to NULL. Doing that meant we needed a special case in GenerationAlloc to check if GenerationContext.block was NULL. So this both reduces free/malloc calls and reduces the work done in GenerationAlloc(). In passing, improve some comments in aset.c Discussion: https://postgr.es/m/CAApHDvpHVSJqqb4B4OZLixr=CotKq-eKkbwZqvZVo_biYvUvQA@mail.gmail.com
* Adjust memory allocation functions to allow sibling callsDavid Rowley2024-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many modern compilers are able to optimize function calls to functions where the parameters of the called function match a leading subset of the calling function's parameters. If there are no instructions in the calling function after the function is called, then the compiler is free to avoid any stack frame setup and implement the function call as a "jmp" rather than a "call". This is called sibling call optimization. Here we adjust the memory allocation functions in mcxt.c to allow this optimization. This requires moving some responsibility into the memory context implementations themselves. It's now the responsibility of the MemoryContext to check for malloc failures. This is good as it both allows the sibling call optimization, but also because most small and medium allocations won't call malloc and just allocate memory to an existing block. That can't fail, so checking for NULLs in that case isn't required. Also, traditionally it's been the responsibility of palloc and the other allocation functions in mcxt.c to check for invalid allocation size requests. Here we also move the responsibility of checking that into the MemoryContext. This isn't to allow the sibling call optimization, but more because most of our allocators handle large allocations separately and we can just add the size check when doing large allocations. We no longer check this for non-large allocations at all. To make checking the allocation request sizes and ERROR handling easier, add some helper functions to mcxt.c for the allocators to use. Author: Andres Freund Reviewed-by: David Rowley Discussion: https://postgr.es/m/20210719195950.gavgs6ujzmjfaiig@alap3.anarazel.de
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Shrink memory contexts struct sizesDavid Rowley2023-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Here we reduce the block size fields in AllocSetContext, GenerationContext and SlabContext from Size down to uint32. Ever since c6e0fe1f2, blocks for non-dedicated palloc chunks can no longer be larger than 1GB, so there's no need to store the various block size fields as 64-bit values. 32 bits are enough to store 2^30. Here we also further reduce the memory context struct sizes by getting rid of the 'keeper' field which stores a pointer to the context's keeper block. All the context types which have this field always allocate the keeper block in the same allocation as the memory context itself, so the keeper block always comes right at the end of the context struct. Add some macros to calculate that address rather than storing it in the context. Overall, in AllocSetContext and GenerationContext, this saves 20 bytes on 64-bit builds which for ALLOCSET_SMALL_SIZES can sometimes mean the difference between having to allocate a 2nd block and storing all the required allocations on the keeper block alone. Such contexts are used in relcache to store cache entries for indexes, of which there can be a large number in a single backend. Author: Melih Mutlu Reviewed-by: David Rowley Discussion: https://postgr.es/m/CAGPVpCSOW3uJ1QJmsMR9_oE3X7fG_z4q0AoU4R_w+2RzvroPFg@mail.gmail.com
* Adjust Valgrind macro usage to protect chunk headersDavid Rowley2023-04-15
| | | | | | | | | | | | | | Prior to this commit we only ever protected MemoryChunk's requested_size field with Valgrind NOACCESS. This means that if the hdrmask field is ever accessed accidentally then we're not going to get any warnings from Valgrind about it. Valgrind would have warned us about the problem fixed in 92957ed98 had we already been doing this. Per suggestion from Tom Lane Reviewed-by: Richard Guo Discussion: https://postgr.es/m/1650235.1672694719@sss.pgh.pa.us Discussion: https://postgr.es/m/CAApHDvr=FZNGbj252Z6M9BSFKoq6BMxgkQ2yEAGUYoo7RquqZg@mail.gmail.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Fix newly introduced bug in slab.cDavid Rowley2022-12-22
| | | | | | | | | | | | | | | | | | | | d21ded75f changed the way slab.c works but introduced a bug that meant we could end up with the slab's curBlocklistIndex pointing to the wrong list. The condition which was checking for this was failing to account for two things: 1. The curBlocklistIndex could be 0 as we've currently got no non-full blocks to put chunks on. In this case, the dlist_is_empty() check cannot be performed as there can be any number of completely full blocks at that index. 2. The curBlocklistIndex may be greater than the index we just moved the block onto. Since we need to ensure we fill up fuller blocks first, we must reset curBlocklistIndex when changing any blocklist element that's less than the curBlocklistIndex too. Reported-by: Takamichi Osumi Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
* Improve the performance of the slab memory allocatorDavid Rowley2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Slab has traditionally been fairly slow when compared with the AllocSet or Generation memory allocators. Part of this slowness came from having to write out an entire block when we allocate a new block in order to populate the free list indexes within the block's memory. Additional slowness came from having to move a block onto another dlist each time we palloc or pfree a chunk from it. Here we optimize both of those cases and do a little bit extra to improve the performance of the slab allocator. Here, instead of writing out the free list indexes when allocating a new block, we introduce the concept of "unused" chunks. When a block is first allocated all chunks are unused. These chunks only make it onto the free list when they are pfree'd. When allocating new chunks on an existing block, we have the choice of consuming a chunk from the free list or an unused chunk. When both exist, we opt to use one from the free list, as these have been used already and the memory of them is more likely to be cached by the CPU. Here we also reduce the number of block lists from there being one for every possible value of free chunks on a block to just having a small fixed number of block lists. We keep the 0th block list for completely full blocks and anything else stores blocks for some range of free chunks with fuller blocks appearing on lower block list array elements. This reduces how often we must move a block to another list when we allocate or free chunks, but still allows us to prefer to put new chunks on fuller blocks and perhaps allow blocks with fewer chunks to be free'd later once all their remaining chunks have been pfree'd. Additionally, we now store a list of "emptyblocks", which are blocks that no longer contain any allocated chunks. We now keep up to 10 of these around to avoid having to thrash malloc/free when allocation patterns continually cause blocks to become free of any allocated chunks only to allocate more chunks again. Now only once we have 10 of these, we free the block. This does raise the high water mark for the total memory that a slab context can consume. It does not seem entirely unreasonable that we might one day want to make this a property of SlabContext rather than a compile-time constant. Let's wait and see if there is any evidence to support that this is required before doing it. Author: Andres Freund, David Rowley Tested-by: Tomas Vondra, John Naylor Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
* Static assertions cleanupPeter Eisentraut2022-12-15
| | | | | | | | | | | | | | | | | | | | | Because we added StaticAssertStmt() first before StaticAssertDecl(), some uses as well as the instructions in c.h are now a bit backwards from the "native" way static assertions are meant to be used in C. This updates the guidance and moves some static assertions to better places. Specifically, since the addition of StaticAssertDecl(), we can put static assertions at the file level. This moves a number of static assertions out of function bodies, where they might have been stuck out of necessity, to perhaps better places at the file level or in header files. Also, when the static assertion appears in a position where a declaration is allowed, then using StaticAssertDecl() is more native than StaticAssertStmt(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com
* Remove AssertArg and AssertStatePeter Eisentraut2022-10-28
| | | | | | | | | These don't offer anything over plain Assert, and their usage had already been declared obsolescent. Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20221009210148.GA900071@nathanxps13
* Harden memory context allocators against bogus chunk pointers.Tom Lane2022-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit c6e0fe1f2, functions such as AllocSetFree could pretty safely presume that they were given a valid chunk pointer for their own type of context, because the indirect call through a memory context object and method struct would be very unlikely to work otherwise. But now, if pfree() is mistakenly invoked on a pointer to garbage, we have three chances in eight of ending up at one of these functions. That means we need to take extra measures to verify that we are looking at what we're supposed to be looking at, especially in debug builds. Hence, add code to verify that the chunk's back-link to a block header leads to a memory context object that satisfies the right sort of IsA() check. This is still a bit weaker than what we did before, but for the moment assume that an IsA() check is sufficient. As a compromise between speed and safety, implement these checks as Asserts when dealing with small chunks but plain test-and-elogs when dealing with large (external) chunks. The latter case should not be too performance-critical, but the former case probably is. In slab.c, all chunks are small; but nonetheless use a plain test in SlabRealloc, because that is certainly not performance-critical, indeed we should be suspicious that it's being called in error. In aset.c, additionally add some assertions that the "value" field of the chunk header is within the small range allowed for freelist indexes. Without that, we might find ourselves trying to wipe most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling on a "freelist header" that's far away from the context object. Eventually, field experience might show us that it's smarter for these tests to be active always, but for now we'll try to get away with just having them as assertions. While at it, also be more uniform about asserting that context objects passed as parameters are of the type we expect. Some places missed that altogether, and slab.c was for no very good reason doing it differently from the other allocators. Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
* Make more effort to put a sentinel at the end of allocated memoryDavid Rowley2022-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a sentinel byte just beyond the requested size if there happened to be enough space on the chunk to do so. For Slab and Generation context types, we only rounded the size of the chunk up to the next maxalign boundary, so it was often not that likely that those would ever have space for the sentinel given that the majority of allocation requests are going to be for sizes which are maxaligned. For AllocSet, it was a little different as smaller allocations are rounded up to the next power-of-2 value rather than the next maxalign boundary, so we're a bit more likely to have space for the sentinel byte, especially when we get away from tiny sized allocations such as 8 or 16 bytes. Here we make more of an effort to allow space so that there is enough room for the sentinel byte in more cases. This makes it more likely that we'll detect when buggy code accidentally writes beyond the end of any of its memory allocations. Each of the 3 MemoryContext types has been changed as follows: The Slab allocator will now always set a sentinel byte. Both the current usages of this MemoryContext type happen to use chunk sizes which were on the maxalign boundary, so these never used sentinel bytes previously. For the Generation allocator, we now always ensure there's enough space in the allocation for a sentinel byte. For AllocSet, this commit makes an adjustment for allocation sizes which are greater than allocChunkLimit. We now ensure there is always space for a sentinel byte. We don't alter the sentinel behavior for request sizes <= allocChunkLimit. Making way for the sentinel byte for power-of-2 request sizes would require doubling up to the next power of 2. Some analysis done on the request sizes made during installcheck shows that a fairly large portion of allocation requests are for power-of-2 sizes. The amount of additional memory for the sentinel there seems prohibitive, so we do nothing for those here. Author: David Rowley Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
* Use MAXALIGN() in calculations using sizeof(SlabBlock)David Rowley2022-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | c6e0fe1f2 added a new pointer field to SlabBlock to make it 4 bytes larger on 32-bit machines. Prior to that commit, the size of that struct was a multiple of 8, which meant that MAXALIGN(sizeof(SlabBlock)) was the same as sizeof(SlabBlock), however, after c6e0fe1f2, due to the addition of the new pointer field to store a pointer to the owning context, that was no longer true on builds with sizeof(void *) == 4. This problem was highlighted by an Assert failure which was checking that the pointer given to pfree() was MAXALIGNED. Various 32-bit ARM buildfarm animals were failing. These have MAXIMUM_ALIGNOF of 8. The only 32-bit testing I'd managed to do on c6e0fe1f2 had been on x86, which has a MAXIMUM_ALIGNOF of 4, therefore did not exhibit this issue. Here we define Slab_BLOCKHDRSZ and copy what is being done in aset.c and generation.c for doing calculations based on the size of the context's block type. This means that SlabAlloc() will now always return a MAXALIGNed pointer. This also fixes an incorrect sentinel_ok() check in SlabCheck() which was incorrectly checking the wrong sentinel byte. This must have previously not caused any issues due to the fullChunkSize never being large enough to store the sentinel byte. Diagnosed-by: Tomas Vondra, Tom Lane Author: Tomas Vondra, David Rowley Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
* Improve performance of and reduce overheads of memory managementDavid Rowley2022-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we palloc a chunk of memory, traditionally, we prefix the returned pointer with a pointer to the memory context to which the chunk belongs. This is required so that we're able to easily determine the owning context when performing operations such as pfree() and repalloc(). For the AllocSet context, prior to this commit we additionally prefixed the pointer to the owning context with the size of the chunk. This made the header 16 bytes in size. This 16-byte overhead was required for all AllocSet allocations regardless of the allocation size. For the generation context, the problem was worse; in addition to the pointer to the owning context and chunk size, we also stored a pointer to the owning block so that we could track the number of freed chunks on a block. The slab allocator had a 16-byte chunk header. The changes being made here reduce the chunk header size down to just 8 bytes for all 3 of our memory context types. For small to medium sized allocations, this significantly increases the number of chunks that we can fit on a given block which results in much more efficient use of memory. Additionally, this commit completely changes the rule that pointers to palloc'd memory must be directly prefixed by a pointer to the owning memory context and instead, we now insist that they're directly prefixed by an 8-byte value where the least significant 3-bits are set to a value to indicate which type of memory context the pointer belongs to. Using those 3 bits as an index (known as MemoryContextMethodID) to a new array which stores the methods for each memory context type, we're now able to pass the pointer given to functions such as pfree() and repalloc() to the function specific to that context implementation to allow them to devise their own methods of finding the memory context which owns the given allocated chunk of memory. The reason we're able to reduce the chunk header down to just 8 bytes is because of the way we make use of the remaining 61 bits of the required 8-byte chunk header. Here we also implement a general-purpose MemoryChunk struct which makes use of those 61 remaining bits to allow the storage of a 30-bit value which the MemoryContext is free to use as it pleases, and also the number of bytes which must be subtracted from the chunk to get a reference to the block that the chunk is stored on (also 30 bits). The 1 additional remaining bit is to denote if the chunk is an "external" chunk or not. External here means that the chunk header does not store the 30-bit value or the block offset. The MemoryContext can use these external chunks at any time, but must use them if any of the two 30-bit fields are not large enough for the value(s) that need to be stored in them. When the chunk is marked as external, it is up to the MemoryContext to devise its own means to determine the block offset. Using 3-bits for the MemoryContextMethodID does mean we're limiting ourselves to only having a maximum of 8 different memory context types. We could reduce the bit space for the 30-bit value a little to make way for more than 3 bits, but it seems like it might be better to do that only if we ever need more than 8 context types. This would only be a problem if some future memory context type which does not use MemoryChunk really couldn't give up any of the 61 remaining bits in the chunk header. With this MemoryChunk, each of our 3 memory context types can quickly obtain a reference to the block any given chunk is located on. AllocSet is able to find the context to which the chunk is owned, by first obtaining a reference to the block by subtracting the block offset as is stored in the 'hdrmask' field and then referencing the block's 'aset' field. The Generation context uses the same method, but GenerationBlock did not have a field pointing back to the owning context, so one is added by this commit. In aset.c and generation.c, all allocations larger than allocChunkLimit are stored on dedicated blocks. When there's just a single chunk on a block like this, it's easy to find the block from the chunk, we just subtract the size of the block header from the chunk pointer. The size of these chunks is also known as we store the endptr on the block, so we can just subtract the pointer to the allocated memory from that. Because we can easily find the owning block and the size of the chunk for these dedicated blocks, we just always use external chunks for allocation sizes larger than allocChunkLimit. For generation.c, this sidesteps the problem of non-external MemoryChunks being unable to represent chunk sizes >= 1GB. This is less of a problem for aset.c as we store the free list index in the MemoryChunk's spare 30-bit field (the value of which will never be close to using all 30-bits). We can easily reverse engineer the chunk size from this when needed. Storing this saves AllocSetFree() from having to make a call to AllocSetFreeIndex() to determine which free list to put the newly freed chunk on. For the slab allocator, this commit adds a new restriction that slab chunks cannot be >= 1GB in size. If there happened to be any users of slab.c which used chunk sizes this large, they really should be using AllocSet instead. Here we also add a restriction that normal non-dedicated blocks cannot be 1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB during the creation of an AllocSet or Generation context. Allocations can still be larger than 1GB, it's just these will always be on dedicated blocks (which do not have the 1GB restriction). Author: Andres Freund, David Rowley Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Fix incorrect format placeholdersPeter Eisentraut2021-12-22
|
* Add function to log the memory contexts of specified backend process.Fujii Masao2021-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3e98c0bafb added pg_backend_memory_contexts view to display the memory contexts of the backend process. However its target process is limited to the backend that is accessing to the view. So this is not so convenient when investigating the local memory bloat of other backend process. To improve this situation, this commit adds pg_log_backend_memory_contexts() function that requests to log the memory contexts of the specified backend process. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be used in some environments because no debugger is available there. So, pg_log_backend_memory_contexts() allows us to see the memory contexts of specified backend more easily. Only superusers are allowed to request to log the memory contexts because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its memory contexts at LOG_SERVER_ONLY level, so that these memory contexts will appear in the server log but not be sent to the client. It logs one message per memory context. Because if it buffers all memory contexts into StringInfo to log them as one message, which may require the buffer to be enlarged very much and lead to OOM error since there can be a large number of memory contexts in a backend. When a backend process is consuming huge memory, logging all its memory contexts might overrun available disk space. To prevent this, now this patch limits the number of child contexts to log per parent to 100. As with MemoryContextStats(), it supposes that practical cases where the log gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large. There was another proposed patch to add the function to return the memory contexts of specified backend as the result sets, instead of logging them, in the discussion. However that patch is not included in this commit because it had several issues to address. Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra, Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Initial pgindent and pgperltidy run for v13.Tom Lane2020-05-14
| | | | | | | | | | | Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
* Remove useless (and broken) logging logic in memory context functions.Tom Lane2020-04-23
| | | | | | | | | | | Nobody really uses this stuff, especially not since we created valgrind-based infrastructure that does the same thing better. It is thus unsurprising that the generation.c and slab.c versions were actually broken. Rather than fix 'em, let's just remove 'em. Alexander Lakhin Discussion: https://postgr.es/m/8936216c-3492-3f6e-634b-d638fddc5f91@gmail.com
* Revert "Specialize MemoryContextMemAllocated()."Jeff Davis2020-03-19
| | | | This reverts commit e00912e11a9ec2d29274ed8a6465e81385906dc2.
* Specialize MemoryContextMemAllocated().Jeff Davis2020-03-18
| | | | | | | | | | | An AllocSet doubles the size of allocated blocks (up to maxBlockSize), which means that the current block can represent half of the total allocated space for the memory context. But the free space in the current block may never have been touched, so don't count the untouched memory as allocated for the purposes of MemoryContextMemAllocated(). Discussion: https://postgr.es/m/ec63d70b668818255486a83ffadc3aec492c1f57.camel@j-davis.com
* Allocate freechunks bitmap as part of SlabContextTomas Vondra2020-01-17
| | | | | | | | | | | | | | | | | | | | The bitmap used by SlabCheck to cross-check free chunks in a block used to be allocated for each SlabCheck call, and was never freed. The memory leak could be fixed by simply adding a pfree call, but it's actually a bad idea to do any allocations in SlabCheck at all as it assumes the state of the memory management as a whole is sane. So instead we allocate the bitmap as part of SlabContext, which means we don't need to do any allocations in SlabCheck and the bitmap goes away together with the SlabContext. Backpatch to 10, where the Slab context was introduced. Author: Tomas Vondra Reported-by: Andres Freund Reviewed-by: Tom Lane Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20200116044119.g45f7pmgz4jmodxj%40alap3.anarazel.de
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-12
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Add transparent block-level memory accountingTomas Vondra2019-10-01
| | | | | | | | | | | | | | | | | | | | Adds accounting of memory allocated in a memory context. Compared to various ad hoc solutions, the main advantage is that the accounting is transparent and does not require direct control over allocations (this matters for use cases where the allocations happen in user code, like for example aggregate states allocated in a transition functions). To reduce overhead, the accounting happens at the block level (not for individual chunks) and only the context immediately owning the block is updated. When inquiring about amount of memory allocated in a context, we have to recursively walk all children contexts. This "lazy" accounting works well for cases with relatively small number of contexts in the relevant subtree and/or with infrequent inquiries. Author: Jeff Davis Reivewed-by: Tomas Vondra, Melanie Plageman, Soumyadeep Chakraborty Discussion: https://www.postgresql.org/message-id/flat/027a129b8525601c6a680d27ce3a7172dab61aab.camel@j-davis.com
* Fix inconsistencies and typos in the tree, take 10Michael Paquier2019-08-13
| | | | | | | | | This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
* Fix inconsistencies and typos in the tree, take 9Michael Paquier2019-08-05
| | | | | | | | This addresses more issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Allow memory contexts to have both fixed and variable ident strings.Tom Lane2018-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, we treated memory context names as potentially variable in all cases, and therefore always copied them into the context header. Commit 9fa6f00b1 rethought this a little bit and invented a distinction between fixed and variable names, skipping the copy step for the former. But we can make things both simpler and more useful by instead allowing there to be two parts to a context's identification, a fixed "name" and an optional, variable "ident". The name supplied in the context create call is now required to be a compile-time-constant string in all cases, as it is never copied but just pointed to. The "ident" string, if wanted, is supplied later. This is needed because typically we want the ident to be stored inside the context so that it's cleaned up automatically on context deletion; that means it has to be copied into the context before we can set the pointer. The cost of this approach is basically just an additional pointer field in struct MemoryContextData, which isn't much overhead, and is bought back entirely in the AllocSet case by not needing a headerSize field anymore, since we no longer have to cope with variable header length. In addition, we can simplify the internal interfaces for memory context creation still further, saving a few cycles there. And it's no longer true that a custom identifier disqualifies a context from participating in aset.c's freelist scheme, so possibly there's some win on that end. All the places that were using non-compile-time-constant context names are adjusted to put the variable info into the "ident" instead. This allows more effective identification of those contexts in many cases; for example, subsidary contexts of relcache entries are now identified by both type (e.g. "index info") and relname, where before you got only one or the other. Contexts associated with PL function cache entries are now identified more fully and uniformly, too. I also arranged for plancache contexts to use the query source string as their identifier. This is basically free for CachedPlanSources, as they contained a copy of that string already. We pay an extra pstrdup to do it for CachedPlans. That could perhaps be avoided, but it would make things more fragile (since the CachedPlanSource is sometimes destroyed first). I suspect future improvements in error reporting will require CachedPlans to have a copy of that string anyway, so it's not clear that it's worth moving mountains to avoid it now. This also changes the APIs for context statistics routines so that the context-specific routines no longer assume that output goes straight to stderr, nor do they know all details of the output format. This is useful immediately to reduce code duplication, and it also allows for external code to do something with stats output that's different from printing to stderr. The reason for pushing this now rather than waiting for v12 is that it rethinks some of the API changes made by commit 9fa6f00b1. Seems better for extension authors to endure just one round of API changes not two. Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Rethink MemoryContext creation to improve performance.Tom Lane2017-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now *required* to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
* Mostly-cosmetic improvements in memory chunk header alignment coding.Tom Lane2017-11-24
| | | | | | | | | | | | | | | Add commentary about what we're doing and why. Apply the method used for padding in GenerationChunk to AllocChunkData, replacing the rather ad-hoc solution used in commit 7e3aa03b4. Reorder fields in GenerationChunk so that the padding calculation will work even if sizeof(size_t) is different from sizeof(void *) --- likely that will never happen, but we don't need the assumption if we do it like this. Improve static assertions about alignment. In passing, fix a couple of oversights in the "large chunk" path in GenerationAlloc(). Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Make slab allocator work on platforms with MAXIMUM_ALIGNOF < sizeof(int).Heikki Linnakangas2017-05-18
| | | | | | Notably, m68k only needs 2-byte alignment. Per report from Christoph Berg. Discussion: https://www.postgresql.org/message-id/20170517193957.fwntkgi6epuso5l2@msg.df7cb.de
* Fix two valgrind issues in slab allocator.Andres Freund2017-04-04
| | | | | | | | | | | | During allocation VALGRIND_MAKE_MEM_DEFINED was called with a pointer as size. That kind of works, but makes valgrind exceedingly slow for workloads involving the slab allocator. Secondly there was an access to memory marked as unreachable within SlabCheck(). Fix that too. Author: Tomas Vondra Discussion: https://postgr.es/m/a6543b6d-6015-99b1-63ef-3ed55a76a730@2ndquadrant.com
* Suppress compiler warning in slab.c.Tom Lane2017-03-08
| | | | | | | | | Compilers that don't realize that elog(ERROR) doesn't return complained that SlabRealloc() failed to return a value. While at it, fix the rather muddled header comment for the function. Per buildfarm.
* Reduce size of common allocation header.Andres Freund2017-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | The new slab allocator needs different per-allocation information than the classical aset.c. The definition in 58b25e981 wasn't sufficiently careful on 32 platforms with 8 byte alignment, leading to buildfarm failures. That's not entirely easy to fix by just adjusting the definition. As slab.c doesn't actually need the size part(s) of the common header, all chunks are equally sized after all, it seems better to instead reduce the header to the part needed by all allocators, namely which context an allocation belongs to. That has the advantage of reducing the overhead of slab allocations, and also allows for more flexibility in future allocators. To avoid spreading the logic about accessing a chunk's context around, centralize it in GetMemoryChunkContext(), which allows to delete a good number of lines. A followup commit will revise the mmgr/README portion about StandardChunkHeader, and more. Author: Andres Freund Discussion: https://postgr.es/m/20170228074420.aazv4iw6k562mnxg@alap3.anarazel.de
* Add "Slab" MemoryContext implementation for efficient equal-sized allocations.Andres Freund2017-02-27
The default general purpose aset.c style memory context is not a great choice for allocations that are all going to be evenly sized, especially when those objects aren't small, and have varying lifetimes. There tends to be a lot of fragmentation, larger allocations always directly go to libc rather than have their cost amortized over several pallocs. These problems lead to the introduction of ad-hoc slab allocators in reorderbuffer.c. But it turns out that the simplistic implementation leads to problems when a lot of objects are allocated and freed, as aset.c is still the underlying implementation. Especially freeing can easily run into O(n^2) behavior in aset.c. While the O(n^2) behavior in aset.c can, and probably will, be addressed, custom allocators for this behavior are more efficient both in space and time. This allocator is for evenly sized allocations, and supports both cheap allocations and freeing, without fragmenting significantly. It does so by allocating evenly sized blocks via malloc(), and carves them into chunks that can be used for allocations. In order to release blocks to the OS as early as possible, chunks are allocated from the fullest block that still has free objects, increasing the likelihood of a block being entirely unused. A subsequent commit uses this in reorderbuffer.c, but a further allocator is needed to resolve the performance problems triggering this work. There likely are further potentialy uses of this allocator besides reorderbuffer.c. There's potential further optimizations of the new slab.c, in particular the array of freelists could be replaced by a more intelligent structure - but for now this looks more than good enough. Author: Tomas Vondra, editorialized by Andres Freund Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com