aboutsummaryrefslogtreecommitdiff
path: root/src/backend/lib
Commit message (Collapse)AuthorAge
* Use pg_bitutils for HyperLogLog.Jeff Davis2020-07-30
| | | | | | | | | | | Using pg_leftmost_one_post32() yields substantial performance benefits. Backpatching to version 13 because HLL is used for HashAgg improvements in 9878b643, which was also backpatched to 13. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-WzkGvDKVDo+0YvfvZ+1CE=iCi88DCOGFF3i1hTGGaxcKPw@mail.gmail.com Backpatch-through: 13
* Move src/backend/utils/hash/hashfn.c to src/commonRobert Haas2020-02-27
| | | | | | | | | | | | | | This also involves renaming src/include/utils/hashutils.h, which becomes src/include/common/hashfn.h. Perhaps an argument can be made for keeping the hashutils.h name, but it seemed more consistent to make it match the name of the file, and also more descriptive of what is actually going on here. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not to break the Windows build from Davinder Singh and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
* Put all the prototypes for hashfn.c into the same header file.Robert Haas2020-02-24
| | | | | | | | | | Previously, some of the prototypes for functions in hashfn.c were in utils/hashutils.h and others were in utils/hsearch.h, but that is confusing and has no particular benefit. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Make StringInfo available to frontend code.Andres Freund2019-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's plenty places in frontend code that could benefit from a string buffer implementation. Some because it yields simpler and faster code, and some others because of the desire to share code between backend and frontend. While there is a string buffer implementation available to frontend code, libpq's PQExpBuffer, it is clunkier than stringinfo, it introduces a libpq dependency, doesn't allow for sharing between frontend and backend code, and has a higher API/ABI stability requirement due to being exposed via libpq. Therefore it seems best to just making StringInfo being usable by frontend code. There's not much to do for that, except for rewriting two subsequent elog/ereport calls into others types of error reporting, and deciding on a maximum string length. For the maximum string size I decided to privately define MaxAllocSize to the same value as used in the backend. It seems likely that we'll want to reconsider this for both backend and frontend code in the not too far away future. For now I've left stringinfo.h in lib/, rather than common/, to reduce the likelihood of unnecessary breakage. We could alternatively decide to provide a redirecting stringinfo.h in lib/, or just not provide compatibility. Author: Andres Freund Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de
* Split all OBJS style lines in makefiles into one-line-per-entry style.Andres Freund2019-11-05
| | | | | | | | | | | | | | | When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
* Fix inconsistencies in the codeMichael Paquier2019-07-08
| | | | | | | | | | | This addresses a couple of issues in the code: - Typos and inconsistencies in comments and function declarations. - Removal of unreferenced function declarations. - Removal of unnecessary compile flags. - A cleanup error in regressplans.sh. Author: Alexander Lakhin Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com
* Fix more typos and inconsistencies in the treeMichael Paquier2019-06-17
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com
* Fix assorted inconsistencies.Amit Kapila2019-06-08
| | | | | | | | | | There were a number of issues in the recent commits which include typos, code and comments mismatch, leftover function declarations. Fix them. Reported-by: Alexander Lakhin Author: Alexander Lakhin, Amit Kapila and Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/ef0c0232-0c1d-3a35-63d4-0ebd06e31387@gmail.com
* Update copyright year.Thomas Munro2019-05-24
| | | | | Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Fix example in comment.Heikki Linnakangas2019-04-09
| | | | Author: Adrien Nayrat
* Further code review for new integerset code.Tom Lane2019-03-25
| | | | | Mostly cosmetic adjustments, but I added a more reliable method of detecting whether an iteration is in progress.
* Clean up the Simple-8b encoder code.Heikki Linnakangas2019-03-25
| | | | | | | | | | | | | | | | | | | | | | | | Coverity complained that simple8b_encode() might read beyond the end of the 'diffs' array, in the loop to encode the integers. That was a false positive, because we never get into the loop in modes 0 or 1, and the array is large enough for all the other modes. But I admit it's very subtle, so it's not surprising that Coverity didn't see it, and it's not very obvious to humans either. Refactor it, so that the second loop re-computes the differences, instead of carrying them over from the first loop in the 'diffs' array. This way, the 'diffs' array is not needed anymore. It makes no measurable difference in performance, and seems more straightforward this way. Also, improve the comments in simple8b_encode(): fix the comment about its return value that was flat-out wrong, and explain the condition when it returns EMPTY_CODEWORD better. In the passing, move the 'selector' from the codeword's low bits to the high bits. It doesn't matter much, but looking at the original paper, and googling around for other Simple-8b implementations, that's how it's usually done. Per Coverity, and Tom Lane's report off-list.
* Fix yet more portability bugs in integerset and its tests.Heikki Linnakangas2019-03-22
| | | | | | | | | | | | There were more large constants that needed UINT64CONST. And one variable was declared as "int", when it needed to be uint64. These bugs were only visible on 32-bit systems; clearly I should've tested on one, given that this code does a lot of work with 64-bit integers. Also, in the test "huge distances" test, the code created some values with random distances between them, but the test logic didn't take into account the possibility that the random distance was exactly 1. That never actually happens with the seed we're using, but let's be tidy.
* Add IntegerSet, to hold large sets of 64-bit ints efficiently.Heikki Linnakangas2019-03-22
| | | | | | | | | | | | | | | | | | | | | The set is implemented as a B-tree, with a compact representation at leaf items, using Simple-8b algorithm, so that clusters of nearby values use less memory. The IntegerSet isn't used for anything yet, aside from the test code, but we have two patches in the works that would benefit from this: A patch to allow GiST vacuum to delete empty pages, and a patch to reduce heap VACUUM's memory usage, by storing the list of dead TIDs more efficiently and lifting the 1 GB limit on its size. This includes a unit test module, in src/test/modules/test_integerset. It can be used to verify correctness, as a regression test, but if you run it manully, it can also print memory usage and execution time of some of the tests. Author: Heikki Linnakangas, Andrey Borodin Reviewed-by: Julien Rouhaud Discussion: https://www.postgresql.org/message-id/b5e82599-1966-5783-733c-1a947ddb729f@iki.fi
* Move hash_any prototype from access/hash.h to utils/hashutils.hAlvaro Herrera2019-03-11
| | | | | | | | | | | | | | | | | | | | | ... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
* Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.Tom Lane2019-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test for the compiler builtins __builtin_clz, __builtin_ctz, and __builtin_popcount, and make use of these in preference to handwritten C code if they're available. Create src/port infrastructure for "leftmost one", "rightmost one", and "popcount" so as to centralize these decisions. On x86_64, __builtin_popcount generally won't make use of the POPCNT opcode because that's not universally supported yet. Provide code that checks CPUID and then calls POPCNT via asm() if available. This requires indirecting through a function pointer, which is an annoying amount of overhead for a one-instruction operation, but it's probably not worth working harder than this for our current use-cases. I'm not sure we've found all the existing places that could profit from this new infrastructure; but we at least touched all the ones that used copied-and-pasted versions of the bitmapset.c code, and got rid of multiple copies of the associated constant arrays. While at it, replace c-compiler.m4's one-per-builtin-function macros with a single one that can handle all the cases we need to worry about so far. Also, because I'm paranoid, make those checks into AC_LINK checks rather than just AC_COMPILE; the former coding failed to verify that libgcc has support for the builtin, in cases where it's not inline code. David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
* Revert attempts to use POPCNT etc instructionsAlvaro Herrera2019-02-15
| | | | | | | | | | | This reverts commits fc6c72747ae6, 109de05cbb03, d0b4663c23b7 and 711bab1e4d19. Somebody will have to try harder before submitting this patch again. I've spent entirely too much time on it already, and the #ifdef maze yet to be written in order for it to build at all got on my nerves. The amount of work needed to get a platform-specific performance improvement that's barely above the noise level is not worth it.
* Add basic support for using the POPCNT and SSE4.2s LZCNT opcodesAlvaro Herrera2019-02-13
| | | | | | | | | | | | | These opcodes have been around in the AMD world since 2007, and 2008 in the case of intel. They're supported in GCC and Clang via some __builtin macros. The opcodes may be unavailable during runtime, in which case we fall back on a C-based implementation of the code. In order to get the POPCNT instruction we must pass the -mpopcnt option to the compiler. We do this only for the pg_bitutils.c file. David Rowley (with fragments taken from a patch by Thomas Munro) Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Rename rbtree.c functions to use "rbt" prefix not "rb" prefix.Tom Lane2018-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | The "rb" prefix is used by Ruby, so that our existing code results in name collisions that break plruby. We discussed ways to prevent that by adjusting dynamic linker options, but it seems that at best we'd move the pain to other cases. Renaming to avoid the collision is the only portable fix anyway. Fortunately, our rbtree code is not (yet?) widely used --- in core, there's only a single usage in GIN --- so it seems likely that we can get away with a rename. I chose to do this basically as s/rb/rbt/g, except for places where there already was a "t" after "rb". The patch could have been made smaller by only touching linker-visible symbols, but it would have resulted in oddly inconsistent-looking code. Better to make it look like "rbt" was the plan all along. Back-patch to v10. The rbtree.c code exists back to 9.5, but rb_iterate() which is the actual immediate source of pain was added in v10, so it seems like changing the names before that would have more risk than benefit. Per report from Pavel Raiskup. Discussion: https://postgr.es/m/4738198.8KVIIDhgEB@nb.usersys.redhat.com
* Remove incorrect comment in dshash.c.Thomas Munro2018-10-29
| | | | | | | Back-patch to 11. Author: Antonin Houska Discussion: https://postgr.es/m/8726.1540553521%40localhost
* Implement %m in src/port/snprintf.c, and teach elog.c to rely on that.Tom Lane2018-09-26
| | | | | | | | | | | | | | | | | | | | | | | I started out with the idea that we needed to detect use of %m format specs in contexts other than elog/ereport calls, because we couldn't rely on that working in *printf calls. But a better answer is to fix things so that it does work. Now that we're using snprintf.c all the time, we can implement %m in that and we've fixed the problem. This requires also adjusting our various printf-wrapping functions so that they ensure "errno" is preserved when they call snprintf.c. Remove elog.c's handmade implementation of %m, and let it rely on snprintf to support the feature. That should provide some performance gain, though I've not attempted to measure it. There are a lot of places where we could now simplify 'printf("%s", strerror(errno))' into 'printf("%m")', but I'm not in any big hurry to make that happen. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
* doc: Update redirecting linksPeter Eisentraut2018-07-16
| | | | | | | Update links that resulted in redirects. Most are changes from http to https, but there are also some other minor edits. (There are still some redirects where the target URL looks less elegant than the one we currently have. I have left those as is.)
* Add missing files to src/backend/lib/README.Heikki Linnakangas2018-05-22
| | | | | | | | | The README lists all the files available in the directory, along with short descriptions of each, but a few newly added ones were missing. While we're at it, reorder the list into alphabetical order. Author: Takeshi Ideriha Discussion: https://www.postgresql.org/message-id/4E72940DA2BF16479384A86D54D0988A56793487@G01JPEXMBKW04
* Post-feature-freeze pgindent run.Tom Lane2018-04-26
| | | | Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
* Fix non-portable use of round().Andres Freund2018-03-31
| | | | | | | | | | | | | | | round() is from C99. Use rint() instead. There are behavioral differences between round() and rint(), but they should not matter to the Bloom filter optimal_k() function. We already assume POSIX behavior for rint(), so there is no question of rint() not using "rounds towards nearest" as its rounding mode. Cleanup from commit 51bc271790eb234a1ba4d14d3e6530f70de92ab5. Per buildfarm member thrips. Author: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn76eCGUonARy-wrVtMHsf+4cvbK_oJAWTLfORTU5ki0w@mail.gmail.com
* Add Bloom filter implementation.Andres Freund2018-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | A Bloom filter is a space-efficient, probabilistic data structure that can be used to test set membership. Callers will sometimes incur false positives, but never false negatives. The rate of false positives is a function of the total number of elements and the amount of memory available for the Bloom filter. Two classic applications of Bloom filters are cache filtering, and data synchronization testing. Any user of Bloom filters must accept the possibility of false positives as a cost worth paying for the benefit in space efficiency. This commit adds a test harness extension module, test_bloomfilter. It can be used to get a sense of how the Bloom filter implementation performs under varying conditions. This is infrastructure for the upcoming "heapallindexed" amcheck patch, which verifies the consistency of a heap relation against one of its indexes. Author: Peter Geoghegan Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
* Minor clean-up in dshash.{c,h}.Andres Freund2018-03-01
| | | | | | | | | | For consistency with other code that deals in numbers of buckets, the macro BUCKETS_PER_PARTITION should produce a value of type size_t. Also, fix a mention of an obsolete proposed name for dshash.c that appeared in a comment. Author: Thomas Munro, based on an observation from Amit Kapila Discussion: https://postgr.es/m/CAA4eK1%2BBOp5aaW3aHEkg5Bptf8Ga_BkBnmA-%3DXcAXShs0yCiYQ%40mail.gmail.com
* Remove some inappropriate #includes.Tom Lane2018-02-16
| | | | | | | | | | | | Other header files should never #include postgres.h (nor postgres_fe.h, nor c.h), per project policy. Also, there's no need for any backend .c file to explicitly include elog.h or palloc.h, because postgres.h pulls those in already. Extracted from a larger patch by Kyotaro Horiguchi. The rest of the removals he suggests require more study, but these are no-brainers. Discussion: https://postgr.es/m/20180215.200447.209320006.horiguchi.kyotaro@lab.ntt.co.jp
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Rethink MemoryContext creation to improve performance.Tom Lane2017-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now *required* to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
* Allow to avoid NUL-byte management for stringinfos and use in format.c.Andres Freund2017-10-11
| | | | | | | | | | | | In a lot of the places having appendBinaryStringInfo() maintain a trailing NUL byte wasn't actually meaningful, e.g. when appending an integer which can contain 0 in one of its bytes. Removing this yields some small speedup, but more importantly will be more consistent when providing faster variants of pq_sendint etc. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
* Fix uninitialized variable in dshash.c.Andres Freund2017-09-18
| | | | | | | | | | | | A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20. The code would have crashed if hashtable->size_log2 ever had the same value as hashtable->control->size_log2 by coincidence. Per Valgrind. Author: Thomas Munro Reported-By: Tomas Vondra Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com
* Remove pre-order and post-order traversal logic for red-black trees.Tom Lane2017-09-10
| | | | | | | | | | | | | This code isn't used, and there's no clear reason why anybody would ever want to use it. These traversal mechanisms don't yield a visitation order that is semantically meaningful for any external purpose, nor are they any faster or simpler than the left-to-right or right-to-left traversals. (In fact, some rough testing suggests they are slower :-(.) Moreover, these mechanisms are impossible to test in any arm's-length fashion; doing so requires knowledge of the red-black tree's internal implementation. Hence, let's just jettison them. Discussion: https://postgr.es/m/17735.1505003111@sss.pgh.pa.us
* Suppress compiler warnings in dshash.c.Tom Lane2017-09-03
| | | | | | | | | | | | | | Some compilers complain, not unreasonably, about left-shifting an int32 "1" and then assigning the result to an int64. In practice I sure hope that this data structure never gets large enough that an overflow would actually occur; but let's cast the constant to the right type to avoid the hazard. In passing, fix a typo in dshash.h. Amit Kapila, adjusted as per comment from Thomas Munro. Discussion: https://postgr.es/m/CAA4eK1+5vfVMYtjK_NX8O3-42yM3o80qdqWnQzGquPrbq6mb+A@mail.gmail.com
* Consolidate the function pointer types used by dshash.c.Andres Freund2017-08-24
| | | | | | | | | | | | | | | | Commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20 introduced dshash with hash and compare functions like DynaHash's, and also variants that take a user data pointer instead of size. Simplify the interface by merging them into a single pair of function pointer types that take both size and a user data pointer. Since it is anticipated that memcmp and tag_hash behavior will be a common requirement, provide wrapper functions dshash_memcmp and dshash_memhash that conform to the new function types. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
* Fix unlikely shared memory leak after failure in dshash_create().Andres Freund2017-08-24
| | | | | | | | | Tidy-up for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20, based on a complaint from Andres Freund. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
* Hash tables backed by DSA shared memory.Andres Freund2017-08-22
| | | | | | | | | | | | | | | | | | | Add general purpose chaining hash tables for DSA memory. Unlike DynaHash in shared memory mode, these hash tables can grow as required, and cope with being mapped into different addresses in different backends. There is a wide range of potential users for such a hash table, though it's very likely the interface will need to evolve as we come to understand the needs of different kinds of users. E.g support for iterators and incremental resizing is planned for later commits and the details of the callback signatures are likely to change. Author: Thomas Munro Reviewed-By: John Gorman, Andres Freund, Dilip Kumar, Robert Haas Discussion: https://postgr.es/m/CAEepm=3d8o8XdVwYT6O=bHKsKAM2pu2D6sV1S_=4d+jStVCE7w@mail.gmail.com https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Revert "Permit dump/reload of not-too-large >1GB tuples"Alvaro Herrera2017-05-10
| | | | | | | | | | | | | | | | | | | This reverts commits fa2fa9955280 and 42f50cb8fa98. While the functionality that was intended to be provided by these commits is desired, the patch didn't actually solve as many of the problematic situations as we hoped, and it created a bunch of its own problems. Since we're going to require more extensive changes soon for other reasons and users have been working around these problems for a long time already, there is no point in spending effort in fixing this halfway measure. Per complaint from Tom Lane. Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us (Commit fa2fa9955280 had already been reverted in branches 9.5 as f858524ee4f and 9.6 as e9e44a0953, so this touches master only. Commit 42f50cb8fa98 was not present in the older branches.)
* Support hashed aggregation with grouping sets.Andrew Gierth2017-03-27
| | | | | | | | | | | | | | | | | | This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
* Fix overflow check in StringInfo; add missing castsAlvaro Herrera2017-01-10
| | | | | | | | | A few thinkos I introduced in fa2fa9955280. Also, amend a similarly broken comment. Report by Daniel Vérité. Authors: Daniel Vérité, Álvaro Herrera Discussion: https://postgr.es/m/1706e85e-60d2-494e-8a64-9af1e1b2186e@manitou-mail.org
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Permit dump/reload of not-too-large >1GB tuplesAlvaro Herrera2016-12-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our documentation states that our maximum field size is 1 GB, and that our maximum row size of 1.6 TB. However, while this might be attainable in theory with enough contortions, it is not workable in practice; for starters, pg_dump fails to dump tables containing rows larger than 1 GB, even if individual columns are well below the limit; and even if one does manage to manufacture a dump file containing a row that large, the server refuses to load it anyway. This commit enables dumping and reloading of such tuples, provided two conditions are met: 1. no single column is larger than 1 GB (in output size -- for bytea this includes the formatting overhead) 2. the whole row is not larger than 2 GB There are three related changes to enable this: a. StringInfo's API now has two additional functions that allow creating a string that grows beyond the typical 1GB limit (and "long" string). ABI compatibility is maintained. We still limit these strings to 2 GB, though, for reasons explained below. b. COPY now uses long StringInfos, so that pg_dump doesn't choke trying to emit rows longer than 1GB. c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation for the input tuple, which means that large tuples are accepted on input. Note that at this point we do not apply any further limit to the input tuple size. The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit length words to describe each row; and because the documentation is ambiguous on its signedness and libpq does consider it signed, we cannot use the highest-order bit. Additionally, the StringInfo API uses "int" (which is 4 bytes wide in most platforms) in many places, so we'd need to change that API too in order to improve, which has lots of fallout. Backpatch to 9.5, which is the oldest that has MemoryContextAllocExtended, a necessary piece of infrastructure. We could apply to 9.4 with very minimal additional effort, but any further than that would require backpatching "huge" allocations too. This is the largest set of changes we could find that can be back-patched without breaking compatibility with existing systems. Fixing a bigger set of problems (for example, dumping tuples bigger than 2GB, or dumping fields bigger than 1GB) would require changing the FE/BE protocol and/or changing the StringInfo API in an ABI-incompatible way, neither of which would be back-patchable. Authors: Daniel Vérité, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql
* Clarify the new Red-Black post-order traversal code a bit.Heikki Linnakangas2016-09-04
| | | | | | | | | | Coverity complained about the for(;;) loop, because it never actually iterated. It was used just to be able to use "break" to exit it early. I agree with Coverity, that's a bit confusing, so refactor the code to use if-else instead. While we're at it, use a local variable to hold the "current" node. That's shorter and clearer than referring to "iter->last_visited" all the time.