aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/jsonb_gin.c
Commit message (Collapse)AuthorAge
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Remove unused #include's from backend .c filesPeter Eisentraut2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Fix jsonb subscripting to cope with toasted subscript values.Tom Lane2022-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | jsonb_get_element() was incautious enough to use VARDATA() and VARSIZE() directly on an arbitrary text Datum. That of course fails if the Datum is short-header, compressed, or out-of-line. The typical result would be failing to match any element of a jsonb object, though matching the wrong one seems possible as well. setPathObject() was slightly brighter, in that it used VARDATA_ANY and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for short-header Datums. push_path() had the same issue. This could result in faulty subscripted insertions, though keys long enough to cause a problem are likely rare in the wild. Having seen these, I looked around for unsafe usages in the rest of the adt/json* files. There are a couple of places where it's not immediately obvious that the Datum can't be compressed or out-of-line, so I added pg_detoast_datum_packed() to cope if it is. Also, remove some other usages of VARDATA/VARSIZE on Datums we just extracted from a text array. Those aren't actively broken, but they will become so if we ever start allowing short-header array elements, which does not seem like a terribly unreasonable thing to do. In any case they are not great coding examples, and they could also do with comments pointing out that we're assuming we don't need pg_detoast_datum_packed. Per report from exe-dealer@yandex.ru. Patch by me, but thanks to David Johnston for initial investigation. Back-patch to v14 where jsonb subscripting was introduced. Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru
* Avoid using list_length() to test for empty list.Tom Lane2022-08-17
| | | | | | | | | | | | | | | | | | | | | | | | The standard way to check for list emptiness is to compare the List pointer to NIL; our list code goes out of its way to ensure that that is the only representation of an empty list. (An acceptable alternative is a plain boolean test for non-null pointer, but explicit mention of NIL is usually preferable.) Various places didn't get that memo and expressed the condition with list_length(), which might not be so bad except that there were such a variety of ways to check it exactly: equal to zero, less than or equal to zero, less than one, yadda yadda. In the name of code readability, let's standardize all those spellings as "list == NIL" or "list != NIL". (There's probably some microscopic efficiency gain too, though few of these look to be at all performance-critical.) A very small number of cases were left as-is because they seemed more consistent with other adjacent list_length tests that way. Peter Smith, with bikeshedding from a number of us Discussion: https://postgr.es/m/CAHut+PtQYe+ENX5KrONMfugf0q6NHg4hR5dAhqEXEc2eefFeig@mail.gmail.com
* Add construct_array_builtin, deconstruct_array_builtinPeter Eisentraut2022-07-01
| | | | | | | | | | | | | | | There were many calls to construct_array() and deconstruct_array() for built-in types, for example, when dealing with system catalog columns. These all hardcoded the type attributes necessary to pass to these functions. To simplify this a bit, add construct_array_builtin(), deconstruct_array_builtin() as wrappers that centralize this hardcoded knowledge. This simplifies many call sites and reduces the amount of hardcoded stuff that is spread around. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2914356f-9e5f-8c59-2995-5997fc48bcba%40enterprisedb.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Introduce macros for typalign and typstorage constants.Tom Lane2020-03-04
| | | | | | | | | | | | | | | | | | | | | Our usual practice for "poor man's enum" catalog columns is to define macros for the possible values and use those, not literal constants, in C code. But for some reason lost in the mists of time, this was never done for typalign/attalign or typstorage/attstorage. It's never too late to make it better though, so let's do that. The reason I got interested in this right now is the need to duplicate some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch. But in general, this sort of change aids greppability and readability, so it's a good idea even without any specific motivation. I may have missed a few places that could be converted, and it's even more likely that pending patches will re-introduce some hard-coded references. But that's not fatal --- there's no expectation that we'd actually change any of these values. We can clean up stragglers over time. Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
* Move src/backend/utils/hash/hashfn.c to src/commonRobert Haas2020-02-27
| | | | | | | | | | | | | | This also involves renaming src/include/utils/hashutils.h, which becomes src/include/common/hashfn.h. Perhaps an argument can be made for keeping the hashutils.h name, but it seemed more consistent to make it match the name of the file, and also more descriptive of what is actually going on here. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not to break the Windows build from Davinder Singh and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* GIN support for @@ and @? jsonpath operatorsAlexander Korotkov2019-04-01
| | | | | | | | | | | | | | | | This commit makes existing GIN operator classes jsonb_ops and json_path_ops support "jsonb @@ jsonpath" and "jsonb @? jsonpath" operators. Basic idea is to extract statements of following form out of jsonpath. key1.key2. ... .keyN = const The rest of jsonpath is rechecked from heap. Catversion is bumped. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Alexander Korotkov Reviewed-by: Jonathan Katz, Pavel Stehule
* Move hash_any prototype from access/hash.h to utils/hashutils.hAlvaro Herrera2019-03-11
| | | | | | | | | | | | | | | | | | | | | ... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Make DatumGetFoo/PG_GETARG_FOO/PG_RETURN_FOO macro names more consistent.Tom Lane2017-09-18
| | | | | | | | | | | | | | | | | | | | | By project convention, these names should include "P" when dealing with a pointer type; that is, if the result of a GETARG macro is of type FOO *, it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO. Some newer types such as JSONB and ranges had not followed the convention, and a number of contrib modules hadn't gotten that memo either. Rename the offending macros to improve consistency. In passing, fix a few places that thought PG_DETOAST_DATUM() returns a Datum; it does not, it returns "struct varlena *". Applying DatumGetPointer to that happens not to cause any bad effects today, but it's formally wrong. Also, adjust an ltree macro that was designed without any thought for what pgindent would do with it. This is all cosmetic and shouldn't have any impact on generated code. Mark Dilger, some further tweaks by me Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com
* Assume deconstruct_array() outputs are untoasted.Noah Misch2017-03-12
| | | | | | In functions that issue a deconstruct_array() call, consistently use plain VARSIZE()/VARDATA() on the array elements. Prior practice was divided between those and VARSIZE_ANY_EXHDR()/VARDATA_ANY().
* Move some things from builtins.h to new header filesPeter Eisentraut2017-01-20
| | | | This avoids that builtins.h has to include additional header files.
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Fix erroneous hash calculations in gin_extract_jsonb_path().Tom Lane2015-11-05
| | | | | | | | | | | | | | | | | | | | The jsonb_path_ops code calculated hash values inconsistently in some cases involving nested arrays and objects. This would result in queries possibly not finding entries that they should find, when using a jsonb_path_ops GIN index for the search. The problem cases involve JSONB values that contain both scalars and sub-objects at the same nesting level, for example an array containing both scalars and sub-arrays. To fix, reset the current stack->hash after processing each value or sub-object, not before; and don't try to be cute about the outermost level's initial hash. Correcting this means that existing jsonb_path_ops indexes may now be inconsistent with the new hash calculation code. The symptom is the same --- searches not finding entries they should find --- but the specific rows affected are likely to be different. Users will need to REINDEX jsonb_path_ops indexes to make sure that all searches work as expected. Per bug #13756 from Daniel Cheng. Back-patch to 9.4 where the faulty logic was introduced.
* Use JsonbIteratorToken consistently in automatic variable declarations.Noah Misch2015-10-11
| | | | | | Many functions stored JsonbIteratorToken values in variables of other integer types. Also, standardize order relative to other declarations. Expect compilers to generate the same code before and after this change.
* Move strategy numbers to include/access/stratnum.hAlvaro Herrera2015-05-15
| | | | | | | | | | | | | | | | | | | | For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* Rename jsonb_hash_ops to jsonb_path_ops.Tom Lane2014-05-11
| | | | | | | | | | | | | There's no longer much pressure to switch the default GIN opclass for jsonb, but there was still some unhappiness with the name "jsonb_hash_ops", since hashing is no longer a distinguishing property of that opclass, and anyway it seems like a relatively minor detail. At the suggestion of Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the important characteristic that each index entry depends on the entire path from the document root to the indexed value. Also add a user-facing explanation of the implementation properties of these two opclasses.
* Improve key representation for GIN jsonb_ops, and fix existence-search bug.Tom Lane2014-05-09
| | | | | | | | | | | | | | | | | | | | | | Change the key representation so that values that would exceed 127 bytes are hashed into short strings, and so that the original JSON datatype of each value is recorded in the index. The hashing rule eliminates the major objection to having this opclass be the default for jsonb, namely that it could fail for plausible input data (due to GIN's restrictions on maximum key length). Preserving datatype information doesn't really buy us much right now, but it requires no extra space compared to the previous way, and it might be useful later. Also, change the consistency-checking functions to request recheck for exists (jsonb ? text) and related operators. The original analysis that this is an exactly checkable query was incorrect, since the index does not preserve information about whether a key appears at top level in the indexed JSON object. Add a test case demonstrating the problem. Make some other, mostly cosmetic improvements to the code in jsonb_gin.c as well. catversion bump due to on-disk data format change in jsonb_ops indexes.
* Clean up jsonb code.Heikki Linnakangas2014-05-07
| | | | | | | | | | | | | | | | | | | | | | | The main target of this cleanup is the convertJsonb() function, but I also touched a lot of other things that I spotted into in the process. The new convertToJsonb() function uses an output buffer that's resized on demand, so the code to estimate of the size of JsonbValue is removed. The on-disk format was not changed, even though I refactored the structs used to handle it. The term "superheader" is replaced with "container". The jsonb_exists_any and jsonb_exists_all functions no longer sort the input array. That was a premature optimization, the idea being that if there are duplicates in the input array, you only need to check them once. Also, sorting the array saves some effort in the binary search used to find a key within an object. But there were drawbacks too: the sorting and deduplicating obviously isn't free, and in the typical case there are no duplicates to remove, and the gain in the binary search was minimal. Remove all that, which makes the code simpler too. This includes a bug-fix; the total length of the elements in a jsonb array or object mustn't exceed 2^28. That is now checked.
* Fix some more confusion between uint32 and Datum.Tom Lane2014-05-06
|
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* De-anonymize the union in JsonbValue.Tom Lane2014-04-02
| | | | Needed for strict C89 compliance.
* Rename GinLogicValue to GinTernaryValue.Heikki Linnakangas2014-03-31
| | | | | It's more descriptive. Also, get rid of the enum, and use #defines instead, per Greg Stark's suggestion.
* Introduce jsonb, a structured format for storing json.Andrew Dunstan2014-03-23
The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund