aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/jsonpath_scan.l
Commit message (Collapse)AuthorAge
* flex code modernization: Replace YY_EXTRA_TYPE define with flex optionPeter Eisentraut2025-01-06
| | | | | | | | Replace #define YY_EXTRA_TYPE with %option extra-type. The latter is the way recommended by the flex manual (available since flex 2.5.34). Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Partial pgindent of .l and .y filesPeter Eisentraut2024-12-25
| | | | | | | Trying to clean up the code a bit while we're working on these files for the reentrant scanner/pure parser patches. This cleanup only touches the code sections after the second '%%' in each file, via a manually-supervised and locally hacked up pgindent.
* jsonpath scanner: reentrant scannerPeter Eisentraut2024-12-24
| | | | | | | | | | | | | | | | | | Use the flex %option reentrant to make the generated scanner reentrant and thread-safe. Note: The parser was already pure. Simplify flex scan buffer management: Instead of constructing the buffer from pieces and then using yy_scan_buffer(), we can just use yy_scan_string(), which does the same thing internally. (Actually, we use yy_scan_bytes() here because we already have the length.) Use flex yyextra to handle context information, instead of global variables. This complements the other changes to make the scanner reentrant. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
* Small indenting fixes in jsonpath_scan.lPeter Eisentraut2024-11-29
| | | | | | Some lines were indented by an inconsistent number of spaces. While we're here, also fix some code that used the newline after left parenthesis style, which is obsolete.
* Remove useless casts to (void *)Peter Eisentraut2024-11-28
| | | | | | | | Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
* Implement various jsonpath methodsAndrew Dunstan2024-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements ithe jsonpath .bigint(), .boolean(), .date(), .decimal([precision [, scale]]), .integer(), .number(), .string(), .time(), .time_tz(), .timestamp(), and .timestamp_tz() methods. .bigint() converts the given JSON string or a numeric value to the bigint type representation. .boolean() converts the given JSON string, numeric, or boolean value to the boolean type representation. In the numeric case, only integers are allowed. We use the parse_bool() backend function to convert a string to a bool. .decimal([precision [, scale]]) converts the given JSON string or a numeric value to the numeric type representation. If precision and scale are provided for .decimal(), then it is converted to the equivalent numeric typmod and applied to the numeric number. .integer() and .number() convert the given JSON string or a numeric value to the int4 and numeric type representation. .string() uses the datatype's output function to convert numeric and various date/time types to the string representation. The JSON string representing a valid date/time is converted to the specific date or time type representation using jsonpath .date(), .time(), .time_tz(), .timestamp(), .timestamp_tz() methods. The changes use the infrastructure of the .datetime() method and perform the datatype conversion as appropriate. Unlike the .datetime() method, none of these methods accept a format template and use ISO DateTime format instead. However, except for .date(), the date/time related methods take an optional precision to adjust the fractional seconds. Jeevan Chalke, reviewed by Peter Eisentraut and Andrew Dunstan.
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Message wording improvementsPeter Eisentraut2023-07-10
|
* SQL JSON path enhanced numeric literalsPeter Eisentraut2023-03-05
| | | | | | | | | | | | | Add support for non-decimal integer literals and underscores in numeric literals to SQL JSON path language. This follows the rules of ECMAScript, as referred to by the SQL standard. Internally, all the numeric literal parsing of jsonpath goes through numeric_in, which already supports all this, so this patch is just a bit of lexer work and some tests and documentation. Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b11b25bb-6ec1-d42f-cedd-311eae59e1fb@enterprisedb.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Convert jsonpath's input function to report errors softlyAndrew Dunstan2022-12-24
| | | | | | Reviewed by Tom Lane Discussion: https://postgr.es/m/a8dc5700-c341-3ba8-0507-cc09881e6200@dunslane.net
* Harmonize more lexer function parameter names.Peter Geoghegan2022-09-22
| | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions for several "lexer adjacent" backend functions. These were missed by commit aab06442. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
* Build all Flex files standaloneJohn Naylor2022-09-04
| | | | | | | | | | | | | The proposed Meson build system will need a way to ignore certain generated files in order to coexist with the autoconf build system, and C files generated by Flex which are #include'd into .y files make this more difficult. In similar vein to 72b1e3a21, arrange for all Flex C files to compile to their own .o targets. Reviewed by Andres Freund Discussion: https://www.postgresql.org/message-id/20220810171935.7k5zgnjwqzalzmtm%40awork3.anarazel.de Discussion: https://www.postgresql.org/message-id/CAFBsxsF8Gc2StS3haXofshHCzqNMRXiSxvQEYGwnFsTmsdwNeg@mail.gmail.com
* Indent C code in flex and bison filesPeter Eisentraut2022-05-13
| | | | | | In the style of pgindent, done semi-manually. Discussion: https://www.postgresql.org/message-id/flat/7d062ecc-7444-23ec-a159-acd8adf9b586%40enterprisedb.com
* Make JSON path numeric literals more correctPeter Eisentraut2022-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per ECMAScript standard (ECMA-262, referenced by SQL standard), the syntax forms .1 1. should be allowed for decimal numeric literals, but the existing implementation rejected them. Also, by the same standard, reject trailing junk after numeric literals. Note that the ECMAScript standard for numeric literals is in respects like these slightly different from the JSON standard, which might be the original cause for this discrepancy. A change is that this kind of syntax is now rejected: 1.type() This needs to be written as (1).type() This is correct; normal JavaScript also does not accept this syntax. We also need to fix up the jsonpath output function for this case. We put parentheses around numeric items if they are followed by another path item. Reviewed-by: Nikita Glukhov <n.gluhov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/50a828cc-0a00-7791-7883-2ed06dfb2dbb@enterprisedb.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Add lcov exclusion markers to jsonpath scannerPeter Eisentraut2020-05-26
| | | | | This was done for all scanners in 421167362242ce1fb46d6d720798787e7cd65aad but not added to the new one.
* Allow Unicode escapes in any server encoding, not only UTF-8.Tom Lane2020-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | SQL includes provisions for numeric Unicode escapes in string literals and identifiers. Previously we only accepted those if they represented ASCII characters or the server encoding was UTF-8, making the conversion to internal form trivial. This patch adjusts things so that we'll call the appropriate encoding conversion function in less-trivial cases, allowing the escape sequence to be accepted so long as it corresponds to some character available in the server encoding. This also applies to processing of Unicode escapes in JSONB. However, the old restriction still applies to client-side JSON processing, since that hasn't got access to the server's encoding conversion infrastructure. This patch includes some lexer infrastructure that simplifies throwing errors with error cursors pointing into the middle of a string (or other complex token). For the moment I only used it for errors relating to Unicode escapes, but we might later expand the usage to some other cases. Patch by me, reviewed by John Naylor. Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Implement jsonpath .datetime() methodAlexander Korotkov2019-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements jsonpath .datetime() method as it's specified in SQL/JSON standard. There are no-argument and single-argument versions of this method. No-argument version selects first of ISO datetime formats matching input string. Single-argument version accepts template string as its argument. Additionally to .datetime() method itself this commit also implements comparison ability of resulting date and time values. There is some difficulty because exising jsonb_path_*() functions are immutable, while comparison of timezoned and non-timezoned types involves current timezone. At first, current timezone could be changes in session. Moreover, timezones themselves are not immutable and could be updated. This is why we let existing immutable functions throw errors on such non-immutable comparison. In the same time this commit provides jsonb_path_*_tz() functions which are stable and support operations involving timezones. As new functions are added to the system catalog, catversion is bumped. Support of .datetime() method was the only blocker prevents T832 from being marked as supported. sql_features.txt is updated correspondingly. Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov. Heavily revised by me. Comments were adjusted by Liudmila Mantrova. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Alexander Korotkov, Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Liudmila Mantrova Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
* Fix some minor spec-compliance issues in jsonpath lexer.Tom Lane2019-09-20
| | | | | | | | | | | | | | | | | | | | | | Although the SQL/JSON tech report makes reference to ECMAScript which allows both single- and double-quoted strings, all the rest of the report speaks only of double-quoted string literals in jsonpaths. That's more compatible with JSON itself; moreover single-quoted strings are hard to use inside a jsonpath that is itself a single-quoted SQL literal. So guess that the intent is to allow only double-quoted literals, and remove lexer support for single-quoted literals. It'll be less painful to add this again later if we're wrong, than to remove a shipped feature. Also, adjust the lexer so that unrecognized backslash sequences are treated as just meaning the escaped character, not as errors. This change has much better support in the standards, as JSON, JavaScript and ECMAScript all make it plain that that's what's supposed to happen. Back-patch to v12. Discussion: https://postgr.es/m/CAPpHfdvDci4iqNF9fhRkTqhe-5_8HmzeLt56drH%2B_Rv2rNRqfg@mail.gmail.com
* Fix typos.Amit Kapila2019-05-26
| | | | | | | Reported-by: Alexander Lakhin Author: Alexander Lakhin Reviewed-by: Amit Kapila and Tom Lane Discussion: https://postgr.es/m/7208de98-add8-8537-91c0-f8b089e2928c@gmail.com
* More message style fixesAlvaro Herrera2019-05-16
| | | | Discussion: https://postgr.es/m/20190515183005.GA26486@alvherre.pgsql
* Improve error reporting in jsonpathAlexander Korotkov2019-05-08
| | | | | | | | | | | | | | | This commit contains multiple improvements to error reporting in jsonpath including but not limited to getting rid of following things: * definition of error messages in macros, * errdetail() when valueable information could fit to errmsg(), * word "singleton" which is not properly explained anywhere, * line breaks in error messages. Reported-by: Tom Lane Discussion: https://postgr.es/m/14890.1555523005%40sss.pgh.pa.us Author: Alexander Korotkov Reviewed-by: Tom Lane
* Minor jsonpath fixes.Tom Lane2019-04-17
| | | | | | | | Restore missed "make clean" rule, fix misspelling. John Naylor Discussion: https://postgr.es/m/CACPNZCt5B8jDCCGQiFoSuqmg-za_NCy4QDioBTLaNRih9+-bXg@mail.gmail.com
* Restrict some cases in parsing numerics in jsonpathAlexander Korotkov2019-04-01
| | | | | | | | | Jsonpath now accepts integers with leading zeroes and floats starting with a dot. However, SQL standard requires to follow JSON specification, which doesn't allow none of these cases. Our json[b] datatypes also restrict that. So, restrict it in jsonpath altogether. Author: Nikita Glukhov
* Get rid of backtracking in jsonpath_scan.lAlexander Korotkov2019-03-25
| | | | | | | | | | Non-backtracking flex parsers work faster than backtracking ones. So, this commit gets rid of backtracking in jsonpath_scan.l. That required explicit handling of some cases as well as manual backtracking for some cases. More regression tests for numerics are added. Discussion: https://mail.google.com/mail/u/0?ik=a20b091faa&view=om&permmsgid=msg-f%3A1628425344167939063 Author: John Naylor, Nikita Gluknov, Alexander Korotkov
* Cosmetic changes for jsonpath_gram.y and jsonpath_scan.lAlexander Korotkov2019-03-25
| | | | | | | | | | | This commit include formatting improvements, renamings and comments. Also, it makes jsonpath_scan.l be more uniform with other our lexers. Firstly, states names are renamed to more short alternatives. Secondly, <INITIAL> prefix removed from the rules. Corresponding rules are moved to the tail, so they would anyway work only in initial state. Author: Alexander Korotkov Reviewed-by: John Naylor
* Get rid of jsonpath_gram.h and jsonpath_scanner.hAlexander Korotkov2019-03-20
| | | | | | | | | Jsonpath grammar and scanner are both quite small. It doesn't worth complexity to compile them separately. This commit makes grammar and scanner be compiled at once. Therefore, jsonpath_gram.h and jsonpath_gram.h are no longer needed. This commit also does some reorganization of code in jsonpath_gram.y. Discussion: https://postgr.es/m/d47b2023-3ecb-5f04-d253-d557547cf74f%402ndQuadrant.com
* Rename typedef in jsonpath_gram.y from "string" to "JsonPathString"Alexander Korotkov2019-03-19
| | | | Reason is the same as in 75c57058b0.
* Rename typedef in jsonpath_scan.l from "keyword" to "JsonPathKeyword"Alexander Korotkov2019-03-19
| | | | | | | Typedef name should be both unique and non-intersect with variable names across all the sources. That makes both pg_indent and debuggers happy. Discussion: https://postgr.es/m/23865.1552936099%40sss.pgh.pa.us
* Fix whitespacePeter Eisentraut2019-03-19
|
* Apply const qualifier to keywords of jsonpath_scan.lAlexander Korotkov2019-03-17
| | | | | Discussion: https://postgr.es/m/CAEeOP_a-Pfy%3DU9-f%3DgQ0AsB8FrxrC8xCTVq%2BeO71-2VoWP5cag%40mail.gmail.com Author: Mark G
* Partial implementation of SQL/JSON path languageAlexander Korotkov2019-03-16
SQL 2016 standards among other things contains set of SQL/JSON features for JSON processing inside of relational database. The core of SQL/JSON is JSON path language, allowing access parts of JSON documents and make computations over them. This commit implements partial support JSON path language as separate datatype called "jsonpath". The implementation is partial because it's lacking datetime support and suppression of numeric errors. Missing features will be added later by separate commits. Support of SQL/JSON features requires implementation of separate nodes, and it will be considered in subsequent patches. This commit includes following set of plain functions, allowing to execute jsonpath over jsonb values: * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]). * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]). This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb, jsonpath) correspondingly. These operators will have an index support (implemented in subsequent patches). Catversion bumped, to add new functions and operators. Code was written by Nikita Glukhov and Teodor Sigaev, revised by me. Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work was inspired by Oleg Bartunov. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov