aboutsummaryrefslogtreecommitdiff
path: root/src/backend/parser/scan.l
Commit message (Collapse)AuthorAge
* Change the way UESCAPE is lexed, to reduce the size of the flex tables.Heikki Linnakangas2013-03-14
| | | | | | | The error rule used to avoid backtracking with the U&'...' UESCAPE 'x' syntax bloated the flex tables, so refactor that. This patch makes the error rule shorter, by introducing a new exclusive flex state that's entered after parsing U&'...'. This shrinks the postgres binary by about 220kB.
* Improve handling of ereport(ERROR) and elog(ERROR).Tom Lane2013-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 71450d7fd6c7cf7b3e38ac56e363bff6a681973c, we added code to inform suitably-intelligent compilers that ereport() doesn't return if the elevel is ERROR or higher. This patch extends that to elog(), and also fixes a double-evaluation hazard that the previous commit created in ereport(), as well as reducing the emitted code size. The elog() improvement requires the compiler to support __VA_ARGS__, which should be available in just about anything nowadays since it's required by C99. But our minimum language baseline is still C89, so add a configure test for that. The previous commit assumed that ereport's elevel could be evaluated twice, which isn't terribly safe --- there are already counterexamples in xlog.c. On compilers that have __builtin_constant_p, we can use that to protect the second test, since there's no possible optimization gain if the compiler doesn't know the value of elevel. Otherwise, use a local variable inside the macros to prevent double evaluation. The local-variable solution is inferior because (a) it leads to useless code being emitted when elevel isn't constant, and (b) it increases the optimization level needed for the compiler to recognize that subsequent code is unreachable. But it seems better than not teaching non-gcc compilers about unreachability at all. Lastly, if the compiler has __builtin_unreachable(), we can use that instead of abort(), resulting in a noticeable code savings since no function call is actually emitted. However, it seems wise to do this only in non-assert builds. In an assert build, continue to use abort(), so that the behavior will be predictable and debuggable if the "impossible" happens. These changes involve making the ereport and elog macros emit do-while statement blocks not just expressions, which forces small changes in a few call sites. Andres Freund, Tom Lane, Heikki Linnakangas
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Add makefile rules to check for backtracking in backend and psql lexers.Tom Lane2011-08-25
| | | | | Per discussion, we should enforce the policy of "no backtracking" in these performance-sensitive scanners.
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Remove useless whitespace at end of linesPeter Eisentraut2010-11-23
|
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Change the default value of standard_conforming_strings to on.Robert Haas2010-07-20
| | | | | This change should be publicized to driver maintainers at once and release-noted as an incompatibility with previous releases.
* Change the notation for calling functions with named parameters fromTom Lane2010-05-30
| | | | | | | | | | | | | "val AS name" to "name := val", as per recent discussion. This patch catches everything in the original named-parameters patch, but I'm not certain that no other dependencies snuck in later (grepping the source tree for all uses of AS soon proved unworkable). In passing I note that we've dropped the ball at least once on keeping ecpg's lexer (as opposed to parser) in sync with the backend. It would be a good idea to go through all of pgc.l and see if it's in sync now. I didn't attempt that at the moment.
* Fix unportable use of isxdigit() with char (rather than unsigned char)Tom Lane2010-01-16
| | | | | argument, per warnings from buildfarm member pika. Also clean up code formatting a trifle.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Remove plpgsql's separate lexer (finally!), in favor of using the core lexerTom Lane2009-11-12
| | | | | | | directly. This was a lot of trouble, but should be worth it in terms of not having to keep the plpgsql lexer in step with core anymore. In addition the handling of keywords is significantly better-structured, allowing us to de-reserve a number of words that plpgsql formerly treated as reserved.
* Re-refactor the core scanner's API, in order to get out from under the problemTom Lane2009-11-09
| | | | | | | | | | | | | | | | | | of different parsers having different YYSTYPE unions that they want to use with it. I defined a new union core_YYSTYPE that is just the (very short) list of semantic values returned by the core scanner. I had originally worried that this would require an extra interface layer, but actually we can have parser.c's base_yylex (formerly filtered_base_yylex) take care of that at no extra cost. Names associated with the core scanner are now "core_yy_foo", with "base_yy_foo" being used in the core Bison parser and the parser.c interface layer. This solves the last serious stumbling block to eliminating plpgsql's separate lexer. One restriction that will still be present is that plpgsql and the core will have to agree on the token numbers assigned to tokens that can be returned by the core lexer. Since Bison doesn't seem willing to accept external assignments of those numbers, we'll have to live with decreeing that core and plpgsql grammars declare these tokens first and in the same order.
* Sync psql's scanner with recent changes in backend scanner's flex rules.Tom Lane2009-09-27
| | | | Marko Kreen, Tom Lane
* Prevent isolated second surrogate in U& syntaxPeter Eisentraut2009-09-25
|
* Remove backup states from Unicode escapes patchPeter Eisentraut2009-09-25
|
* Unicode escapes in E'...' stringsPeter Eisentraut2009-09-22
| | | | Author: Marko Kreen <markokr@gmail.com>
* Surrogate pair support for U& string and identifier syntaxPeter Eisentraut2009-09-21
| | | | | This is mainly to make the functionality consistent with the proposed \u escape syntax.
* Tweak the core scanner so that it can be used by plpgsql too.Tom Lane2009-07-14
| | | | | | | | | | | | | | Changes: Pass in the keyword lookup array instead of having it be hardwired. (This incidentally allows elimination of some duplicate coding in ecpg.) Re-order the token declarations in gram.y so that non-keyword tokens have numbers that won't change when keywords are added or removed. Add ".." and ":=" to the set of tokens recognized by scan.l. (Since these combinations are nowhere legal in core SQL, this does not change anything except the precise wording of the error you get when you write this.)
* Although the flex documentation avers that yyalloc and yyrealloc takeTom Lane2009-07-13
| | | | | | | size_t arguments, the emitted scanner actually prototypes them with type yy_size_t, which is sometimes not the same thing depending on flex version and platform. Easiest fix seems to be to use yy_size_t. Per buildfarm results.
* Convert the core lexer and parser into fully reentrant code, by making useTom Lane2009-07-13
| | | | | | | | | | | of features added to flex and bison since this code was originally written. This change doesn't in itself offer any new capability, but it's needed infrastructure for planned improvements in plpgsql. Another feature now available in flex is the ability to make it use palloc instead of malloc, so do that to avoid possible memory leaks. (We should at some point change the other lexers likewise, but this commit doesn't touch them.)
* Move some declarations in the raw-parser header files to create a clearerTom Lane2009-07-12
| | | | | | | | | | distinction between the external API (parser.h) and declarations that only need to be visible within the raw parser code (gramparse.h, which now is only included by parser.c, gram.y, scan.l, and keywords.c). This is in preparation for the upcoming change to a reentrant lexer, which will require referencing YYSTYPE in the declarations of base_yylex and filtered_base_yylex, hence gram.h will have to be included by gramparse.h. We don't want any more files than absolutely necessary to depend on gram.h, so some cleanup is called for.
* Make new complaint about unsafe Unicode literals include an error location.Tom Lane2009-05-05
| | | | Every other ereport in scan.l has one, this should too.
* Disable the use of Unicode escapes in string constants (U&'') whenPeter Eisentraut2009-05-05
| | | | standard_conforming_strings is not on, for security reasons.
* Fix de-escaping checks so that we will reject \000 as well as other invalidlyTom Lane2009-04-19
| | | | encoded sequences. Per discussion of a couple of days ago.
* Fix broken {xufailed} production that made HEAD fail onTom Lane2009-04-14
| | | | | | | | select u&42 from table-with-a-u-column; Also fix missing SET_YYLLOC() in the {dolqfailed} production that I suppose this was based on. The latter is a pre-existing bug, but the only effect is to misplace the error cursor by one token, so probably not worth backpatching.
* Clarify to the translator that yyerror() deals with the translation ofPeter Eisentraut2009-03-04
| | | | | "syntax error", not the literal string. I was previously confused on this matter, but I have now verified that everything is translated properly.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Unicode escapes in strings and identifiersPeter Eisentraut2008-10-29
|
* Add a bunch of new error location reports to parse-analysis error messages.Tom Lane2008-09-01
| | | | | There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.
* Remove all traces that suggest that a non-Bison yacc might be supported, andPeter Eisentraut2008-08-29
| | | | | change build system to use only Bison. Simplify build rules, make file names uniform. Don't build the token table header file where it is not needed.
* Add "%option noinput" to the scanners to avoid compiler warnings. GCC 4.3Peter Eisentraut2008-05-09
| | | | began to realize that the input() function isn't used and printed warnings.
* Oops, change should go in scan.l to survive a clean checkout and not justMagnus Hagander2008-04-04
| | | | a make clean...
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* Perform post-escaping encoding validity checks on SQL literals and COPY inputAndrew Dunstan2007-09-12
| | | | so that invalidly encoded data cannot enter the database by these means.
* Increase the initial size of StringInfo buffers to 1024 bytes (from 256);Tom Lane2007-08-12
| | | | | | | | | likewise increase the initial size of the scanner's literal buffer to 1024 (from 128). Instrumentation of the regression tests suggests that this saves a useful amount of repalloc() traffic --- the number of calls occurring during one set of tests drops from about 6900 to about 3900. The old sizes were chosen in the late 90's with an eye to machines much smaller than are common today.
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Fix bugs in plpgsql and ecpg caused by assuming that isspace() would onlyTom Lane2006-09-22
| | | | | | | | | return true for exactly the characters treated as whitespace by their flex scanners. Per report from Victor Snezhko and subsequent investigation. Also fix a passel of unsafe usages of <ctype.h> functions, that is, ye olde char-vs-unsigned-char issue. I won't miss <ctype.h> when we are finally able to stop using it.
* Revert FETCH/MOVE int64 patch. Was using incorrect checks forBruce Momjian2006-09-03
| | | | fetch/move in scan.l.
* Change FETCH/MOVE to use int8.Bruce Momjian2006-09-02
| | | | Dhanaraj M
* Add a new GUC parameter backslash_quote, which determines whether the SQLTom Lane2006-05-21
| | | | | | | | | | | | | | | | | parser will allow "\'" to be used to represent a literal quote mark. The "\'" representation has been deprecated for some time in favor of the SQL-standard representation "''" (two single quote marks), but it has been used often enough that just disallowing it immediately won't do. Hence backslash_quote allows the settings "on", "off", and "safe_encoding", the last meaning to allow "\'" only if client_encoding is a valid server encoding. That is now the default, and the reason is that in encodings such as SJIS that allow 0x5c (ASCII backslash) to be the last byte of a multibyte character, accepting "\'" allows SQL-injection attacks as per CVE-2006-2314 (further details will be published after release). The "on" setting is available for backward compatibility, but it must not be used with clients that are exposed to untrusted input. Thanks to Akio Ishida and Yasuo Ohgaki for identifying this security issue.
* Code review for standard_conforming_strings patch. Fix it so it does notTom Lane2006-05-11
| | | | | | | throw warnings for 100%-SQL-standard constructs, clean up some minor infelicities, try to un-break ecpg to the best of my ability. (It's not clear how ecpg is going to find out the setting of standard_conforming_strings, though.) I think pg_dump still needs work, too.
* Improve parser so that we can show an error cursor position for errorsTom Lane2006-03-14
| | | | | | | | | | | during parse analysis, not only errors detected in the flex/bison stages. This is per my earlier proposal. This commit includes all the basic infrastructure, but locations are only tracked and reported for errors involving column references, function calls, and operators. More could be done later but this seems like a good set to start with. I've also moved the ReportSyntaxErrorPosition logic out of psql and into libpq, which should make it available to more people --- even within psql this is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
* Remove the stub support we had for UNION JOIN; per discussion, this isTom Lane2006-03-07
| | | | | | not likely ever to be implemented seeing it's been removed from SQL2003. This allows getting rid of the 'filter' version of yylex() that we had in parser.c, which should save at least a few microseconds in parsing.
* Enable standard_conforming_strings to be turned on.Bruce Momjian2006-03-06
| | | | Kevin Grittner
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Mark unescape_single_char() "static": as far as I can see this functionNeil Conway2006-02-18
| | | | is only used by scan.l/scan.c
* Reject operator names >= NAMEDATALEN characters. These will not workTom Lane2005-08-16
| | | | | anyway, and in assert-enabled builds you are likely to get an assertion failure. Backpatch as far as 7.3; 7.2 seems not to have the problem.
* Code review for escape-strings patch. Sync psql and plpgsql lexersTom Lane2005-06-26
| | | | | | | with main, avoid using a SQL-defined SQLSTATE for what is most definitely not a SQL-compatible error condition, fix documentation omissions, adhere to message style guidelines, don't use two GUC_REPORT variables when one is sufficient. Nothing done about pg_dump issues.