SQL Syntax
A description of the general syntax of SQL.
SQL manipulates sets of data. The language is
composed of various key words. Arithmetic
and procedural expressions are allowed. We will cover these topics
in this chapter; subsequent chapters will include details on data
types, functions, and operators.
Key Words
SQL92 defines key words
for the language
which have specific meaning. Some key words are
reserved, which indicates that they are
restricted to appear in only certain contexts. Other key words are
not restricted, which indicates that in certain
contexts they
have a specific meaning but are not otherwise constrained.
Postgres implements an extended subset of the
SQL92 and SQL3 languages. Some language
elements are not as restricted in this implementation as is
called for in the language standards, in part due
to the extensibility features of Postgres.
Information on SQL92 and SQL3 key words
is derived from .
Reserved Key Words
SQL92 and SQL3 have
reserved key words which are not allowed
as identifiers and not allowed in any usage other than as fundamental
tokens in SQL statements.
Postgres has additional key words
which have similar restrictions. In particular, these key words
are not allowed as column or table names, though in some cases
they are allowed to be column labels (i.e. in AS clauses).
Any string can be specified as an identifier if surrounded by
double quotes (like this!
). Some care is required since
such an identifier will be case sensitive
and will retain embedded whitespace and most other special characters.
The following are Postgres
reserved words which are neither SQL92
nor SQL3 reserved words. These are allowed
to be present as column labels, but not as identifiers:
ABORT ANALYZE
BINARY
CLUSTER CONSTRAINT COPY
DO
EXPLAIN EXTEND
LISTEN LOAD LOCK
MOVE
NEW NONE NOTIFY
OFFSET
RESET
SETOF SHOW
UNLISTEN UNTIL
VACUUM VERBOSE
The following are Postgres
reserved words which are also SQL92
or SQL3 reserved words, and which
are allowed to be present as column labels, but not as identifiers:
ALL ANY ASC BETWEEN BIT BOTH
CASE CAST CHAR CHARACTER CHECK COALESCE COLLATE COLUMN
CONSTRAINT CROSS CURRENT CURRENT_DATE CURRENT_TIME
CURRENT_TIMESTAMP CURRENT_USER
DEC DECIMAL DEFAULT DESC DISTINCT
ELSE END EXCEPT EXISTS EXTRACT
FALSE FLOAT FOR FOREIGN FROM FULL
GLOBAL GROUP
HAVING
IN INNER INTERSECT INTO IS
JOIN
LEADING LEFT LIKE LOCAL
NATURAL NCHAR NOT NULL NULLIF NUMERIC
ON OR ORDER OUTER OVERLAPS
POSITION PRECISION PRIMARY PUBLIC
REFERENCES RIGHT
SELECT SESSION_USER SOME SUBSTRING
TABLE THEN TO TRANSACTION TRIM TRUE
UNION UNIQUE USER
VARCHAR
WHEN WHERE
The following are Postgres
reserved words which are also SQL92
or SQL3 reserved words:
ADD ALTER AND AS
BEGIN BY
CASCADE CLOSE COMMIT CREATE CURSOR
DECLARE DEFAULT DELETE DESC DISTINCT DROP
EXECUTE EXISTS EXTRACT
FETCH FLOAT FOR FROM FULL
GRANT
HAVING
IN INNER INSERT INTERVAL INTO IS
JOIN
LEADING LEFT LIKE LOCAL
NAMES NATIONAL NATURAL NCHAR NO NOT NULL
ON OR OUTER
PARTIAL PRIMARY PRIVILEGES PROCEDURE PUBLIC
REFERENCES REVOKE RIGHT ROLLBACK
SELECT SET SUBSTRING
TO TRAILING TRIM
UNION UNIQUE UPDATE USING
VALUES VARCHAR VARYING VIEW
WHERE WITH WORK
The following are SQL92 reserved key words which
are not Postgres reserved key words, but which
if used as function names are always translated into the function
CHAR_LENGTH:
CHARACTER_LENGTH
The following are SQL92 or SQL3
reserved key words which
are not Postgres reserved key words, but
if used as type names are always translated into an alternate, native type:
BOOLEAN DOUBLE FLOAT INT INTEGER INTERVAL REAL SMALLINT
The following are not keywords of any kind, but when used in the
context of a type name are translated into a native
Postgres type, and when used in the
context of a function name are translated into a native function:
DATETIME TIMESPAN
(translated to TIMESTAMP and INTERVAL,
respectively). This feature is intended to help with
transitioning to v7.0, and will be removed in the next full
release (likely v7.1).
The following are either SQL92
or SQL3 reserved key words
which are not key words in Postgres.
These have no proscribed usage in Postgres
at the time of writing (v7.0) but may become reserved key words in the
future:
Some of these key words represent functions in SQL92.
These functions are defined in Postgres,
but the parser does not consider the names to be key words and they are allowed
in other contexts.
ALLOCATE ARE ASSERTION AT AUTHORIZATION AVG
BIT_LENGTH
CASCADED CATALOG CHAR_LENGTH CHARACTER_LENGTH COLLATION
CONNECT CONNECTION CONTINUE CONVERT CORRESPONDING COUNT
CURRENT_SESSION
DATE DEALLOCATE DEC DESCRIBE DESCRIPTOR
DIAGNOSTICS DISCONNECT DOMAIN
ESCAPE EXCEPT EXCEPTION EXEC EXTERNAL
FIRST FOUND
GET GO GOTO
IDENTITY INDICATOR INPUT INTERSECT
LAST LOWER
MAX MIN MODULE
OCTET_LENGTH OPEN OUTPUT OVERLAPS
PREPARE PRESERVE
ROWS
SCHEMA SECTION SESSION SIZE SOME
SQL SQLCODE SQLERROR SQLSTATE SUM SYSTEM_USER
TEMPORARY TRANSLATE TRANSLATION
UNKNOWN UPPER USAGE
VALUE
WHENEVER WRITE
Non-reserved Keywords
SQL92 and SQL3 have
non-reserved keywords which have
a prescribed meaning in the language but which are also allowed
as identifiers.
Postgres has additional keywords
which allow similar unrestricted usage.
In particular, these keywords
are allowed as column or table names.
The following are Postgres
non-reserved key words which are neither SQL92
nor SQL3 non-reserved key words:
ACCESS AFTER AGGREGATE
BACKWARD BEFORE
CACHE COMMENT CREATEDB CREATEUSER CYCLE
DATABASE DELIMITERS
EACH ENCODING EXCLUSIVE
FORCE FORWARD FUNCTION
HANDLER
INCREMENT INDEX INHERITS INSENSITIVE INSTEAD ISNULL
LANCOMPILER LOCATION
MAXVALUE MINVALUE MODE
NOCREATEDB NOCREATEUSER NOTHING NOTIFY NOTNULL
OIDS OPERATOR
PASSWORD PROCEDURAL
RECIPE REINDEX RENAME RETURNS ROW RULE
SEQUENCE SERIAL SHARE START STATEMENT STDIN STDOUT
TEMP TRUSTED
UNLISTEN UNTIL
VALID VERSION
The following are Postgres
non-reserved key words which are SQL92
or SQL3 reserved key words:
ABSOLUTE ACTION
CONSTRAINTS
DAY DEFERRABLE DEFERRED
HOUR
IMMEDIATE INITIALLY INSENSITIVE ISOLATION
KEY
LANGUAGE LEVEL
MATCH MINUTE MONTH
NEXT
OF ONLY OPTION
PENDANT PRIOR PRIVILEGES
READ RELATIVE RESTRICT
SCROLL SECOND
TIME TIMESTAMP TIMEZONE_HOUR TIMEZONE_MINUTE TRIGGER
YEAR
ZONE
The following are Postgres
non-reserved key words which are also either SQL92
or SQL3 non-reserved key words:
COMMITTED SERIALIZABLE TYPE
The following are either SQL92
or SQL3 non-reserved key words which are not
key words of any kind in Postgres:
ADA
C CATALOG_NAME CHARACTER_SET_CATALOG CHARACTER_SET_NAME
CHARACTER_SET_SCHEMA CLASS_ORIGIN COBOL COLLATION_CATALOG
COLLATION_NAME COLLATION_SCHEMA COLUMN_NAME
COMMAND_FUNCTION CONDITION_NUMBER
CONNECTION_NAME CONSTRAINT_CATALOG CONSTRAINT_NAME
CONSTRAINT_SCHEMA CURSOR_NAME
DATA DATE_TIME_INTERVAL_CODE DATE_TIME_INTERVAL_PRECISION
DYNAMIC_FUNCTION
FORTRAN
LENGTH
MESSAGE_LENGTH MESSAGE_OCTET_LENGTH MORE MUMPS
NAME NULLABLE NUMBER
PAD PASCAL PLI
REPEATABLE RETURNED_LENGTH RETURNED_OCTET_LENGTH
RETURNED_SQLSTATE ROW_COUNT
SCALE SCHEMA_NAME SERVER_NAME SPACE SUBCLASS_ORIGIN
TABLE_NAME
UNCOMMITTED UNNAMED
Comments
A comment
is an arbitrary sequence of characters beginning with double dashes
and extending to the end of the line, e.g.:
-- This is a standard SQL comment
We also support C-style block comments, e.g.:
/* multi
line
comment
*/
A comment beginning with "/*" extends to the first occurrence of "*/".
Names
Names in SQL must begin with a letter
(a-z) or underscore
(_).
Subsequent characters in a name can be letters, digits
(0-9),
or underscores. The system uses no more than NAMEDATALEN-1 characters
of a name; longer names can be written in queries, but they will be
truncated.
By default, NAMEDATALEN is 32 so the maximum name length is 31 (but
at the time the system is built, NAMEDATALEN can be changed in
src/include/postgres_ext.h).
Names containing other characters may be formed by surrounding them
with double quotes ("). For example, table or column
names may contain
otherwise disallowed characters such as spaces, ampersands, etc. if
quoted. Quoting a name also makes it case-sensitive,
whereas unquoted names are always folded to lower case. For example,
the names FOO, foo
and "foo" are
considered the same by Postgres, but
"Foo" is a different name.
Double quotes can also be used to protect a name that would otherwise
be taken to be an SQL keyword. For example, IN
is a keyword but "IN" is a name.
Constants
There are three implicitly typed constants
for use in Postgres: strings, integers,
and floating point numbers. Constants can
also be specified with explicit types, which can enable more
accurate representation and more efficient handling by the
backend. The implicit constants are described below; explicit
constants are discussed afterwards.
String Constants
Strings
in SQL are arbitrary sequences of ASCII characters bounded by single
quotes ("'", e.g. 'This is a string').
SQL92 allows single quotes to be embedded in strings by typing two
adjacent single quotes (e.g. 'Dianne''s horse').
In Postgres single quotes may alternatively
be escaped with a backslash ("\", e.g.
'Dianne\'s horse'). To include a
backslash in a string constant, type two backslashes.
Non-printing characters may also be embedded within strings by
prepending them with a backslash
(e.g. '\tab').
Integer Constants
Integer constants
in SQL are collection of ASCII digits with no decimal point. Legal
values range from -2147483648 to +2147483647. This will vary
depending on the operating system and host machine.
Note that larger integers can be specified for int8
by using SQL92 string notation or
Postgres type notation:
int8 '4000000000' -- string style
'4000000000'::int8 -- Postgres (historical) style
Floating Point Constants
Floating point constants
consist of an integer part, a decimal point, and a fraction part or
scientific notation of the following format:
{dig}.{dig} [e [+-] {dig}]
where dig is one or more digits.
You must include at least one dig after the
period and after the [+-] if you use those options. An exponent with
a missing mantissa has a mantissa of 1 inserted. There may be no
extra characters embedded in the string.
Floating point constaints are of type
float8. float4 can be specified
explicitly by using SQL92 string notation or
Postgres type notation:
float4 '1.23' -- string style
'1.23'::float4 -- Postgres (historical) style
Constants of Postgres User-Defined Types
A constant of an
arbitrary
type can be entered using any one of the following notations:
type 'string'
'string'::type
CAST 'string' AS type
The value inside the string is passed to the input
conversion routine for the type called
type. The result is a
constant of the indicated type. The explicit typecast may be omitted
if there is no ambiguity as to the type the constant must be, in which
case it is automatically coerced.
Array constants
Array constants
are arrays of any Postgres type, including other arrays, string
constants, etc. The general format of an array constant is the
following:
{val1delimval2delim}
where delim
is the delimiter for the type stored in the pg_type class.
(For built-in types, this is the comma character (","). An
example of an array constant is
{{1,2,3},{4,5,6},{7,8,9}}
This constant is a two-dimensional, 3 by 3 array consisting of three
sub-arrays of integers.
Individual array elements can and should be placed between quotation
marks whenever possible to avoid ambiguity problems with respect to
leading white space.
Fields and Columns
Fields
A field
is either an attribute of a given class or one of the following:
oid
stands for the unique identifier of an instance which is added by
Postgres to all instances automatically. Oids are not reused and are 32
bit quantities.
xmin
The identity of the inserting transaction.
xmax
The identity of the deleting transaction.
cmin
The command identifier within the transaction.
cmax
The identity of the deleting command.
For further information on these fields consult
.
Times are represented internally as instances of the
abstime
data type. Transaction and command identifiers are 32 bit quantities.
Transactions are assigned sequentially starting at 512.
Columns
A column is a construct of the form:
instance{.composite_field}.field `['number`]'
instance
identifies a particular class and can be thought of as standing for
the instances of that class. An instance variable is either a class
name, a surrogate for a class defined by means of a FROM clause,
or the keyword NEW or CURRENT.
NEW and CURRENT can only appear in the action portion of a rule, while
other instance variables can be used in any SQL statement.
composite_field
is a field of of one of the Postgres composite types,
while successive composite fields address attributes in the
class(s) to which the composite field evaluates. Lastly,
field
is a normal (base type) field in the class(s) last addressed. If
field
is of type array,
then the optional number
designator indicates a specific element in the array. If no number is
indicated, then all array elements are returned.
Operators
Any built-in system, or user-defined operator may be used in SQL.
For the list of built-in and system operators consult
.
For a list of user-defined operators consult your system administrator
or run a query on the pg_operator class.
Parentheses may be used for arbitrary grouping of operators in expressions.
Expressions
SQL92 allows expressions
to transform data in tables. Expressions may contain operators
(see
for more details) and functions
( has
more information).
An expression is one of the following:
( a_expr )
constant
attribute
a_expr binary_operator a_expr
a_expr right_unary_operator
left_unary_operator a_expr
parameter
functional expression
aggregate expression
We have already discussed constants and attributes. The three kinds of
operator expressions indicate respectively binary (infix), right-unary
(suffix) and left-unary (prefix) operators. The following sections
discuss the remaining options.
Parameters
A parameter
is used to indicate a parameter in a SQL function. Typically this
is used in SQL function definition statement. The form of a
parameter is:
$number
For example, consider the definition of a function,
dept, as
CREATE FUNCTION dept (name)
RETURNS dept
AS 'select * from
dept where name=$1'
LANGUAGE 'sql';
Functional Expressions
A functional expression
is the name of a legal SQL function, followed by its argument list
enclosed in parentheses:
function (a_expr [, a_expr ... ] )
For example, the following computes the square root of an employee
salary:
sqrt(emp.salary)
Aggregate Expressions
An aggregate expression represents the application
of an aggregate function across the rows selected by a query.
An aggregate function reduces multiple inputs to a single output value,
such as the sum or average of the inputs.
The syntax of an aggregate expression is one of the following:
aggregate_name (expression)
aggregate_name (ALL expression)
aggregate_name (DISTINCT expression)
aggregate_name ( * )
where aggregate_name is a previously defined
aggregate, and expression is any expression
that doesn't itself contain an aggregate expression.
The first form of aggregate expression invokes the aggregate across all
input rows for which the given expression yields a non-null value.
The second form is the same as the first, since ALL is the default.
The third form invokes the aggregate for all distinct non-null values
of the expression found in the input rows. The last form invokes the
aggregate once for each input row regardless of null or non-null values;
since no particular input value is specified, it is generally only useful
for the count() aggregate.
For example, count(*) yields the total number of input rows;
count(f1) yields the number of input rows in which f1 is non-null;
count(distinct f1) yields the number of distinct non-null values of f1.
Target List
A target list
is a parenthesized, comma-separated list of one or more elements, each
of which must be of the form:
a_expr [ AS result_attname ]
where result_attname
is the name of the attribute to be created (or an
already existing attribute name in the case of update statements.) If
result_attname
is not present, then
a_expr
must contain only one attribute name which is assumed to be the name
of the result field. In Postgres
default naming is only used if
a_expr
is an attribute.
Qualification
A qualification
consists of any number of clauses connected by the logical operators:
NOT
AND
OR
A clause is an a_expr
that evaluates to a boolean over a set of instances.
From List
The from list
is a comma-separated list of from expressions.
Each "from expression" is of the form:
[ class_reference ] instance_variable
{, [ class_ref ] instance_variable... }
where class_reference
is of the form
class_name [ * ]
The "from expression"
defines one or more instance variables to range over the class
indicated in class_reference.
One can also request
the instance variable to range over all classes that are beneath the
indicated class in the inheritance hierarchy by postpending the
designator asterisk ("*").