Binary String Functions and Operatorsbinary datafunctions
This section describes functions and operators for examining and
manipulating binary strings, that is values of type bytea.
Many of these are equivalent, in purpose and syntax, to the
text-string functions described in the previous section.
SQL defines some string functions that use
key words, rather than commas, to separate
arguments. Details are in
.
PostgreSQL also provides versions of these functions
that use the regular function invocation syntax
(see ).
SQL Binary String Functions and Operators
Function/Operator
Description
Example(s)
binary stringconcatenationbytea||byteabytea
Concatenates the two binary strings.
'\x123456'::bytea || '\x789a00bcde'::bytea\x123456789a00bcdebit_lengthbit_length ( bytea )
integer
Returns number of bits in the binary string (8
times the octet_length).
bit_length('\x123456'::bytea)24btrimbtrim ( bytesbytea,
bytesremovedbytea )
bytea
Removes the longest string containing only bytes appearing in
bytesremoved from the start and end of
bytes.
btrim('\x1234567890'::bytea, '\x9012'::bytea)\x345678ltrimltrim ( bytesbytea,
bytesremovedbytea )
bytea
Removes the longest string containing only bytes appearing in
bytesremoved from the start of
bytes.
ltrim('\x1234567890'::bytea, '\x9012'::bytea)\x34567890octet_lengthoctet_length ( bytea )
integer
Returns number of bytes in the binary string.
octet_length('\x123456'::bytea)3overlayoverlay ( bytesbyteaPLACINGnewsubstringbyteaFROMstartintegerFORcountinteger )
bytea
Replaces the substring of bytes that starts at
the start'th byte and extends
for count bytes
with newsubstring.
If count is omitted, it defaults to the length
of newsubstring.
overlay('\x1234567890'::bytea placing '\002\003'::bytea from 2 for 3)\x12020390positionposition ( substringbyteaINbytesbytea )
integer
Returns first starting index of the specified
substring within
bytes, or zero if it's not present.
position('\x5678'::bytea in '\x1234567890'::bytea)3rtrimrtrim ( bytesbytea,
bytesremovedbytea )
bytea
Removes the longest string containing only bytes appearing in
bytesremoved from the end of
bytes.
rtrim('\x1234567890'::bytea, '\x9012'::bytea)\x12345678substringsubstring ( bytesbyteaFROMstartintegerFORcountinteger )
bytea
Extracts the substring of bytes starting at
the start'th byte if that is specified,
and stopping after count bytes if that is
specified. Provide at least one of start
and count.
substring('\x1234567890'::bytea from 3 for 2)\x5678trimtrim ( LEADING | TRAILING | BOTHbytesremovedbyteaFROMbytesbytea )
bytea
Removes the longest string containing only bytes appearing in
bytesremoved from the start,
end, or both ends (BOTH is the default)
of bytes.
trim('\x9012'::bytea from '\x1234567890'::bytea)\x345678trim ( LEADING | TRAILING | BOTHFROMbytesbytea,
bytesremovedbytea )
bytea
This is a non-standard syntax for trim().
trim(both from '\x1234567890'::bytea, '\x9012'::bytea)\x345678
Additional binary string manipulation functions are available and
are listed in . Some
of them are used internally to implement the
SQL-standard string functions listed in .
Other Binary String Functions
Function
Description
Example(s)
bit_countpopcountbit_countbit_count ( bytesbytea )
bigint
Returns the number of bits set in the binary string (also known as
popcount).
bit_count('\x1234567890'::bytea)15crc32crc32 ( bytea )
bigint
Computes the CRC-32 value of the binary string.
crc32('abc'::bytea)891568578crc32ccrc32c ( bytea )
bigint
Computes the CRC-32C value of the binary string.
crc32c('abc'::bytea)910901175get_bitget_bit ( bytesbytea,
nbigint )
integer
Extracts n'th bit
from binary string.
get_bit('\x1234567890'::bytea, 30)1get_byteget_byte ( bytesbytea,
ninteger )
integer
Extracts n'th byte
from binary string.
get_byte('\x1234567890'::bytea, 4)144lengthbinary stringlengthlengthof a binary stringbinary strings, lengthlength ( bytea )
integer
Returns the number of bytes in the binary string.
length('\x1234567890'::bytea)5length ( bytesbytea,
encodingname )
integer
Returns the number of characters in the binary string, assuming
that it is text in the given encoding.
length('jose'::bytea, 'UTF8')4md5md5 ( bytea )
text
Computes the MD5 hash of
the binary string, with the result written in hexadecimal.
md5('Th\000omas'::bytea)8ab2d3c9689aaf18&zwsp;b4958c334c82d8b1reversereverse ( bytea )
bytea
Reverses the order of the bytes in the binary string.
reverse('\xabcd'::bytea)\xcdabset_bitset_bit ( bytesbytea,
nbigint,
newvalueinteger )
bytea
Sets n'th bit in
binary string to newvalue.
set_bit('\x1234567890'::bytea, 30, 0)\x1234563890set_byteset_byte ( bytesbytea,
ninteger,
newvalueinteger )
bytea
Sets n'th byte in
binary string to newvalue.
set_byte('\x1234567890'::bytea, 4, 64)\x1234567840sha224sha224 ( bytea )
bytea
Computes the SHA-224 hash
of the binary string.
sha224('abc'::bytea)\x23097d223405d8228642a477bda2&zwsp;55b32aadbce4bda0b3f7e36c9da7sha256sha256 ( bytea )
bytea
Computes the SHA-256 hash
of the binary string.
sha256('abc'::bytea)\xba7816bf8f01cfea414140de5dae2223&zwsp;b00361a396177a9cb410ff61f20015adsha384sha384 ( bytea )
bytea
Computes the SHA-384 hash
of the binary string.
sha384('abc'::bytea)\xcb00753f45a35e8bb5a03d699ac65007&zwsp;272c32ab0eded1631a8b605a43ff5bed&zwsp;8086072ba1e7cc2358baeca134c825a7sha512sha512 ( bytea )
bytea
Computes the SHA-512 hash
of the binary string.
sha512('abc'::bytea)\xddaf35a193617abacc417349ae204131&zwsp;12e6fa4e89a97ea20a9eeee64b55d39a&zwsp;2192992a274fc1a836ba3c23a3feebbd&zwsp;454d4423643ce80e2a9ac94fa54ca49fsubstrsubstr ( bytesbytea, startinteger, countinteger )
bytea
Extracts the substring of bytes starting at
the start'th byte,
and extending for count bytes if that is
specified. (Same
as substring(bytes
from start
for count).)
substr('\x1234567890'::bytea, 3, 2)\x5678
Functions get_byte and set_byte
number the first byte of a binary string as byte 0.
Functions get_bit and set_bit
number bits from the right within each byte; for example bit 0 is the least
significant bit of the first byte, and bit 15 is the most significant bit
of the second byte.
For historical reasons, the function md5
returns a hex-encoded value of type text whereas the SHA-2
functions return type bytea. Use the functions
encode
and decode to
convert between the two. For example write encode(sha256('abc'),
'hex') to get a hex-encoded text representation,
or decode(md5('abc'), 'hex') to get
a bytea value.
character stringconverting to binary stringbinary stringconverting to character string
Functions for converting strings between different character sets
(encodings), and for representing arbitrary binary data in textual
form, are shown in
. For these
functions, an argument or result of type text is expressed
in the database's default encoding, while arguments or results of
type bytea are in an encoding named by another argument.
Text/Binary String Conversion Functions
Function
Description
Example(s)
convertconvert ( bytesbytea,
src_encodingname,
dest_encodingname )
bytea
Converts a binary string representing text in
encoding src_encoding
to a binary string in encoding dest_encoding
(see for
available conversions).
convert('text_in_utf8', 'UTF8', 'LATIN1')\x746578745f696e5f75746638convert_fromconvert_from ( bytesbytea,
src_encodingname )
text
Converts a binary string representing text in
encoding src_encoding
to text in the database encoding
(see for
available conversions).
convert_from('text_in_utf8', 'UTF8')text_in_utf8convert_toconvert_to ( stringtext,
dest_encodingname )
bytea
Converts a text string (in the database encoding) to a
binary string encoded in encoding dest_encoding
(see for
available conversions).
convert_to('some_text', 'UTF8')\x736f6d655f74657874encodeencode ( bytesbytea,
formattext )
text
Encodes binary data into a textual representation; supported
format values are:
base64,
escape,
hex.
encode('123\000\001', 'base64')MTIzAAE=decodedecode ( stringtext,
formattext )
bytea
Decodes binary data from a textual representation; supported
format values are the same as
for encode.
decode('MTIzAAE=', 'base64')\x3132330001
The encode and decode
functions support the following textual formats:
base64
base64 format
The base64 format is that
of RFC
2045 Section 6.8. As per the RFC, encoded lines are
broken at 76 characters. However instead of the MIME CRLF
end-of-line marker, only a newline is used for end-of-line.
The decode function ignores carriage-return,
newline, space, and tab characters. Otherwise, an error is
raised when decode is supplied invalid
base64 data — including when trailing padding is incorrect.
escape
escape format
The escape format converts zero bytes and
bytes with the high bit set into octal escape sequences
(\nnn), and it doubles
backslashes. Other byte values are represented literally.
The decode function will raise an error if a
backslash is not followed by either a second backslash or three
octal digits; it accepts other byte values unchanged.
hex
hex format
The hex format represents each 4 bits of
data as one hexadecimal digit, 0
through f, writing the higher-order digit of
each byte first. The encode function outputs
the a-f hex digits in lower
case. Because the smallest unit of data is 8 bits, there are
always an even number of characters returned
by encode.
The decode function
accepts the a-f characters in
either upper or lower case. An error is raised
when decode is given invalid hex data
— including when given an odd number of characters.
In addition, it is possible to cast integral values to and from type
bytea. Casting an integer to bytea produces
2, 4, or 8 bytes, depending on the width of the integer type. The result
is the two's complement representation of the integer, with the most
significant byte first. Some examples:
1234::smallint::bytea \x04d2
cast(1234 as bytea) \x000004d2
cast(-1234 as bytea) \xfffffb2e
'\x8000'::bytea::smallint -32768
'\x8000'::bytea::integer 32768
Casting a bytea to an integer will raise an error if the
length of the bytea exceeds the width of the integer type.
See also the aggregate function string_agg in
and the large object functions
in .