Vladimir Homutov [Wed, 28 Jul 2021 14:23:18 +0000 (17:23 +0300)]
QUIC: handle EAGAIN properly on UDP sockets.
Previously, the error was ignored leading to unnecessary retransmits.
Now, unsent frames are returned into output queue, state is reset, and
timer is started for the next send attempt.
Roman Arutyunyan [Thu, 29 Jul 2021 07:03:36 +0000 (10:03 +0300)]
HTTP/3: require mandatory uni streams before additional ones.
As per quic-http-34:
Endpoints SHOULD create the HTTP control stream as well as the
unidirectional streams required by mandatory extensions (such as the
QPACK encoder and decoder streams) first, and then create additional
streams as allowed by their peer.
Previously, client could create and destroy additional uni streams unlimited
number of times before creating mandatory streams.
QUIC: avoid processing 1-RTT with incomplete handshake in OpenSSL.
OpenSSL is known to provide read keys for an encryption level before the
level is active in TLS, following the old BoringSSL API. In BoringSSL,
it was then fixed to defer releasing read keys until QUIC may use them.
Vladimir Homutov [Tue, 20 Jul 2021 09:37:12 +0000 (12:37 +0300)]
QUIC: the "quic_gso" directive.
The directive enables usage of UDP segmentation offloading by quic.
By default, gso is disabled since it is not always operational when
detected (depends on interface configuration).
Vladimir Homutov [Thu, 15 Jul 2021 11:22:00 +0000 (14:22 +0300)]
QUIC: added support for segmentation offloading.
To improve output performance, UDP segmentation offloading is used
if available. If there is a significant amount of data in an output
queue and path is verified, QUIC packets are not sent one-by-one,
but instead are collected in a buffer, which is then passed to kernel
in a single sendmsg call, using UDP GSO. Such method greatly decreases
number of system calls and thus system load.
Ruslan Ermilov [Mon, 5 Jul 2021 10:26:49 +0000 (13:26 +0300)]
Win32: use only preallocated memory in send/recv chain functions.
The ngx_wsasend_chain() and ngx_wsarecv_chain() functions were
modified to use only preallocated memory, and the number of
preallocated wsabufs was increased to 64.
Sometimes, QUIC packets need to be of certain (or minimal) size. This is
achieved by adding PADDING frames. It is possible, that adding padding will
affect header size, thus forcing us to recalculate padding size once more.
Ruslan Ermilov [Mon, 5 Jul 2021 10:09:23 +0000 (13:09 +0300)]
Use only preallocated memory in ngx_readv_chain() (ticket #1408).
In d1bde5c3c5d2, the number of preallocated iovec's for ngx_readv_chain()
was increased. Still, in some setups, the function might allocate memory
for iovec's from a connection pool, which is only freed when closing the
connection.
The ngx_readv_chain() function was modified to use only preallocated
memory, similarly to the ngx_writev_chain() change in 8e903522c17a.
Renamed header -> field per quic-qpack naming convention, in particular:
- Header Field -> Field Line
- Header Block -> (Encoded) Field Section
- Without Name Reference -> With Literal Name
- Header Acknowledgement -> Section Acknowledgment
Maxim Dounin [Mon, 28 Jun 2021 15:01:24 +0000 (18:01 +0300)]
Disabled control characters in the Host header.
Control characters (0x00-0x1f, 0x7f) and space are not expected to appear
in the Host header. Requests with such characters in the Host header are
now unconditionally rejected.
Maxim Dounin [Mon, 28 Jun 2021 15:01:20 +0000 (18:01 +0300)]
Improved logging of invalid headers.
In 71edd9192f24 logging of invalid headers which were rejected with the
NGX_HTTP_PARSE_INVALID_HEADER error was restricted to just the "client
sent invalid header line" message, without any attempts to log the header
itself.
This patch returns logging of the header up to the invalid character and
the character itself. The r->header_end pointer is now properly set
in all cases to make logging possible.
The same logging is also introduced when parsing headers from upstream
servers.
Maxim Dounin [Mon, 28 Jun 2021 15:01:18 +0000 (18:01 +0300)]
Disabled control characters and space in header names.
Control characters (0x00-0x1f, 0x7f), space, and colon were never allowed in
header names. The only somewhat valid use is header continuation which nginx
never supported and which is explicitly obsolete by RFC 7230.
Previously, such headers were considered invalid and were ignored by default
(as per ignore_invalid_headers directive). With this change, such headers
are unconditionally rejected.
It is expected to make nginx more resilient to various attacks, in particular,
with ignore_invalid_headers switched off (which is inherently unsecure, though
nevertheless sometimes used in the wild).
Maxim Dounin [Mon, 28 Jun 2021 15:01:15 +0000 (18:01 +0300)]
Disabled control characters in URIs.
Control characters (0x00-0x1f, 0x7f) were never allowed in URIs, and must
be percent-encoded by clients. Further, these are not believed to appear
in practice. On the other hand, passing such characters might make various
attacks possible or easier, despite the fact that currently allowed control
characters are not significant for HTTP request parsing.
Maxim Dounin [Mon, 28 Jun 2021 15:01:13 +0000 (18:01 +0300)]
Disabled spaces in URIs (ticket #196).
From now on, requests with spaces in URIs are immediately rejected rather
than allowed. Spaces were allowed in 31e9677b15a1 (0.8.41) to handle bad
clients. It is believed that now this behaviour causes more harm than
good.
And "%" can appear as a part of escaping itself. The following
characters are not allowed and need to be escaped: %00-%1F, %7F-%FF,
" ", """, "<", ">", "\", "^", "`", "{", "|", "}".
Not escaping ">" is known to cause problems at least with MS Exchange (see
http://nginx.org/pipermail/nginx-ru/2010-January/031261.html) and in
Tomcat (ticket #2191).
The patch adds escaping of the following chars in all URI parts: """, "<",
">", "\", "^", "`", "{", "|", "}". Note that comments are mostly preserved
to outline important characters being escaped.
Maxim Dounin [Mon, 28 Jun 2021 15:01:06 +0000 (18:01 +0300)]
Disabled requests with both Content-Length and Transfer-Encoding.
HTTP clients are not allowed to generate such requests since Transfer-Encoding
introduction in RFC 2068, and they are not expected to appear in practice
except in attempts to perform a request smuggling attack. While handling of
such requests is strictly defined, the most secure approach seems to reject
them.
Maxim Dounin [Mon, 28 Jun 2021 15:01:04 +0000 (18:01 +0300)]
Added CONNECT method rejection.
No valid CONNECT requests are expected to appear within nginx, since it
is not a forward proxy. Further, request line parsing will reject
proper CONNECT requests anyway, since we don't allow authority-form of
request-target. On the other hand, RFC 7230 specifies separate message
length rules for CONNECT which we don't support, so make sure to always
reject CONNECTs to avoid potential abuse.
Maxim Dounin [Mon, 28 Jun 2021 15:01:00 +0000 (18:01 +0300)]
Moved TRACE method rejection to a better place.
Previously, TRACE requests were rejected before parsing Transfer-Encoding.
This is not important since keepalive is not enabled at this point anyway,
though rejecting such requests after properly parsing other headers is
less likely to cause issues in case of further code changes.
After 2096b21fcd10, a single RST_STREAM(NO_ERROR) may not result in an error.
This change removes several unnecessary ctx->type checks for such a case.
Roman Arutyunyan [Fri, 11 Jun 2021 09:11:08 +0000 (12:11 +0300)]
HTTP/3: generate more H3_FRAME_UNEXPECTED.
As per quic-http-34, these are the cases when this error should be generated:
If an endpoint receives a second SETTINGS frame
on the control stream, the endpoint MUST respond with a connection
error of type H3_FRAME_UNEXPECTED
SETTINGS frames MUST NOT be sent on any stream other than the control
stream. If an endpoint receives a SETTINGS frame on a different
stream, the endpoint MUST respond with a connection error of type
H3_FRAME_UNEXPECTED.
A client MUST NOT send a PUSH_PROMISE frame. A server MUST treat the
receipt of a PUSH_PROMISE frame as a connection error of type
H3_FRAME_UNEXPECTED; see Section 8.
The MAX_PUSH_ID frame is always sent on the control stream. Receipt
of a MAX_PUSH_ID frame on any other stream MUST be treated as a
connection error of type H3_FRAME_UNEXPECTED.
Receipt of an invalid sequence of frames MUST be treated as a
connection error of type H3_FRAME_UNEXPECTED; see Section 8. In
particular, a DATA frame before any HEADERS frame, or a HEADERS or
DATA frame after the trailing HEADERS frame, is considered invalid.
A CANCEL_PUSH frame is sent on the control stream. Receiving a
CANCEL_PUSH frame on a stream other than the control stream MUST be
treated as a connection error of type H3_FRAME_UNEXPECTED.
The GOAWAY frame is always sent on the control stream.
Roman Arutyunyan [Fri, 11 Jun 2021 07:56:51 +0000 (10:56 +0300)]
HTTP/3: reordered H3_MISSING_SETTINGS and H3_FRAME_UNEXPECTED.
The quic-http-34 is ambiguous as to what error should be generated for the
first frame in control stream:
Each side MUST initiate a single control stream at the beginning of
the connection and send its SETTINGS frame as the first frame on this
stream. If the first frame of the control stream is any other frame
type, this MUST be treated as a connection error of type
H3_MISSING_SETTINGS.
If a DATA frame is received on a control stream, the recipient MUST
respond with a connection error of type H3_FRAME_UNEXPECTED.
If a HEADERS frame is received on a control stream, the recipient MUST
respond with a connection error of type H3_FRAME_UNEXPECTED.
Previously, H3_FRAME_UNEXPECTED had priority, but now H3_MISSING_SETTINGS has.
The arguments in the spec sound more compelling for H3_MISSING_SETTINGS.
- Function ngx_quic_control_flow() is introduced. This functions does
both MAX_DATA and MAX_STREAM_DATA flow controls. The function is called
from STREAM and RESET_STREAM frame handlers. Previously, flow control
was only accounted for STREAM. Also, MAX_DATA flow control was not accounted
at all.
- Function ngx_quic_update_flow() is introduced. This function advances flow
control windows and sends MAX_DATA/MAX_STREAM_DATA. The function is called
from RESET_STREAM frame handler, stream cleanup handler and stream recv()
handler.
Maxim Dounin [Tue, 1 Jun 2021 14:37:51 +0000 (17:37 +0300)]
Fixed SSL logging with lingering close.
Recent fixes to SSL shutdown with lingering close (554c6ae25ffc, 1.19.5)
broke logging of SSL variables. To make sure logging of SSL variables
works properly, avoid freeing c->ssl when doing an SSL shutdown before
lingering close.
Reported by Reinis Rozitis
(http://mailman.nginx.org/pipermail/nginx/2021-May/060670.html).
Maxim Dounin [Tue, 1 Jun 2021 14:37:49 +0000 (17:37 +0300)]
SSL: ngx_ssl_shutdown() rework.
Instead of calling SSL_free() with each return point, introduced a single
place where cleanup happens. As a positive side effect, this fixes two
potential memory leaks on ngx_handle_read_event() and ngx_handle_write_event()
errors where there were no SSL_free() calls (though unlikely practical,
as errors there are only expected to happen due to bugs or kernel issues).
HTTP/3: fixed parsing encoder insertions with empty header value.
When starting processing a new encoder instruction, the header state is not
memzero'ed because generally it's burdensome. If the header value is empty,
this resulted in inserting a stale value left from the previous instruction.
Maxim Dounin [Mon, 31 May 2021 13:36:51 +0000 (16:36 +0300)]
Core: disabled SO_REUSEADDR on UDP sockets while testing config.
On Linux, SO_REUSEADDR allows completely duplicate UDP sockets, so using
SO_REUSEADDR when testing configuration results in packets being dropped
if there is an existing traffic on the sockets being tested (ticket #2187).
While dropped packets are expected with UDP, it is better to avoid this
when possible.
With this change, SO_REUSEADDR is no longer set on datagram sockets when
testing configuration.
Maxim Dounin [Mon, 31 May 2021 13:36:37 +0000 (16:36 +0300)]
Core: disabled cloning sockets when testing config (ticket #2188).
Since we anyway do not set SO_REUSEPORT when testing configuration
(see ecb5cd305b06), trying to open additional sockets does not make much
sense, as all these additional sockets are expected to result in EADDRINUSE
errors from bind(). On the other hand, there are reports that trying
to open these sockets takes significant time under load: total configuration
testing time greater than 15s was observed in ticket #2188, compared to less
than 1s without load.
With this change, no additional sockets are opened during testing
configuration.
Roman Arutyunyan [Tue, 25 May 2021 13:41:59 +0000 (16:41 +0300)]
QUIC: make sure stream data size is lower than final size.
As per quic-transport 34, FINAL_SIZE_ERROR is generated if an endpoint received
a STREAM frame or a RESET_STREAM frame containing a final size that was lower
than the size of stream data that was already received.
Maxim Dounin [Tue, 25 May 2021 12:17:50 +0000 (15:17 +0300)]
Resolver: explicit check for compression pointers in question.
Since nginx always uses exactly one entry in the question section of
a DNS query, and never uses compression pointers in this entry, parsing
of a DNS response in ngx_resolver_process_response() does not expect
compression pointers to appear in the question section of the DNS
response. Indeed, compression pointers in the first name of a DNS response
hardly make sense, do not seem to be allowed by RFC 1035 (which says
"a pointer to a prior occurance of the same name", note "prior"), and
were never observed in practice.
Added an explicit check to ngx_resolver_process_response()'s parsing
of the question section to properly report an error if compression pointers
nevertheless appear in the question section.
Maxim Dounin [Tue, 25 May 2021 12:17:45 +0000 (15:17 +0300)]
Resolver: simplified ngx_resolver_copy().
Instead of checking on each label if we need to place a dot or not,
now it always adds a dot after a label, and reduces the resulting
length afterwards.
Maxim Dounin [Tue, 25 May 2021 12:17:41 +0000 (15:17 +0300)]
Resolver: fixed label types handling in ngx_resolver_copy().
Previously, anything with any of the two high bits set were interpreted
as compression pointers. This is incorrect, as RFC 1035 clearly states
that "The 10 and 01 combinations are reserved for future use". Further,
the 01 combination is actually allocated for EDNS extended label type
(see RFC 2671 and RFC 6891), not really used though.
Fix is to reject unrecognized label types rather than misinterpreting
them as compression pointers.
Maxim Dounin [Tue, 25 May 2021 12:17:38 +0000 (15:17 +0300)]
Resolver: fixed off-by-one read in ngx_resolver_copy().
It is believed to be harmless, and in the worst case it uses some
uninitialized memory as a part of the compression pointer length,
eventually leading to the "name is out of DNS response" error.
Roman Arutyunyan [Tue, 25 May 2021 10:55:12 +0000 (13:55 +0300)]
QUIC: refactored CRYPTO and STREAM buffer ordering.
Generic function ngx_quic_order_bufs() is introduced. This function creates
and maintains a chain of buffers with holes. Holes are marked with b->sync
flag. Several buffers and holes in this chain may share the same underlying
memory buffer.
When processing STREAM frames with this function, frame data is copied only
once to the right place in the stream input chain. Previously data could
be copied twice. First when buffering an out-of-order frame data, and then
when filling stream buffer from ordered frame queue. Now there's only one
data chain for both tasks.
Maxim Dounin [Mon, 24 May 2021 15:23:42 +0000 (18:23 +0300)]
Fixed log action when using SSL certificates with variables.
When variables are used in ssl_certificate or ssl_certificate_key, a request
is created in the certificate callback to evaluate the variables, and then
freed. Freeing it, however, updates c->log->action to "closing request",
resulting in confusing error messages like "client timed out ... while
closing request" when a client times out during the SSL handshake.
Fix is to restore c->log->action after calling ngx_http_free_request().
Sergey Kandaurov [Sat, 22 May 2021 15:40:45 +0000 (18:40 +0300)]
QUIC: unroll and inline ngx_quic_varint_len()/ngx_quic_build_int().
According to profiling, those two are among most frequently called,
so inlining is generally useful, and unrolling should help with it.
Further, this fixes undefined behaviour seen with invalid values.
Maxim Dounin [Wed, 19 May 2021 00:13:31 +0000 (03:13 +0300)]
Mail: max_errors directive.
Similarly to smtpd_hard_error_limit in Postfix and smtp_max_unknown_commands
in Exim, specifies the number of errors after which the connection is closed.
Maxim Dounin [Wed, 19 May 2021 00:13:28 +0000 (03:13 +0300)]
Mail: IMAP pipelining support.
The change is mostly the same as the SMTP one (04e43d03e153 and 3f5d0af4e40a),
and ensures that nginx is able to properly handle or reject multiple IMAP
commands. The s->cmd field is not really used and set for consistency.
Non-synchronizing literals handling in invalid/unknown commands is limited,
so when a non-synchronizing literal is detected at the end of a discarded
line, the connection is closed.
Maxim Dounin [Wed, 19 May 2021 00:13:26 +0000 (03:13 +0300)]
Mail: stricter checking of IMAP tags.
Only "A-Za-z0-9-._" characters now allowed (which is stricter than what
RFC 3501 requires, but expected to be enough for all known clients),
and tags shouldn't be longer than 32 characters.
Maxim Dounin [Wed, 19 May 2021 00:13:23 +0000 (03:13 +0300)]
Mail: fixed backslash handling in IMAP literals.
Previously, s->backslash was set if any of the arguments was a quoted
string with a backslash character. After successful command parsing
this resulted in all arguments being filtered to remove backslashes.
This is, however, incorrect, as backslashes should not be removed from
IMAP literals. For example:
S: * OK IMAP4 ready
C: a01 login {9}
S: + OK
C: user\name "pass\"word"
S: * BAD internal server error
resulted in "Auth-User: username" instead of "Auth-User: user\name"
as it should.
Fix is to apply backslash filtering on per-argument basis during parsing.
Maxim Dounin [Wed, 19 May 2021 00:13:22 +0000 (03:13 +0300)]
Mail: removed dead s->arg_start handling.
As discussed in the previous change, s->arg_start handling in the "done"
labels of ngx_mail_pop3_parse_command(), ngx_mail_imap_parse_command(),
and ngx_mail_smtp_parse_command() is wrong: s->arg_start cannot be
set there, as it is handled and cleared on all code paths where the
"done" labels are reached. The relevant code is dead and now removed.
Maxim Dounin [Wed, 19 May 2021 00:13:20 +0000 (03:13 +0300)]
Mail: fixed s->arg_start clearing on invalid IMAP commands.
Previously, s->arg_start was left intact after invalid IMAP commands,
and this might result in an argument incorrectly added to the following
command. Similarly, s->backslash was left intact as well, leading
to unneeded backslash removal.
For example (LFs from the client are explicitly shown as "<LF>"):
S: * OK IMAP4 ready
C: a01 login "\<LF>
S: a01 BAD invalid command
C: a0000000000\2 authenticate <LF>
S: a00000000002 aBAD invalid command
The backslash followed by LF generates invalid command with s->arg_start
and s->backslash set, the following command incorrectly treats anything
from the old s->arg_start to the space after the command as an argument,
and removes the backslash from the tag. If there is no space, s->arg_end
will be NULL.
Both things seem to be harmless though. In particular:
- This can be used to provide an incorrect argument to a command without
arguments. The only command which seems to look at the single argument
is AUTHENTICATE, and it checks the argument length before trying to
access it.
- Backslash removal uses the "end" pointer, and stops due to "src < end"
condition instead of scanning all the process memory if s->arg_end is
NULL (and arg[0].len is huge).
- There should be no backslashes in unquoted strings.
An obvious fix is to clear s->arg_start and s->backslash on invalid commands,
similarly to how it is done in POP3 parsing (added in 810:e3aa8f305d21) and
SMTP parsing.
This, however, makes it clear that s->arg_start handling in the "done"
label is wrong: s->arg_start cannot be legitimately set there, as it
is expected to be cleared in all possible cases when the "done" label is
reached. The relevant code is dead and will be removed by the following
change.
Maxim Dounin [Wed, 19 May 2021 00:13:18 +0000 (03:13 +0300)]
Mail: POP3 pipelining support.
The change is mostly the same as the SMTP one (04e43d03e153 and 3f5d0af4e40a),
and ensures that nginx is able to properly handle or reject multiple POP3
commands, as required by the PIPELINING capability (RFC 2449). The s->cmd
field is not really used and set for consistency.
Maxim Dounin [Wed, 19 May 2021 00:13:17 +0000 (03:13 +0300)]
Mail: optimized discarding invalid SMTP commands.
There is no need to scan buffer from s->buffer->pos, as we already scanned
the buffer till "p" and wasn't able to find an LF.
There is no real need for this change in SMTP, since it is at most a
microoptimization of a non-common code path. Similar code in IMAP, however,
will have to start scanning from "p" to be correct, since there can be
newlines in IMAP literals.
Maxim Dounin [Wed, 19 May 2021 00:13:15 +0000 (03:13 +0300)]
Mail: fixed handling of invalid SMTP commands split between reads.
Previously, if an invalid SMTP command was split between reads, nginx failed
to wait for LF before returning an error, and interpreted the rest of the
command received later as a separate command.
The sw_invalid state in ngx_mail_smtp_parse_command(), introduced in 04e43d03e153, did not work, since ngx_mail_smtp_auth_state() clears
s->state when returning an error due to NGX_MAIL_PARSE_INVALID_COMMAND.
And not clearing s->state will introduce another problem: the rest
of the command would trigger duplicate error when rest of the command is
received.
Fix is to return NGX_AGAIN from ngx_mail_smtp_parse_command() until full
command is received.
Maxim Dounin [Wed, 19 May 2021 00:13:12 +0000 (03:13 +0300)]
Mail: fixed SMTP pipelining to send the response immediately.
Previously, if there were some pipelined SMTP data in the buffer when
a proxied connection with the backend was established, nginx called
ngx_mail_proxy_handler() to send these data, and not tried to send the
response to the last command. In most cases, this response was later sent
along with the response to the pipelined command, but if for some reason
client decides to wait for the response before finishing the next command
this might result in a connection hang.
Fix is to always call ngx_mail_proxy_handler() to send the response, and
additionally post an event to send the pipelined data if needed.
When using server push, a segfault occured because
ngx_http_v3_create_push_request() accessed ngx_http_v3_session_t object the old
way. Prior to 9ec3e71f8a61, HTTP/3 session was stored directly in c->data.
Now it's referenced by the v3_session field of ngx_http_connection_t.
Maxim Dounin [Wed, 5 May 2021 23:22:03 +0000 (02:22 +0300)]
Changed complex value slots to use NGX_CONF_UNSET_PTR.
With this change, it is now possible to use ngx_conf_merge_ptr_value()
to merge complex values. This change follows much earlier changes in
ngx_conf_merge_ptr_value() and ngx_conf_set_str_array_slot()
in 1452:cd586e963db0 (0.6.10) and 1701:40d004d95d88 (0.6.22), and the
change in ngx_conf_set_keyval_slot() (7728:485dba3e2a01, 1.19.4).
To preserve compatibility with existing 3rd party modules, both NULL
and NGX_CONF_UNSET_PTR are accepted for now.
Client IDs cannot be reused on different paths. This change allows to reuse
client id previosly seen on the same path (but with different dcid) in case
when no unused client IDs are available.