Maxim Dounin [Sun, 29 Aug 2021 19:22:02 +0000 (22:22 +0300)]
Request body: reading body buffering in filters.
If a filter wants to buffer the request body during reading (for
example, to check an external scanner), it can now do so. To make
it possible, the code now checks rb->last_saved (introduced in the
previous change) along with rb->rest == 0.
Since in HTTP/2 this requires flow control to avoid overflowing the
request body buffer, so filters which need buffering have to set
the rb->filter_need_buffering flag on the first filter call. (Note
that each filter is expected to call the next filter, so all filters
will be able set the flag if needed.)
Maxim Dounin [Sun, 29 Aug 2021 19:21:03 +0000 (22:21 +0300)]
Request body: introduced rb->last_saved flag.
It indicates that the last buffer was received by the save filter,
and can be used to check this at higher levels. To be used in the
following changes.
Maxim Dounin [Sun, 29 Aug 2021 19:20:54 +0000 (22:20 +0300)]
Request body: added alert to catch duplicate body saving.
If due to an error ngx_http_request_body_save_filter() is called
more than once with rb->rest == 0, this used to result in a segmentation
fault. Added an alert to catch such errors, just in case.
Maxim Dounin [Sun, 29 Aug 2021 19:20:44 +0000 (22:20 +0300)]
HTTP/2: improved handling of preread unbuffered requests.
Previously, fully preread unbuffered requests larger than client body
buffer size were saved to disk, despite the fact that "unbuffered" is
expected to imply no disk buffering.
Maxim Dounin [Sun, 29 Aug 2021 19:20:38 +0000 (22:20 +0300)]
HTTP/2: improved handling of END_STREAM in a separate DATA frame.
The save body filter saves the request body to disk once the buffer is full.
Yet in HTTP/2 this might happen even if there is no need to save anything
to disk, notably when content length is known and the END_STREAM flag is
sent in a separate empty DATA frame. Workaround is to provide additional
byte in the buffer, so saving the request body won't be triggered.
This fixes unexpected request body disk buffering in HTTP/2 observed after
the previous change when content length is known and the END_STREAM flag
is sent in a separate empty DATA frame.
Maxim Dounin [Sun, 29 Aug 2021 19:20:36 +0000 (22:20 +0300)]
HTTP/2: reworked body reading to better match HTTP/1.x code.
In particular, now the code always uses a buffer limited by
client_body_buffer_size. At the cost of an additional copy it
ensures that small DATA frames are not directly mapped to small
write() syscalls, but rather buffered in memory before writing.
Further, requests without Content-Length are no longer forced
to use temporary files.
Maxim Dounin [Fri, 20 Aug 2021 00:53:56 +0000 (03:53 +0300)]
Upstream: fixed timeouts with gRPC, SSL and select (ticket #2229).
With SSL it is possible that an established connection is ready for
reading after the handshake. Further, events might be already disabled
in case of level-triggered event methods. If this happens and
ngx_http_upstream_send_request() blocks waiting for some data from
the upstream, such as flow control in case of gRPC, the connection
will time out due to no read events on the upstream connection.
Fix is to explicitly check the c->read->ready flag if sending request
blocks and post a read event if it is set.
Note that while it is possible to modify ngx_ssl_handshake() to keep
read events active, this won't completely resolve the issue, since
there can be data already received during the SSL handshake
(see 573bd30e46b4).
Alexey Radkov [Thu, 19 Aug 2021 17:51:27 +0000 (20:51 +0300)]
Core: removed unnecessary restriction in hash initialization.
Hash initialization ignores elements with key.data set to NULL.
Nevertheless, the initial hash bucket size check didn't skip them,
resulting in unnecessary restrictions on, for example, variables with
long names and with the NGX_HTTP_VARIABLE_NOHASH flag.
Fix is to update the initial hash bucket size check to skip elements
with key.data set to NULL, similarly to how it is done in other parts
of the code.
Maxim Dounin [Thu, 21 Oct 2021 15:44:07 +0000 (18:44 +0300)]
SSL: SSL_sendfile() support with kernel TLS.
Requires OpenSSL 3.0 compiled with "enable-ktls" option. Further, KTLS
needs to be enabled in kernel, and in OpenSSL, either via OpenSSL
configuration file or with "ssl_conf_command Options KTLS;" in nginx
configuration.
On FreeBSD, kernel TLS is available starting with FreeBSD 13.0, and
can be enabled with "sysctl kern.ipc.tls.enable=1" and "kldload ktls_ocf"
to load a software backend, see man ktls(4) for details.
On Linux, kernel TLS is available starting with kernel 4.13 (at least 5.2
is recommended), and needs kernel compiled with CONFIG_TLS=y (with
CONFIG_TLS=m, which is used at least on Ubuntu 21.04 by default,
the tls module needs to be loaded with "modprobe tls").
Maxim Dounin [Thu, 21 Oct 2021 15:38:38 +0000 (18:38 +0300)]
Removed CLOCK_MONOTONIC_COARSE support.
While clock_gettime(CLOCK_MONOTONIC_COARSE) is faster than
clock_gettime(CLOCK_MONOTONIC), the latter is fast enough on Linux for
practical usage, and the difference is negligible compared to other costs
at each event loop iteration. On the other hand, CLOCK_MONOTONIC_COARSE
causes various issues with typical CONFIG_HZ=250, notably very inaccurate
limit_rate handling in some edge cases (ticket #1678) and negative difference
between $request_time and $upstream_response_time (ticket #1965).
Vladimir Homutov [Wed, 20 Oct 2021 06:50:02 +0000 (09:50 +0300)]
HTTP: connections with wrong ALPN protocols are now rejected.
This is a recommended behavior by RFC 7301 and is useful
for mitigation of protocol confusion attacks [1].
To avoid possible negative effects, list of supported protocols
was extended to include all possible HTTP protocol ALPN IDs
registered by IANA [2], i.e. "http/1.0" and "http/0.9".
Maxim Dounin [Mon, 18 Oct 2021 13:46:59 +0000 (16:46 +0300)]
Upstream: fixed logging level of upstream invalid header errors.
In b87b7092cedb (nginx 1.21.1), logging level of "upstream sent invalid
header" errors was accidentally changed to "info". This change restores
the "error" level, which is a proper logging level for upstream-side
errors.
Awdhesh Mathpal [Fri, 8 Oct 2021 02:23:11 +0000 (19:23 -0700)]
Proxy: disabled keepalive on extra data in non-buffered mode.
The u->keepalive flag is initialized early if the response has no body
(or an empty body), and needs to be reset if there are any extra data,
similarly to how it is done in ngx_http_proxy_copy_filter(). Missed
in 83c4622053b0.
Vladimir Homutov [Wed, 22 Sep 2021 07:20:00 +0000 (10:20 +0300)]
Stream: added half-close support.
The "proxy_half_close" directive enables handling of TCP half close. If
enabled, connection to proxied server is kept open until both read ends get
EOF. Write end shutdown is properly transmitted via proxy.
Roman Arutyunyan [Fri, 10 Sep 2021 09:59:22 +0000 (12:59 +0300)]
Request body: do not create temp file if there's nothing to write.
Do this only when the entire request body is empty and
r->request_body_in_file_only is set.
The issue manifested itself with missing warning "a client request body is
buffered to a temporary file" when the entire rb->buf is full and all buffers
are delayed by a filter.
Rob Mueller [Fri, 13 Aug 2021 07:57:47 +0000 (03:57 -0400)]
Mail: Auth-SSL-Protocol and Auth-SSL-Cipher headers (ticket #2134).
This adds new Auth-SSL-Protocol and Auth-SSL-Cipher headers to
the mail proxy auth protocol when SSL is enabled.
This can be useful for detecting users using older clients that
negotiate old ciphers when you want to upgrade to newer
TLS versions of remove suppport for old and insecure ciphers.
You can use your auth backend to notify these users before the
upgrade that they either need to upgrade their client software
or contact your support team to work out an upgrade path.
Maxim Dounin [Mon, 16 Aug 2021 19:40:31 +0000 (22:40 +0300)]
SSL: ciphers now set before loading certificates (ticket #2035).
To load old/weak server or client certificates it might be needed to adjust
the security level, as introduced in OpenSSL 1.1.0. This change ensures that
ciphers are set before loading the certificates, so security level changes
via the cipher string apply to certificate loading.
Sergey Kandaurov [Tue, 10 Aug 2021 20:43:17 +0000 (23:43 +0300)]
SSL: removed export ciphers support.
Export ciphers are forbidden to negotiate in TLS 1.1 and later protocol modes.
They are disabled since OpenSSL 1.0.2g by default unless explicitly configured
with "enable-weak-ssl-ciphers", and completely removed in OpenSSL 1.1.0.
Sergey Kandaurov [Tue, 10 Aug 2021 20:43:17 +0000 (23:43 +0300)]
SSL: use of the SSL_OP_IGNORE_UNEXPECTED_EOF option.
A new behaviour was introduced in OpenSSL 1.1.1e, when a peer does not send
close_notify before closing the connection. Previously, it was to return
SSL_ERROR_SYSCALL with errno 0, known since at least OpenSSL 0.9.7, and is
handled gracefully in nginx. Now it returns SSL_ERROR_SSL with a distinct
reason SSL_R_UNEXPECTED_EOF_WHILE_READING ("unexpected eof while reading").
This leads to critical errors seen in nginx within various routines such as
SSL_do_handshake(), SSL_read(), SSL_shutdown(). The behaviour was restored
in OpenSSL 1.1.1f, but presents in OpenSSL 3.0 by default.
Use of the SSL_OP_IGNORE_UNEXPECTED_EOF option added in OpenSSL 3.0 allows
to set a compatible behaviour to return SSL_ERROR_ZERO_RETURN:
https://git.openssl.org/?p=openssl.git;a=commitdiff;h=09b90e0
See for additional details: https://github.com/openssl/openssl/issues/11381
Sergey Kandaurov [Tue, 10 Aug 2021 20:43:16 +0000 (23:43 +0300)]
SSL: silenced warnings when building with OpenSSL 3.0.
The OPENSSL_SUPPRESS_DEPRECATED macro is used to suppress deprecation warnings.
This covers Session Tickets keys, SSL Engine, DH low level API for DHE ciphers.
Unlike OPENSSL_API_COMPAT, it works well with OpenSSL built with no-deprecated.
In particular, it doesn't unhide various macros in OpenSSL includes, which are
meant to be hidden under OPENSSL_NO_DEPRECATED.
Sergey Kandaurov [Tue, 10 Aug 2021 20:43:16 +0000 (23:43 +0300)]
SSL: using SSL_CTX_set0_tmp_dh_pkey() with OpenSSL 3.0 in dhparam.
Using PEM_read_bio_DHparams() and SSL_CTX_set_tmp_dh() is deprecated
as part of deprecating the low level DH functions in favor of EVP_PKEY:
https://git.openssl.org/?p=openssl.git;a=commitdiff;h=163f6dc
Sergey Kandaurov [Tue, 10 Aug 2021 20:42:59 +0000 (23:42 +0300)]
SSL: RSA data type is deprecated in OpenSSL 3.0.
The only consumer is a callback function for SSL_CTX_set_tmp_rsa_callback()
deprecated in OpenSSL 1.1.0. Now the function is conditionally compiled too.
Disabled HTTP/1.0 requests with Transfer-Encoding.
The latest HTTP/1.1 draft describes Transfer-Encoding in HTTP/1.0 as having
potentially faulty message framing as that could have been forwarded without
handling of the chunked encoding, and forbids processing subsequest requests
over that connection: https://github.com/httpwg/http-core/issues/879.
While handling of such requests is permitted, the most secure approach seems
to reject them.
Maxim Dounin [Tue, 3 Aug 2021 17:50:30 +0000 (20:50 +0300)]
SSL: set events ready flags after handshake.
The c->read->ready and c->write->ready flags might be reset during
the handshake, and not set again if the handshake was finished on
the other event. At the same time, some data might be read from
the socket during the handshake, so missing c->read->ready flag might
result in a connection hang, for example, when waiting for an SMTP
greeting (which was already received during the handshake).
Previously, when cleaning up a QUIC stream in shutdown mode,
ngx_quic_shutdown_quic() was called, which could close the QUIC connection
right away. This could be a problem if the connection was referenced up the
stack. For example, this could happen in ngx_quic_init_streams(),
ngx_quic_close_streams(), ngx_quic_create_client_stream() etc.
With a typical HTTP/3 client the issue is unlikely because of HTTP/3 uni
streams which need a posted event to close. In this case QUIC connection
cannot be closed right away.
Now QUIC connection read event is posted and it will shut down the connection
asynchronously.
Roman Arutyunyan [Thu, 29 Jul 2021 13:01:37 +0000 (16:01 +0300)]
HTTP/3: close connection on keepalive_requests * 2.
After receiving GOAWAY, client is not supposed to create new streams. However,
until client reads this frame, we allow it to create new streams, which are
gracefully rejected. To prevent client from abusing this algorithm, a new
limit is introduced. Upon reaching keepalive_requests * 2, server now closes
the entire QUIC connection claiming excessive load.
Roman Arutyunyan [Thu, 29 Jul 2021 09:49:16 +0000 (12:49 +0300)]
QUIC: limit in-flight bytes by congestion window.
Previously, in-flight byte counter and congestion window were properly
maintained, but the limit was not properly implemented.
Now a new datagram is sent only if in-flight byte counter is less than window.
The limit is datagram-based, which means that a single datagram may lead to
exceeding the limit, but the next one will not be sent.
Vladimir Homutov [Wed, 28 Jul 2021 14:23:18 +0000 (17:23 +0300)]
QUIC: handle EAGAIN properly on UDP sockets.
Previously, the error was ignored leading to unnecessary retransmits.
Now, unsent frames are returned into output queue, state is reset, and
timer is started for the next send attempt.
Roman Arutyunyan [Thu, 29 Jul 2021 07:03:36 +0000 (10:03 +0300)]
HTTP/3: require mandatory uni streams before additional ones.
As per quic-http-34:
Endpoints SHOULD create the HTTP control stream as well as the
unidirectional streams required by mandatory extensions (such as the
QPACK encoder and decoder streams) first, and then create additional
streams as allowed by their peer.
Previously, client could create and destroy additional uni streams unlimited
number of times before creating mandatory streams.
QUIC: avoid processing 1-RTT with incomplete handshake in OpenSSL.
OpenSSL is known to provide read keys for an encryption level before the
level is active in TLS, following the old BoringSSL API. In BoringSSL,
it was then fixed to defer releasing read keys until QUIC may use them.
Vladimir Homutov [Tue, 20 Jul 2021 09:37:12 +0000 (12:37 +0300)]
QUIC: the "quic_gso" directive.
The directive enables usage of UDP segmentation offloading by quic.
By default, gso is disabled since it is not always operational when
detected (depends on interface configuration).
Vladimir Homutov [Thu, 15 Jul 2021 11:22:00 +0000 (14:22 +0300)]
QUIC: added support for segmentation offloading.
To improve output performance, UDP segmentation offloading is used
if available. If there is a significant amount of data in an output
queue and path is verified, QUIC packets are not sent one-by-one,
but instead are collected in a buffer, which is then passed to kernel
in a single sendmsg call, using UDP GSO. Such method greatly decreases
number of system calls and thus system load.
Ruslan Ermilov [Mon, 5 Jul 2021 10:26:49 +0000 (13:26 +0300)]
Win32: use only preallocated memory in send/recv chain functions.
The ngx_wsasend_chain() and ngx_wsarecv_chain() functions were
modified to use only preallocated memory, and the number of
preallocated wsabufs was increased to 64.
Sometimes, QUIC packets need to be of certain (or minimal) size. This is
achieved by adding PADDING frames. It is possible, that adding padding will
affect header size, thus forcing us to recalculate padding size once more.
Ruslan Ermilov [Mon, 5 Jul 2021 10:09:23 +0000 (13:09 +0300)]
Use only preallocated memory in ngx_readv_chain() (ticket #1408).
In d1bde5c3c5d2, the number of preallocated iovec's for ngx_readv_chain()
was increased. Still, in some setups, the function might allocate memory
for iovec's from a connection pool, which is only freed when closing the
connection.
The ngx_readv_chain() function was modified to use only preallocated
memory, similarly to the ngx_writev_chain() change in 8e903522c17a.
Renamed header -> field per quic-qpack naming convention, in particular:
- Header Field -> Field Line
- Header Block -> (Encoded) Field Section
- Without Name Reference -> With Literal Name
- Header Acknowledgement -> Section Acknowledgment
Maxim Dounin [Mon, 28 Jun 2021 15:01:24 +0000 (18:01 +0300)]
Disabled control characters in the Host header.
Control characters (0x00-0x1f, 0x7f) and space are not expected to appear
in the Host header. Requests with such characters in the Host header are
now unconditionally rejected.
Maxim Dounin [Mon, 28 Jun 2021 15:01:20 +0000 (18:01 +0300)]
Improved logging of invalid headers.
In 71edd9192f24 logging of invalid headers which were rejected with the
NGX_HTTP_PARSE_INVALID_HEADER error was restricted to just the "client
sent invalid header line" message, without any attempts to log the header
itself.
This patch returns logging of the header up to the invalid character and
the character itself. The r->header_end pointer is now properly set
in all cases to make logging possible.
The same logging is also introduced when parsing headers from upstream
servers.
Maxim Dounin [Mon, 28 Jun 2021 15:01:18 +0000 (18:01 +0300)]
Disabled control characters and space in header names.
Control characters (0x00-0x1f, 0x7f), space, and colon were never allowed in
header names. The only somewhat valid use is header continuation which nginx
never supported and which is explicitly obsolete by RFC 7230.
Previously, such headers were considered invalid and were ignored by default
(as per ignore_invalid_headers directive). With this change, such headers
are unconditionally rejected.
It is expected to make nginx more resilient to various attacks, in particular,
with ignore_invalid_headers switched off (which is inherently unsecure, though
nevertheless sometimes used in the wild).
Maxim Dounin [Mon, 28 Jun 2021 15:01:15 +0000 (18:01 +0300)]
Disabled control characters in URIs.
Control characters (0x00-0x1f, 0x7f) were never allowed in URIs, and must
be percent-encoded by clients. Further, these are not believed to appear
in practice. On the other hand, passing such characters might make various
attacks possible or easier, despite the fact that currently allowed control
characters are not significant for HTTP request parsing.
Maxim Dounin [Mon, 28 Jun 2021 15:01:13 +0000 (18:01 +0300)]
Disabled spaces in URIs (ticket #196).
From now on, requests with spaces in URIs are immediately rejected rather
than allowed. Spaces were allowed in 31e9677b15a1 (0.8.41) to handle bad
clients. It is believed that now this behaviour causes more harm than
good.
And "%" can appear as a part of escaping itself. The following
characters are not allowed and need to be escaped: %00-%1F, %7F-%FF,
" ", """, "<", ">", "\", "^", "`", "{", "|", "}".
Not escaping ">" is known to cause problems at least with MS Exchange (see
http://nginx.org/pipermail/nginx-ru/2010-January/031261.html) and in
Tomcat (ticket #2191).
The patch adds escaping of the following chars in all URI parts: """, "<",
">", "\", "^", "`", "{", "|", "}". Note that comments are mostly preserved
to outline important characters being escaped.
Maxim Dounin [Mon, 28 Jun 2021 15:01:06 +0000 (18:01 +0300)]
Disabled requests with both Content-Length and Transfer-Encoding.
HTTP clients are not allowed to generate such requests since Transfer-Encoding
introduction in RFC 2068, and they are not expected to appear in practice
except in attempts to perform a request smuggling attack. While handling of
such requests is strictly defined, the most secure approach seems to reject
them.
Maxim Dounin [Mon, 28 Jun 2021 15:01:04 +0000 (18:01 +0300)]
Added CONNECT method rejection.
No valid CONNECT requests are expected to appear within nginx, since it
is not a forward proxy. Further, request line parsing will reject
proper CONNECT requests anyway, since we don't allow authority-form of
request-target. On the other hand, RFC 7230 specifies separate message
length rules for CONNECT which we don't support, so make sure to always
reject CONNECTs to avoid potential abuse.
Maxim Dounin [Mon, 28 Jun 2021 15:01:00 +0000 (18:01 +0300)]
Moved TRACE method rejection to a better place.
Previously, TRACE requests were rejected before parsing Transfer-Encoding.
This is not important since keepalive is not enabled at this point anyway,
though rejecting such requests after properly parsing other headers is
less likely to cause issues in case of further code changes.
After 2096b21fcd10, a single RST_STREAM(NO_ERROR) may not result in an error.
This change removes several unnecessary ctx->type checks for such a case.