Gzip: updated handling of zlib variant from Intel.
In current versions (all versions based on zlib 1.2.11, at least
since 2018) it no longer uses 64K hash and does not force window
bits to 13 if it is less than 13. That is, it needs just 16 bytes
more memory than normal zlib, so these bytes are simply added to
the normal size calculation.
Maxim Dounin [Sun, 28 Mar 2021 14:45:39 +0000 (17:45 +0300)]
Fixed handling of already closed connections.
In limit_req, auth_delay, and upstream code to check for broken
connections, tests for possible connection close by the client
did not work if the connection was already closed when relevant
event handler was set. This happened because there were no additional
events in case of edge-triggered event methods, and read events
were disabled in case of level-triggered ones.
Fix is to explicitly post a read event if the c->read->ready flag
is set.
Maxim Dounin [Sun, 28 Mar 2021 14:45:35 +0000 (17:45 +0300)]
Upstream: fixed non-buffered proxying with eventport.
For new data to be reported with eventport on Solaris,
ngx_handle_read_event() needs to be called after reading response
headers. To do so, ngx_http_upstream_process_non_buffered_upstream()
now called unconditionally if there are no prepread data. This
won't cause any read() syscalls as long as upstream connection
is not ready for reading (c->read->ready is not set), but will result
in proper handling of all events.
Maxim Dounin [Sun, 28 Mar 2021 14:45:31 +0000 (17:45 +0300)]
Resolver: added missing event handling after reading.
If we need to be notified about further events, ngx_handle_read_event()
needs to be called after a read event is processed. Without this,
an event can be removed from the kernel and won't be reported again,
notably when using oneshot event methods, such as eventport on Solaris.
While here, error handling is also added, similar to one present in
ngx_resolver_tcp_read(). This is not expected to make a difference
and mostly added for consistency.
Maxim Dounin [Sun, 28 Mar 2021 14:45:29 +0000 (17:45 +0300)]
Events: fixed "port_dissociate() failed" alerts with eventport.
If an attempt is made to delete an event which was already reported,
port_dissociate() returns an error. Fix is avoid doing anything if
ev->active is not set.
Possible alternative approach would be to avoid calling ngx_del_event()
at all if ev->active is not set. This approach, however, will require
something else to re-add the other event of the connection, since both
read and write events are dissociated if an event is reported on a file
descriptor. Currently ngx_eventport_del_event() re-associates write
event if called to delete read event, and vice versa.
Maxim Dounin [Thu, 25 Mar 2021 22:44:59 +0000 (01:44 +0300)]
Events: fixed expiration of timers in the past.
If, at the start of an event loop iteration, there are any timers
in the past (including timers expiring now), the ngx_process_events()
function is called with zero timeout, and returns immediately even
if there are no events. But the following code only calls
ngx_event_expire_timers() if time actually changed, so this results
in nginx spinning in the event loop till current time changes.
While such timers are not expected to appear under normal conditions,
as all such timers should be removed on previous event loop iterations,
they still can appear due to bugs, zero timeouts set in the configuration
(if this is not explicitly handled by the code), or due to external
time changes on systems without clock_gettime(CLOCK_MONOTONIC).
Fix is to call ngx_event_expire_timers() unconditionally. Calling
it on each event loop iteration is not expected to be significant from
performance point of view, especially compared to a syscall in
ngx_process_events().
Maxim Dounin [Thu, 25 Mar 2021 22:44:57 +0000 (01:44 +0300)]
HTTP/2: improved handling of "keepalive_timeout 0".
Without explicit handling, a zero timer was actually added, leading to
multiple unneeded syscalls. Further, sending GOAWAY frame early might
be beneficial for clients.
Sergey Kandaurov [Wed, 24 Mar 2021 11:03:33 +0000 (14:03 +0300)]
Cancel keepalive and lingering close on EOF better (ticket #2145).
Unlike in 75e908236701, which added the logic to ngx_http_finalize_request(),
this change moves it to a more generic routine ngx_http_finalize_connection()
to cover cases when a request is finalized with NGX_DONE.
In particular, this fixes unwanted connection transition into the keepalive
state after receiving EOF while discarding request body. With edge-triggered
event methods that means the connection will last for extra seconds as set in
the keepalive_timeout directive.
Maxim Dounin [Tue, 23 Mar 2021 13:52:23 +0000 (16:52 +0300)]
gRPC: fixed handling of padding on DATA frames.
The response size check introduced in 39501ce97e29 did not take into
account possible padding on DATA frames, resulting in incorrect
"upstream sent response body larger than indicated content length" errors
if upstream server used padding in responses with known length.
Fix is to check the actual size of response buffers produced by the code,
similarly to how it is done in other protocols, instead of checking
the size of DATA frames.
Currently listener contains rbtree with multiple nodes for single QUIC
connection: each corresponding to specific server id. Each udp node points
to same ngx_connection_t, which points to QUIC connection via c->udp field.
Thus when an event handler is called, it only gets ngx_connection_t with
c->udp pointing to QUIC connection. This makes it hard to obtain actual
node which was used to dispatch packet (it requires to repeat DCID lookup).
Additionally, ngx_quic_connection_t->udp field is only needed to keep a
pointer in c->udp. The node is not added into the tree and does not carry
useful information.
Sometimes it is required to process datagram properties at higher level (i.e.
QUIC is interested in source address which may change and IP options). The
patch adds ngx_udp_dgram_t structure used to pass packet-related information
in c->udp.
Roman Arutyunyan [Thu, 11 Mar 2021 12:25:11 +0000 (15:25 +0300)]
QUIC: do not copy input data.
Previously, when a new datagram arrived, data were copied from the UDP layer
to the QUIC layer via c->recv() interface. Now UDP buffer is accessed
directly.
Sergey Kandaurov [Wed, 31 Mar 2021 18:43:17 +0000 (21:43 +0300)]
QUIC: HKDF API compatibility with OpenSSL master branch.
OpenSSL 3.0 started to require HKDF-Extract output PRK length pointer
used to represent the amount of data written to contain the length of
the key buffer before the call. EVP_PKEY_derive() documents this.
See HKDF_Extract() internal implementation update in this change:
https://github.com/openssl/openssl/commit/5a285ad
Roman Arutyunyan [Mon, 15 Mar 2021 13:39:33 +0000 (16:39 +0300)]
QUIC: connection shutdown.
The function ngx_quic_shutdown_connection() waits until all non-cancelable
streams are closed, and then closes the connection. In HTTP/3 cancelable
streams are all unidirectional streams except push streams.
The function is called from HTTP/3 when client reaches keepalive_requests.
Maxim Dounin [Fri, 5 Mar 2021 14:16:32 +0000 (17:16 +0300)]
Mail: sending of the PROXY protocol to backends.
Activated with the "proxy_protocol" directive. Can be combined with
"listen ... proxy_protocol;" and "set_real_ip_from ...;" to pass
client address provided to nginx in the PROXY protocol header.
Maxim Dounin [Fri, 5 Mar 2021 14:16:24 +0000 (17:16 +0300)]
Mail: parsing of the PROXY protocol from clients.
Activated with the "proxy_protocol" parameter of the "listen" directive.
Obtained information is passed to the auth_http script in Proxy-Protocol-Addr,
Proxy-Protocol-Port, Proxy-Protocol-Server-Addr, and Proxy-Protocol-Server-Port
headers.
Maxim Dounin [Fri, 5 Mar 2021 14:16:19 +0000 (17:16 +0300)]
Mail: postponed session initialization under accept mutex.
Similarly to 40e8ce405859 in the stream module, this reduces the time
accept mutex is held. This also simplifies following changes to
introduce PROXY protocol support.
Maxim Dounin [Fri, 5 Mar 2021 14:16:17 +0000 (17:16 +0300)]
Mail: added missing event handling after reading data.
If we need to be notified about further events, ngx_handle_read_event()
needs to be called after a read event is processed. Without this,
an event can be removed from the kernel and won't be reported again,
notably when using oneshot event methods, such as eventport on Solaris.
For consistency, existing ngx_handle_read_event() call removed from
ngx_mail_read_command(), as this call only covers one of the code paths
where ngx_mail_read_command() returns NGX_AGAIN. Instead, appropriate
processing added to the callers, covering all code paths where NGX_AGAIN
is returned.
Maxim Dounin [Fri, 5 Mar 2021 14:16:16 +0000 (17:16 +0300)]
Mail: added missing event handling after blocking events.
As long as a read event is blocked (ignored), ngx_handle_read_event()
needs to be called to make sure no further notifications will be
triggered when using level-triggered event methods, such as select() or
poll().
Maxim Dounin [Fri, 5 Mar 2021 14:16:15 +0000 (17:16 +0300)]
Events: fixed eventport handling in ngx_handle_read_event().
The "!rev->ready" test seems to be a typo, introduced in the original
commit (719:f30b1a75fd3b). The ngx_handle_write_event() code properly
tests for "rev->ready" instead.
Due to this typo, read events might be unexpectedly removed during
proxying after an event on the other part of the proxied connection.
Catched by mail proxying tests.
Maxim Dounin [Mon, 1 Mar 2021 17:00:45 +0000 (20:00 +0300)]
Introduced strerrordesc_np() support.
The strerrordesc_np() function, introduced in glibc 2.32, provides an
async-signal-safe way to obtain error messages. This makes it possible
to avoid copying error messages.
Maxim Dounin [Mon, 1 Mar 2021 17:00:43 +0000 (20:00 +0300)]
Improved maximum errno detection.
Previously, systems without sys_nerr (or _sys_nerr) were handled with an
assumption that errors start at 0 and continuous. This is, however, not
something POSIX requires, and not true on some platforms.
Notably, on Linux, where sys_nerr is no longer available for newly linked
binaries starting with glibc 2.32, there are gaps in error list, which
used to stop us from properly detecting maximum errno. Further, on
GNU/Hurd errors start at 0x40000001.
With this change, maximum errno detection is moved to the runtime code,
now able to ignore gaps, and also detects the first error if needed.
This fixes observed "Unknown error" messages as seen on Linux with
glibc 2.32 and on GNU/Hurd.
Maxim Dounin [Mon, 1 Mar 2021 14:31:28 +0000 (17:31 +0300)]
HTTP/2: client_header_timeout before first request (ticket #2142).
With this change, behaviour of HTTP/2 becomes even closer to HTTP/1.x,
and client_header_timeout instead of keepalive_timeout is used before
the first request is received.
This fixes HTTP/2 connections being closed even before the first request
if "keepalive_timeout 0;" was used in the configuration; the problem
appeared in f790816a0e87 (1.19.7).
Maxim Dounin [Thu, 25 Feb 2021 20:42:25 +0000 (23:42 +0300)]
Contrib: vim syntax, default highlighting (ticket #2141).
Using default highlighting makes it possible to easily overrule
highlighting specified in the syntax file, see ":highlight-default"
in vim help for details.
Roman Arutyunyan [Thu, 18 Feb 2021 09:22:28 +0000 (12:22 +0300)]
QUIC: set idle timer when sending an ack-eliciting packet.
As per quic-transport-34:
An endpoint also restarts its idle timer when sending an ack-eliciting
packet if no other ack-eliciting packets have been sent since last receiving
and processing a packet.
Maxim Dounin [Thu, 11 Feb 2021 18:52:24 +0000 (21:52 +0300)]
HTTP/2: keepalive_timeout now armed once between requests.
Previously, PINGs and other frames extended possible keepalive time,
making it possible to keep an open HTTP/2 connection for a long time.
Now the connection is always closed as long as keepalive_timeout expires,
similarly to how it happens in HTTP/1.x.
Note that as a part of this change, incomplete frames are no longer
trigger a separate timeout, so http2_recv_timeout (replaced by
client_header_timeout in previous patches) is essentially cancelled.
The client_header_timeout is, however, used for SSL handshake and
while reading HEADERS frames.
Maxim Dounin [Thu, 11 Feb 2021 18:52:23 +0000 (21:52 +0300)]
HTTP/2: removed http2_idle_timeout and http2_max_requests.
Instead, keepalive_timeout and keepalive_requests are now used. This
is expected to simplify HTTP/2 code and usage. This also matches
directives used by upstream module for all protocols.
In case of default settings, this effectively changes maximum number
of requests per connection from 1000 to 100. This looks acceptable,
especially given that HTTP/2 code now properly supports lingering close.
Further, this changes default keepalive timeout in HTTP/2 from 300 seconds
to 75 seconds. This also looks acceptable, and larger than PING interval
used by Firefox (network.http.spdy.ping-threshold defaults to 58s),
the only browser to use PINGs.
Maxim Dounin [Thu, 11 Feb 2021 18:52:20 +0000 (21:52 +0300)]
HTTP/2: removed http2_recv_timeout.
Instead, the client_header_timeout is now used for HTTP/2 reading.
Further, the timeout is changed to be set once till no further data
left to read, similarly to how client_header_timeout is used in other
places.
Maxim Dounin [Thu, 11 Feb 2021 18:52:17 +0000 (21:52 +0300)]
HTTP/2: fixed reusing connections with active requests.
New connections are marked reusable by ngx_http_init_connection() if there
are no data available for reading. As a result, if SSL is not used,
ngx_http_v2_init() might be called when the connection is marked reusable.
If a HEADERS frame is immediately available for reading, this resulted
in connection being preserved in reusable state with an active request,
and possibly closed later as if during worker shutdown (that is, after
all active requests were finalized).
Fix is to explicitly mark connections non-reusable in ngx_http_v2_init()
instead of (incorrectly) assuming they are already non-reusable.
Maxim Dounin [Thu, 11 Feb 2021 18:52:11 +0000 (21:52 +0300)]
Additional connections reuse.
If ngx_drain_connections() fails to immediately reuse any connections
and there are no free connections, it now additionally tries to reuse
a connection again. This helps to provide at least one free connection
in case of HTTP/2 with lingering close, where merely trying to reuse
a connection once does not free it, but makes it reusable again,
waiting for lingering close.
Maxim Dounin [Thu, 11 Feb 2021 18:52:09 +0000 (21:52 +0300)]
Reuse of connections in lingering close.
This is particularly important in HTTP/2, where keepalive connections
are closed with lingering. Before the patch, reusing a keepalive HTTP/2
connection resulted in the connection waiting for lingering close to
remain in the reusable connections queue, preventing ngx_drain_connections()
from closing additional connections.
The patch fixes it by marking the connection reusable again, and so
moving it in the reusable connections queue. Further, it makes actually
possible to reuse such connections if needed.
Vladimir Homutov [Wed, 10 Feb 2021 11:10:14 +0000 (14:10 +0300)]
QUIC: distinguish reserved transport parameters in logging.
18.1. Reserved Transport Parameters
Transport parameters with an identifier of the form "31 * N + 27" for
integer values of N are reserved to exercise the requirement that
unknown transport parameters be ignored. These transport parameters
have no semantics, and can carry arbitrary values.
Roman Arutyunyan [Fri, 12 Feb 2021 11:40:33 +0000 (14:40 +0300)]
QUIC: improved setting the lost timer.
Setting the timer is brought into compliance with quic-recovery-34. Now it's
set from a single function ngx_quic_set_lost_timer() that takes into account
both loss detection and PTO. The following issues are fixed with this change:
- when in loss detection mode, discarding a context could turn off the
timer forever after switching to the PTO mode
- when in loss detection mode, sending a packet resulted in rescheduling the
timer as if it's always in the PTO mode
QUIC: disabled non-immediate ACKs for Initial and Handshake.
As per quic-transport-33:
An endpoint MUST acknowledge all ack-eliciting Initial and Handshake
packets immediately
If a packet carrying Initial or Handshake ACK was lost, a non-immediate ACK
should not be sent later. Instead, client is expected to send a new packet
to acknowledge.
Sending non-immediate ACKs for Initial packets can cause the client to
generate an inflated RTT sample.
The token generation in QUIC is reworked. Single host key is used to generate
all required keys of needed sizes using HKDF.
The "quic_stateless_reset_token_key" directive is removed. Instead, the
"quic_host_key" directive is used, which reads key from file, or sets it
to random bytes if not specified.