| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
libcrypt is no longer part of glibc, so it might not be available.
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
|
| |
|
|
|
|
|
|
|
| |
Notably, Apple Silicon CPUs have 128 byte cache line size,
which is twice the default configured for generic aarch64.
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each AIO (thread IO) operation being run is now accompanied with 1-minute
timer. This timer prevents unexpected shutdown of the worker process while
an AIO operation is running, and logs an alert if the operation is running
for too long.
This fixes "open socket left" alerts during worker processes shutdown
due to pending AIO (or thread IO) operations while corresponding requests
have no timers. In particular, such errors were observed while reading
cache headers (ticket #2162), and with worker_shutdown_timeout.
|
|
|
|
|
|
|
|
|
|
|
| |
When graceful shutdown was requested, and then nginx was forced to
do fast shutdown, it used to (incorrectly) complain about open sockets
left in connections which weren't yet closed when fast shutdown
was requested.
Fix is to avoid complaining about open sockets when fast shutdown was
requested after graceful one. Abnormal termination, if requested with
the WINCH signal, can still happen though.
|
|
|
|
|
| |
MTU selection starts by doubling the initial MTU until the first failure.
Then binary search is used to find the path MTU.
|
|\ |
|
| | |
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Binary upgrades are not supported without master process, but it is,
however, possible, that nginx running with master process is asked
to upgrade binary, and the configuration file as available on disk
at this time includes "master_process off;".
If this happens, listening sockets inherited from the previous binary
will have ls[i].previous set. But the old cycle on initial process
startup, including startup after binary upgrade, is destroyed by
ngx_init_cycle() once configuration parsing is complete. As a result,
an attempt to dereference ls[i].previous in ngx_event_process_init()
accesses already freed memory.
Fix is to avoid looking into ls[i].previous if the old cycle is already
freed.
With this change it is also no longer needed to clear ls[i].previous in
worker processes, so the relevant code was removed.
|
| |
| |
| |
| |
| |
| | |
Previously, if an event was posted by a read event handler, called by
ngx_close_idle_connections(), that event was not processed until the next
event loop iteration, which could happen after a timeout.
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When reading exactly rev->available bytes, rev->available might become 0
after FIONREAD usage introduction in efd71d49bde0. On the next call of
ngx_readv_chain() on systems with EPOLLRDHUP this resulted in return without
any actions, that is, with rev->ready set, and this in turn resulted in no
timers set in event pipe, leading to socket leaks.
Fix is to reset rev->ready in ngx_readv_chain() when returning due to
rev->available being 0 with EPOLLRDHUP, much like it is already done in
ngx_unix_recv(). This ensures that if rev->available will become 0, on
systems with EPOLLRDHUP support appropriate EPOLLRDHUP-specific handling
will happen on the next ngx_readv_chain() call.
While here, also synced ngx_readv_chain() to match ngx_unix_recv() and
reset rev->ready when returning due to rev->available being 0 with kqueue.
This is mostly cosmetic change, as rev->ready is anyway reset when
rev->available is set to 0.
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In 7583:efd71d49bde0 (nginx 1.17.5) along with introduction of the
ioctl(FIONREAD) support proper handling of systems without EPOLLRDHUP
support in the kernel (but with EPOLLRDHUP in headers) was broken.
Before the change, rev->available was never set to 0 unless
ngx_use_epoll_rdhup was also set (that is, runtime test for EPOLLRDHUP
introduced in 6536:f7849bfb6d21 succeeded). After the change,
rev->available might reach 0 on systems without runtime EPOLLRDHUP
support, stopping further reading in ngx_readv_chain() and ngx_unix_recv().
And, if EOF happened to be already reported along with the last event,
it is not reported again by epoll_wait(), leading to connection hangs
and timeouts on such systems.
This affects Linux kernels before 2.6.17 if nginx was compiled
with newer headers, and, more importantly, emulation layers, such as
DigitalOcean's App Platform's / gVisor's epoll emulation layer.
Fix is to explicitly check ngx_use_epoll_rdhup before the corresponding
rev->pending_eof tests in ngx_readv_chain() and ngx_unix_recv().
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
The NGX_HAVE_ADDRINFO_CMSG macro is defined when at least one of methods
to deal with corresponding control message is available.
|
| | |
|
|\| |
|
| |
| |
| |
| |
| |
| | |
The SF_NOCACHE flag, introduced in FreeBSD 11 along with the new non-blocking
sendfile() implementation by glebius@, makes it possible to use sendfile()
along with the "directio" directive.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Starting with FreeBSD 11, there is no need to use AIO operations to preload
data into cache for sendfile(SF_NODISKIO) to work. Instead, sendfile()
handles non-blocking loading data from disk by itself. It still can, however,
return EBUSY if a page is already being loaded (for example, by a different
process). If this happens, we now post an event for the next event loop
iteration, so sendfile() is retried "after a short period", as manpage
recommends.
The limit of the number of EBUSY tolerated without any progress is preserved,
but now it does not result in an alert, since on an idle system event loop
iteration might be very short and EBUSY can happen many times in a row.
Instead, SF_NODISKIO is simply disabled for one call once the limit is
reached.
With this change, sendfile(SF_NODISKIO) is now used automatically as long as
sendfile() is enabled, and no longer requires "aio on;".
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With sendfile in threads, "task already active" alerts might appear in logs
if a write event happens on the main HTTP/2 connection, triggering a sendfile
in threads while another thread operation is already running. Observed
with "aio threads; aio_write on; sendfile on;" and with thread event handlers
modified to post a write event to the main HTTP/2 connection (though can
happen without any modifications).
Similarly, sendfile() with AIO preloading on FreeBSD can trigger duplicate
aio operation, resulting in "second aio post" alerts. This is, however,
harder to reproduce, especially on modern FreeBSD systems, since sendfile()
usually does not return EBUSY.
Fix is to avoid starting a sendfile operation if other thread operation
is active by checking r->aio in the thread handler (and, similarly, in
aio preload handler). The added check also makes duplicate calls protection
redundant, so it is removed.
|
| |
| |
| |
| |
| | |
Full stream shutdown is now called from stream cleanup handler instead of
explicitly sending frames.
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On Linux starting with 2.6.16, sendfile() silently limits all operations
to MAX_RW_COUNT, defined as (INT_MAX & PAGE_MASK). This incorrectly
triggered the interrupt check, and resulted in 0-sized writev() on the
next loop iteration.
Fix is to make sure the limit is always checked, so we will return from
the loop if the limit is already reached even if number of bytes sent is
not exactly equal to the number of bytes we've tried to send.
|
|\| |
|
| |
| |
| |
| | |
This allows to build nginx on macOS with -Wdeprecated-declarations.
|
| |
| |
| |
| | |
This was broken by 2dfd313f22f2.
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In d1bde5c3c5d2, the number of preallocated iovec's for ngx_readv_chain()
was increased. Still, in some setups, the function might allocate memory
for iovec's from a connection pool, which is only freed when closing the
connection.
The ngx_readv_chain() function was modified to use only preallocated
memory, similarly to the ngx_writev_chain() change in 8e903522c17a.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
To improve output performance, UDP segmentation offloading is used
if available. If there is a significant amount of data in an output
queue and path is verified, QUIC packets are not sent one-by-one,
but instead are collected in a buffer, which is then passed to kernel
in a single sendmsg call, using UDP GSO. Such method greatly decreases
number of system calls and thus system load.
|
|/
|
|
|
|
|
|
| |
Additionally, the ngx_init_srcaddr_cmsg() function is introduced which
initializes control message with connection local address.
The NGX_HAVE_ADDRINFO_CMSG macro is defined when at least one of methods
to deal with corresponding control message is available.
|
|
|
|
|
|
|
|
|
| |
Due to structure's alignment, some uninitialized memory contents may have
been passed between processes.
Zeroing was removed in 0215ec9aaa8a.
Reported by Johnny Wang.
|
| |
|
|
|
|
|
|
| |
The strerrordesc_np() function, introduced in glibc 2.32, provides an
async-signal-safe way to obtain error messages. This makes it possible
to avoid copying error messages.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, systems without sys_nerr (or _sys_nerr) were handled with an
assumption that errors start at 0 and continuous. This is, however, not
something POSIX requires, and not true on some platforms.
Notably, on Linux, where sys_nerr is no longer available for newly linked
binaries starting with glibc 2.32, there are gaps in error list, which
used to stop us from properly detecting maximum errno. Further, on
GNU/Hurd errors start at 0x40000001.
With this change, maximum errno detection is moved to the runtime code,
now able to ignore gaps, and also detects the first error if needed.
This fixes observed "Unknown error" messages as seen on Linux with
glibc 2.32 and on GNU/Hurd.
|
|
|
|
|
|
|
|
|
|
|
| |
Clearing cache based on free space left on a file system is
expected to allow better disk utilization in some cases, notably
when disk space might be also used for something other than nginx
cache (including nginx own temporary files) and while loading
cache (when cache size might be inaccurate for a while, effectively
disabling max_size cache clearing).
Based on a patch by Adam Bambuch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With XFS, using "allocsize=64m" mount option results in large preallocation
being reported in the st_blocks as returned by fstat() till the file is
closed. This in turn results in incorrect cache size calculations and
wrong clearing based on max_size.
To avoid too aggressive cache clearing on such volumes, st_blocks values
which result in sizes larger than st_size and eight blocks (an arbitrary
limit) are no longer trusted, and we use st_size instead.
The ngx_de_fs_size() counterpart is intentionally not modified, as
it is used on closed files and hence not affected by this problem.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFS on Linux is known to report wsize as a block size (in both f_bsize
and f_frsize, both in statfs() and statvfs()). On the other hand,
typical file system block sizes on Linux (ext2/ext3/ext4, XFS) are limited
to pagesize. (With FAT, block sizes can be at least up to 512k in
extreme cases, but this doesn't really matter, see below.)
To avoid too aggressive cache clearing on NFS volumes on Linux, block
sizes larger than pagesize are now ignored.
Note that it is safe to ignore large block sizes. Since 3899:e7cd13b7f759
(1.0.1) cache size is calculated based on fstat() st_blocks, and rounding
to file system block size is preserved mostly for Windows.
Note well that on other OSes valid block sizes seen are at least up
to 65536. In particular, UFS on FreeBSD is known to work well with block
and fragment sizes set to 65536.
|
| |
|
|
|
|
|
|
| |
Listening UNIX sockets were not removed on graceful shutdown, preventing
the next runs. The fix is to replace the custom socket closing code in
ngx_master_process_cycle() by the ngx_close_listening_sockets() call.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to avoid looping for a long time while working
with a fast enough peer when data are added to the socket buffer faster
than we are able to read and process them (ticket #1431). This is
basically what we already do on FreeBSD with kqueue, where information
about the number of bytes in the socket buffer is returned by
the kevent() call.
With other event methods rev->available is now set to -1 when the socket
is ready for reading. Later in ngx_recv() and ngx_recv_chain(), if
full buffer is received, real number of bytes in the socket buffer is
retrieved using ioctl(FIONREAD). Reading more than this number of bytes
ensures that even with edge-triggered event methods the event will be
triggered again, so it is safe to stop processing of the socket and
switch to other connections.
Using ioctl(FIONREAD) only after reading a full buffer is an optimization.
With this approach we only call ioctl(FIONREAD) when there are at least
two recv()/readv() calls.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AIO support in nginx was originally developed against FreeBSD versions 4-6,
where the sival_ptr field was named as sigval_ptr (seemingly by mistake[1]),
which made nginx use the only name available then. The standard-complaint
name was restored in 2005 (first appeared in FreeBSD 7.0, 2008), retaining
compatibility with previous versions[2][3]. In DragonFly, similar changes
were committed in 2009[4], with backward compatibility recently removed[5].
The change switches to the standard name, retaining compatibility with old
FreeBSD versions.
[1] https://svnweb.freebsd.org/changeset/base/48621
[2] https://svnweb.freebsd.org/changeset/base/152029
[3] https://svnweb.freebsd.org/changeset/base/174003
[4] https://gitweb.dragonflybsd.org/dragonfly.git/commit/3693401
[5] https://gitweb.dragonflybsd.org/dragonfly.git/commit/7875042
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous interface of ngx_open_dir() assumed that passed directory name
has a room for NGX_DIR_MASK at the end (NGX_DIR_MASK_LEN bytes). While all
direct users of ngx_dir_open() followed this interface, this also implied
similar requirements for indirect uses - in particular, via ngx_walk_tree().
Currently none of ngx_walk_tree() uses provides appropriate space, and
fixing this does not look like a right way to go. Instead, ngx_dir_open()
interface was changed to not require any additional space and use
appropriate allocations instead.
|
|
|
|
|
|
| |
Previously, "%uA" was used, which corresponds to ngx_atomic_uint_t.
Size of ngx_atomic_uint_t can be easily different from uint64_t,
leading to undefined results.
|
|
|
|
|
|
|
|
|
| |
The bug in question was fixed in glibc 2.3.2 and is no longer expected
to manifest itself on real servers. On the other hand, the workaround
causes compilation problems on various systems. Previously, we've
already fixed the code to compile with musl libc (fd6fd02f6a4d), and
now it is broken on Fedora 28 where glibc's crypt library was replaced
by libxcrypt. So the workaround was removed.
|
|
|
|
| |
No functional changes.
|