The subject of this post stems from a few lines in the Nginx logs that I happened to notice:
2014/07/31 18:51:59 [notice] 29056#0: gracefully shutting down 2014/07/31 18:51:59 [notice] 29056#0: exiting 2014/07/31 18:51:59 [notice] 29056#0: exit 2014/07/31 18:51:59 [notice] 28124#0: signal 17 (SIGCHLD) received 2014/07/31 18:51:59 [notice] 28124#0: worker process 29056 exited with code 0 2014/07/31 18:51:59 [notice] 28124#0: signal 29 (SIGIO) received
During a reload operation, Nginx received a SIGIO signal. In the previous post—while analyzing Nginx’s signal-handling code—I observed that when this signal is caught by Nginx, the signal handler merely sets the global variable ngx_sigio to 1, performing no other actions. Furthermore, this signal did not appear during reopen, quit, or stop operations.
What exactly is the purpose of this signal, and under what circumstances does Nginx receive it?
I found a relevant Q&A thread on Stack Overflow:
For async signaling code should do these steps: First you should allow your process receive SIGIO and then your socket or pipe should be put to async mode.
Search for these lines in your code:
fcntl(fd, F_SETOWN, getpid());// allow the process to receive SIGIO andfcntl(fd, F_SETFL, FASYNC);// make socket/pipe non-blocking orfcntl(fd, F_SETFL, O_NONBLOCK);// make socket/pipe non-blocking
而 fcntl 手册 (man 2 fcntl) 对 F_SETOWN 的作用介绍如下:
F_SETOWN (int)Set the Process ID or process group ID that will receive
SIGIOandSIGURGsignals for events on file descriptorfdto the ID given inarg. …If you set the
O_ASYNCstatus flag on a file descriptor by using theF_SETFLcommand offcntl(), aSIGIOsignal is sent whenever input or output becomes possible on that file descriptor. …
from Linux Device DriversBy enabling asynchronous notification, this application can receive a signal whenever data becomes available and need not concern itself with polling.
SIGIO is used to asynchronously retrieve I/O events in situations where one does not wish to—or cannot—employ polling mechanisms (such as select, epoll, etc.). Additionally, FASYNC is a legacy flag on UNIX systems; in modern BSD and Linux kernels, as well as their respective libc implementations, it has been superseded by O_ASYNC.
Now, let’s take a look at the relevant code within Nginx:
/* os/unix/ngx_process.c:129 */ ngx_nonblocking(ngx_processes[s].channel[0]) ... ngx_nonblocking(ngx_processes[s].channel[1]) ... on = 1; ioctl(ngx_processes[s].channel[0], FIOASYNC, &on) ... fcntl(ngx_processes[s].channel[0], F_SETOWN, ngx_pid) ... fcntl(ngx_process[s].channel[0], F_SETFD, FD_CLOEXEC) ... fcntl(ngx_process[s].channel[1], F_SETFS, FD_CLOEXEC)
In slight contrast to the description found on Stack Overflow, Nginx utilizes ioctl(FIOASYNC) to enable asynchronous event notifications on the channel (within Nginx, a socketpair serves as the communication channel between the master process and the worker processes) in order to monitor events occurring on that channel.
So, under what circumstances does Nginx receive a SIGIO signal?
Nginx employs socketpair(AF_UNIX, SOCK_STREAM) for unidirectional communication with its worker processes. (Although socketpair on Linux is inherently full-duplex, the Nginx codebase utilizes it solely for unidirectional purposes: the master process sends commands—such as termination requests or instructions to reopen files—to the worker processes.) We designate the two endpoints of this pair as the “write end” and the “read end”—corresponding to channel[0] and channel[1], respectively.
In addition to maintaining their own specific “read end” file descriptors, each Nginx worker process also acquires and stores the “write end” descriptors belonging to the other worker processes. This acquisition occurs either through inheritance (from the master process) or via file descriptor passing (receiving descriptors sent by the master process through the “read end”). This design choice was presumably intended to facilitate inter-worker process communication; however, in the current version of the codebase, these “write end” descriptors held by the worker processes are not actually being utilized.
During Nginx’s handling of a “reload” operation, the process first involves creating new worker processes based on the updated configuration file, and subsequently sending termination commands to the older worker processes:
/* ngx_master_process_cycle */
if (ngx_reconfigure) {
...
cycle = ngx_init_cycle(cycle);
...
ngx_start_worker_processes(cycle, ccf->worker_processes,
NGX_PROCESS_JUST_RESPAWN);
ngx_start_cache_manager_processes(cycle, 1);
/* allow new processes to start */
ngx_msleep(100);
...
ngx_signal_worker_processes(cycle, ngx_signal_value(NGX_SHUTDOWN_SIGNAL));
}
Upon the termination of an old worker process, all read-end and write-end descriptors it held are automatically closed and reclaimed by the operating system. Subsequently, the master process receives the SIGCHLD signal indicating the worker process’s exit, thereby triggering the worker process reclamation logic: ngx_reap_children.
/* ngx_reap_children */
for (i = 0; i < ngx_last_process; i++) {
...
if (ngx_processes[i].existed) {
if (!ngx_processes[i].detached) {
ngx_close_channel(ngx_processs[i].channel, cycle->log);
ngx_processes[i].channel[0] = -1;
ngx_processes[i].channel[1] = -1;
ch.pid = ngx_processes[i].pid;
ch.slot = i;
for (n = 0; n < ngx_last_process; n++) {
if (ngx_processes[n].exited
|| ngx_processes[n].pid == -1
|| ngx_processes[n].channel[0] == -1)
{
continue;
}
ngx_write_channel(ngx_processes[n].channel[0],
&ch, sizeof(ngx_channel_t), cycle->log);
}
...
}
...
}
...
}
Subsequently, upon receiving this command via their respective read ends—and utilizing the corresponding write ends they have retained—the other normally operating worker processes complete the configuration reload process.
Let us briefly summarize the aforementioned workflow (in chronological order):
- When a new worker process is created, it inherits the read and write ends of all existing worker processes from the master process. (During the worker process initialization phase—specifically
ngx_worker_process_init—it closes all read-end file descriptors except for its own; it closes the write-end descriptor corresponding to its own read end, while retaining the write-end descriptors belonging to all other processes—including those of Worker Process A.) - Worker Process A terminates, closing all of its own read-end and write-end file descriptors.
- The master process closes the read-end and write-end file descriptors corresponding to Worker Process A that it maintains internally.
- The other worker processes close the write-end file descriptors corresponding to Worker Process A that they maintain internally.
Before drawing a final conclusion, let us briefly review a few key concepts:
File descriptors are process-level resources;
Within the kernel, each file descriptor corresponds to a specific data structure, and this structure is shared among processes. In other words, file descriptors residing in the distinct address spaces of multiple processes can all reference the very same data structure within the kernel space. Consequently—in such scenarios—the kernel will only reclaim this shared data structure (i.e., when its reference count drops to zero), and the underlying file or socket will only be truly closed, once all participating processes have closed their respective file descriptors (as documented in man 2 close);
As previously described: once fcntl or ioctl has been used to configure asynchronous event signal notification for the writing end, the master process will receive a SIGIO signal under two specific conditions: either when the reading end closes the connection (in the context of a TCP connection established via sockets, a “read event” occurs on the socket when the remote peer disconnects, at which point a read system call returns a value of 0), or when the reading end attempts to write data (though this specific scenario does not occur within the Nginx architecture).
Based on the points outlined above, we can draw the following conclusions: In Step 2, when the master process closes the read and write file descriptors associated with Worker Process A, the read-side descriptor (or, more precisely, the corresponding kernel data structure) is not immediately reclaimed or released, because the new worker process still retains a copy of Worker Process A’s write-side descriptor. However, when the master subsequently closes the write-side descriptor—since no other process retains a reference to it—that descriptor is truly and completely closed. Consequently, the master process—acting as the “owner” of the read-side descriptor—receives a SIGIO signal. It is not until Step 3—after the other worker processes have received the master’s instruction—that the write-side descriptor associated with Worker Process A is finally released.
Several conclusions:
- The triggering of the SIGIO signal is a consequence of the unique workflow involved in a “reload” operation (other Nginx operational commands do not result in a scenario where both old and new worker processes coexist simultaneously; notably, the unexpected termination of a worker process can also give rise to a situation similar to a reload). Furthermore, Nginx receives a SIGIO signal once for every worker process currently enabled whenever a reload is performed (FIXME: What about the reliability of Linux signals?).
- It is incorrect—as is sometimes seen on Stack Overflow—to treat the FASYNC and O_NONBLOCK flags as equivalent, describing both simply as mechanisms for setting a file descriptor to non-blocking mode.
- The
ioctlandfcntlsystem calls each encompass a wide range of functions, and consequently, there are instances where their capabilities overlap (e.g., FIOASYNC vs. O_ASYNC).
At this point, the questions raised at the beginning have been largely answered. Next, let us summarize the often-confusing fcntl and ioctl functions.
fcntl vs. ioctl
These two long-standing functions share some overlapping functionality. To facilitate a smoother analysis of the relevant Nginx code that follows, we will first summarize the distinctions between these two functions and the roles played by their associated flags.
fcntl– manipulate file descriptor. It performs one of the operations on the open file descriptor.ioctl– control device. It manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (e.g., terminals).
from APUE
The
ioctlfunction has always been the catchall for I/O operations.It is included in the Single UNIX specification only as an extension for dealing with STREAMS devices. UNIX System specifications, however, use it for many miscellaneous device operations. Some implementations have even extended it for use with regular files.
Each device driver can define its own set of
ioctlcommands. The system, however, provides genericioctlcommands for different classes of devices. Examples of some of the categories for these genericioctlcommands supported in FreeBSD are summarized bellow:
Category Constant names
------------------------------
disk labels DIOxxx
file I/O FIOxxx
mag tab I/O MTIOxxx
socket I/O SIOxxx
terminal I/O TIOxxx
from wikipedia
… the kernel is designed to be extensible, and may accept an extra module called a device driver which runs in kernel space and can directly address the device. An ioctl interface is a single system call by which userspace may communicate with device drivers. Requests on a device driver are vectored with respect to this ioctl system call, typically by a handle to the device and a request number. The basic kernel can thus allow the userspace to access a device driver without knowing anything about the facilities supported by the device, and without needing an unmanageably large collection of system calls.
…
On Unix operatiing systems, two other vectors call interfaces are popular: The
fcntlsystem call configures open files, and is used in situations such as enabling non-blocking I/O; and thesetsockoptsystem call configures open network sockets.…
ioctlcalls minimize the complexity of the kernel’s system call interface. However, by providing a place for developers to “stash” bits and pieces of kernel programming interfaces,ioctlcalls complicate the overall user-to-kernel API. A kernel that provides several hundred system calls may provide several thousandioctlcalls.…
Though the interface to ioctl calls appears somewhat different from conventional system calls, there is in practice little difference between an ioctl call and a system call; an ioctl call is simply a system call with a different dispatching mechanism. Many of the arguments against expanding the kernel system call interface could therefore be applied to ioctl interfaces.
from linuxforums.org
fcntlacts on a file descriptor to change the way the fd is to be handled;ioctlis to do commands on a device.
from stackoverflow
Prior to standardization there was
ioctl(FIONBIO)andfcntl(O_NDELAY), but these behaved inconsistently between systems, and even within the same system. For example, it was common forFIONBIOto work on sockets andO_NDELAYto work on ttys, with a lot of inconsistency for things like pipes, fifos, and devices. And if you didn’t know what kind of file descriptor you had, you’d have to set both to be sure. But in addition, a non-blocking read with no data available was also indicated inconsistently; depending on the OS and the type of file descriptor the read may return0, or-1with errnoEAGAIN, or-1with errnoEWOULDBLOCK. Even today, settingFIONBIOorO_NDELAYon Solaris causes a read with no data to return 0 on a tty or pipe, or -1 with errno EAGAIN on a socket. However 0 is ambiguous since it is also returned forEOF.POSIX addressed this with the introduction of
O_NONBLOCK, which has standardized behavior across different systems and file descriptor types. Because existing systems usually want to avoid any changes to behavior which might break backward compatibility, POSIX defined a new flag rather than mandating specific behavior for one of the others. Some systems like Linux treat all 3 the same, and also defineEAGAINandEWOULDBLOCKto the same value, but systems wishing to maintain some other legacy behavior for backward compatibility can do so when the older mechanisms are used.New programs should use fcntl(…O_NONBLOCK…), as standardized by POSIX.
In summary:
fcntlis used to manipulate the state of file descriptors, whileioctlis primarily used to interact with device drivers in kernel space—specifically, device drivers can define new operations viaioctl.- In areas where their functionality overlaps, it is generally preferable to choose
fcntl, as it offers superior standardization and portability.
non-blocking
FIONBIO – Enables nonblocking I/O. this effect is similar to setting the O_NONBLOCK flag with the fcntl subroutine. The third parameter to the ioctl subroutine for this command is a pointer to an integer that indicates whether nonblocking I/O is being enabled or disabled. A value of 0 disables non-blocking I/O.
For this function, the Nginx implementation code is as follows:
/*
* ioctl(FIONBIO) sets a non-blocking mode with the single syscall
* while fcntl(F_SETFL, O_NONBLOCK) needs to learn the current state
* using fcntl(F_GETFL).
*
* ioctl() and fcntl() are syscalls at least in FreeBSD 2.x, Linux 2.2
* and Solaris 7.
*
* ioctl() in Linux 2.4 and 2.6 uses BKL, however, fcntl(F_SETFL) uses it too.
*/
#if (NGX_HAVE_FIONBIO)
int
ngx_nonblocking(ngx_socket_t s)
{
int nb;
nb = 1;
return ioctl(s, FIONBIO, &nb);
}
#else
#define ngx_nonblocking(s) fcntl(s, F_SETFL, fcntl(s, F_GETFL) | O_NONBLOCK)
#endif
asynchronous events
- FIOASYNC – Enables a simple form of asynchronous I/O notification. This command causes the kernel to send SIGIO signal to a process or a process group when I/O is possible. Only sockets, ttys, and pseudo-ttys implement this functionality.
- O_ASYNC – If you set the O_ASYNCstatus flag on a file descriptor by using the F_SETFL command of fcntl(), a SIGIO signal is sent whenever input or output becomes possible on that file descriptor.
The Nginx implementation code is as follows:
on = 1;
if (ioctl(ngx_processes[s].channel[0], FIOASYNC, &on) == -1) {
ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
"ioctl(FIOASYNC) failed while spawning \"%s\"", name);
ngx_close_channel(ngx_processes[s].channel, cycle->log);
return NGX_INVALID_PID;
}
IO signal
man 2 socket– UsingF_SETOWNoffcntl(2)is equivalent to anioctl(2)call with theFIOSETOWN.F_SETOWN– Set the process ID or process group ID that will receiveSIGIOandSIGURGsignals for events on file descriptor fd to the ID given in arg.FIOSETOWN– Set the process ID or process group ID that is to receive the SIGIO and SIGURG signals. Specifying a 0 value resets the socket such that no asynchronous signals are delivered. Specifying a process ID or a process group ID requests that sockets begin sending the SIGURG signal to the specified ID when out-of-band data arrives on the socket.
The Nginx implementation code is as follows:
if (fcntl(ngx_processes[s].channel[0], F_SETOWN, ngx_pid) == -1) {
ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
"fcntl(F_SETOWN) failed while spawning \"%s\"", name);
ngx_close_channel(ngx_processes[s].channel, cycle->log);
return NGX_INVALID_PID;
}
Don’t leave me so easily, please leave something behind…


Leave a Reply