Blame man7/epoll.7

Packit 7cfc04
.\"  Copyright (C) 2003  Davide Libenzi
Packit 7cfc04
.\"
Packit 7cfc04
.\" %%%LICENSE_START(GPLv2+_SW_3_PARA)
Packit 7cfc04
.\"  This program is free software; you can redistribute it and/or modify
Packit 7cfc04
.\"  it under the terms of the GNU General Public License as published by
Packit 7cfc04
.\"  the Free Software Foundation; either version 2 of the License, or
Packit 7cfc04
.\"  (at your option) any later version.
Packit 7cfc04
.\"
Packit 7cfc04
.\"  This program is distributed in the hope that it will be useful,
Packit 7cfc04
.\"  but WITHOUT ANY WARRANTY; without even the implied warranty of
Packit 7cfc04
.\"  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
Packit 7cfc04
.\"  GNU General Public License for more details.
Packit 7cfc04
.\"
Packit 7cfc04
.\" You should have received a copy of the GNU General Public
Packit 7cfc04
.\" License along with this manual; if not, see
Packit 7cfc04
.\" <http://www.gnu.org/licenses/>.
Packit 7cfc04
.\" %%%LICENSE_END
Packit 7cfc04
.\"
Packit 7cfc04
.\"  Davide Libenzi <davidel@xmailserver.org>
Packit 7cfc04
.\"
Packit 7cfc04
.TH EPOLL 7 2017-09-15 "Linux" "Linux Programmer's Manual"
Packit 7cfc04
.SH NAME
Packit 7cfc04
epoll \- I/O event notification facility
Packit 7cfc04
.SH SYNOPSIS
Packit 7cfc04
.B #include <sys/epoll.h>
Packit 7cfc04
.SH DESCRIPTION
Packit 7cfc04
The
Packit 7cfc04
.B epoll
Packit 7cfc04
API performs a similar task to
Packit 7cfc04
.BR poll (2):
Packit 7cfc04
monitoring multiple file descriptors to see if I/O is possible on any of them.
Packit 7cfc04
The
Packit 7cfc04
.B epoll
Packit 7cfc04
API can be used either as an edge-triggered or a level-triggered
Packit 7cfc04
interface and scales well to large numbers of watched file descriptors.
Packit 7cfc04
The following system calls are provided to
Packit 7cfc04
create and manage an
Packit 7cfc04
.B epoll
Packit 7cfc04
instance:
Packit 7cfc04
.IP * 3
Packit 7cfc04
.BR epoll_create (2)
Packit 7cfc04
creates a new
Packit 7cfc04
.B epoll
Packit 7cfc04
instance and returns a file descriptor referring to that instance.
Packit 7cfc04
(The more recent
Packit 7cfc04
.BR epoll_create1 (2)
Packit 7cfc04
extends the functionality of
Packit 7cfc04
.BR epoll_create (2).)
Packit 7cfc04
.IP *
Packit 7cfc04
Interest in particular file descriptors is then registered via
Packit 7cfc04
.BR epoll_ctl (2).
Packit 7cfc04
The set of file descriptors currently registered on an
Packit 7cfc04
.B epoll
Packit 7cfc04
instance is sometimes called an
Packit 7cfc04
.I epoll
Packit 7cfc04
set.
Packit 7cfc04
.IP *
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
waits for I/O events,
Packit 7cfc04
blocking the calling thread if no events are currently available.
Packit 7cfc04
.SS Level-triggered and edge-triggered
Packit 7cfc04
The
Packit 7cfc04
.B epoll
Packit 7cfc04
event distribution interface is able to behave both as edge-triggered
Packit 7cfc04
(ET) and as level-triggered (LT).
Packit 7cfc04
The difference between the two mechanisms
Packit 7cfc04
can be described as follows.
Packit 7cfc04
Suppose that
Packit 7cfc04
this scenario happens:
Packit 7cfc04
.IP 1. 3
Packit 7cfc04
The file descriptor that represents the read side of a pipe
Packit 7cfc04
.RI ( rfd )
Packit 7cfc04
is registered on the
Packit 7cfc04
.B epoll
Packit 7cfc04
instance.
Packit 7cfc04
.IP 2.
Packit 7cfc04
A pipe writer writes 2\ kB of data on the write side of the pipe.
Packit 7cfc04
.IP 3.
Packit 7cfc04
A call to
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
is done that will return
Packit 7cfc04
.I rfd
Packit 7cfc04
as a ready file descriptor.
Packit 7cfc04
.IP 4.
Packit 7cfc04
The pipe reader reads 1\ kB of data from
Packit 7cfc04
.IR rfd .
Packit 7cfc04
.IP 5.
Packit 7cfc04
A call to
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
is done.
Packit 7cfc04
.PP
Packit 7cfc04
If the
Packit 7cfc04
.I rfd
Packit 7cfc04
file descriptor has been added to the
Packit 7cfc04
.B epoll
Packit 7cfc04
interface using the
Packit 7cfc04
.B EPOLLET
Packit 7cfc04
(edge-triggered)
Packit 7cfc04
flag, the call to
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
done in step
Packit 7cfc04
.B 5
Packit 7cfc04
will probably hang despite the available data still present in the file
Packit 7cfc04
input buffer;
Packit 7cfc04
meanwhile the remote peer might be expecting a response based on the
Packit 7cfc04
data it already sent.
Packit 7cfc04
The reason for this is that edge-triggered mode
Packit 7cfc04
delivers events only when changes occur on the monitored file descriptor.
Packit 7cfc04
So, in step
Packit 7cfc04
.B 5
Packit 7cfc04
the caller might end up waiting for some data that is already present inside
Packit 7cfc04
the input buffer.
Packit 7cfc04
In the above example, an event on
Packit 7cfc04
.I rfd
Packit 7cfc04
will be generated because of the write done in
Packit 7cfc04
.B 2
Packit 7cfc04
and the event is consumed in
Packit 7cfc04
.BR 3 .
Packit 7cfc04
Since the read operation done in
Packit 7cfc04
.B 4
Packit 7cfc04
does not consume the whole buffer data, the call to
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
done in step
Packit 7cfc04
.B 5
Packit 7cfc04
might block indefinitely.
Packit 7cfc04
.PP
Packit 7cfc04
An application that employs the
Packit 7cfc04
.B EPOLLET
Packit 7cfc04
flag should use nonblocking file descriptors to avoid having a blocking
Packit 7cfc04
read or write starve a task that is handling multiple file descriptors.
Packit 7cfc04
The suggested way to use
Packit 7cfc04
.B epoll
Packit 7cfc04
as an edge-triggered
Packit 7cfc04
.RB ( EPOLLET )
Packit 7cfc04
interface is as follows:
Packit 7cfc04
.RS
Packit 7cfc04
.TP 4
Packit 7cfc04
.B i
Packit 7cfc04
with nonblocking file descriptors; and
Packit 7cfc04
.TP
Packit 7cfc04
.B ii
Packit 7cfc04
by waiting for an event only after
Packit 7cfc04
.BR read (2)
Packit 7cfc04
or
Packit 7cfc04
.BR write (2)
Packit 7cfc04
return
Packit 7cfc04
.BR EAGAIN .
Packit 7cfc04
.RE
Packit 7cfc04
.PP
Packit 7cfc04
By contrast, when used as a level-triggered interface
Packit 7cfc04
(the default, when
Packit 7cfc04
.B EPOLLET
Packit 7cfc04
is not specified),
Packit 7cfc04
.B epoll
Packit 7cfc04
is simply a faster
Packit 7cfc04
.BR poll (2),
Packit 7cfc04
and can be used wherever the latter is used since it shares the
Packit 7cfc04
same semantics.
Packit 7cfc04
.PP
Packit 7cfc04
Since even with edge-triggered
Packit 7cfc04
.BR epoll ,
Packit 7cfc04
multiple events can be generated upon receipt of multiple chunks of data,
Packit 7cfc04
the caller has the option to specify the
Packit 7cfc04
.B EPOLLONESHOT
Packit 7cfc04
flag, to tell
Packit 7cfc04
.B epoll
Packit 7cfc04
to disable the associated file descriptor after the receipt of an event with
Packit 7cfc04
.BR epoll_wait (2).
Packit 7cfc04
When the
Packit 7cfc04
.B EPOLLONESHOT
Packit 7cfc04
flag is specified,
Packit 7cfc04
it is the caller's responsibility to rearm the file descriptor using
Packit 7cfc04
.BR epoll_ctl (2)
Packit 7cfc04
with
Packit 7cfc04
.BR EPOLL_CTL_MOD .
Packit 7cfc04
.SS Interaction with autosleep
Packit 7cfc04
If the system is in
Packit 7cfc04
.B autosleep
Packit 7cfc04
mode via
Packit 7cfc04
.I /sys/power/autosleep
Packit 7cfc04
and an event happens which wakes the device from sleep, the device
Packit 7cfc04
driver will keep the device awake only until that event is queued.
Packit 7cfc04
To keep the device awake until the event has been processed,
Packit 7cfc04
it is necessary to use the
Packit 7cfc04
.BR epoll_ctl (2)
Packit 7cfc04
.B EPOLLWAKEUP
Packit 7cfc04
flag.
Packit 7cfc04
.PP
Packit 7cfc04
When the
Packit 7cfc04
.B EPOLLWAKEUP
Packit 7cfc04
flag is set in the
Packit 7cfc04
.B events
Packit 7cfc04
field for a
Packit 7cfc04
.IR "struct epoll_event" ,
Packit 7cfc04
the system will be kept awake from the moment the event is queued,
Packit 7cfc04
through the
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
call which returns the event until the subsequent
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
call.
Packit 7cfc04
If the event should keep the system awake beyond that time,
Packit 7cfc04
then a separate
Packit 7cfc04
.I wake_lock
Packit 7cfc04
should be taken before the second
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
call.
Packit 7cfc04
.SS /proc interfaces
Packit 7cfc04
The following interfaces can be used to limit the amount of
Packit 7cfc04
kernel memory consumed by epoll:
Packit 7cfc04
.\" Following was added in 2.6.28, but them removed in 2.6.29
Packit 7cfc04
.\" .TP
Packit 7cfc04
.\" .IR /proc/sys/fs/epoll/max_user_instances " (since Linux 2.6.28)"
Packit 7cfc04
.\" This specifies an upper limit on the number of epoll instances
Packit 7cfc04
.\" that can be created per real user ID.
Packit 7cfc04
.TP
Packit 7cfc04
.IR /proc/sys/fs/epoll/max_user_watches " (since Linux 2.6.28)"
Packit 7cfc04
This specifies a limit on the total number of
Packit 7cfc04
file descriptors that a user can register across
Packit 7cfc04
all epoll instances on the system.
Packit 7cfc04
The limit is per real user ID.
Packit 7cfc04
Each registered file descriptor costs roughly 90 bytes on a 32-bit kernel,
Packit 7cfc04
and roughly 160 bytes on a 64-bit kernel.
Packit 7cfc04
Currently,
Packit 7cfc04
.\" 2.6.29 (in 2.6.28, the default was 1/32 of lowmem)
Packit 7cfc04
the default value for
Packit 7cfc04
.I max_user_watches
Packit 7cfc04
is 1/25 (4%) of the available low memory,
Packit 7cfc04
divided by the registration cost in bytes.
Packit 7cfc04
.SS Example for suggested usage
Packit 7cfc04
While the usage of
Packit 7cfc04
.B epoll
Packit 7cfc04
when employed as a level-triggered interface does have the same
Packit 7cfc04
semantics as
Packit 7cfc04
.BR poll (2),
Packit 7cfc04
the edge-triggered usage requires more clarification to avoid stalls
Packit 7cfc04
in the application event loop.
Packit 7cfc04
In this example, listener is a
Packit 7cfc04
nonblocking socket on which
Packit 7cfc04
.BR listen (2)
Packit 7cfc04
has been called.
Packit 7cfc04
The function
Packit 7cfc04
.I do_use_fd()
Packit 7cfc04
uses the new ready file descriptor until
Packit 7cfc04
.B EAGAIN
Packit 7cfc04
is returned by either
Packit 7cfc04
.BR read (2)
Packit 7cfc04
or
Packit 7cfc04
.BR write (2).
Packit 7cfc04
An event-driven state machine application should, after having received
Packit 7cfc04
.BR EAGAIN ,
Packit 7cfc04
record its current state so that at the next call to
Packit 7cfc04
.I do_use_fd()
Packit 7cfc04
it will continue to
Packit 7cfc04
.BR read (2)
Packit 7cfc04
or
Packit 7cfc04
.BR write (2)
Packit 7cfc04
from where it stopped before.
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
#define MAX_EVENTS 10
Packit 7cfc04
struct epoll_event ev, events[MAX_EVENTS];
Packit 7cfc04
int listen_sock, conn_sock, nfds, epollfd;
Packit 7cfc04
Packit 7cfc04
/* Code to set up listening socket, \(aqlisten_sock\(aq,
Packit 7cfc04
   (socket(), bind(), listen()) omitted */
Packit 7cfc04
Packit 7cfc04
epollfd = epoll_create1(0);
Packit 7cfc04
if (epollfd == \-1) {
Packit 7cfc04
    perror("epoll_create1");
Packit 7cfc04
    exit(EXIT_FAILURE);
Packit 7cfc04
}
Packit 7cfc04
Packit 7cfc04
ev.events = EPOLLIN;
Packit 7cfc04
ev.data.fd = listen_sock;
Packit 7cfc04
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, listen_sock, &ev) == \-1) {
Packit 7cfc04
    perror("epoll_ctl: listen_sock");
Packit 7cfc04
    exit(EXIT_FAILURE);
Packit 7cfc04
}
Packit 7cfc04
Packit 7cfc04
for (;;) {
Packit 7cfc04
    nfds = epoll_wait(epollfd, events, MAX_EVENTS, \-1);
Packit 7cfc04
    if (nfds == \-1) {
Packit 7cfc04
        perror("epoll_wait");
Packit 7cfc04
        exit(EXIT_FAILURE);
Packit 7cfc04
    }
Packit 7cfc04
Packit 7cfc04
    for (n = 0; n < nfds; ++n) {
Packit 7cfc04
        if (events[n].data.fd == listen_sock) {
Packit 7cfc04
            conn_sock = accept(listen_sock,
Packit 7cfc04
                               (struct sockaddr *) &addr, &addrlen);
Packit 7cfc04
            if (conn_sock == \-1) {
Packit 7cfc04
                perror("accept");
Packit 7cfc04
                exit(EXIT_FAILURE);
Packit 7cfc04
            }
Packit 7cfc04
            setnonblocking(conn_sock);
Packit 7cfc04
            ev.events = EPOLLIN | EPOLLET;
Packit 7cfc04
            ev.data.fd = conn_sock;
Packit 7cfc04
            if (epoll_ctl(epollfd, EPOLL_CTL_ADD, conn_sock,
Packit 7cfc04
                        &ev) == \-1) {
Packit 7cfc04
                perror("epoll_ctl: conn_sock");
Packit 7cfc04
                exit(EXIT_FAILURE);
Packit 7cfc04
            }
Packit 7cfc04
        } else {
Packit 7cfc04
            do_use_fd(events[n].data.fd);
Packit 7cfc04
        }
Packit 7cfc04
    }
Packit 7cfc04
}
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
When used as an edge-triggered interface, for performance reasons, it is
Packit 7cfc04
possible to add the file descriptor inside the
Packit 7cfc04
.B epoll
Packit 7cfc04
interface
Packit 7cfc04
.RB ( EPOLL_CTL_ADD )
Packit 7cfc04
once by specifying
Packit 7cfc04
.RB ( EPOLLIN | EPOLLOUT ).
Packit 7cfc04
This allows you to avoid
Packit 7cfc04
continuously switching between
Packit 7cfc04
.B EPOLLIN
Packit 7cfc04
and
Packit 7cfc04
.B EPOLLOUT
Packit 7cfc04
calling
Packit 7cfc04
.BR epoll_ctl (2)
Packit 7cfc04
with
Packit 7cfc04
.BR EPOLL_CTL_MOD .
Packit 7cfc04
.SS Questions and answers
Packit 7cfc04
.TP 4
Packit 7cfc04
.B Q0
Packit 7cfc04
What is the key used to distinguish the file descriptors registered in an
Packit 7cfc04
.B epoll
Packit 7cfc04
set?
Packit 7cfc04
.TP
Packit 7cfc04
.B A0
Packit 7cfc04
The key is the combination of the file descriptor number and
Packit 7cfc04
the open file description
Packit 7cfc04
(also known as an "open file handle",
Packit 7cfc04
the kernel's internal representation of an open file).
Packit 7cfc04
.TP
Packit 7cfc04
.B Q1
Packit 7cfc04
What happens if you register the same file descriptor on an
Packit 7cfc04
.B epoll
Packit 7cfc04
instance twice?
Packit 7cfc04
.TP
Packit 7cfc04
.B A1
Packit 7cfc04
You will probably get
Packit 7cfc04
.BR EEXIST .
Packit 7cfc04
However, it is possible to add a duplicate
Packit 7cfc04
.RB ( dup (2),
Packit 7cfc04
.BR dup2 (2),
Packit 7cfc04
.BR fcntl (2)
Packit 7cfc04
.BR F_DUPFD )
Packit 7cfc04
file descriptor to the same
Packit 7cfc04
.B epoll
Packit 7cfc04
instance.
Packit 7cfc04
.\" But a file descriptor duplicated by fork(2) can't be added to the
Packit 7cfc04
.\" set, because the [file *, fd] pair is already in the epoll set.
Packit 7cfc04
.\" That is a somewhat ugly inconsistency.  On the one hand, a child process
Packit 7cfc04
.\" cannot add the duplicate file descriptor to the epoll set.  (In every
Packit 7cfc04
.\" other case that I can think of, file descriptors duplicated by fork have
Packit 7cfc04
.\" similar semantics to file descriptors duplicated by dup() and friends.)  On
Packit 7cfc04
.\" the other hand, the very fact that the child has a duplicate of the
Packit 7cfc04
.\" file descriptor means that even if the parent closes its file descriptor,
Packit 7cfc04
.\" then epoll_wait() in the parent will continue to receive notifications for
Packit 7cfc04
.\" that file descriptor because of the duplicated file descriptor in the child.
Packit 7cfc04
.\"
Packit 7cfc04
.\" See http://thread.gmane.org/gmane.linux.kernel/596462/
Packit 7cfc04
.\" "epoll design problems with common fork/exec patterns"
Packit 7cfc04
.\"
Packit 7cfc04
.\" mtk, Feb 2008
Packit 7cfc04
This can be a useful technique for filtering events,
Packit 7cfc04
if the duplicate file descriptors are registered with different
Packit 7cfc04
.I events
Packit 7cfc04
masks.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q2
Packit 7cfc04
Can two
Packit 7cfc04
.B epoll
Packit 7cfc04
instances wait for the same file descriptor?
Packit 7cfc04
If so, are events reported to both
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptors?
Packit 7cfc04
.TP
Packit 7cfc04
.B A2
Packit 7cfc04
Yes, and events would be reported to both.
Packit 7cfc04
However, careful programming may be needed to do this correctly.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q3
Packit 7cfc04
Is the
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor itself poll/epoll/selectable?
Packit 7cfc04
.TP
Packit 7cfc04
.B A3
Packit 7cfc04
Yes.
Packit 7cfc04
If an
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor has events waiting, then it will
Packit 7cfc04
indicate as being readable.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q4
Packit 7cfc04
What happens if one attempts to put an
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor into its own file descriptor set?
Packit 7cfc04
.TP
Packit 7cfc04
.B A4
Packit 7cfc04
The
Packit 7cfc04
.BR epoll_ctl (2)
Packit 7cfc04
call fails
Packit 7cfc04
.RB ( EINVAL ).
Packit 7cfc04
However, you can add an
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor inside another
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor set.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q5
Packit 7cfc04
Can I send an
Packit 7cfc04
.B epoll
Packit 7cfc04
file descriptor over a UNIX domain socket to another process?
Packit 7cfc04
.TP
Packit 7cfc04
.B A5
Packit 7cfc04
Yes, but it does not make sense to do this, since the receiving process
Packit 7cfc04
would not have copies of the file descriptors in the
Packit 7cfc04
.B epoll
Packit 7cfc04
set.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q6
Packit 7cfc04
Will closing a file descriptor cause it to be removed from all
Packit 7cfc04
.B epoll
Packit 7cfc04
sets automatically?
Packit 7cfc04
.TP
Packit 7cfc04
.B A6
Packit 7cfc04
Yes, but be aware of the following point.
Packit 7cfc04
A file descriptor is a reference to an open file description (see
Packit 7cfc04
.BR open (2)).
Packit 7cfc04
Whenever a file descriptor is duplicated via
Packit 7cfc04
.BR dup (2),
Packit 7cfc04
.BR dup2 (2),
Packit 7cfc04
.BR fcntl (2)
Packit 7cfc04
.BR F_DUPFD ,
Packit 7cfc04
or
Packit 7cfc04
.BR fork (2),
Packit 7cfc04
a new file descriptor referring to the same open file description is
Packit 7cfc04
created.
Packit 7cfc04
An open file description continues to exist until all
Packit 7cfc04
file descriptors referring to it have been closed.
Packit 7cfc04
A file descriptor is removed from an
Packit 7cfc04
.B epoll
Packit 7cfc04
set only after all the file descriptors referring to the underlying
Packit 7cfc04
open file description have been closed
Packit 7cfc04
(or before if the file descriptor is explicitly removed using
Packit 7cfc04
.BR epoll_ctl (2)
Packit 7cfc04
.BR EPOLL_CTL_DEL ).
Packit 7cfc04
This means that even after a file descriptor that is part of an
Packit 7cfc04
.B epoll
Packit 7cfc04
set has been closed,
Packit 7cfc04
events may be reported for that file descriptor if other file
Packit 7cfc04
descriptors referring to the same underlying file description remain open.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q7
Packit 7cfc04
If more than one event occurs between
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
calls, are they combined or reported separately?
Packit 7cfc04
.TP
Packit 7cfc04
.B A7
Packit 7cfc04
They will be combined.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q8
Packit 7cfc04
Does an operation on a file descriptor affect the
Packit 7cfc04
already collected but not yet reported events?
Packit 7cfc04
.TP
Packit 7cfc04
.B A8
Packit 7cfc04
You can do two operations on an existing file descriptor.
Packit 7cfc04
Remove would be meaningless for
Packit 7cfc04
this case.
Packit 7cfc04
Modify will reread available I/O.
Packit 7cfc04
.TP
Packit 7cfc04
.B Q9
Packit 7cfc04
Do I need to continuously read/write a file descriptor
Packit 7cfc04
until
Packit 7cfc04
.B EAGAIN
Packit 7cfc04
when using the
Packit 7cfc04
.B EPOLLET
Packit 7cfc04
flag (edge-triggered behavior) ?
Packit 7cfc04
.TP
Packit 7cfc04
.B A9
Packit 7cfc04
Receiving an event from
Packit 7cfc04
.BR epoll_wait (2)
Packit 7cfc04
should suggest to you that such
Packit 7cfc04
file descriptor is ready for the requested I/O operation.
Packit 7cfc04
You must consider it ready until the next (nonblocking)
Packit 7cfc04
read/write yields
Packit 7cfc04
.BR EAGAIN .
Packit 7cfc04
When and how you will use the file descriptor is entirely up to you.
Packit 7cfc04
.IP
Packit 7cfc04
For packet/token-oriented files (e.g., datagram socket,
Packit 7cfc04
terminal in canonical mode),
Packit 7cfc04
the only way to detect the end of the read/write I/O space
Packit 7cfc04
is to continue to read/write until
Packit 7cfc04
.BR EAGAIN .
Packit 7cfc04
.IP
Packit 7cfc04
For stream-oriented files (e.g., pipe, FIFO, stream socket), the
Packit 7cfc04
condition that the read/write I/O space is exhausted can also be detected by
Packit 7cfc04
checking the amount of data read from / written to the target file
Packit 7cfc04
descriptor.
Packit 7cfc04
For example, if you call
Packit 7cfc04
.BR read (2)
Packit 7cfc04
by asking to read a certain amount of data and
Packit 7cfc04
.BR read (2)
Packit 7cfc04
returns a lower number of bytes, you
Packit 7cfc04
can be sure of having exhausted the read I/O space for the file
Packit 7cfc04
descriptor.
Packit 7cfc04
The same is true when writing using
Packit 7cfc04
.BR write (2).
Packit 7cfc04
(Avoid this latter technique if you cannot guarantee that
Packit 7cfc04
the monitored file descriptor always refers to a stream-oriented file.)
Packit 7cfc04
.SS Possible pitfalls and ways to avoid them
Packit 7cfc04
.TP
Packit 7cfc04
.B o Starvation (edge-triggered)
Packit 7cfc04
.PP
Packit 7cfc04
If there is a large amount of I/O space,
Packit 7cfc04
it is possible that by trying to drain
Packit 7cfc04
it the other files will not get processed causing starvation.
Packit 7cfc04
(This problem is not specific to
Packit 7cfc04
.BR epoll .)
Packit 7cfc04
.PP
Packit 7cfc04
The solution is to maintain a ready list
Packit 7cfc04
and mark the file descriptor as ready
Packit 7cfc04
in its associated data structure, thereby allowing the application to
Packit 7cfc04
remember which files need to be processed but still round robin amongst
Packit 7cfc04
all the ready files.
Packit 7cfc04
This also supports ignoring subsequent events you
Packit 7cfc04
receive for file descriptors that are already ready.
Packit 7cfc04
.TP
Packit 7cfc04
.B o If using an event cache...
Packit 7cfc04
.PP
Packit 7cfc04
If you use an event cache or store all the file descriptors returned from
Packit 7cfc04
.BR epoll_wait (2),
Packit 7cfc04
then make sure to provide a way to mark
Packit 7cfc04
its closure dynamically (i.e., caused by
Packit 7cfc04
a previous event's processing).
Packit 7cfc04
Suppose you receive 100 events from
Packit 7cfc04
.BR epoll_wait (2),
Packit 7cfc04
and in event #47 a condition causes event #13 to be closed.
Packit 7cfc04
If you remove the structure and
Packit 7cfc04
.BR close (2)
Packit 7cfc04
the file descriptor for event #13, then your
Packit 7cfc04
event cache might still say there are events waiting for that
Packit 7cfc04
file descriptor causing confusion.
Packit 7cfc04
.PP
Packit 7cfc04
One solution for this is to call, during the processing of event 47,
Packit 7cfc04
.BR epoll_ctl ( EPOLL_CTL_DEL )
Packit 7cfc04
to delete file descriptor 13 and
Packit 7cfc04
.BR close (2),
Packit 7cfc04
then mark its associated
Packit 7cfc04
data structure as removed and link it to a cleanup list.
Packit 7cfc04
If you find another
Packit 7cfc04
event for file descriptor 13 in your batch processing,
Packit 7cfc04
you will discover the file descriptor had been
Packit 7cfc04
previously removed and there will be no confusion.
Packit 7cfc04
.SH VERSIONS
Packit 7cfc04
The
Packit 7cfc04
.B epoll
Packit 7cfc04
API was introduced in Linux kernel 2.5.44.
Packit 7cfc04
.\" Its interface should be finalized in Linux kernel 2.5.66.
Packit 7cfc04
Support was added to glibc in version 2.3.2.
Packit 7cfc04
.SH CONFORMING TO
Packit 7cfc04
The
Packit 7cfc04
.B epoll
Packit 7cfc04
API is Linux-specific.
Packit 7cfc04
Some other systems provide similar
Packit 7cfc04
mechanisms, for example, FreeBSD has
Packit 7cfc04
.IR kqueue ,
Packit 7cfc04
and Solaris has
Packit 7cfc04
.IR /dev/poll .
Packit 7cfc04
.SH NOTES
Packit 7cfc04
The set of file descriptors that is being monitored via
Packit 7cfc04
an epoll file descriptor can be viewed via the entry for
Packit 7cfc04
the epoll file descriptor in the process's
Packit 7cfc04
.IR /proc/[pid]/fdinfo
Packit 7cfc04
directory.
Packit 7cfc04
See
Packit 7cfc04
.BR proc (5)
Packit 7cfc04
for further details.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.BR kcmp (2)
Packit 7cfc04
.B KCMP_EPOLL_TFD
Packit 7cfc04
operation can be used to test whether a file descriptor
Packit 7cfc04
is present in an epoll instance.
Packit 7cfc04
.SH SEE ALSO
Packit 7cfc04
.BR epoll_create (2),
Packit 7cfc04
.BR epoll_create1 (2),
Packit 7cfc04
.BR epoll_ctl (2),
Packit 7cfc04
.BR epoll_wait (2),
Packit 7cfc04
.BR poll (2),
Packit 7cfc04
.BR select (2)
Packit 7cfc04
.SH COLOPHON
Packit 7cfc04
This page is part of release 4.15 of the Linux
Packit 7cfc04
.I man-pages
Packit 7cfc04
project.
Packit 7cfc04
A description of the project,
Packit 7cfc04
information about reporting bugs,
Packit 7cfc04
and the latest version of this page,
Packit 7cfc04
can be found at
Packit 7cfc04
\%https://www.kernel.org/doc/man\-pages/.