Blame manual/io.texi

Packit 6c4009
@node I/O Overview, I/O on Streams, Pattern Matching, Top
Packit 6c4009
@c %MENU% Introduction to the I/O facilities
Packit 6c4009
@chapter Input/Output Overview
Packit 6c4009
Packit 6c4009
Most programs need to do either input (reading data) or output (writing
Packit 6c4009
data), or most frequently both, in order to do anything useful.  @Theglibc{}
Packit 6c4009
provides such a large selection of input and output functions
Packit 6c4009
that the hardest part is often deciding which function is most
Packit 6c4009
appropriate!
Packit 6c4009
Packit 6c4009
This chapter introduces concepts and terminology relating to input
Packit 6c4009
and output.  Other chapters relating to the GNU I/O facilities are:
Packit 6c4009
Packit 6c4009
@itemize @bullet
Packit 6c4009
@item
Packit 6c4009
@ref{I/O on Streams}, which covers the high-level functions
Packit 6c4009
that operate on streams, including formatted input and output.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
@ref{Low-Level I/O}, which covers the basic I/O and control
Packit 6c4009
functions on file descriptors.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
@ref{File System Interface}, which covers functions for operating on
Packit 6c4009
directories and for manipulating file attributes such as access modes
Packit 6c4009
and ownership.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
@ref{Pipes and FIFOs}, which includes information on the basic interprocess
Packit 6c4009
communication facilities.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
@ref{Sockets}, which covers a more complicated interprocess communication
Packit 6c4009
facility with support for networking.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
@ref{Low-Level Terminal Interface}, which covers functions for changing
Packit 6c4009
how input and output to terminals or other serial devices are processed.
Packit 6c4009
@end itemize
Packit 6c4009
Packit 6c4009
Packit 6c4009
@menu
Packit 6c4009
* I/O Concepts::       Some basic information and terminology.
Packit 6c4009
* File Names::         How to refer to a file.
Packit 6c4009
@end menu
Packit 6c4009
Packit 6c4009
@node I/O Concepts, File Names,  , I/O Overview
Packit 6c4009
@section Input/Output Concepts
Packit 6c4009
Packit 6c4009
Before you can read or write the contents of a file, you must establish
Packit 6c4009
a connection or communications channel to the file.  This process is
Packit 6c4009
called @dfn{opening} the file.  You can open a file for reading, writing,
Packit 6c4009
or both.
Packit 6c4009
@cindex opening a file
Packit 6c4009
Packit 6c4009
The connection to an open file is represented either as a stream or as a
Packit 6c4009
file descriptor.  You pass this as an argument to the functions that do
Packit 6c4009
the actual read or write operations, to tell them which file to operate
Packit 6c4009
on.  Certain functions expect streams, and others are designed to
Packit 6c4009
operate on file descriptors.
Packit 6c4009
Packit 6c4009
When you have finished reading to or writing from the file, you can
Packit 6c4009
terminate the connection by @dfn{closing} the file.  Once you have
Packit 6c4009
closed a stream or file descriptor, you cannot do any more input or
Packit 6c4009
output operations on it.
Packit 6c4009
Packit 6c4009
@menu
Packit 6c4009
* Streams and File Descriptors::    The GNU C Library provides two ways
Packit 6c4009
			             to access the contents of files.
Packit 6c4009
* File Position::                   The number of bytes from the
Packit 6c4009
                                     beginning of the file.
Packit 6c4009
@end menu
Packit 6c4009
Packit 6c4009
@node Streams and File Descriptors, File Position,  , I/O Concepts
Packit 6c4009
@subsection Streams and File Descriptors
Packit 6c4009
Packit 6c4009
When you want to do input or output to a file, you have a choice of two
Packit 6c4009
basic mechanisms for representing the connection between your program
Packit 6c4009
and the file: file descriptors and streams.  File descriptors are
Packit 6c4009
represented as objects of type @code{int}, while streams are represented
Packit 6c4009
as @code{FILE *} objects.
Packit 6c4009
Packit 6c4009
File descriptors provide a primitive, low-level interface to input and
Packit 6c4009
output operations.  Both file descriptors and streams can represent a
Packit 6c4009
connection to a device (such as a terminal), or a pipe or socket for
Packit 6c4009
communicating with another process, as well as a normal file.  But, if
Packit 6c4009
you want to do control operations that are specific to a particular kind
Packit 6c4009
of device, you must use a file descriptor; there are no facilities to
Packit 6c4009
use streams in this way.  You must also use file descriptors if your
Packit 6c4009
program needs to do input or output in special modes, such as
Packit 6c4009
nonblocking (or polled) input (@pxref{File Status Flags}).
Packit 6c4009
Packit 6c4009
Streams provide a higher-level interface, layered on top of the
Packit 6c4009
primitive file descriptor facilities.  The stream interface treats all
Packit 6c4009
kinds of files pretty much alike---the sole exception being the three
Packit 6c4009
styles of buffering that you can choose (@pxref{Stream Buffering}).
Packit 6c4009
Packit 6c4009
The main advantage of using the stream interface is that the set of
Packit 6c4009
functions for performing actual input and output operations (as opposed
Packit 6c4009
to control operations) on streams is much richer and more powerful than
Packit 6c4009
the corresponding facilities for file descriptors.  The file descriptor
Packit 6c4009
interface provides only simple functions for transferring blocks of
Packit 6c4009
characters, but the stream interface also provides powerful formatted
Packit 6c4009
input and output functions (@code{printf} and @code{scanf}) as well as
Packit 6c4009
functions for character- and line-oriented input and output.
Packit 6c4009
@c !!! glibc has dprintf, which lets you do printf on an fd.
Packit 6c4009
Packit 6c4009
Since streams are implemented in terms of file descriptors, you can
Packit 6c4009
extract the file descriptor from a stream and perform low-level
Packit 6c4009
operations directly on the file descriptor.  You can also initially open
Packit 6c4009
a connection as a file descriptor and then make a stream associated with
Packit 6c4009
that file descriptor.
Packit 6c4009
Packit 6c4009
In general, you should stick with using streams rather than file
Packit 6c4009
descriptors, unless there is some specific operation you want to do that
Packit 6c4009
can only be done on a file descriptor.  If you are a beginning
Packit 6c4009
programmer and aren't sure what functions to use, we suggest that you
Packit 6c4009
concentrate on the formatted input functions (@pxref{Formatted Input})
Packit 6c4009
and formatted output functions (@pxref{Formatted Output}).
Packit 6c4009
Packit 6c4009
If you are concerned about portability of your programs to systems other
Packit 6c4009
than GNU, you should also be aware that file descriptors are not as
Packit 6c4009
portable as streams.  You can expect any system running @w{ISO C} to
Packit 6c4009
support streams, but @nongnusystems{} may not support file descriptors at
Packit 6c4009
all, or may only implement a subset of the GNU functions that operate on
Packit 6c4009
file descriptors.  Most of the file descriptor functions in @theglibc{}
Packit 6c4009
are included in the POSIX.1 standard, however.
Packit 6c4009
Packit 6c4009
@node File Position,  , Streams and File Descriptors, I/O Concepts
Packit 6c4009
@subsection File Position
Packit 6c4009
Packit 6c4009
One of the attributes of an open file is its @dfn{file position} that
Packit 6c4009
keeps track of where in the file the next character is to be read or
Packit 6c4009
written.  On @gnusystems{}, and all POSIX.1 systems, the file position
Packit 6c4009
is simply an integer representing the number of bytes from the beginning
Packit 6c4009
of the file.
Packit 6c4009
Packit 6c4009
The file position is normally set to the beginning of the file when it
Packit 6c4009
is opened, and each time a character is read or written, the file
Packit 6c4009
position is incremented.  In other words, access to the file is normally
Packit 6c4009
@dfn{sequential}.
Packit 6c4009
@cindex file position
Packit 6c4009
@cindex sequential-access files
Packit 6c4009
Packit 6c4009
Ordinary files permit read or write operations at any position within
Packit 6c4009
the file.  Some other kinds of files may also permit this.  Files which
Packit 6c4009
do permit this are sometimes referred to as @dfn{random-access} files.
Packit 6c4009
You can change the file position using the @code{fseek} function on a
Packit 6c4009
stream (@pxref{File Positioning}) or the @code{lseek} function on a file
Packit 6c4009
descriptor (@pxref{I/O Primitives}).  If you try to change the file
Packit 6c4009
position on a file that doesn't support random access, you get the
Packit 6c4009
@code{ESPIPE} error.
Packit 6c4009
@cindex random-access files
Packit 6c4009
Packit 6c4009
Streams and descriptors that are opened for @dfn{append access} are
Packit 6c4009
treated specially for output: output to such files is @emph{always}
Packit 6c4009
appended sequentially to the @emph{end} of the file, regardless of the
Packit 6c4009
file position.  However, the file position is still used to control where in
Packit 6c4009
the file reading is done.
Packit 6c4009
@cindex append-access files
Packit 6c4009
Packit 6c4009
If you think about it, you'll realize that several programs can read a
Packit 6c4009
given file at the same time.  In order for each program to be able to
Packit 6c4009
read the file at its own pace, each program must have its own file
Packit 6c4009
pointer, which is not affected by anything the other programs do.
Packit 6c4009
Packit 6c4009
In fact, each opening of a file creates a separate file position.
Packit 6c4009
Thus, if you open a file twice even in the same program, you get two
Packit 6c4009
streams or descriptors with independent file positions.
Packit 6c4009
Packit 6c4009
By contrast, if you open a descriptor and then duplicate it to get
Packit 6c4009
another descriptor, these two descriptors share the same file position:
Packit 6c4009
changing the file position of one descriptor will affect the other.
Packit 6c4009
Packit 6c4009
@node File Names,  , I/O Concepts, I/O Overview
Packit 6c4009
@section File Names
Packit 6c4009
Packit 6c4009
In order to open a connection to a file, or to perform other operations
Packit 6c4009
such as deleting a file, you need some way to refer to the file.  Nearly
Packit 6c4009
all files have names that are strings---even files which are actually
Packit 6c4009
devices such as tape drives or terminals.  These strings are called
Packit 6c4009
@dfn{file names}.  You specify the file name to say which file you want
Packit 6c4009
to open or operate on.
Packit 6c4009
Packit 6c4009
This section describes the conventions for file names and how the
Packit 6c4009
operating system works with them.
Packit 6c4009
@cindex file name
Packit 6c4009
Packit 6c4009
@menu
Packit 6c4009
* Directories::                 Directories contain entries for files.
Packit 6c4009
* File Name Resolution::        A file name specifies how to look up a file.
Packit 6c4009
* File Name Errors::            Error conditions relating to file names.
Packit 6c4009
* File Name Portability::       File name portability and syntax issues.
Packit 6c4009
@end menu
Packit 6c4009
Packit 6c4009
Packit 6c4009
@node Directories, File Name Resolution,  , File Names
Packit 6c4009
@subsection Directories
Packit 6c4009
Packit 6c4009
In order to understand the syntax of file names, you need to understand
Packit 6c4009
how the file system is organized into a hierarchy of directories.
Packit 6c4009
Packit 6c4009
@cindex directory
Packit 6c4009
@cindex link
Packit 6c4009
@cindex directory entry
Packit 6c4009
A @dfn{directory} is a file that contains information to associate other
Packit 6c4009
files with names; these associations are called @dfn{links} or
Packit 6c4009
@dfn{directory entries}.  Sometimes, people speak of ``files in a
Packit 6c4009
directory'', but in reality, a directory only contains pointers to
Packit 6c4009
files, not the files themselves.
Packit 6c4009
Packit 6c4009
@cindex file name component
Packit 6c4009
The name of a file contained in a directory entry is called a @dfn{file
Packit 6c4009
name component}.  In general, a file name consists of a sequence of one
Packit 6c4009
or more such components, separated by the slash character (@samp{/}).  A
Packit 6c4009
file name which is just one component names a file with respect to its
Packit 6c4009
directory.  A file name with multiple components names a directory, and
Packit 6c4009
then a file in that directory, and so on.
Packit 6c4009
Packit 6c4009
Some other documents, such as the POSIX standard, use the term
Packit 6c4009
@dfn{pathname} for what we call a file name, and either @dfn{filename}
Packit 6c4009
or @dfn{pathname component} for what this manual calls a file name
Packit 6c4009
component.  We don't use this terminology because a ``path'' is
Packit 6c4009
something completely different (a list of directories to search), and we
Packit 6c4009
think that ``pathname'' used for something else will confuse users.  We
Packit 6c4009
always use ``file name'' and ``file name component'' (or sometimes just
Packit 6c4009
``component'', where the context is obvious) in GNU documentation.  Some
Packit 6c4009
macros use the POSIX terminology in their names, such as
Packit 6c4009
@code{PATH_MAX}.  These macros are defined by the POSIX standard, so we
Packit 6c4009
cannot change their names.
Packit 6c4009
Packit 6c4009
You can find more detailed information about operations on directories
Packit 6c4009
in @ref{File System Interface}.
Packit 6c4009
Packit 6c4009
@node File Name Resolution, File Name Errors, Directories, File Names
Packit 6c4009
@subsection File Name Resolution
Packit 6c4009
Packit 6c4009
A file name consists of file name components separated by slash
Packit 6c4009
(@samp{/}) characters.  On the systems that @theglibc{} supports,
Packit 6c4009
multiple successive @samp{/} characters are equivalent to a single
Packit 6c4009
@samp{/} character.
Packit 6c4009
Packit 6c4009
@cindex file name resolution
Packit 6c4009
The process of determining what file a file name refers to is called
Packit 6c4009
@dfn{file name resolution}.  This is performed by examining the
Packit 6c4009
components that make up a file name in left-to-right order, and locating
Packit 6c4009
each successive component in the directory named by the previous
Packit 6c4009
component.  Of course, each of the files that are referenced as
Packit 6c4009
directories must actually exist, be directories instead of regular
Packit 6c4009
files, and have the appropriate permissions to be accessible by the
Packit 6c4009
process; otherwise the file name resolution fails.
Packit 6c4009
Packit 6c4009
@cindex root directory
Packit 6c4009
@cindex absolute file name
Packit 6c4009
If a file name begins with a @samp{/}, the first component in the file
Packit 6c4009
name is located in the @dfn{root directory} of the process (usually all
Packit 6c4009
processes on the system have the same root directory).  Such a file name
Packit 6c4009
is called an @dfn{absolute file name}.
Packit 6c4009
@c !!! xref here to chroot, if we ever document chroot. -rm
Packit 6c4009
Packit 6c4009
@cindex relative file name
Packit 6c4009
Otherwise, the first component in the file name is located in the
Packit 6c4009
current working directory (@pxref{Working Directory}).  This kind of
Packit 6c4009
file name is called a @dfn{relative file name}.
Packit 6c4009
Packit 6c4009
@cindex parent directory
Packit 6c4009
The file name components @file{.} (``dot'') and @file{..} (``dot-dot'')
Packit 6c4009
have special meanings.  Every directory has entries for these file name
Packit 6c4009
components.  The file name component @file{.} refers to the directory
Packit 6c4009
itself, while the file name component @file{..} refers to its
Packit 6c4009
@dfn{parent directory} (the directory that contains the link for the
Packit 6c4009
directory in question).  As a special case, @file{..} in the root
Packit 6c4009
directory refers to the root directory itself, since it has no parent;
Packit 6c4009
thus @file{/..} is the same as @file{/}.
Packit 6c4009
Packit 6c4009
Here are some examples of file names:
Packit 6c4009
Packit 6c4009
@table @file
Packit 6c4009
@item /a
Packit 6c4009
The file named @file{a}, in the root directory.
Packit 6c4009
Packit 6c4009
@item /a/b
Packit 6c4009
The file named @file{b}, in the directory named @file{a} in the root directory.
Packit 6c4009
Packit 6c4009
@item a
Packit 6c4009
The file named @file{a}, in the current working directory.
Packit 6c4009
Packit 6c4009
@item /a/./b
Packit 6c4009
This is the same as @file{/a/b}.
Packit 6c4009
Packit 6c4009
@item ./a
Packit 6c4009
The file named @file{a}, in the current working directory.
Packit 6c4009
Packit 6c4009
@item ../a
Packit 6c4009
The file named @file{a}, in the parent directory of the current working
Packit 6c4009
directory.
Packit 6c4009
@end table
Packit 6c4009
Packit 6c4009
@c An empty string may ``work'', but I think it's confusing to
Packit 6c4009
@c try to describe it.  It's not a useful thing for users to use--rms.
Packit 6c4009
A file name that names a directory may optionally end in a @samp{/}.
Packit 6c4009
You can specify a file name of @file{/} to refer to the root directory,
Packit 6c4009
but the empty string is not a meaningful file name.  If you want to
Packit 6c4009
refer to the current working directory, use a file name of @file{.} or
Packit 6c4009
@file{./}.
Packit 6c4009
Packit 6c4009
Unlike some other operating systems, @gnusystems{} don't have any
Packit 6c4009
built-in support for file types (or extensions) or file versions as part
Packit 6c4009
of its file name syntax.  Many programs and utilities use conventions
Packit 6c4009
for file names---for example, files containing C source code usually
Packit 6c4009
have names suffixed with @samp{.c}---but there is nothing in the file
Packit 6c4009
system itself that enforces this kind of convention.
Packit 6c4009
Packit 6c4009
@node File Name Errors, File Name Portability, File Name Resolution, File Names
Packit 6c4009
@subsection File Name Errors
Packit 6c4009
Packit 6c4009
@cindex file name errors
Packit 6c4009
@cindex usual file name errors
Packit 6c4009
Packit 6c4009
Functions that accept file name arguments usually detect these
Packit 6c4009
@code{errno} error conditions relating to the file name syntax or
Packit 6c4009
trouble finding the named file.  These errors are referred to throughout
Packit 6c4009
this manual as the @dfn{usual file name errors}.
Packit 6c4009
Packit 6c4009
@table @code
Packit 6c4009
@item EACCES
Packit 6c4009
The process does not have search permission for a directory component
Packit 6c4009
of the file name.
Packit 6c4009
Packit 6c4009
@item ENAMETOOLONG
Packit 6c4009
This error is used when either the total length of a file name is
Packit 6c4009
greater than @code{PATH_MAX}, or when an individual file name component
Packit 6c4009
has a length greater than @code{NAME_MAX}.  @xref{Limits for Files}.
Packit 6c4009
Packit 6c4009
On @gnuhurdsystems{}, there is no imposed limit on overall file name
Packit 6c4009
length, but some file systems may place limits on the length of a
Packit 6c4009
component.
Packit 6c4009
Packit 6c4009
@item ENOENT
Packit 6c4009
This error is reported when a file referenced as a directory component
Packit 6c4009
in the file name doesn't exist, or when a component is a symbolic link
Packit 6c4009
whose target file does not exist.  @xref{Symbolic Links}.
Packit 6c4009
Packit 6c4009
@item ENOTDIR
Packit 6c4009
A file that is referenced as a directory component in the file name
Packit 6c4009
exists, but it isn't a directory.
Packit 6c4009
Packit 6c4009
@item ELOOP
Packit 6c4009
Too many symbolic links were resolved while trying to look up the file
Packit 6c4009
name.  The system has an arbitrary limit on the number of symbolic links
Packit 6c4009
that may be resolved in looking up a single file name, as a primitive
Packit 6c4009
way to detect loops.  @xref{Symbolic Links}.
Packit 6c4009
@end table
Packit 6c4009
Packit 6c4009
Packit 6c4009
@node File Name Portability,  , File Name Errors, File Names
Packit 6c4009
@subsection Portability of File Names
Packit 6c4009
Packit 6c4009
The rules for the syntax of file names discussed in @ref{File Names},
Packit 6c4009
are the rules normally used by @gnusystems{} and by other POSIX
Packit 6c4009
systems.  However, other operating systems may use other conventions.
Packit 6c4009
Packit 6c4009
There are two reasons why it can be important for you to be aware of
Packit 6c4009
file name portability issues:
Packit 6c4009
Packit 6c4009
@itemize @bullet
Packit 6c4009
@item
Packit 6c4009
If your program makes assumptions about file name syntax, or contains
Packit 6c4009
embedded literal file name strings, it is more difficult to get it to
Packit 6c4009
run under other operating systems that use different syntax conventions.
Packit 6c4009
Packit 6c4009
@item
Packit 6c4009
Even if you are not concerned about running your program on machines
Packit 6c4009
that run other operating systems, it may still be possible to access
Packit 6c4009
files that use different naming conventions.  For example, you may be
Packit 6c4009
able to access file systems on another computer running a different
Packit 6c4009
operating system over a network, or read and write disks in formats used
Packit 6c4009
by other operating systems.
Packit 6c4009
@end itemize
Packit 6c4009
Packit 6c4009
The @w{ISO C} standard says very little about file name syntax, only that
Packit 6c4009
file names are strings.  In addition to varying restrictions on the
Packit 6c4009
length of file names and what characters can validly appear in a file
Packit 6c4009
name, different operating systems use different conventions and syntax
Packit 6c4009
for concepts such as structured directories and file types or
Packit 6c4009
extensions.  Some concepts such as file versions might be supported in
Packit 6c4009
some operating systems and not by others.
Packit 6c4009
Packit 6c4009
The POSIX.1 standard allows implementations to put additional
Packit 6c4009
restrictions on file name syntax, concerning what characters are
Packit 6c4009
permitted in file names and on the length of file name and file name
Packit 6c4009
component strings.  However, on @gnusystems{}, any character except
Packit 6c4009
the null character is permitted in a file name string, and
Packit 6c4009
on @gnuhurdsystems{} there are no limits on the length of file name
Packit 6c4009
strings.