Blame libarchive/cpio.5

Packit 08bd4c
.\" Copyright (c) 2007 Tim Kientzle
Packit 08bd4c
.\" All rights reserved.
Packit 08bd4c
.\"
Packit 08bd4c
.\" Redistribution and use in source and binary forms, with or without
Packit 08bd4c
.\" modification, are permitted provided that the following conditions
Packit 08bd4c
.\" are met:
Packit 08bd4c
.\" 1. Redistributions of source code must retain the above copyright
Packit 08bd4c
.\"    notice, this list of conditions and the following disclaimer.
Packit 08bd4c
.\" 2. Redistributions in binary form must reproduce the above copyright
Packit 08bd4c
.\"    notice, this list of conditions and the following disclaimer in the
Packit 08bd4c
.\"    documentation and/or other materials provided with the distribution.
Packit 08bd4c
.\"
Packit 08bd4c
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
Packit 08bd4c
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
Packit 08bd4c
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
Packit 08bd4c
.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
Packit 08bd4c
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
Packit 08bd4c
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
Packit 08bd4c
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
Packit 08bd4c
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
Packit 08bd4c
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
Packit 08bd4c
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
Packit 08bd4c
.\" SUCH DAMAGE.
Packit 08bd4c
.\"
Packit 08bd4c
.\" $FreeBSD$
Packit 08bd4c
.\"
Packit 08bd4c
.Dd December 23, 2011
Packit 08bd4c
.Dt CPIO 5
Packit 08bd4c
.Os
Packit 08bd4c
.Sh NAME
Packit 08bd4c
.Nm cpio
Packit 08bd4c
.Nd format of cpio archive files
Packit 08bd4c
.Sh DESCRIPTION
Packit 08bd4c
The
Packit 08bd4c
.Nm
Packit 08bd4c
archive format collects any number of files, directories, and other
Packit 08bd4c
file system objects (symbolic links, device nodes, etc.) into a single
Packit 08bd4c
stream of bytes.
Packit 08bd4c
.Ss General Format
Packit 08bd4c
Each file system object in a
Packit 08bd4c
.Nm
Packit 08bd4c
archive comprises a header record with basic numeric metadata
Packit 08bd4c
followed by the full pathname of the entry and the file data.
Packit 08bd4c
The header record stores a series of integer values that generally
Packit 08bd4c
follow the fields in
Packit 08bd4c
.Va struct stat .
Packit 08bd4c
(See
Packit 08bd4c
.Xr stat 2
Packit 08bd4c
for details.)
Packit 08bd4c
The variants differ primarily in how they store those integers
Packit 08bd4c
(binary, octal, or hexadecimal).
Packit 08bd4c
The header is followed by the pathname of the
Packit 08bd4c
entry (the length of the pathname is stored in the header)
Packit 08bd4c
and any file data.
Packit 08bd4c
The end of the archive is indicated by a special record with
Packit 08bd4c
the pathname
Packit 08bd4c
.Dq TRAILER!!! .
Packit 08bd4c
.Ss PWB format
Packit 08bd4c
XXX Any documentation of the original PWB/UNIX 1.0 format? XXX
Packit 08bd4c
.Ss Old Binary Format
Packit 08bd4c
The old binary
Packit 08bd4c
.Nm
Packit 08bd4c
format stores numbers as 2-byte and 4-byte binary values.
Packit 08bd4c
Each entry begins with a header in the following format:
Packit 08bd4c
.Bd -literal -offset indent
Packit 08bd4c
struct header_old_cpio {
Packit 08bd4c
        unsigned short   c_magic;
Packit 08bd4c
        unsigned short   c_dev;
Packit 08bd4c
        unsigned short   c_ino;
Packit 08bd4c
        unsigned short   c_mode;
Packit 08bd4c
        unsigned short   c_uid;
Packit 08bd4c
        unsigned short   c_gid;
Packit 08bd4c
        unsigned short   c_nlink;
Packit 08bd4c
        unsigned short   c_rdev;
Packit 08bd4c
	unsigned short   c_mtime[2];
Packit 08bd4c
        unsigned short   c_namesize;
Packit 08bd4c
	unsigned short   c_filesize[2];
Packit 08bd4c
};
Packit 08bd4c
.Ed
Packit 08bd4c
.Pp
Packit 08bd4c
The
Packit 08bd4c
.Va unsigned short
Packit 08bd4c
fields here are 16-bit integer values; the
Packit 08bd4c
.Va unsigned int
Packit 08bd4c
fields are 32-bit integer values.
Packit 08bd4c
The fields are as follows
Packit 08bd4c
.Bl -tag -width indent
Packit 08bd4c
.It Va magic
Packit 08bd4c
The integer value octal 070707.
Packit 08bd4c
This value can be used to determine whether this archive is
Packit 08bd4c
written with little-endian or big-endian integers.
Packit 08bd4c
.It Va dev , Va ino
Packit 08bd4c
The device and inode numbers from the disk.
Packit 08bd4c
These are used by programs that read
Packit 08bd4c
.Nm
Packit 08bd4c
archives to determine when two entries refer to the same file.
Packit 08bd4c
Programs that synthesize
Packit 08bd4c
.Nm
Packit 08bd4c
archives should be careful to set these to distinct values for each entry.
Packit 08bd4c
.It Va mode
Packit 08bd4c
The mode specifies both the regular permissions and the file type.
Packit 08bd4c
It consists of several bit fields as follows:
Packit 08bd4c
.Bl -tag -width "MMMMMMM" -compact
Packit 08bd4c
.It 0170000
Packit 08bd4c
This masks the file type bits.
Packit 08bd4c
.It 0140000
Packit 08bd4c
File type value for sockets.
Packit 08bd4c
.It 0120000
Packit 08bd4c
File type value for symbolic links.
Packit 08bd4c
For symbolic links, the link body is stored as file data.
Packit 08bd4c
.It 0100000
Packit 08bd4c
File type value for regular files.
Packit 08bd4c
.It 0060000
Packit 08bd4c
File type value for block special devices.
Packit 08bd4c
.It 0040000
Packit 08bd4c
File type value for directories.
Packit 08bd4c
.It 0020000
Packit 08bd4c
File type value for character special devices.
Packit 08bd4c
.It 0010000
Packit 08bd4c
File type value for named pipes or FIFOs.
Packit 08bd4c
.It 0004000
Packit 08bd4c
SUID bit.
Packit 08bd4c
.It 0002000
Packit 08bd4c
SGID bit.
Packit 08bd4c
.It 0001000
Packit 08bd4c
Sticky bit.
Packit 08bd4c
On some systems, this modifies the behavior of executables and/or directories.
Packit 08bd4c
.It 0000777
Packit 08bd4c
The lower 9 bits specify read/write/execute permissions
Packit 08bd4c
for world, group, and user following standard POSIX conventions.
Packit 08bd4c
.El
Packit 08bd4c
.It Va uid , Va gid
Packit 08bd4c
The numeric user id and group id of the owner.
Packit 08bd4c
.It Va nlink
Packit 08bd4c
The number of links to this file.
Packit 08bd4c
Directories always have a value of at least two here.
Packit 08bd4c
Note that hardlinked files include file data with every copy in the archive.
Packit 08bd4c
.It Va rdev
Packit 08bd4c
For block special and character special entries,
Packit 08bd4c
this field contains the associated device number.
Packit 08bd4c
For all other entry types, it should be set to zero by writers
Packit 08bd4c
and ignored by readers.
Packit 08bd4c
.It Va mtime
Packit 08bd4c
Modification time of the file, indicated as the number
Packit 08bd4c
of seconds since the start of the epoch,
Packit 08bd4c
00:00:00 UTC January 1, 1970.
Packit 08bd4c
The four-byte integer is stored with the most-significant 16 bits first
Packit 08bd4c
followed by the least-significant 16 bits.
Packit 08bd4c
Each of the two 16 bit values are stored in machine-native byte order.
Packit 08bd4c
.It Va namesize
Packit 08bd4c
The number of bytes in the pathname that follows the header.
Packit 08bd4c
This count includes the trailing NUL byte.
Packit 08bd4c
.It Va filesize
Packit 08bd4c
The size of the file.
Packit 08bd4c
Note that this archive format is limited to
Packit 08bd4c
four gigabyte file sizes.
Packit 08bd4c
See
Packit 08bd4c
.Va mtime
Packit 08bd4c
above for a description of the storage of four-byte integers.
Packit 08bd4c
.El
Packit 08bd4c
.Pp
Packit 08bd4c
The pathname immediately follows the fixed header.
Packit 08bd4c
If the
Packit 08bd4c
.Cm namesize
Packit 08bd4c
is odd, an additional NUL byte is added after the pathname.
Packit 08bd4c
The file data is then appended, padded with NUL
Packit 08bd4c
bytes to an even length.
Packit 08bd4c
.Pp
Packit 08bd4c
Hardlinked files are not given special treatment;
Packit 08bd4c
the full file contents are included with each copy of the
Packit 08bd4c
file.
Packit 08bd4c
.Ss Portable ASCII Format
Packit 08bd4c
.St -susv2
Packit 08bd4c
standardized an ASCII variant that is portable across all
Packit 08bd4c
platforms.
Packit 08bd4c
It is commonly known as the
Packit 08bd4c
.Dq old character
Packit 08bd4c
format or as the
Packit 08bd4c
.Dq odc
Packit 08bd4c
format.
Packit 08bd4c
It stores the same numeric fields as the old binary format, but
Packit 08bd4c
represents them as 6-character or 11-character octal values.
Packit 08bd4c
.Bd -literal -offset indent
Packit 08bd4c
struct cpio_odc_header {
Packit 08bd4c
        char    c_magic[6];
Packit 08bd4c
        char    c_dev[6];
Packit 08bd4c
        char    c_ino[6];
Packit 08bd4c
        char    c_mode[6];
Packit 08bd4c
        char    c_uid[6];
Packit 08bd4c
        char    c_gid[6];
Packit 08bd4c
        char    c_nlink[6];
Packit 08bd4c
        char    c_rdev[6];
Packit 08bd4c
        char    c_mtime[11];
Packit 08bd4c
        char    c_namesize[6];
Packit 08bd4c
        char    c_filesize[11];
Packit 08bd4c
};
Packit 08bd4c
.Ed
Packit 08bd4c
.Pp
Packit 08bd4c
The fields are identical to those in the old binary format.
Packit 08bd4c
The name and file body follow the fixed header.
Packit 08bd4c
Unlike the old binary format, there is no additional padding
Packit 08bd4c
after the pathname or file contents.
Packit 08bd4c
If the files being archived are themselves entirely ASCII, then
Packit 08bd4c
the resulting archive will be entirely ASCII, except for the
Packit 08bd4c
NUL byte that terminates the name field.
Packit 08bd4c
.Ss New ASCII Format
Packit 08bd4c
The "new" ASCII format uses 8-byte hexadecimal fields for
Packit 08bd4c
all numbers and separates device numbers into separate fields
Packit 08bd4c
for major and minor numbers.
Packit 08bd4c
.Bd -literal -offset indent
Packit 08bd4c
struct cpio_newc_header {
Packit 08bd4c
        char    c_magic[6];
Packit 08bd4c
        char    c_ino[8];
Packit 08bd4c
        char    c_mode[8];
Packit 08bd4c
        char    c_uid[8];
Packit 08bd4c
        char    c_gid[8];
Packit 08bd4c
        char    c_nlink[8];
Packit 08bd4c
        char    c_mtime[8];
Packit 08bd4c
        char    c_filesize[8];
Packit 08bd4c
        char    c_devmajor[8];
Packit 08bd4c
        char    c_devminor[8];
Packit 08bd4c
        char    c_rdevmajor[8];
Packit 08bd4c
        char    c_rdevminor[8];
Packit 08bd4c
        char    c_namesize[8];
Packit 08bd4c
        char    c_check[8];
Packit 08bd4c
};
Packit 08bd4c
.Ed
Packit 08bd4c
.Pp
Packit 08bd4c
Except as specified below, the fields here match those specified
Packit 08bd4c
for the old binary format above.
Packit 08bd4c
.Bl -tag -width indent
Packit 08bd4c
.It Va magic
Packit 08bd4c
The string
Packit 08bd4c
.Dq 070701 .
Packit 08bd4c
.It Va check
Packit 08bd4c
This field is always set to zero by writers and ignored by readers.
Packit 08bd4c
See the next section for more details.
Packit 08bd4c
.El
Packit 08bd4c
.Pp
Packit 08bd4c
The pathname is followed by NUL bytes so that the total size
Packit 08bd4c
of the fixed header plus pathname is a multiple of four.
Packit 08bd4c
Likewise, the file data is padded to a multiple of four bytes.
Packit 08bd4c
Note that this format supports only 4 gigabyte files (unlike the
Packit 08bd4c
older ASCII format, which supports 8 gigabyte files).
Packit 08bd4c
.Pp
Packit 08bd4c
In this format, hardlinked files are handled by setting the
Packit 08bd4c
filesize to zero for each entry except the last one that
Packit 08bd4c
appears in the archive.
Packit 08bd4c
.Ss New CRC Format
Packit 08bd4c
The CRC format is identical to the new ASCII format described
Packit 08bd4c
in the previous section except that the magic field is set
Packit 08bd4c
to
Packit 08bd4c
.Dq 070702
Packit 08bd4c
and the
Packit 08bd4c
.Va check
Packit 08bd4c
field is set to the sum of all bytes in the file data.
Packit 08bd4c
This sum is computed treating all bytes as unsigned values
Packit 08bd4c
and using unsigned arithmetic.
Packit 08bd4c
Only the least-significant 32 bits of the sum are stored.
Packit 08bd4c
.Ss HP variants
Packit 08bd4c
The
Packit 08bd4c
.Nm cpio
Packit 08bd4c
implementation distributed with HPUX used XXXX but stored
Packit 08bd4c
device numbers differently XXX.
Packit 08bd4c
.Ss Other Extensions and Variants
Packit 08bd4c
Sun Solaris uses additional file types to store extended file
Packit 08bd4c
data, including ACLs and extended attributes, as special
Packit 08bd4c
entries in cpio archives.
Packit 08bd4c
.Pp
Packit 08bd4c
XXX Others? XXX
Packit 08bd4c
.Sh SEE ALSO
Packit 08bd4c
.Xr cpio 1 ,
Packit 08bd4c
.Xr tar 5
Packit 08bd4c
.Sh STANDARDS
Packit 08bd4c
The
Packit 08bd4c
.Nm cpio
Packit 08bd4c
utility is no longer a part of POSIX or the Single Unix Standard.
Packit 08bd4c
It last appeared in
Packit 08bd4c
.St -susv2 .
Packit 08bd4c
It has been supplanted in subsequent standards by
Packit 08bd4c
.Xr pax 1 .
Packit 08bd4c
The portable ASCII format is currently part of the specification for the
Packit 08bd4c
.Xr pax 1
Packit 08bd4c
utility.
Packit 08bd4c
.Sh HISTORY
Packit 08bd4c
The original cpio utility was written by Dick Haight
Packit 08bd4c
while working in AT&T's Unix Support Group.
Packit 08bd4c
It appeared in 1977 as part of PWB/UNIX 1.0, the
Packit 08bd4c
.Dq Programmer's Work Bench
Packit 08bd4c
derived from
Packit 08bd4c
.At v6
Packit 08bd4c
that was used internally at AT&T.
Packit 08bd4c
Both the old binary and old character formats were in use
Packit 08bd4c
by 1980, according to the System III source released
Packit 08bd4c
by SCO under their
Packit 08bd4c
.Dq Ancient Unix
Packit 08bd4c
license.
Packit 08bd4c
The character format was adopted as part of
Packit 08bd4c
.St -p1003.1-88 .
Packit 08bd4c
XXX when did "newc" appear?  Who invented it?  When did HP come out with their variant?  When did Sun introduce ACLs and extended attributes? XXX
Packit 08bd4c
.Sh BUGS
Packit 08bd4c
The
Packit 08bd4c
.Dq CRC
Packit 08bd4c
format is mis-named, as it uses a simple checksum and
Packit 08bd4c
not a cyclic redundancy check.
Packit 08bd4c
.Pp
Packit 08bd4c
The old binary format is limited to 16 bits for user id,
Packit 08bd4c
group id, device, and inode numbers.
Packit 08bd4c
It is limited to 4 gigabyte file sizes.
Packit 08bd4c
.Pp
Packit 08bd4c
The old ASCII format is limited to 18 bits for
Packit 08bd4c
the user id, group id, device, and inode numbers.
Packit 08bd4c
It is limited to 8 gigabyte file sizes.
Packit 08bd4c
.Pp
Packit 08bd4c
The new ASCII format is limited to 4 gigabyte file sizes.
Packit 08bd4c
.Pp
Packit 08bd4c
None of the cpio formats store user or group names,
Packit 08bd4c
which are essential when moving files between systems with
Packit 08bd4c
dissimilar user or group numbering.
Packit 08bd4c
.Pp
Packit 08bd4c
Especially when writing older cpio variants, it may be necessary
Packit 08bd4c
to map actual device/inode values to synthesized values that
Packit 08bd4c
fit the available fields.
Packit 08bd4c
With very large filesystems, this may be necessary even for
Packit 08bd4c
the newer formats.