Blame doc/html/tar.5.html

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
"http://www.w3.org/TR/html4/loose.dtd">
Packit 08bd4c
<html>
Packit 08bd4c
<head>
Packit 08bd4c
<meta name="generator" content="groff -Thtml, see www.gnu.org">
Packit 08bd4c
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
Packit 08bd4c
<meta name="Content-Style" content="text/css">
Packit 08bd4c
<style type="text/css">
Packit 08bd4c
       p       { margin-top: 0; margin-bottom: 0; vertical-align: top }
Packit 08bd4c
       pre     { margin-top: 0; margin-bottom: 0; vertical-align: top }
Packit 08bd4c
       table   { margin-top: 0; margin-bottom: 0; vertical-align: top }
Packit 08bd4c
       h1      { text-align: center }
Packit 08bd4c
</style>
Packit 08bd4c
<title></title>
Packit 08bd4c
</head>
Packit 08bd4c
<body>
Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

TAR(5) BSD File Formats Manual TAR(5)

Packit 08bd4c
Packit 08bd4c

NAME

Packit 08bd4c
Packit 08bd4c

tar — format of

Packit 08bd4c
tape archive files

Packit 08bd4c
Packit 08bd4c

DESCRIPTION

Packit 08bd4c
Packit 08bd4c

The tar archive format

Packit 08bd4c
collects any number of files, directories, and other file
Packit 08bd4c
system objects (symbolic links, device nodes, etc.) into a
Packit 08bd4c
single stream of bytes. The format was originally designed
Packit 08bd4c
to be used with tape drives that operate with fixed-size
Packit 08bd4c
blocks, but is widely used as a general packaging
Packit 08bd4c
mechanism.

Packit 08bd4c
Packit 08bd4c

General

Packit 08bd4c
Format 
Packit 08bd4c
A tar archive consists of a series of 512-byte
Packit 08bd4c
records. Each file system object requires a header record
Packit 08bd4c
which stores basic metadata (pathname, owner, permissions,
Packit 08bd4c
etc.) and zero or more records containing any file data. The
Packit 08bd4c
end of the archive is indicated by two records consisting
Packit 08bd4c
entirely of zero bytes.

Packit 08bd4c
Packit 08bd4c

For

Packit 08bd4c
compatibility with tape drives that use fixed block sizes,
Packit 08bd4c
programs that read or write tar files always read or write a
Packit 08bd4c
fixed number of records with each I/O operation. These
Packit 08bd4c
’’blocks’’ are always a multiple of
Packit 08bd4c
the record size. The maximum block size supported by early
Packit 08bd4c
implementations was 10240 bytes or 20 records. This is still
Packit 08bd4c
the default for most implementations although block sizes of
Packit 08bd4c
1MiB (2048 records) or larger are commonly used with modern
Packit 08bd4c
high-speed tape drives. (Note: the terms
Packit 08bd4c
’’block’’ and
Packit 08bd4c
’’record’’ here are not entirely
Packit 08bd4c
standard; this document follows the convention established
Packit 08bd4c
by John Gilmore in documenting pdtar.)

Packit 08bd4c
Packit 08bd4c

Old-Style

Packit 08bd4c
Archive Format 
Packit 08bd4c
The original tar archive format has been extended many times
Packit 08bd4c
to include additional information that various implementors
Packit 08bd4c
found necessary. This section describes the variant
Packit 08bd4c
implemented by the tar command included in Version 7
Packit 08bd4c
AT&T UNIX, which seems to be the earliest widely-used
Packit 08bd4c
version of the tar program.

Packit 08bd4c
Packit 08bd4c

The header

Packit 08bd4c
record for an old-style tar archive consists of the
Packit 08bd4c
following:

Packit 08bd4c
Packit 08bd4c

struct

Packit 08bd4c
header_old_tar {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
       cellspacing="0" cellpadding="0">
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char name[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mode[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char uid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char gid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char size[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mtime[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char checksum[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char linkflag[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char linkname[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char pad[255];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

};

Packit 08bd4c
Packit 08bd4c

All unused bytes in the header

Packit 08bd4c
record are filled with nulls.

Packit 08bd4c
Packit 08bd4c

name

Packit 08bd4c
Packit 08bd4c

Pathname,

Packit 08bd4c
stored as a null-terminated string. Early tar
Packit 08bd4c
implementations only stored regular files (including
Packit 08bd4c
hardlinks to those files). One common early convention used
Packit 08bd4c
a trailing "/" character to indicate a directory
Packit 08bd4c
name, allowing directory permissions and owner information
Packit 08bd4c
to be archived and restored.

Packit 08bd4c
Packit 08bd4c

mode

Packit 08bd4c
Packit 08bd4c

File mode,

Packit 08bd4c
stored as an octal number in ASCII.

Packit 08bd4c
Packit 08bd4c

uid, gid

Packit 08bd4c
Packit 08bd4c

User id and group id of owner,

Packit 08bd4c
as octal numbers in ASCII.

Packit 08bd4c
Packit 08bd4c

size

Packit 08bd4c
Packit 08bd4c

Size of file,

Packit 08bd4c
as octal number in ASCII. For regular files only, this
Packit 08bd4c
indicates the amount of data that follows the header. In
Packit 08bd4c
particular, this field was ignored by early tar
Packit 08bd4c
implementations when extracting hardlinks. Modern writers
Packit 08bd4c
should always store a zero length for hardlink entries.

Packit 08bd4c
Packit 08bd4c

mtime

Packit 08bd4c
Packit 08bd4c

Modification

Packit 08bd4c
time of file, as an octal number in ASCII. This indicates
Packit 08bd4c
the number of seconds since the start of the epoch, 00:00:00
Packit 08bd4c
UTC January 1, 1970. Note that negative values should be
Packit 08bd4c
avoided here, as they are handled inconsistently.

Packit 08bd4c
Packit 08bd4c

checksum

Packit 08bd4c
Packit 08bd4c

Header checksum, stored as an

Packit 08bd4c
octal number in ASCII. To compute the checksum, set the
Packit 08bd4c
checksum field to all spaces, then sum all bytes in the
Packit 08bd4c
header using unsigned arithmetic. This field should be
Packit 08bd4c
stored as six octal digits followed by a null and a space
Packit 08bd4c
character. Note that many early implementations of tar used
Packit 08bd4c
signed arithmetic for the checksum field, which can cause
Packit 08bd4c
interoperability problems when transferring archives between
Packit 08bd4c
systems. Modern robust readers compute the checksum both
Packit 08bd4c
ways and accept the header if either computation
Packit 08bd4c
matches.

Packit 08bd4c
Packit 08bd4c

linkflag,

Packit 08bd4c
linkname

Packit 08bd4c
Packit 08bd4c

In order to preserve hardlinks

Packit 08bd4c
and conserve tape, a file with multiple links is only
Packit 08bd4c
written to the archive the first time it is encountered. The
Packit 08bd4c
next time it is encountered, the linkflag is set to
Packit 08bd4c
an ASCII ’1’ and the linkname field holds
Packit 08bd4c
the first name under which this file appears. (Note that
Packit 08bd4c
regular files have a null value in the linkflag
Packit 08bd4c
field.)

Packit 08bd4c
Packit 08bd4c

Early tar

Packit 08bd4c
implementations varied in how they terminated these fields.
Packit 08bd4c
The tar command in Version 7 AT&T UNIX used the
Packit 08bd4c
following conventions (this is also documented in early BSD
Packit 08bd4c
manpages): the pathname must be null-terminated; the mode,
Packit 08bd4c
uid, and gid fields must end in a space and a null byte; the
Packit 08bd4c
size and mtime fields must end in a space; the checksum is
Packit 08bd4c
terminated by a null and a space. Early implementations
Packit 08bd4c
filled the numeric fields with leading spaces. This seems to
Packit 08bd4c
have been common practice until the IEEE Std 1003.1-1988
Packit 08bd4c
(’’POSIX.1’’) standard was released.
Packit 08bd4c
For best portability, modern implementations should fill the
Packit 08bd4c
numeric fields with leading zeros.

Packit 08bd4c
Packit 08bd4c

Pre-POSIX

Packit 08bd4c
Archives 
Packit 08bd4c
An early draft of IEEE Std 1003.1-1988
Packit 08bd4c
(’’POSIX.1’’) served as the basis
Packit 08bd4c
for John Gilmore’s pdtar program and many
Packit 08bd4c
system implementations from the late 1980s and early 1990s.
Packit 08bd4c
These archives generally follow the POSIX ustar format
Packit 08bd4c
described below with the following variations:

Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c

The magic value consists of the

Packit 08bd4c
five characters ’’ustar’’ followed
Packit 08bd4c
by a space. The version field contains a space character
Packit 08bd4c
followed by a null.

Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c

The numeric fields are

Packit 08bd4c
generally filled with leading spaces (not leading zeros as
Packit 08bd4c
recommended in the final standard).

Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c

The prefix field is often not

Packit 08bd4c
used, limiting pathnames to the 100 characters of old-style
Packit 08bd4c
archives.

Packit 08bd4c
Packit 08bd4c

POSIX ustar

Packit 08bd4c
Archives 
Packit 08bd4c
IEEE Std 1003.1-1988 (’’POSIX.1’’)
Packit 08bd4c
defined a standard tar file format to be read and written by
Packit 08bd4c
compliant implementations of tar(1). This format is often
Packit 08bd4c
called the ’’ustar’’ format, after
Packit 08bd4c
the magic value used in the header. (The name is an acronym
Packit 08bd4c
for ’’Unix Standard TAR’’.) It
Packit 08bd4c
extends the historic format with new fields:

Packit 08bd4c
Packit 08bd4c

struct

Packit 08bd4c
header_posix_ustar {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
       cellspacing="0" cellpadding="0">
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char name[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mode[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char uid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char gid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char size[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mtime[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char checksum[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char typeflag[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char linkname[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char magic[6];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char version[2];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char uname[32];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char gname[32];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char devmajor[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char devminor[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char prefix[155];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char pad[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

};

Packit 08bd4c
Packit 08bd4c

typeflag

Packit 08bd4c
Packit 08bd4c

Type of entry. POSIX extended

Packit 08bd4c
the earlier linkflag field with several new type
Packit 08bd4c
values:

Packit 08bd4c
Packit 08bd4c

’’0’’

Packit 08bd4c
Packit 08bd4c

Regular file.

Packit 08bd4c
NUL should be treated as a synonym, for compatibility
Packit 08bd4c
purposes.

Packit 08bd4c
Packit 08bd4c

’’1’’

Packit 08bd4c
Packit 08bd4c

Hard link.

Packit 08bd4c
Packit 08bd4c

’’2’’

Packit 08bd4c
Packit 08bd4c

Symbolic

Packit 08bd4c
link.

Packit 08bd4c
Packit 08bd4c

’’3’’

Packit 08bd4c
Packit 08bd4c

Character

Packit 08bd4c
device node.

Packit 08bd4c
Packit 08bd4c

’’4’’

Packit 08bd4c
Packit 08bd4c

Block device

Packit 08bd4c
node.

Packit 08bd4c
Packit 08bd4c

’’5’’

Packit 08bd4c
Packit 08bd4c

Directory.

Packit 08bd4c
Packit 08bd4c

’’6’’

Packit 08bd4c
Packit 08bd4c

FIFO node.

Packit 08bd4c
Packit 08bd4c

’’7’’

Packit 08bd4c
Packit 08bd4c

Reserved.

Packit 08bd4c
Packit 08bd4c

Other

Packit 08bd4c
Packit 08bd4c

A

Packit 08bd4c
POSIX-compliant implementation must treat any unrecognized
Packit 08bd4c
typeflag value as a regular file. In particular, writers
Packit 08bd4c
should ensure that all entries have a valid filename so that
Packit 08bd4c
they can be restored by readers that do not support the
Packit 08bd4c
corresponding extension. Uppercase letters "A"
Packit 08bd4c
through "Z" are reserved for custom extensions.
Packit 08bd4c
Note that sockets and whiteout entries are not
Packit 08bd4c
archivable.

Packit 08bd4c
Packit 08bd4c

It is worth noting that the

Packit 08bd4c
size field, in particular, has different meanings
Packit 08bd4c
depending on the type. For regular files, of course, it
Packit 08bd4c
indicates the amount of data following the header. For
Packit 08bd4c
directories, it may be used to indicate the total size of
Packit 08bd4c
all files in the directory, for use by operating systems
Packit 08bd4c
that pre-allocate directory space. For all other types, it
Packit 08bd4c
should be set to zero by writers and ignored by readers.

Packit 08bd4c
Packit 08bd4c

magic

Packit 08bd4c
Packit 08bd4c

Contains the

Packit 08bd4c
magic value ’’ustar’’ followed by a
Packit 08bd4c
NUL byte to indicate that this is a POSIX standard archive.
Packit 08bd4c
Full compliance requires the uname and gname fields be
Packit 08bd4c
properly set.

Packit 08bd4c
Packit 08bd4c

version

Packit 08bd4c
Packit 08bd4c

Version. This should be

Packit 08bd4c
’’00’’ (two copies of the ASCII
Packit 08bd4c
digit zero) for POSIX standard archives.

Packit 08bd4c
Packit 08bd4c

uname, gname

Packit 08bd4c
Packit 08bd4c

User and group names, as

Packit 08bd4c
null-terminated ASCII strings. These should be used in
Packit 08bd4c
preference to the uid/gid values when they are set and the
Packit 08bd4c
corresponding names exist on the system.

Packit 08bd4c
Packit 08bd4c

devmajor,

Packit 08bd4c
devminor

Packit 08bd4c
Packit 08bd4c

Major and minor numbers for

Packit 08bd4c
character device or block device entry.

Packit 08bd4c
Packit 08bd4c

name, prefix

Packit 08bd4c
Packit 08bd4c

If the pathname is too long to

Packit 08bd4c
fit in the 100 bytes provided by the standard format, it can
Packit 08bd4c
be split at any / character with the first portion
Packit 08bd4c
going into the prefix field. If the prefix field is not
Packit 08bd4c
empty, the reader will prepend the prefix value and a
Packit 08bd4c
/ character to the regular name field to obtain the
Packit 08bd4c
full pathname. The standard does not require a trailing
Packit 08bd4c
/ character on directory names, though most
Packit 08bd4c
implementations still include this for compatibility
Packit 08bd4c
reasons.

Packit 08bd4c
Packit 08bd4c

Note that all

Packit 08bd4c
unused bytes must be set to NUL.

Packit 08bd4c
Packit 08bd4c

Field

Packit 08bd4c
termination is specified slightly differently by POSIX than
Packit 08bd4c
by previous implementations. The magic, uname,
Packit 08bd4c
and gname fields must have a trailing NUL. The
Packit 08bd4c
pathname, linkname, and prefix fields
Packit 08bd4c
must have a trailing NUL unless they fill the entire field.
Packit 08bd4c
(In particular, it is possible to store a 256-character
Packit 08bd4c
pathname if it happens to have a / as the 156th
Packit 08bd4c
character.) POSIX requires numeric fields to be zero-padded
Packit 08bd4c
in the front, and requires them to be terminated with either
Packit 08bd4c
space or NUL characters.

Packit 08bd4c
Packit 08bd4c

Currently, most

Packit 08bd4c
tar implementations comply with the ustar format,
Packit 08bd4c
occasionally extending it by adding new fields to the blank
Packit 08bd4c
area at the end of the header record.

Packit 08bd4c
Packit 08bd4c

Numeric

Packit 08bd4c
Extensions 
Packit 08bd4c
There have been several attempts to extend the range of
Packit 08bd4c
sizes or times supported by modifying how numbers are stored
Packit 08bd4c
in the header.

Packit 08bd4c
Packit 08bd4c

One obvious

Packit 08bd4c
extension to increase the size of files is to eliminate the
Packit 08bd4c
terminating characters from the various numeric fields. For
Packit 08bd4c
example, the standard only allows the size field to contain
Packit 08bd4c
11 octal digits, reserving the twelfth byte for a trailing
Packit 08bd4c
NUL character. Allowing 12 octal digits allows file sizes up
Packit 08bd4c
to 64 GB.

Packit 08bd4c
Packit 08bd4c

Another

Packit 08bd4c
extension, utilized by GNU tar, star, and other newer
Packit 08bd4c
tar implementations, permits binary numbers in the
Packit 08bd4c
standard numeric fields. This is flagged by setting the high
Packit 08bd4c
bit of the first byte. The remainder of the field is treated
Packit 08bd4c
as a signed twos-complement value. This permits 95-bit
Packit 08bd4c
values for the length and time fields and 63-bit values for
Packit 08bd4c
the uid, gid, and device numbers. In particular, this
Packit 08bd4c
provides a consistent way to handle negative time values.
Packit 08bd4c
GNU tar supports this extension for the length, mtime,
Packit 08bd4c
ctime, and atime fields. Joerg Schilling’s star
Packit 08bd4c
program and the libarchive library support this extension
Packit 08bd4c
for all numeric fields. Note that this extension is largely
Packit 08bd4c
obsoleted by the extended attribute record provided by the
Packit 08bd4c
pax interchange format.

Packit 08bd4c
Packit 08bd4c

Another early

Packit 08bd4c
GNU extension allowed base-64 values rather than octal. This
Packit 08bd4c
extension was short-lived and is no longer supported by any
Packit 08bd4c
implementation.

Packit 08bd4c
Packit 08bd4c

Pax

Packit 08bd4c
Interchange Format 
Packit 08bd4c
There are many attributes that cannot be portably stored in
Packit 08bd4c
a POSIX ustar archive. IEEE Std 1003.1-2001
Packit 08bd4c
(’’POSIX.1’’) defined a
Packit 08bd4c
’’pax interchange format’’ that uses
Packit 08bd4c
two new types of entries to hold text-formatted metadata
Packit 08bd4c
that applies to following entries. Note that a pax
Packit 08bd4c
interchange format archive is a ustar archive in every
Packit 08bd4c
respect. The new data is stored in ustar-compatible archive
Packit 08bd4c
entries that use the ’’x’’ or
Packit 08bd4c
’’g’’ typeflag. In particular, older
Packit 08bd4c
implementations that do not fully support these extensions
Packit 08bd4c
will extract the metadata into regular files, where the
Packit 08bd4c
metadata can be examined as necessary.

Packit 08bd4c
Packit 08bd4c

An entry in a

Packit 08bd4c
pax interchange format archive consists of one or two
Packit 08bd4c
standard ustar entries, each with its own header and data.
Packit 08bd4c
The first optional entry stores the extended attributes for
Packit 08bd4c
the following entry. This optional first entry has an
Packit 08bd4c
"x" typeflag and a size field that indicates the
Packit 08bd4c
total size of the extended attributes. The extended
Packit 08bd4c
attributes themselves are stored as a series of text-format
Packit 08bd4c
lines encoded in the portable UTF-8 encoding. Each line
Packit 08bd4c
consists of a decimal number, a space, a key string, an
Packit 08bd4c
equals sign, a value string, and a new line. The decimal
Packit 08bd4c
number indicates the length of the entire line, including
Packit 08bd4c
the initial length field and the trailing newline. An
Packit 08bd4c
example of such a field is:

Packit 08bd4c
Packit 08bd4c

25 ctime=1084839148.1212\n

Packit 08bd4c
Packit 08bd4c

Keys in all lowercase are

Packit 08bd4c
standard keys. Vendors can add their own keys by prefixing
Packit 08bd4c
them with an all uppercase vendor name and a period. Note
Packit 08bd4c
that, unlike the historic header, numeric values are stored
Packit 08bd4c
using decimal, not octal. A description of some common keys
Packit 08bd4c
follows:

Packit 08bd4c
Packit 08bd4c

atime, ctime,

Packit 08bd4c
mtime

Packit 08bd4c
Packit 08bd4c

File access, inode change, and

Packit 08bd4c
modification times. These fields can be negative or include
Packit 08bd4c
a decimal point and a fractional value.

Packit 08bd4c
Packit 08bd4c

hdrcharset

Packit 08bd4c
Packit 08bd4c

The character set used by the

Packit 08bd4c
pax extension values. By default, all textual values in the
Packit 08bd4c
pax extended attributes are assumed to be in UTF-8,
Packit 08bd4c
including pathnames, user names, and group names. In some
Packit 08bd4c
cases, it is not possible to translate local conventions
Packit 08bd4c
into UTF-8. If this key is present and the value is the
Packit 08bd4c
six-character ASCII string
Packit 08bd4c
’’BINARY’’, then all textual values
Packit 08bd4c
are assumed to be in a platform-dependent multi-byte
Packit 08bd4c
encoding. Note that there are only two valid values for this
Packit 08bd4c
key: ’’BINARY’’ or
Packit 08bd4c
’’ISO-IR 10646 2000 UTF-8’’.
Packit 08bd4c
No other values are permitted by the standard, and the
Packit 08bd4c
latter value should generally not be used as it is the
Packit 08bd4c
default when this key is not specified. In particular, this
Packit 08bd4c
flag should not be used as a general mechanism to allow
Packit 08bd4c
filenames to be stored in arbitrary encodings.

Packit 08bd4c
Packit 08bd4c

uname, uid,

Packit 08bd4c
gname, gid

Packit 08bd4c
Packit 08bd4c

User name, group name, and

Packit 08bd4c
numeric UID and GID values. The user name and group name
Packit 08bd4c
stored here are encoded in UTF8 and can thus include
Packit 08bd4c
non-ASCII characters. The UID and GID fields can be of
Packit 08bd4c
arbitrary length.

Packit 08bd4c
Packit 08bd4c

linkpath

Packit 08bd4c
Packit 08bd4c

The full path of the linked-to

Packit 08bd4c
file. Note that this is encoded in UTF8 and can thus include
Packit 08bd4c
non-ASCII characters.

Packit 08bd4c
Packit 08bd4c

path

Packit 08bd4c
Packit 08bd4c

The full

Packit 08bd4c
pathname of the entry. Note that this is encoded in UTF8 and
Packit 08bd4c
can thus include non-ASCII characters.

Packit 08bd4c
Packit 08bd4c

realtime.*,

Packit 08bd4c
security.*

Packit 08bd4c
Packit 08bd4c

These keys are reserved and may

Packit 08bd4c
be used for future standardization.

Packit 08bd4c
Packit 08bd4c

size

Packit 08bd4c
Packit 08bd4c

The size of the

Packit 08bd4c
file. Note that there is no length limit on this field,
Packit 08bd4c
allowing conforming archives to store files much larger than
Packit 08bd4c
the historic 8GB limit.

Packit 08bd4c
Packit 08bd4c

SCHILY.*

Packit 08bd4c
Packit 08bd4c

Vendor-specific attributes used

Packit 08bd4c
by Joerg Schilling’s star implementation.

Packit 08bd4c
Packit 08bd4c

SCHILY.acl.access,

Packit 08bd4c
SCHILY.acl.default, SCHILY.acl.ace

Packit 08bd4c
Packit 08bd4c

Stores the access, default and

Packit 08bd4c
NFSv4 ACLs as textual strings in a format that is an
Packit 08bd4c
extension of the format specified by POSIX.1e draft 17. In
Packit 08bd4c
particular, each user or group access specification can
Packit 08bd4c
include an additional colon-separated field with the numeric
Packit 08bd4c
UID or GID. This allows ACLs to be restored on systems that
Packit 08bd4c
may not have complete user or group information available
Packit 08bd4c
(such as when NIS/YP or LDAP services are temporarily
Packit 08bd4c
unavailable).

Packit 08bd4c
Packit 08bd4c

SCHILY.devminor,

Packit 08bd4c
SCHILY.devmajor

Packit 08bd4c
Packit 08bd4c

The full minor and major

Packit 08bd4c
numbers for device nodes.

Packit 08bd4c
Packit 08bd4c

SCHILY.fflags

Packit 08bd4c
Packit 08bd4c

The file flags.

Packit 08bd4c
Packit 08bd4c

SCHILY.realsize

Packit 08bd4c
Packit 08bd4c

The full size of the file on

Packit 08bd4c
disk. XXX explain? XXX

Packit 08bd4c
Packit 08bd4c

SCHILY.dev, SCHILY.ino,

Packit 08bd4c
SCHILY.nlinks

Packit 08bd4c
Packit 08bd4c

The device number, inode

Packit 08bd4c
number, and link count for the entry. In particular, note
Packit 08bd4c
that a pax interchange format archive using Joerg
Packit 08bd4c
Schilling’s SCHILY.* extensions can store all
Packit 08bd4c
of the data from struct stat.

Packit 08bd4c
Packit 08bd4c

LIBARCHIVE.*

Packit 08bd4c
Packit 08bd4c

Vendor-specific attributes used

Packit 08bd4c
by the libarchive library and programs that use
Packit 08bd4c
it.

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

LIBARCHIVE.creationtime

Packit 08bd4c
Packit 08bd4c

The time when the file was

Packit 08bd4c
created. (This should not be confused with the POSIX
Packit 08bd4c
’’ctime’’ attribute, which refers to
Packit 08bd4c
the time when the file metadata was last changed.)

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

LIBARCHIVE.xattr.namespace.key

Packit 08bd4c
Packit 08bd4c

Libarchive stores

Packit 08bd4c
POSIX.1e-style extended attributes using keys of this form.
Packit 08bd4c
The key value is URL-encoded: All non-ASCII
Packit 08bd4c
characters and the two special characters
Packit 08bd4c
’’=’’ and
Packit 08bd4c
’’%’’ are encoded as
Packit 08bd4c
’’%’’ followed by two uppercase
Packit 08bd4c
hexadecimal digits. The value of this key is the extended
Packit 08bd4c
attribute value encoded in base 64. XXX Detail the base-64
Packit 08bd4c
format here XXX

Packit 08bd4c
Packit 08bd4c

VENDOR.*

Packit 08bd4c
Packit 08bd4c

XXX document other

Packit 08bd4c
vendor-specific extensions XXX

Packit 08bd4c
Packit 08bd4c

Any values

Packit 08bd4c
stored in an extended attribute override the corresponding
Packit 08bd4c
values in the regular tar header. Note that compliant
Packit 08bd4c
readers should ignore the regular fields when they are
Packit 08bd4c
overridden. This is important, as existing archivers are
Packit 08bd4c
known to store non-compliant values in the standard header
Packit 08bd4c
fields in this situation. There are no limits on length for
Packit 08bd4c
any of these fields. In particular, numeric fields can be
Packit 08bd4c
arbitrarily large. All text fields are encoded in UTF8.
Packit 08bd4c
Compliant writers should store only portable 7-bit ASCII
Packit 08bd4c
characters in the standard ustar header and use extended
Packit 08bd4c
attributes whenever a text value contains non-ASCII
Packit 08bd4c
characters.

Packit 08bd4c
Packit 08bd4c

In addition to

Packit 08bd4c
the x entry described above, the pax interchange
Packit 08bd4c
format also supports a g entry. The g entry is
Packit 08bd4c
identical in format, but specifies attributes that serve as
Packit 08bd4c
defaults for all subsequent archive entries. The g
Packit 08bd4c
entry is not widely used.

Packit 08bd4c
Packit 08bd4c

Besides the new

Packit 08bd4c
x and g entries, the pax interchange format
Packit 08bd4c
has a few other minor variations from the earlier ustar
Packit 08bd4c
format. The most troubling one is that hardlinks are
Packit 08bd4c
permitted to have data following them. This allows readers
Packit 08bd4c
to restore any hardlink to a file without having to rewind
Packit 08bd4c
the archive to find an earlier entry. However, it creates
Packit 08bd4c
complications for robust readers, as it is no longer clear
Packit 08bd4c
whether or not they should ignore the size field for
Packit 08bd4c
hardlink entries.

Packit 08bd4c
Packit 08bd4c

GNU Tar

Packit 08bd4c
Archives 
Packit 08bd4c
The GNU tar program started with a pre-POSIX format similar
Packit 08bd4c
to that described earlier and has extended it using several
Packit 08bd4c
different mechanisms: It added new fields to the empty space
Packit 08bd4c
in the header (some of which was later used by POSIX for
Packit 08bd4c
conflicting purposes); it allowed the header to be continued
Packit 08bd4c
over multiple records; and it defined new entries that
Packit 08bd4c
modify following entries (similar in principle to the
Packit 08bd4c
x entry described above, but each GNU special entry
Packit 08bd4c
is single-purpose, unlike the general-purpose x
Packit 08bd4c
entry). As a result, GNU tar archives are not POSIX
Packit 08bd4c
compatible, although more lenient POSIX-compliant readers
Packit 08bd4c
can successfully extract most GNU tar archives.

Packit 08bd4c
Packit 08bd4c

struct

Packit 08bd4c
header_gnu_tar {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
       cellspacing="0" cellpadding="0">
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char name[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mode[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char uid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char gid[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char size[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char mtime[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char checksum[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char typeflag[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char linkname[100];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char magic[6];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char version[2];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char uname[32];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char gname[32];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char devmajor[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char devminor[8];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char atime[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char ctime[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char offset[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char longnames[4];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char unused[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

struct {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char offset[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char numbytes[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

} sparse[4];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char isextended[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char realsize[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char pad[17];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

};

Packit 08bd4c
Packit 08bd4c

typeflag

Packit 08bd4c
Packit 08bd4c

GNU tar uses the following

Packit 08bd4c
special entry types, in addition to those defined by
Packit 08bd4c
POSIX:

Packit 08bd4c
Packit 08bd4c

7

Packit 08bd4c
Packit 08bd4c

GNU tar treats

Packit 08bd4c
type "7" records identically to type "0"
Packit 08bd4c
records, except on one obscure RTOS where they are used to
Packit 08bd4c
indicate the pre-allocation of a contiguous file on
Packit 08bd4c
disk.

Packit 08bd4c
Packit 08bd4c

D

Packit 08bd4c
Packit 08bd4c

This indicates

Packit 08bd4c
a directory entry. Unlike the POSIX-standard "5"
Packit 08bd4c
typeflag, the header is followed by data records listing the
Packit 08bd4c
names of files in this directory. Each name is preceded by
Packit 08bd4c
an ASCII "Y" if the file is stored in this archive
Packit 08bd4c
or "N" if the file is not stored in this archive.
Packit 08bd4c
Each name is terminated with a null, and an extra null marks
Packit 08bd4c
the end of the name list. The purpose of this entry is to
Packit 08bd4c
support incremental backups; a program restoring from such
Packit 08bd4c
an archive may wish to delete files on disk that did not
Packit 08bd4c
exist in the directory when the archive was made.

Packit 08bd4c
Packit 08bd4c

Note that the

Packit 08bd4c
"D" typeflag specifically violates POSIX, which
Packit 08bd4c
requires that unrecognized typeflags be restored as normal
Packit 08bd4c
files. In this case, restoring the "D" entry as a
Packit 08bd4c
file could interfere with subsequent creation of the
Packit 08bd4c
like-named directory.

Packit 08bd4c
Packit 08bd4c

K

Packit 08bd4c
Packit 08bd4c

The data for

Packit 08bd4c
this entry is a long linkname for the following regular
Packit 08bd4c
entry.

Packit 08bd4c
Packit 08bd4c

L

Packit 08bd4c
Packit 08bd4c

The data for

Packit 08bd4c
this entry is a long pathname for the following regular
Packit 08bd4c
entry.

Packit 08bd4c
Packit 08bd4c

M

Packit 08bd4c
Packit 08bd4c

This is a

Packit 08bd4c
continuation of the last file on the previous volume. GNU
Packit 08bd4c
multi-volume archives guarantee that each volume begins with
Packit 08bd4c
a valid entry header. To ensure this, a file may be split,
Packit 08bd4c
with part stored at the end of one volume, and part stored
Packit 08bd4c
at the beginning of the next volume. The "M"
Packit 08bd4c
typeflag indicates that this entry continues an existing
Packit 08bd4c
file. Such entries can only occur as the first or second
Packit 08bd4c
entry in an archive (the latter only if the first entry is a
Packit 08bd4c
volume label). The size field specifies the size of
Packit 08bd4c
this entry. The offset field at bytes 369-380
Packit 08bd4c
specifies the offset where this file fragment begins. The
Packit 08bd4c
realsize field specifies the total size of the file
Packit 08bd4c
(which must equal size plus offset). When
Packit 08bd4c
extracting, GNU tar checks that the header file name is the
Packit 08bd4c
one it is expecting, that the header offset is in the
Packit 08bd4c
correct sequence, and that the sum of offset and size is
Packit 08bd4c
equal to realsize.

Packit 08bd4c
Packit 08bd4c

N

Packit 08bd4c
Packit 08bd4c

Type

Packit 08bd4c
"N" records are no longer generated by GNU tar.
Packit 08bd4c
They contained a list of files to be renamed or symlinked
Packit 08bd4c
after extraction; this was originally used to support long
Packit 08bd4c
names. The contents of this record are a text description of
Packit 08bd4c
the operations to be done, in the form ’’Rename
Packit 08bd4c
%s to %s\n’’ or ’’Symlink %s to
Packit 08bd4c
%s\n’’; in either case, both filenames are
Packit 08bd4c
escaped using K&R C syntax. Due to security concerns,
Packit 08bd4c
"N" records are now generally ignored when reading
Packit 08bd4c
archives.

Packit 08bd4c
Packit 08bd4c

S

Packit 08bd4c
Packit 08bd4c

This is a

Packit 08bd4c
’’sparse’’ regular file. Sparse
Packit 08bd4c
files are stored as a series of fragments. The header
Packit 08bd4c
contains a list of fragment offset/length pairs. If more
Packit 08bd4c
than four such entries are required, the header is extended
Packit 08bd4c
as necessary with ’’extra’’ header
Packit 08bd4c
extensions (an older format that is no longer used), or
Packit 08bd4c
’’sparse’’ extensions.

Packit 08bd4c
Packit 08bd4c

V

Packit 08bd4c
Packit 08bd4c

The name

Packit 08bd4c
field should be interpreted as a tape/volume header name.
Packit 08bd4c
This entry should generally be ignored on extraction.

Packit 08bd4c
Packit 08bd4c

magic

Packit 08bd4c
Packit 08bd4c

The magic field

Packit 08bd4c
holds the five characters ’’ustar’’
Packit 08bd4c
followed by a space. Note that POSIX ustar archives have a
Packit 08bd4c
trailing null.

Packit 08bd4c
Packit 08bd4c

version

Packit 08bd4c
Packit 08bd4c

The version field holds a space

Packit 08bd4c
character followed by a null. Note that POSIX ustar archives
Packit 08bd4c
use two copies of the ASCII digit
Packit 08bd4c
’’0’’.

Packit 08bd4c
Packit 08bd4c

atime, ctime

Packit 08bd4c
Packit 08bd4c

The time the file was last

Packit 08bd4c
accessed and the time of last change of file information,
Packit 08bd4c
stored in octal as with mtime.

Packit 08bd4c
Packit 08bd4c

longnames

Packit 08bd4c
Packit 08bd4c

This field is apparently no

Packit 08bd4c
longer used.

Packit 08bd4c
Packit 08bd4c

Sparse offset /

Packit 08bd4c
numbytes

Packit 08bd4c
Packit 08bd4c

Each such structure specifies a

Packit 08bd4c
single fragment of a sparse file. The two fields store
Packit 08bd4c
values as octal numbers. The fragments are each padded to a
Packit 08bd4c
multiple of 512 bytes in the archive. On extraction, the
Packit 08bd4c
list of fragments is collected from the header (including
Packit 08bd4c
any extension headers), and the data is then read and
Packit 08bd4c
written to the file at appropriate offsets.

Packit 08bd4c
Packit 08bd4c

isextended

Packit 08bd4c
Packit 08bd4c

If this is set to non-zero, the

Packit 08bd4c
header will be followed by additional ’’sparse
Packit 08bd4c
header’’ records. Each such record contains
Packit 08bd4c
information about as many as 21 additional sparse blocks as
Packit 08bd4c
shown here:

Packit 08bd4c
Packit 08bd4c

struct

Packit 08bd4c
gnu_sparse_header {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
       cellspacing="0" cellpadding="0">
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

struct {

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char offset[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char numbytes[12];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

} sparse[21];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char isextended[1];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

char padding[7];

Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c
Packit 08bd4c

};

Packit 08bd4c
Packit 08bd4c

realsize

Packit 08bd4c
Packit 08bd4c

A binary representation of the

Packit 08bd4c
file’s complete size, with a much larger range than
Packit 08bd4c
the POSIX file size. In particular, with M type
Packit 08bd4c
files, the current entry is only a portion of the file. In
Packit 08bd4c
that case, the POSIX size field will indicate the size of
Packit 08bd4c
this entry; the realsize field will indicate the
Packit 08bd4c
total size of the file.

Packit 08bd4c
Packit 08bd4c

GNU tar pax

Packit 08bd4c
archives 
Packit 08bd4c
GNU tar 1.14 (XXX check this XXX) and later will write pax
Packit 08bd4c
interchange format archives when you specify the
Packit 08bd4c
--posix flag. This format follows the pax interchange
Packit 08bd4c
format closely, using some SCHILY tags and
Packit 08bd4c
introducing new keywords to store sparse file information.
Packit 08bd4c
There have been three iterations of the sparse file support,
Packit 08bd4c
referred to as ’’0.0’’,
Packit 08bd4c
’’0.1’’, and
Packit 08bd4c
’’1.0’’.

Packit 08bd4c
Packit 08bd4c

GNU.sparse.numblocks,

Packit 08bd4c
GNU.sparse.offset, GNU.sparse.numbytes,
Packit 08bd4c
GNU.sparse.size

Packit 08bd4c
Packit 08bd4c

The

Packit 08bd4c
’’0.0’’ format used an initial
Packit 08bd4c
GNU.sparse.numblocks attribute to indicate the number
Packit 08bd4c
of blocks in the file, a pair of GNU.sparse.offset
Packit 08bd4c
and GNU.sparse.numbytes to indicate the offset and
Packit 08bd4c
size of each block, and a single GNU.sparse.size to
Packit 08bd4c
indicate the full size of the file. This is not the same as
Packit 08bd4c
the size in the tar header because the latter value does not
Packit 08bd4c
include the size of any holes. This format required that the
Packit 08bd4c
order of attributes be preserved and relied on readers
Packit 08bd4c
accepting multiple appearances of the same attribute names,
Packit 08bd4c
which is not officially permitted by the standards.

Packit 08bd4c
Packit 08bd4c

GNU.sparse.map

Packit 08bd4c
Packit 08bd4c

The

Packit 08bd4c
’’0.1’’ format used a single
Packit 08bd4c
attribute that stored a comma-separated list of decimal
Packit 08bd4c
numbers. Each pair of numbers indicated the offset and size,
Packit 08bd4c
respectively, of a block of data. This does not work well if
Packit 08bd4c
the archive is extracted by an archiver that does not
Packit 08bd4c
recognize this extension, since many pax implementations
Packit 08bd4c
simply discard unrecognized attributes.

Packit 08bd4c
Packit 08bd4c

GNU.sparse.major,

Packit 08bd4c
GNU.sparse.minor, GNU.sparse.name,
Packit 08bd4c
GNU.sparse.realsize

Packit 08bd4c
Packit 08bd4c

The

Packit 08bd4c
’’1.0’’ format stores the sparse
Packit 08bd4c
block map in one or more 512-byte blocks prepended to the
Packit 08bd4c
file data in the entry body. The pax attributes indicate the
Packit 08bd4c
existence of this map (via the GNU.sparse.major and
Packit 08bd4c
GNU.sparse.minor fields) and the full size of the
Packit 08bd4c
file. The GNU.sparse.name holds the true name of the
Packit 08bd4c
file. To avoid confusion, the name stored in the regular tar
Packit 08bd4c
header is a modified name so that extraction errors will be
Packit 08bd4c
apparent to users.

Packit 08bd4c
Packit 08bd4c

Solaris

Packit 08bd4c
Tar 
Packit 08bd4c
XXX More Details Needed XXX

Packit 08bd4c
Packit 08bd4c

Solaris tar

Packit 08bd4c
(beginning with SunOS XXX 5.7 ?? XXX) supports an
Packit 08bd4c
’’extended’’ format that is
Packit 08bd4c
fundamentally similar to pax interchange format, with the
Packit 08bd4c
following differences:

Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c

Extended attributes are stored

Packit 08bd4c
in an entry whose type is X, not x, as used by
Packit 08bd4c
pax interchange format. The detailed format of this entry
Packit 08bd4c
appears to be the same as detailed above for the x
Packit 08bd4c
entry.

Packit 08bd4c
Packit 08bd4c

Packit 08bd4c
Packit 08bd4c

An additional A header

Packit 08bd4c
is used to store an ACL for the following regular entry. The
Packit 08bd4c
body of this entry contains a seven-digit octal number
Packit 08bd4c
followed by a zero byte, followed by the textual ACL
Packit 08bd4c
description. The octal value is the number of ACL entries
Packit 08bd4c
plus a constant that indicates the ACL type: 01000000 for
Packit 08bd4c
POSIX.1e ACLs and 03000000 for NFSv4 ACLs.

Packit 08bd4c
Packit 08bd4c

AIX Tar

Packit 08bd4c

Packit 08bd4c
XXX More details needed XXX

Packit 08bd4c
Packit 08bd4c

AIX Tar uses a

Packit 08bd4c
ustar-formatted header with the type A for storing
Packit 08bd4c
coded ACL information. Unlike the Solaris format, AIX tar
Packit 08bd4c
writes this header after the regular file body to which it
Packit 08bd4c
applies. The pathname in this header is either NFS4
Packit 08bd4c
or AIXC to indicate the type of ACL stored. The
Packit 08bd4c
actual ACL is stored in platform-specific binary format.

Packit 08bd4c
Packit 08bd4c

Mac OS X

Packit 08bd4c
Tar 
Packit 08bd4c
The tar distributed with Apple’s Mac OS X stores most
Packit 08bd4c
regular files as two separate files in the tar archive. The
Packit 08bd4c
two files have the same name except that the first one has
Packit 08bd4c
’’._’’ prepended to the last path
Packit 08bd4c
element. This special file stores an AppleDouble-encoded
Packit 08bd4c
binary blob with additional metadata about the second file,
Packit 08bd4c
including ACL, extended attributes, and resources. To
Packit 08bd4c
recreate the original file on disk, each separate file can
Packit 08bd4c
be extracted and the Mac OS X copyfile() function can
Packit 08bd4c
be used to unpack the separate metadata file and apply it to
Packit 08bd4c
th regular file. Conversely, the same function provides a
Packit 08bd4c
’’pack’’ option to encode the
Packit 08bd4c
extended metadata from a file into a separate file whose
Packit 08bd4c
contents can then be put into a tar archive.

Packit 08bd4c
Packit 08bd4c

Note that the

Packit 08bd4c
Apple extended attributes interact badly with long
Packit 08bd4c
filenames. Since each file is stored with the full name, a
Packit 08bd4c
separate set of extensions needs to be included in the
Packit 08bd4c
archive for each one, doubling the overhead required for
Packit 08bd4c
files with long names.

Packit 08bd4c
Packit 08bd4c

Summary of

Packit 08bd4c
tar type codes 
Packit 08bd4c
The following list is a condensed summary of the type codes
Packit 08bd4c
used in tar header records generated by different tar
Packit 08bd4c
implementations. More details about specific implementations
Packit 08bd4c
can be found above:

Packit 08bd4c
Packit 08bd4c

NUL

Packit 08bd4c
Packit 08bd4c

Early tar

Packit 08bd4c
programs stored a zero byte for regular files.

Packit 08bd4c
Packit 08bd4c

0

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a regular file.

Packit 08bd4c
Packit 08bd4c

1

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a hard link description.

Packit 08bd4c
Packit 08bd4c

2

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a symbolic link description.

Packit 08bd4c
Packit 08bd4c

3

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a character device node.

Packit 08bd4c
Packit 08bd4c

4

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a block device node.

Packit 08bd4c
Packit 08bd4c

5

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a directory.

Packit 08bd4c
Packit 08bd4c

6

Packit 08bd4c
Packit 08bd4c

POSIX standard

Packit 08bd4c
type code for a FIFO.

Packit 08bd4c
Packit 08bd4c

7

Packit 08bd4c
Packit 08bd4c

POSIX

Packit 08bd4c
reserved.

Packit 08bd4c
Packit 08bd4c

7

Packit 08bd4c
Packit 08bd4c

GNU tar used

Packit 08bd4c
for pre-allocated files on some systems.

Packit 08bd4c
Packit 08bd4c

A

Packit 08bd4c
Packit 08bd4c

Solaris tar ACL

Packit 08bd4c
description stored prior to a regular file header.

Packit 08bd4c
Packit 08bd4c

A

Packit 08bd4c
Packit 08bd4c

AIX tar ACL

Packit 08bd4c
description stored after the file body.

Packit 08bd4c
Packit 08bd4c

D

Packit 08bd4c
Packit 08bd4c

GNU tar

Packit 08bd4c
directory dump.

Packit 08bd4c
Packit 08bd4c

K

Packit 08bd4c
Packit 08bd4c

GNU tar long

Packit 08bd4c
linkname for the following header.

Packit 08bd4c
Packit 08bd4c

L

Packit 08bd4c
Packit 08bd4c

GNU tar long

Packit 08bd4c
pathname for the following header.

Packit 08bd4c
Packit 08bd4c

M

Packit 08bd4c
Packit 08bd4c

GNU tar

Packit 08bd4c
multivolume marker, indicating the file is a continuation of
Packit 08bd4c
a file from the previous volume.

Packit 08bd4c
Packit 08bd4c

N

Packit 08bd4c
Packit 08bd4c

GNU tar long

Packit 08bd4c
filename support. Deprecated.

Packit 08bd4c
Packit 08bd4c

S

Packit 08bd4c
Packit 08bd4c

GNU tar sparse

Packit 08bd4c
regular file.

Packit 08bd4c
Packit 08bd4c

V

Packit 08bd4c
Packit 08bd4c

GNU tar

Packit 08bd4c
tape/volume header name.

Packit 08bd4c
Packit 08bd4c

X

Packit 08bd4c
Packit 08bd4c

Solaris tar

Packit 08bd4c
general-purpose extension header.

Packit 08bd4c
Packit 08bd4c

g

Packit 08bd4c
Packit 08bd4c

POSIX pax

Packit 08bd4c
interchange format global extensions.

Packit 08bd4c
Packit 08bd4c

x

Packit 08bd4c
Packit 08bd4c

POSIX pax

Packit 08bd4c
interchange format per-file extensions.

Packit 08bd4c
Packit 08bd4c

SEE ALSO

Packit 08bd4c
Packit 08bd4c

ar(1), pax(1), tar(1)

Packit 08bd4c
Packit 08bd4c

STANDARDS

Packit 08bd4c
Packit 08bd4c

The tar utility is no

Packit 08bd4c
longer a part of POSIX or the Single Unix Standard. It last
Packit 08bd4c
appeared in Version 2 of the Single UNIX Specification
Packit 08bd4c
(’’SUSv2’’). It has been supplanted
Packit 08bd4c
in subsequent standards by pax(1). The ustar format is
Packit 08bd4c
currently part of the specification for the pax(1) utility.
Packit 08bd4c
The pax interchange file format is new with IEEE Std
Packit 08bd4c
1003.1-2001 (’’POSIX.1’’).

Packit 08bd4c
Packit 08bd4c

HISTORY

Packit 08bd4c
Packit 08bd4c

A tar command appeared in

Packit 08bd4c
Seventh Edition Unix, which was released in January, 1979.
Packit 08bd4c
It replaced the tp program from Fourth Edition Unix
Packit 08bd4c
which in turn replaced the tap program from First
Packit 08bd4c
Edition Unix. John Gilmore’s pdtar
Packit 08bd4c
public-domain implementation (circa 1987) was highly
Packit 08bd4c
influential and formed the basis of GNU tar (circa
Packit 08bd4c
1988). Joerg Shilling’s star archiver is
Packit 08bd4c
another open-source (CDDL) archiver (originally developed
Packit 08bd4c
circa 1985) which features complete support for pax
Packit 08bd4c
interchange format.

Packit 08bd4c
Packit 08bd4c

This

Packit 08bd4c
documentation was written as part of the libarchive
Packit 08bd4c
and bsdtar project by Tim Kientzle
Packit 08bd4c
<kientzle@FreeBSD.org>.

Packit 08bd4c
Packit 08bd4c

BSD

Packit 08bd4c
December 27, 2016 BSD

Packit 08bd4c

Packit 08bd4c
</body>
Packit 08bd4c
</html>