Blame README.md

Packit 08bd4c
# Welcome to libarchive!
Packit 08bd4c
Packit 08bd4c
The libarchive project develops a portable, efficient C library that
Packit 08bd4c
can read and write streaming archives in a variety of formats.  It
Packit 08bd4c
also includes implementations of the common `tar`, `cpio`, and `zcat`
Packit 08bd4c
command-line tools that use the libarchive library.
Packit 08bd4c
Packit 08bd4c
## Questions?  Issues?
Packit 08bd4c
Packit 08bd4c
* http://www.libarchive.org is the home for ongoing
Packit 08bd4c
  libarchive development, including documentation,
Packit 08bd4c
  and links to the libarchive mailing lists.
Packit 08bd4c
* To report an issue, use the issue tracker at
Packit 08bd4c
  https://github.com/libarchive/libarchive/issues
Packit 08bd4c
* To submit an enhancement to libarchive, please
Packit 08bd4c
  submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls
Packit 08bd4c
Packit 08bd4c
## Contents of the Distribution
Packit 08bd4c
Packit 08bd4c
This distribution bundle includes the following major components:
Packit 08bd4c
Packit 08bd4c
* **libarchive**: a library for reading and writing streaming archives
Packit 08bd4c
* **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive
Packit 08bd4c
* **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality
Packit 08bd4c
* **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such
Packit 08bd4c
* **examples**: Some small example programs that you may find useful.
Packit 08bd4c
* **examples/minitar**: a compact sample demonstrating use of libarchive.
Packit 08bd4c
* **contrib**:  Various items sent to me by third parties; please contact the authors with any questions.
Packit 08bd4c
Packit 08bd4c
The top-level directory contains the following information files:
Packit 08bd4c
Packit 08bd4c
* **NEWS** - highlights of recent changes
Packit 08bd4c
* **COPYING** - what you can do with this
Packit 08bd4c
* **INSTALL** - installation instructions
Packit 08bd4c
* **README** - this file
Packit 08bd4c
* **CMakeLists.txt** - input for "cmake" build tool, see INSTALL
Packit 08bd4c
* **configure** - configuration script, see INSTALL for details.  If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`).
Packit 08bd4c
Packit 08bd4c
The following files in the top-level directory are used by the 'configure' script:
Packit 08bd4c
* `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers
Packit 08bd4c
* `Makefile.in`, `config.h.in` - templates used by configure script
Packit 08bd4c
Packit 08bd4c
## Documentation
Packit 08bd4c
Packit 08bd4c
In addition to the informational articles and documentation
Packit 08bd4c
in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki),
Packit 08bd4c
the distribution also includes a number of manual pages:
Packit 08bd4c
Packit 08bd4c
 * bsdtar.1 explains the use of the bsdtar program
Packit 08bd4c
 * bsdcpio.1 explains the use of the bsdcpio program
Packit 08bd4c
 * bsdcat.1 explains the use of the bsdcat program
Packit 08bd4c
 * libarchive.3 gives an overview of the library as a whole
Packit 08bd4c
 * archive_read.3, archive_write.3, archive_write_disk.3, and
Packit 08bd4c
   archive_read_disk.3 provide detailed calling sequences for the read
Packit 08bd4c
   and write APIs
Packit 08bd4c
 * archive_entry.3 details the "struct archive_entry" utility class
Packit 08bd4c
 * archive_internals.3 provides some insight into libarchive's
Packit 08bd4c
   internal structure and operation.
Packit 08bd4c
 * libarchive-formats.5 documents the file formats supported by the library
Packit 08bd4c
 * cpio.5, mtree.5, and tar.5 provide detailed information about these
Packit 08bd4c
   popular archive formats, including hard-to-find details about
Packit 08bd4c
   modern cpio and tar variants.
Packit 08bd4c
Packit 08bd4c
The manual pages above are provided in the 'doc' directory in
Packit 08bd4c
a number of different formats.
Packit 08bd4c
Packit 08bd4c
You should also read the copious comments in `archive.h` and the
Packit 08bd4c
source code for the sample programs for more details.  Please let us
Packit 08bd4c
know about any errors or omissions you find.
Packit 08bd4c
Packit 08bd4c
## Supported Formats
Packit 08bd4c
Packit 08bd4c
Currently, the library automatically detects and reads the following fomats:
Packit 08bd4c
  * Old V7 tar archives
Packit 08bd4c
  * POSIX ustar
Packit 08bd4c
  * GNU tar format (including GNU long filenames, long link names, and sparse files)
Packit 08bd4c
  * Solaris 9 extended tar format (including ACLs)
Packit 08bd4c
  * POSIX pax interchange format
Packit 08bd4c
  * POSIX octet-oriented cpio
Packit 08bd4c
  * SVR4 ASCII cpio
Packit 08bd4c
  * POSIX octet-oriented cpio
Packit 08bd4c
  * Binary cpio (big-endian or little-endian)
Packit 08bd4c
  * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions)
Packit 08bd4c
  * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives)
Packit 08bd4c
  * GNU and BSD 'ar' archives
Packit 08bd4c
  * 'mtree' format
Packit 08bd4c
  * 7-Zip archives
Packit 08bd4c
  * Microsoft CAB format
Packit 08bd4c
  * LHA and LZH archives
Packit 08bd4c
  * RAR archives (with some limitations due to RAR's proprietary status)
Packit 08bd4c
  * XAR archives
Packit 08bd4c
Packit 08bd4c
The library also detects and handles any of the following before evaluating the archive:
Packit 08bd4c
  * uuencoded files
Packit 08bd4c
  * files with RPM wrapper
Packit 08bd4c
  * gzip compression
Packit 08bd4c
  * bzip2 compression
Packit 08bd4c
  * compress/LZW compression
Packit 08bd4c
  * lzma, lzip, and xz compression
Packit 08bd4c
  * lz4 compression
Packit 08bd4c
  * lzop compression
Packit 08bd4c
Packit 08bd4c
The library can create archives in any of the following formats:
Packit 08bd4c
  * POSIX ustar
Packit 08bd4c
  * POSIX pax interchange format
Packit 08bd4c
  * "restricted" pax format, which will create ustar archives except for
Packit 08bd4c
    entries that require pax extensions (for long filenames, ACLs, etc).
Packit 08bd4c
  * Old GNU tar format
Packit 08bd4c
  * Old V7 tar format
Packit 08bd4c
  * POSIX octet-oriented cpio
Packit 08bd4c
  * SVR4 "newc" cpio
Packit 08bd4c
  * shar archives
Packit 08bd4c
  * ZIP archives (with uncompressed or "deflate" compressed entries)
Packit 08bd4c
  * GNU and BSD 'ar' archives
Packit 08bd4c
  * 'mtree' format
Packit 08bd4c
  * ISO9660 format
Packit 08bd4c
  * 7-Zip archives
Packit 08bd4c
  * XAR archives
Packit 08bd4c
Packit 08bd4c
When creating archives, the result can be filtered with any of the following:
Packit 08bd4c
  * uuencode
Packit 08bd4c
  * gzip compression
Packit 08bd4c
  * bzip2 compression
Packit 08bd4c
  * compress/LZW compression
Packit 08bd4c
  * lzma, lzip, and xz compression
Packit 08bd4c
  * lz4 compression
Packit 08bd4c
  * lzop compression
Packit 08bd4c
Packit 08bd4c
## Notes about the Library Design
Packit 08bd4c
Packit 08bd4c
The following notes address many of the most common
Packit 08bd4c
questions we are asked about libarchive:
Packit 08bd4c
Packit 08bd4c
* This is a heavily stream-oriented system.  That means that
Packit 08bd4c
  it is optimized to read or write the archive in a single
Packit 08bd4c
  pass from beginning to end.  For example, this allows
Packit 08bd4c
  libarchive to process archives too large to store on disk
Packit 08bd4c
  by processing them on-the-fly as they are read from or
Packit 08bd4c
  written to a network or tape drive.  This also makes
Packit 08bd4c
  libarchive useful for tools that need to produce
Packit 08bd4c
  archives on-the-fly (such as webservers that provide
Packit 08bd4c
  archived contents of a users account).
Packit 08bd4c
Packit 08bd4c
* In-place modification and random access to the contents
Packit 08bd4c
  of an archive are not directly supported.  For some formats,
Packit 08bd4c
  this is not an issue: For example, tar.gz archives are not
Packit 08bd4c
  designed for random access.  In some other cases, libarchive
Packit 08bd4c
  can re-open an archive and scan it from the beginning quickly
Packit 08bd4c
  enough to provide the needed abilities even without true
Packit 08bd4c
  random access.  Of course, some applications do require true
Packit 08bd4c
  random access; those applications should consider alternatives
Packit 08bd4c
  to libarchive.
Packit 08bd4c
Packit 08bd4c
* The library is designed to be extended with new compression and
Packit 08bd4c
  archive formats.  The only requirement is that the format be
Packit 08bd4c
  readable or writable as a stream and that each archive entry be
Packit 08bd4c
  independent.  There are articles on the libarchive Wiki explaining
Packit 08bd4c
  how to extend libarchive.
Packit 08bd4c
Packit 08bd4c
* On read, compression and format are always detected automatically.
Packit 08bd4c
Packit 08bd4c
* The same API is used for all formats; in particular, it's very
Packit 08bd4c
  easy for software using libarchive to transparently handle
Packit 08bd4c
  any of libarchive's archiving formats.
Packit 08bd4c
Packit 08bd4c
* Libarchive's automatic support for decompression can be used
Packit 08bd4c
  without archiving by explicitly selecting the "raw" and "empty"
Packit 08bd4c
  formats.
Packit 08bd4c
Packit 08bd4c
* I've attempted to minimize static link pollution.  If you don't
Packit 08bd4c
  explicitly invoke a particular feature (such as support for a
Packit 08bd4c
  particular compression or format), it won't get pulled in to
Packit 08bd4c
  statically-linked programs.  In particular, if you don't explicitly
Packit 08bd4c
  enable a particular compression or decompression support, you won't
Packit 08bd4c
  need to link against the corresponding compression or decompression
Packit 08bd4c
  libraries.  This also reduces the size of statically-linked
Packit 08bd4c
  binaries in environments where that matters.
Packit 08bd4c
Packit 08bd4c
* The library is generally _thread safe_ depending on the platform:
Packit 08bd4c
  it does not define any global variables of its own.  However, some
Packit 08bd4c
  platforms do not provide fully thread-safe versions of key C library
Packit 08bd4c
  functions.  On those platforms, libarchive will use the non-thread-safe
Packit 08bd4c
  functions.  Patches to improve this are of great interest to us.
Packit 08bd4c
Packit 08bd4c
* In particular, libarchive's modules to read or write a directory
Packit 08bd4c
  tree do use `chdir()` to optimize the directory traversals.  This
Packit 08bd4c
  can cause problems for programs that expect to do disk access from
Packit 08bd4c
  multiple threads.  Of course, those modules are completely
Packit 08bd4c
  optional and you can use the rest of libarchive without them.
Packit 08bd4c
Packit 08bd4c
* The library is _not_ thread aware, however.  It does no locking
Packit 08bd4c
  or thread management of any kind.  If you create a libarchive
Packit 08bd4c
  object and need to access it from multiple threads, you will
Packit 08bd4c
  need to provide your own locking.
Packit 08bd4c
Packit 08bd4c
* On read, the library accepts whatever blocks you hand it.
Packit 08bd4c
  Your read callback is free to pass the library a byte at a time
Packit 08bd4c
  or mmap the entire archive and give it to the library at once.
Packit 08bd4c
  On write, the library always produces correctly-blocked output.
Packit 08bd4c
Packit 08bd4c
* The object-style approach allows you to have multiple archive streams
Packit 08bd4c
  open at once.  bsdtar uses this in its "@archive" extension.
Packit 08bd4c
Packit 08bd4c
* The archive itself is read/written using callback functions.
Packit 08bd4c
  You can read an archive directly from an in-memory buffer or
Packit 08bd4c
  write it to a socket, if you wish.  There are some utility
Packit 08bd4c
  functions to provide easy-to-use "open file," etc, capabilities.
Packit 08bd4c
Packit 08bd4c
* The read/write APIs are designed to allow individual entries
Packit 08bd4c
  to be read or written to any data source:  You can create
Packit 08bd4c
  a block of data in memory and add it to a tar archive without
Packit 08bd4c
  first writing a temporary file.  You can also read an entry from
Packit 08bd4c
  an archive and write the data directly to a socket.  If you want
Packit 08bd4c
  to read/write entries to disk, there are convenience functions to
Packit 08bd4c
  make this especially easy.
Packit 08bd4c
Packit 08bd4c
* Note: The "pax interchange format" is a POSIX standard extended tar
Packit 08bd4c
  format that should be used when the older _ustar_ format is not
Packit 08bd4c
  appropriate.  It has many advantages over other tar formats
Packit 08bd4c
  (including the legacy GNU tar format) and is widely supported by
Packit 08bd4c
  current tar implementations.
Packit 08bd4c