Blame doc/framing.html

Packit 06404a
Packit 06404a
<html>
Packit 06404a
<head>
Packit 06404a
Packit 06404a
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
Packit 06404a
<title>Ogg Vorbis Documentation</title>
Packit 06404a
Packit 06404a
<style type="text/css">
Packit 06404a
body {
Packit 06404a
  margin: 0 18px 0 18px;
Packit 06404a
  padding-bottom: 30px;
Packit 06404a
  font-family: Verdana, Arial, Helvetica, sans-serif;
Packit 06404a
  color: #333333;
Packit 06404a
  font-size: .8em;
Packit 06404a
}
Packit 06404a
Packit 06404a
a {
Packit 06404a
  color: #3366cc;
Packit 06404a
}
Packit 06404a
Packit 06404a
img {
Packit 06404a
  border: 0;
Packit 06404a
}
Packit 06404a
Packit 06404a
#xiphlogo {
Packit 06404a
  margin: 30px 0 16px 0;
Packit 06404a
}
Packit 06404a
Packit 06404a
#content p {
Packit 06404a
  line-height: 1.4;
Packit 06404a
}
Packit 06404a
Packit 06404a
h1, h1 a, h2, h2 a, h3, h3 a {
Packit 06404a
  font-weight: bold;
Packit 06404a
  color: #ff9900;
Packit 06404a
  margin: 1.3em 0 8px 0;
Packit 06404a
}
Packit 06404a
Packit 06404a
h1 {
Packit 06404a
  font-size: 1.3em;
Packit 06404a
}
Packit 06404a
Packit 06404a
h2 {
Packit 06404a
  font-size: 1.2em;
Packit 06404a
}
Packit 06404a
Packit 06404a
h3 {
Packit 06404a
  font-size: 1.1em;
Packit 06404a
}
Packit 06404a
Packit 06404a
li {
Packit 06404a
  line-height: 1.4;
Packit 06404a
}
Packit 06404a
Packit 06404a
#copyright {
Packit 06404a
  margin-top: 30px;
Packit 06404a
  line-height: 1.5em;
Packit 06404a
  text-align: center;
Packit 06404a
  font-size: .8em;
Packit 06404a
  color: #888888;
Packit 06404a
  clear: both;
Packit 06404a
}
Packit 06404a
</style>
Packit 06404a
Packit 06404a
</head>
Packit 06404a
Packit 06404a
<body>
Packit 06404a
Packit 06404a
Packit 06404a
  Fish Logo and Xiph.Org
Packit 06404a
Packit 06404a
Packit 06404a

Ogg logical bitstream framing

Packit 06404a
Packit 06404a

Ogg bitstreams

Packit 06404a
Packit 06404a

The Ogg transport bitstream is designed to provide framing, error

Packit 06404a
protection and seeking structure for higher-level codec streams that
Packit 06404a
consist of raw, unencapsulated data packets, such as the Vorbis audio
Packit 06404a
codec or Theora video codec.

Packit 06404a
Packit 06404a

Application example: Vorbis

Packit 06404a
Packit 06404a

Vorbis encodes short-time blocks of PCM data into raw packets of

Packit 06404a
bit-packed data. These raw packets may be used directly by transport
Packit 06404a
mechanisms that provide their own framing and packet-separation
Packit 06404a
mechanisms (such as UDP datagrams). For stream based storage (such as
Packit 06404a
files) and transport (such as TCP streams or pipes), Vorbis uses the
Packit 06404a
Ogg bitstream format to provide framing/sync, sync recapture
Packit 06404a
after error, landmarks during seeking, and enough information to
Packit 06404a
properly separate data back into packets at the original packet
Packit 06404a
boundaries without relying on decoding to find packet boundaries.

Packit 06404a
Packit 06404a

Design constraints for Ogg bitstreams

Packit 06404a
Packit 06404a
    Packit 06404a
  1. True streaming; we must not need to seek to build a 100%
  2. Packit 06404a
      complete bitstream.
    Packit 06404a
  3. Use no more than approximately 1-2% of bitstream bandwidth for
  4. Packit 06404a
      packet boundary marking, high-level framing, sync and seeking.
    Packit 06404a
  5. Specification of absolute position within the original sample
  6. Packit 06404a
      stream.
    Packit 06404a
  7. Simple mechanism to ease limited editing, such as a simplified
  8. Packit 06404a
      concatenation mechanism.
    Packit 06404a
  9. Detection of corruption, recapture after error and direct, random
  10. Packit 06404a
      access to data at arbitrary positions in the bitstream.
    Packit 06404a
    Packit 06404a
    Packit 06404a

    Logical and Physical Bitstreams

    Packit 06404a
    Packit 06404a

    A logical Ogg bitstream is a contiguous stream of

    Packit 06404a
    sequential pages belonging only to the logical bitstream. A
    Packit 06404a
    physical Ogg bitstream is constructed from one or more
    Packit 06404a
    than one logical Ogg bitstream (the simplest physical bitstream
    Packit 06404a
    is simply a single logical bitstream). We describe below the exact
    Packit 06404a
    formatting of an Ogg logical bitstream. Combining logical
    Packit 06404a
    bitstreams into more complex physical bitstreams is described in the
    Packit 06404a
    Ogg bitstream overview. The exact
    Packit 06404a
    mapping of raw Vorbis packets into a valid Ogg Vorbis physical
    Packit 06404a
    bitstream is described in the Vorbis I Specification.

    Packit 06404a
    Packit 06404a

    Bitstream structure

    Packit 06404a
    Packit 06404a

    An Ogg stream is structured by dividing incoming packets into

    Packit 06404a
    segments of up to 255 bytes and then wrapping a group of contiguous
    Packit 06404a
    packet segments into a variable length page preceded by a page
    Packit 06404a
    header. Both the header size and page size are variable; the page
    Packit 06404a
    header contains sizing information and checksum data to determine
    Packit 06404a
    header/page size and data integrity.

    Packit 06404a
    Packit 06404a

    The bitstream is captured (or recaptured) by looking for the beginning

    Packit 06404a
    of a page, specifically the capture pattern. Once the capture pattern
    Packit 06404a
    is found, the decoder verifies page sync and integrity by computing
    Packit 06404a
    and comparing the checksum. At that point, the decoder can extract the
    Packit 06404a
    packets themselves.

    Packit 06404a
    Packit 06404a

    Packet segmentation

    Packit 06404a
    Packit 06404a

    Packets are logically divided into multiple segments before encoding

    Packit 06404a
    into a page. Note that the segmentation and fragmentation process is a
    Packit 06404a
    logical one; it's used to compute page header values and the original
    Packit 06404a
    page data need not be disturbed, even when a packet spans page
    Packit 06404a
    boundaries.

    Packit 06404a
    Packit 06404a

    The raw packet is logically divided into [n] 255 byte segments and a

    Packit 06404a
    last fractional segment of < 255 bytes. A packet size may well
    Packit 06404a
    consist only of the trailing fractional segment, and a fractional
    Packit 06404a
    segment may be zero length. These values, called "lacing values" are
    Packit 06404a
    then saved and placed into the header segment table.

    Packit 06404a
    Packit 06404a

    An example should make the basic concept clear:

    Packit 06404a
    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
    raw packet:
    Packit 06404a
      ___________________________________________
    Packit 06404a
     |______________packet data__________________| 753 bytes
    Packit 06404a
    Packit 06404a
    lacing values for page header segment table: 255,255,243
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a
    Packit 06404a

    We simply add the lacing values for the total size; the last lacing

    Packit 06404a
    value for a packet is always the value that is less than 255. Note
    Packit 06404a
    that this encoding both avoids imposing a maximum packet size as well
    Packit 06404a
    as imposing minimum overhead on small packets (as opposed to, eg,
    Packit 06404a
    simply using two bytes at the head of every packet and having a max
    Packit 06404a
    packet size of 32k. Small packets (<255, the typical case) are
    Packit 06404a
    penalized with twice the segmentation overhead). Using the lacing
    Packit 06404a
    values as suggested, small packets see the minimum possible
    Packit 06404a
    byte-aligned overheade (1 byte) and large packets, over 512 bytes or
    Packit 06404a
    so, see a fairly constant ~.5% overhead on encoding space.

    Packit 06404a
    Packit 06404a

    Note that a lacing value of 255 implies that a second lacing value

    Packit 06404a
    follows in the packet, and a value of < 255 marks the end of the
    Packit 06404a
    packet after that many additional bytes. A packet of 255 bytes (or a
    Packit 06404a
    multiple of 255 bytes) is terminated by a lacing value of 0:

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
    raw packet:
    Packit 06404a
      _______________________________
    Packit 06404a
     |________packet data____________|          255 bytes
    Packit 06404a
    Packit 06404a
    lacing values: 255, 0
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    Note also that a 'nil' (zero length) packet is not an error; it

    Packit 06404a
    consists of nothing more than a lacing value of zero in the header.

    Packit 06404a
    Packit 06404a

    Packets spanning pages

    Packit 06404a
    Packit 06404a

    Packets are not restricted to beginning and ending within a page,

    Packit 06404a
    although individual segments are, by definition, required to do so.
    Packit 06404a
    Packets are not restricted to a maximum size, although excessively
    Packit 06404a
    large packets in the data stream are discouraged; the Ogg
    Packit 06404a
    bitstream specification strongly recommends nominal page size of
    Packit 06404a
    approximately 4-8kB (large packets are foreseen as being useful for
    Packit 06404a
    initialization data at the beginning of a logical bitstream).

    Packit 06404a
    Packit 06404a

    After segmenting a packet, the encoder may decide not to place all the

    Packit 06404a
    resulting segments into the current page; to do so, the encoder places
    Packit 06404a
    the lacing values of the segments it wishes to belong to the current
    Packit 06404a
    page into the current segment table, then finishes the page. The next
    Packit 06404a
    page is begun with the first value in the segment table belonging to
    Packit 06404a
    the next packet segment, thus continuing the packet (data in the
    Packit 06404a
    packet body must also correspond properly to the lacing values in the
    Packit 06404a
    spanned pages. The segment data in the first packet corresponding to
    Packit 06404a
    the lacing values of the first page belong in that page; packet
    Packit 06404a
    segments listed in the segment table of the following page must begin
    Packit 06404a
    the page body of the subsequent page).

    Packit 06404a
    Packit 06404a

    The last mechanic to spanning a page boundary is to set the header

    Packit 06404a
    flag in the new page to indicate that the first lacing value in the
    Packit 06404a
    segment table continues rather than begins a packet; a header flag of
    Packit 06404a
    0x01 is set to indicate a continued packet. Although mandatory, it
    Packit 06404a
    is not actually algorithmically necessary; one could inspect the
    Packit 06404a
    preceding segment table to determine if the packet is new or
    Packit 06404a
    continued. Adding the information to the packet_header flag allows a
    Packit 06404a
    simpler design (with no overhead) that needs only inspect the current
    Packit 06404a
    page header after frame capture. This also allows faster error
    Packit 06404a
    recovery in the event that the packet originates in a corrupt
    Packit 06404a
    preceding page, implying that the previous page's segment table
    Packit 06404a
    cannot be trusted.

    Packit 06404a
    Packit 06404a

    Note that a packet can span an arbitrary number of pages; the above

    Packit 06404a
    spanning process is repeated for each spanned page boundary. Also a
    Packit 06404a
    'zero termination' on a packet size that is an even multiple of 255
    Packit 06404a
    must appear even if the lacing value appears in the next page as a
    Packit 06404a
    zero-length continuation of the current packet. The header flag
    Packit 06404a
    should be set to 0x01 to indicate that the packet spanned, even though
    Packit 06404a
    the span is a nil case as far as data is concerned.

    Packit 06404a
    Packit 06404a

    The encoding looks odd, but is properly optimized for speed and the

    Packit 06404a
    expected case of the majority of packets being between 50 and 200
    Packit 06404a
    bytes (note that it is designed such that packets of wildly different
    Packit 06404a
    sizes can be handled within the model; placing packet size
    Packit 06404a
    restrictions on the encoder would have only slightly simplified design
    Packit 06404a
    in page generation and increased overall encoder complexity).

    Packit 06404a
    Packit 06404a

    The main point behind tracking individual packets (and packet

    Packit 06404a
    segments) is to allow more flexible encoding tricks that requiring
    Packit 06404a
    explicit knowledge of packet size. An example is simple bandwidth
    Packit 06404a
    limiting, implemented by simply truncating packets in the nominal case
    Packit 06404a
    if the packet is arranged so that the least sensitive portion of the
    Packit 06404a
    data comes last.

    Packit 06404a
    Packit 06404a

    Page header

    Packit 06404a
    Packit 06404a

    The headering mechanism is designed to avoid copying and re-assembly

    Packit 06404a
    of the packet data (ie, making the packet segmentation process a
    Packit 06404a
    logical one); the header can be generated directly from incoming
    Packit 06404a
    packet data. The encoder buffers packet data until it finishes a
    Packit 06404a
    complete page at which point it writes the header followed by the
    Packit 06404a
    buffered packet segments.

    Packit 06404a
    Packit 06404a

    capture_pattern

    Packit 06404a
    Packit 06404a

    A header begins with a capture pattern that simplifies identifying

    Packit 06404a
    pages; once the decoder has found the capture pattern it can do a more
    Packit 06404a
    intensive job of verifying that it has in fact found a page boundary
    Packit 06404a
    (as opposed to an inadvertent coincidence in the byte stream).

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
      0  0x4f 'O'
    Packit 06404a
      1  0x67 'g'
    Packit 06404a
      2  0x67 'g'
    Packit 06404a
      3  0x53 'S'  
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    stream_structure_version

    Packit 06404a
    Packit 06404a

    The capture pattern is followed by the stream structure revision:

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
      4  0x00
    Packit 06404a
    </tt>
    Packit 06404a
     
    Packit 06404a

    header_type_flag

    Packit 06404a
      
    Packit 06404a

    The header type flag identifies this page's context in the bitstream:

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
      5  bitflags: 0x01: unset = fresh packet
    Packit 06404a
    	               set = continued packet
    Packit 06404a
    	       0x02: unset = not first page of logical bitstream
    Packit 06404a
                           set = first page of logical bitstream (bos)
    Packit 06404a
    	       0x04: unset = not last page of logical bitstream
    Packit 06404a
                           set = last page of logical bitstream (eos)
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    absolute granule position

    Packit 06404a
    Packit 06404a

    (This is packed in the same way the rest of Ogg data is packed; LSb

    Packit 06404a
    of LSB first. Note that the 'position' data specifies a 'sample'
    Packit 06404a
    number (eg, in a CD quality sample is four octets, 16 bits for left
    Packit 06404a
    and 16 bits for right; in video it would likely be the frame number.
    Packit 06404a
    It is up to the specific codec in use to define the semantic meaning
    Packit 06404a
    of the granule position value). The position specified is the total
    Packit 06404a
    samples encoded after including all packets finished on this page
    Packit 06404a
    (packets begun on this page but continuing on to the next page do not
    Packit 06404a
    count). The rationale here is that the position specified in the
    Packit 06404a
    frame header of the last page tells how long the data coded by the
    Packit 06404a
    bitstream is. A truncated stream will still return the proper number
    Packit 06404a
    of samples that can be decoded fully.

    Packit 06404a
    Packit 06404a

    A special value of '-1' (in two's complement) indicates that no packets

    Packit 06404a
    finish on this page.

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
      6  0xXX LSB
    Packit 06404a
      7  0xXX
    Packit 06404a
      8  0xXX
    Packit 06404a
      9  0xXX
    Packit 06404a
     10  0xXX
    Packit 06404a
     11  0xXX
    Packit 06404a
     12  0xXX
    Packit 06404a
     13  0xXX MSB
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    stream serial number

    Packit 06404a
     
    Packit 06404a

    Ogg allows for separate logical bitstreams to be mixed at page

    Packit 06404a
    granularity in a physical bitstream. The most common case would be
    Packit 06404a
    sequential arrangement, but it is possible to interleave pages for
    Packit 06404a
    two separate bitstreams to be decoded concurrently. The serial
    Packit 06404a
    number is the means by which pages physical pages are associated with
    Packit 06404a
    a particular logical stream. Each logical stream must have a unique
    Packit 06404a
    serial number within a physical stream:

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
     14  0xXX LSB
    Packit 06404a
     15  0xXX
    Packit 06404a
     16  0xXX
    Packit 06404a
     17  0xXX MSB
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    page sequence no

    Packit 06404a
    Packit 06404a

    Page counter; lets us know if a page is lost (useful where packets

    Packit 06404a
    span page boundaries).

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
     18  0xXX LSB
    Packit 06404a
     19  0xXX
    Packit 06404a
     20  0xXX
    Packit 06404a
     21  0xXX MSB
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    page checksum

    Packit 06404a
         
    Packit 06404a

    32 bit CRC value (direct algorithm, initial val and final XOR = 0,

    Packit 06404a
    generator polynomial=0x04c11db7). The value is computed over the
    Packit 06404a
    entire header (with the CRC field in the header set to zero) and then
    Packit 06404a
    continued over the page. The CRC field is then filled with the
    Packit 06404a
    computed value.

    Packit 06404a
    Packit 06404a

    (A thorough discussion of CRC algorithms can be found in

    Packit 06404a
    href="http://www.ross.net/crc/download/crc_v3.txt">"A
    Packit 06404a
    Painless Guide to CRC Error Detection Algorithms" by Ross
    Packit 06404a
    Williams ross@ross.net.)

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
     22  0xXX LSB
    Packit 06404a
     23  0xXX
    Packit 06404a
     24  0xXX
    Packit 06404a
     25  0xXX MSB
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    page_segments

    Packit 06404a
    Packit 06404a

    The number of segment entries to appear in the segment table. The

    Packit 06404a
    maximum number of 255 segments (255 bytes each) sets the maximum
    Packit 06404a
    possible physical page size at 65307 bytes or just under 64kB (thus
    Packit 06404a
    we know that a header corrupted so as destroy sizing/alignment
    Packit 06404a
    information will not cause a runaway bitstream. We'll read in the
    Packit 06404a
    page according to the corrupted size information that's guaranteed to
    Packit 06404a
    be a reasonable size regardless, notice the checksum mismatch, drop
    Packit 06404a
    sync and then look for recapture).

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
     26 0x00-0xff (0-255)
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    segment_table (containing packet lacing values)

    Packit 06404a
    Packit 06404a

    The lacing values for each packet segment physically appearing in

    Packit 06404a
    this page are listed in contiguous order.

    Packit 06404a
    Packit 06404a
    <tt>
    Packit 06404a
     byte value
    Packit 06404a
    Packit 06404a
     27 0x00-0xff (0-255)
    Packit 06404a
     [...]
    Packit 06404a
     n  0x00-0xff (0-255, n=page_segments+26)
    Packit 06404a
    </tt>
    Packit 06404a
    Packit 06404a

    Total page size is calculated directly from the known header size and

    Packit 06404a
    lacing values in the segment table. Packet data segments follow
    Packit 06404a
    immediately after the header.

    Packit 06404a
    Packit 06404a

    Page headers typically impose a flat .25-.5% space overhead assuming

    Packit 06404a
    nominal ~8k page sizes. The segmentation table needed for exact
    Packit 06404a
    packet recovery in the streaming layer adds approximately .5-1%
    Packit 06404a
    nominal assuming expected encoder behavior in the 44.1kHz, 128kbps
    Packit 06404a
    stereo encodings.

    Packit 06404a
    Packit 06404a
    Packit 06404a
      The Xiph Fish Logo is a
    Packit 06404a
      trademark (™) of Xiph.Org.
    Packit 06404a
    Packit 06404a
      These pages © 1994 - 2005 Xiph.Org. All rights reserved.
    Packit 06404a
    Packit 06404a
    Packit 06404a
    </body>
    Packit 06404a
    </html>