|
Packit |
06404a |
% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
|
|
Packit |
06404a |
%!TEX root = Vorbis_I_spec.tex
|
|
Packit |
06404a |
% $Id$
|
|
Packit |
06404a |
\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\subsection{Overview}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
This document describes using Ogg logical and physical transport
|
|
Packit |
06404a |
streams to encapsulate Vorbis compressed audio packet data into file
|
|
Packit |
06404a |
form.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
The \xref{vorbis:spec:intro} provides an overview of the construction
|
|
Packit |
06404a |
of Vorbis audio packets.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
The \href{oggstream.html}{Ogg
|
|
Packit |
06404a |
bitstream overview} and \href{framing.html}{Ogg logical
|
|
Packit |
06404a |
bitstream and framing spec} provide detailed descriptions of Ogg
|
|
Packit |
06404a |
transport streams. This specification document assumes a working
|
|
Packit |
06404a |
knowledge of the concepts covered in these named backround
|
|
Packit |
06404a |
documents. Please read them first.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\subsubsection{Restrictions}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
|
|
Packit |
06404a |
streams use Ogg transport streams in degenerate, unmultiplexed
|
|
Packit |
06404a |
form only. That is:
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\begin{itemize}
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
A meta-headerless Ogg file encapsulates the Vorbis I packets
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\end{itemize}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
This is not to say that it is not currently possible to multiplex
|
|
Packit |
06404a |
Vorbis with other media types into a multi-stream Ogg file. At the
|
|
Packit |
06404a |
time this document was written, Ogg was becoming a popular container
|
|
Packit |
06404a |
for low-bitrate movies consisting of DivX video and Vorbis audio.
|
|
Packit |
06404a |
However, a 'Vorbis I audio file' is taken to imply Vorbis audio
|
|
Packit |
06404a |
existing alone within a degenerate Ogg stream. A compliant 'Vorbis
|
|
Packit |
06404a |
audio player' is not required to implement Ogg support beyond the
|
|
Packit |
06404a |
specific support of Vorbis within a degenrate Ogg stream (naturally,
|
|
Packit |
06404a |
application authors are encouraged to support full multiplexed Ogg
|
|
Packit |
06404a |
handling).
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\subsubsection{MIME type}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
The MIME type of Ogg files depend on the context. Specifically, complex
|
|
Packit |
06404a |
multimedia and applications should use \literal{application/ogg},
|
|
Packit |
06404a |
while visual media should use \literal{video/ogg}, and audio
|
|
Packit |
06404a |
\literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear
|
|
Packit |
06404a |
in any of those types. RTP encapsulated Vorbis should use
|
|
Packit |
06404a |
\literal{audio/vorbis} + \literal{audio/vorbis-config}.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\subsection{Encapsulation}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
Ogg encapsulation of a Vorbis packet stream is straightforward.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\begin{itemize}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The first Vorbis packet (the identification header), which
|
|
Packit |
06404a |
uniquely identifies a stream as Vorbis audio, is placed alone in the
|
|
Packit |
06404a |
first page of the logical Ogg stream. This results in a first Ogg
|
|
Packit |
06404a |
page of exactly 58 bytes at the very beginning of the logical stream.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
This first page is marked 'beginning of stream' in the page flags.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The second and third vorbis packets (comment and setup
|
|
Packit |
06404a |
headers) may span one or more pages beginning on the second page of
|
|
Packit |
06404a |
the logical stream. However many pages they span, the third header
|
|
Packit |
06404a |
packet finishes the page on which it ends. The next (first audio) packet
|
|
Packit |
06404a |
must begin on a fresh page.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The granule position of these first pages containing only headers is zero.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The first audio packet of the logical stream begins a fresh Ogg page.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
Packets are placed into ogg pages in order until the end of stream.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The last page is marked 'end of stream' in the page flags.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
Vorbis packets may span page boundaries.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The granule position of pages containing Vorbis audio is in units
|
|
Packit |
06404a |
of PCM audio samples (per channel; a stereo stream's granule position
|
|
Packit |
06404a |
does not increment at twice the speed of a mono stream).
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The granule position of a page represents the end PCM sample
|
|
Packit |
06404a |
position of the last packet \emph{completed} on that
|
|
Packit |
06404a |
page. The 'last PCM sample' is the last complete sample returned by
|
|
Packit |
06404a |
decode, not an internal sample awaiting lapping with a
|
|
Packit |
06404a |
subsequent block. A page that is entirely spanned by a single
|
|
Packit |
06404a |
packet (that completes on a subsequent page) has no granule
|
|
Packit |
06404a |
position, and the granule position is set to '-1'.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
Note that the last decoded (fully lapped) PCM sample from a packet
|
|
Packit |
06404a |
is not necessarily the middle sample from that block. If, eg, the
|
|
Packit |
06404a |
current Vorbis packet encodes a "long block" and the next Vorbis
|
|
Packit |
06404a |
packet encodes a "short block", the last decodable sample from the
|
|
Packit |
06404a |
current packet be at position (3*long\_block\_length/4) -
|
|
Packit |
06404a |
(short\_block\_length/4).
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
The granule (PCM) position of the first page need not indicate
|
|
Packit |
06404a |
that the stream started at position zero. Although the granule
|
|
Packit |
06404a |
position belongs to the last completed packet on the page and a
|
|
Packit |
06404a |
valid granule position must be positive, by
|
|
Packit |
06404a |
inference it may indicate that the PCM position of the beginning
|
|
Packit |
06404a |
of audio is positive or negative.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\begin{itemize}
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
A positive starting value simply indicates that this stream begins at
|
|
Packit |
06404a |
some positive time offset, potentially within a larger
|
|
Packit |
06404a |
program. This is a common case when connecting to the middle
|
|
Packit |
06404a |
of broadcast stream.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
A negative value indicates that
|
|
Packit |
06404a |
output samples preceeding time zero should be discarded during
|
|
Packit |
06404a |
decoding; this technique is used to allow sample-granularity
|
|
Packit |
06404a |
editing of the stream start time of already-encoded Vorbis
|
|
Packit |
06404a |
streams. The number of samples to be discarded must not exceed
|
|
Packit |
06404a |
the overlap-add span of the first two audio packets.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\end{itemize}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
In both of these cases in which the initial audio PCM starting
|
|
Packit |
06404a |
offset is nonzero, the second finished audio packet must flush the
|
|
Packit |
06404a |
page on which it appears and the third packet begin a fresh page.
|
|
Packit |
06404a |
This allows the decoder to always be able to perform PCM position
|
|
Packit |
06404a |
adjustments before needing to return any PCM data from synthesis,
|
|
Packit |
06404a |
resulting in correct positioning information without any aditional
|
|
Packit |
06404a |
seeking logic.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\begin{note}
|
|
Packit |
06404a |
Failure to do so should, at worst, cause a
|
|
Packit |
06404a |
decoder implementation to return incorrect positioning information
|
|
Packit |
06404a |
for seeking operations at the very beginning of the stream.
|
|
Packit |
06404a |
\end{note}
|
|
Packit |
06404a |
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\item
|
|
Packit |
06404a |
A granule position on the final page in a stream that indicates
|
|
Packit |
06404a |
less audio data than the final packet would normally return is used to
|
|
Packit |
06404a |
end the stream on other than even frame boundaries. The difference
|
|
Packit |
06404a |
between the actual available data returned and the declared amount
|
|
Packit |
06404a |
indicates how many trailing samples to discard from the decoding
|
|
Packit |
06404a |
process.
|
|
Packit |
06404a |
|
|
Packit |
06404a |
\end{itemize}
|