blob: 39ec897593966fe45bb974b2064d72478f9d069e [file] [log] [blame]
#LyX 1.6.1 created this file. For more info see http://www.lyx.org/
\lyxformat 345
\begin_document
\begin_header
\textclass scrbook
\use_default_options true
\language english
\inputencoding auto
\font_roman default
\font_sans default
\font_typewriter default
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\paperfontsize 10
\spacing single
\use_hyperref false
\papersize letterpaper
\use_geometry true
\use_amsmath 2
\use_esint 2
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\leftmargin 2cm
\topmargin 2cm
\rightmargin 2cm
\bottommargin 2cm
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle headings
\tracking_changes false
\output_changes false
\author ""
\author ""
\end_header
\begin_body
\begin_layout Title
The Speex Manual
\begin_inset Newline newline
\end_inset
Version 1.2
\end_layout
\begin_layout Author
Jean-Marc Valin
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Standard
Copyright
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
copyright
\end_layout
\end_inset
2002-2008 Jean-Marc Valin/Xiph.org Foundation
\end_layout
\begin_layout Standard
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.1 or any later
version published by the Free Software Foundation; with no Invariant Section,
with no Front-Cover Texts, and with no Back-Cover.
A copy of the license is included in the section entitled "GNU Free Documentati
on License".
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\begin_inset CommandInset toc
LatexCommand tableofcontents
\end_inset
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FloatList table
\end_inset
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Introduction to Speex
\end_layout
\begin_layout Standard
The Speex codec (
\family typewriter
http://www.speex.org/
\family default
) exists because there is a need for a speech codec that is open-source
and free from software patent royalties.
These are essential conditions for being usable in any open-source software.
In essence, Speex is to speech what Vorbis is to audio/music.
Unlike many other speech codecs, Speex is not designed for mobile phones
but rather for packet networks and voice over IP (VoIP) applications.
File-based compression is of course also supported.
\end_layout
\begin_layout Standard
The Speex codec is designed to be very flexible and support a wide range
of speech quality and bit-rate.
Support for very good quality speech also means that Speex can encode wideband
speech (16 kHz sampling rate) in addition to narrowband speech (telephone
quality, 8 kHz sampling rate).
\end_layout
\begin_layout Standard
Designing for VoIP instead of mobile phones means that Speex is robust to
lost packets, but not to corrupted ones.
This is based on the assumption that in VoIP, packets either arrive unaltered
or don't arrive at all.
Because Speex is targeted at a wide range of devices, it has modest (adjustable
) complexity and a small memory footprint.
\end_layout
\begin_layout Standard
All the design goals led to the choice of CELP
\begin_inset Index
status collapsed
\begin_layout Plain Layout
CELP
\end_layout
\end_inset
as the encoding technique.
One of the main reasons is that CELP has long proved that it could work
reliably and scale well to both low bit-rates (e.g.
DoD CELP @ 4.8 kbps) and high bit-rates (e.g.
G.728 @ 16 kbps).
\end_layout
\begin_layout Section
Getting help
\begin_inset CommandInset label
LatexCommand label
name "sec:Getting-help"
\end_inset
\end_layout
\begin_layout Standard
As for many open source projects, there are many ways to get help with Speex.
These include:
\end_layout
\begin_layout Itemize
This manual
\end_layout
\begin_layout Itemize
Other documentation on the Speex website (http://www.speex.org/)
\end_layout
\begin_layout Itemize
Mailing list: Discuss any Speex-related topic on speex-dev@xiph.org (not
just for developers)
\end_layout
\begin_layout Itemize
IRC: The main channel is #speex on irc.freenode.net.
Note that due to time differences, it may take a while to get someone,
so please be patient.
\end_layout
\begin_layout Itemize
Email the author privately at jean-marc.valin@usherbrooke.ca
\series bold
only
\series default
for private/delicate topics you do not wish to discuss publicly.
\end_layout
\begin_layout Standard
Before asking for help (mailing list or IRC),
\series bold
it is important to first read this manual
\series default
(OK, so if you made it here it's already a good sign).
It is generally considered rude to ask on a mailing list about topics that
are clearly detailed in the documentation.
On the other hand, it's perfectly OK (and encouraged) to ask for clarifications
about something covered in the manual.
This manual does not (yet) cover everything about Speex, so everyone is
encouraged to ask questions, send comments, feature requests, or just let
us know how Speex is being used.
\end_layout
\begin_layout Standard
Here are some additional guidelines related to the mailing list.
Before reporting bugs in Speex to the list, it is strongly recommended
(if possible) to first test whether these bugs can be reproduced using
the speexenc and speexdec (see Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Command-line-encoder/decoder"
\end_inset
) command-line utilities.
Bugs reported based on 3rd party code are both harder to find and far too
often caused by errors that have nothing to do with Speex.
\end_layout
\begin_layout Section
About this document
\end_layout
\begin_layout Standard
This document is divided in the following way.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Feature-description"
\end_inset
describes the different Speex features and defines many basic terms that
are used throughout this manual.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Command-line-encoder/decoder"
\end_inset
documents the standard command-line tools provided in the Speex distribution.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Programming-with-Speex"
\end_inset
includes detailed instructions about programming using the libspeex
\begin_inset Index
status collapsed
\begin_layout Plain Layout
libspeex
\end_layout
\end_inset
API.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Formats-and-standards"
\end_inset
has some information related to Speex and standards.
\end_layout
\begin_layout Standard
The three last sections describe the algorithms used in Speex.
These sections require signal processing knowledge, but are not required
for merely using Speex.
They are intended for people who want to understand how Speex really works
and/or want to do research based on Speex.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Introduction-to-CELP"
\end_inset
explains the general idea behind CELP, while sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Speex-narrowband-mode"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Speex-wideband-mode"
\end_inset
are specific to Speex.
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Codec description
\begin_inset CommandInset label
LatexCommand label
name "sec:Feature-description"
\end_inset
\end_layout
\begin_layout Standard
This section describes Speex and its features into more details.
\end_layout
\begin_layout Section
Concepts
\end_layout
\begin_layout Standard
Before introducing all the Speex features, here are some concepts in speech
coding that help better understand the rest of the manual.
Although some are general concepts in speech/audio processing, others are
specific to Speex.
\end_layout
\begin_layout Subsection*
Sampling rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
sampling rate
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The sampling rate expressed in Hertz (Hz) is the number of samples taken
from a signal per second.
For a sampling rate of
\begin_inset Formula $F_{s}$
\end_inset
kHz, the highest frequency that can be represented is equal to
\begin_inset Formula $F_{s}/2$
\end_inset
kHz (
\begin_inset Formula $F_{s}/2$
\end_inset
is known as the Nyquist frequency).
This is a fundamental property in signal processing and is described by
the sampling theorem.
Speex is mainly designed for three different sampling rates: 8 kHz, 16
kHz, and 32 kHz.
These are respectively referred to as narrowband
\begin_inset Index
status collapsed
\begin_layout Plain Layout
narrowband
\end_layout
\end_inset
, wideband
\begin_inset Index
status collapsed
\begin_layout Plain Layout
wideband
\end_layout
\end_inset
and ultra-wideband
\begin_inset Index
status collapsed
\begin_layout Plain Layout
ultra-wideband
\end_layout
\end_inset
.
\end_layout
\begin_layout Subsection*
Bit-rate
\end_layout
\begin_layout Standard
When encoding a speech signal, the bit-rate is defined as the number of
bits per unit of time required to encode the speech.
It is measured in
\emph on
bits per second
\emph default
(bps), or generally
\emph on
kilobits per second
\emph default
.
It is important to make the distinction between
\emph on
kilo
\series bold
bits
\series default
\emph default
\emph on
per second
\emph default
(k
\series bold
b
\series default
ps) and
\emph on
kilo
\series bold
bytes
\series default
\emph default
\emph on
per second
\emph default
(k
\series bold
B
\series default
ps).
\end_layout
\begin_layout Subsection*
Quality
\begin_inset Index
status collapsed
\begin_layout Plain Layout
quality
\end_layout
\end_inset
(variable)
\end_layout
\begin_layout Standard
Speex is a lossy codec, which means that it achieves compression at the
expense of fidelity of the input speech signal.
Unlike some other speech codecs, it is possible to control the trade-off
made between quality and bit-rate.
The Speex encoding process is controlled most of the time by a quality
parameter that ranges from 0 to 10.
In constant bit-rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
constant bit-rate
\end_layout
\end_inset
(CBR) operation, the quality parameter is an integer, while for variable
bit-rate (VBR), the parameter is a float.
\end_layout
\begin_layout Subsection*
Complexity
\begin_inset Index
status collapsed
\begin_layout Plain Layout
complexity
\end_layout
\end_inset
(variable)
\end_layout
\begin_layout Standard
With Speex, it is possible to vary the complexity allowed for the encoder.
This is done by controlling how the search is performed with an integer
ranging from 1 to 10 in a way that's similar to the -1 to -9 options to
\emph on
gzip
\emph default
and
\emph on
bzip2
\emph default
compression utilities.
For normal use, the noise level at complexity 1 is between 1 and 2 dB higher
than at complexity 10, but the CPU requirements for complexity 10 is about
5 times higher than for complexity 1.
In practice, the best trade-off is between complexity 2 and 4, though higher
settings are often useful when encoding non-speech sounds like DTMF
\begin_inset Index
status collapsed
\begin_layout Plain Layout
DTMF
\end_layout
\end_inset
tones.
\end_layout
\begin_layout Subsection*
Variable Bit-Rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
variable bit-rate
\end_layout
\end_inset
(VBR)
\end_layout
\begin_layout Standard
Variable bit-rate (VBR) allows a codec to change its bit-rate dynamically
to adapt to the
\begin_inset Quotes eld
\end_inset
difficulty
\begin_inset Quotes erd
\end_inset
of the audio being encoded.
In the example of Speex, sounds like vowels and high-energy transients
require a higher bit-rate to achieve good quality, while fricatives (e.g.
s,f sounds) can be coded adequately with less bits.
For this reason, VBR can achieve lower bit-rate for the same quality, or
a better quality for a certain bit-rate.
Despite its advantages, VBR has two main drawbacks: first, by only specifying
quality, there's no guaranty about the final average bit-rate.
Second, for some real-time applications like voice over IP (VoIP), what
counts is the maximum bit-rate, which must be low enough for the communication
channel.
\end_layout
\begin_layout Subsection*
Average Bit-Rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
average bit-rate
\end_layout
\end_inset
(ABR)
\end_layout
\begin_layout Standard
Average bit-rate solves one of the problems of VBR, as it dynamically adjusts
VBR quality in order to meet a specific target bit-rate.
Because the quality/bit-rate is adjusted in real-time (open-loop), the
global quality will be slightly lower than that obtained by encoding in
VBR with exactly the right quality setting to meet the target average bit-rate.
\end_layout
\begin_layout Subsection*
Voice Activity Detection
\begin_inset Index
status collapsed
\begin_layout Plain Layout
voice activity detection
\end_layout
\end_inset
(VAD)
\end_layout
\begin_layout Standard
When enabled, voice activity detection detects whether the audio being encoded
is speech or silence/background noise.
VAD is always implicitly activated when encoding in VBR, so the option
is only useful in non-VBR operation.
In this case, Speex detects non-speech periods and encode them with just
enough bits to reproduce the background noise.
This is called
\begin_inset Quotes eld
\end_inset
comfort noise generation
\begin_inset Quotes erd
\end_inset
(CNG).
\end_layout
\begin_layout Subsection*
Discontinuous Transmission
\begin_inset Index
status collapsed
\begin_layout Plain Layout
discontinuous transmission
\end_layout
\end_inset
(DTX)
\end_layout
\begin_layout Standard
Discontinuous transmission is an addition to VAD/VBR operation, that allows
to stop transmitting completely when the background noise is stationary.
In file-based operation, since we cannot just stop writing to the file,
only 5 bits are used for such frames (corresponding to 250 bps).
\end_layout
\begin_layout Subsection*
Perceptual enhancement
\begin_inset Index
status collapsed
\begin_layout Plain Layout
perceptual enhancement
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Perceptual enhancement is a part of the decoder which, when turned on, attempts
to reduce the perception of the noise/distortion produced by the encoding/decod
ing process.
In most cases, perceptual enhancement brings the sound further from the
original
\emph on
objectively
\emph default
(e.g.
considering only SNR), but in the end it still
\emph on
sounds
\emph default
better (subjective improvement).
\end_layout
\begin_layout Subsection*
Latency and algorithmic delay
\begin_inset Index
status collapsed
\begin_layout Plain Layout
algorithmic delay
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Every speech codec introduces a delay in the transmission.
For Speex, this delay is equal to the frame size, plus some amount of
\begin_inset Quotes eld
\end_inset
look-ahead
\begin_inset Quotes erd
\end_inset
required to process each frame.
In narrowband operation (8 kHz), the delay is 30 ms, while for wideband
(16 kHz), the delay is 34 ms.
These values don't account for the CPU time it takes to encode or decode
the frames.
\end_layout
\begin_layout Section
Codec
\end_layout
\begin_layout Standard
The main characteristics of Speex can be summarized as follows:
\end_layout
\begin_layout Itemize
Free software/open-source
\begin_inset Index
status collapsed
\begin_layout Plain Layout
open-source
\end_layout
\end_inset
, patent
\begin_inset Index
status collapsed
\begin_layout Plain Layout
patent
\end_layout
\end_inset
and royalty-free
\end_layout
\begin_layout Itemize
Integration of narrowband
\begin_inset Index
status collapsed
\begin_layout Plain Layout
narrowband
\end_layout
\end_inset
and wideband
\begin_inset Index
status collapsed
\begin_layout Plain Layout
wideband
\end_layout
\end_inset
using an embedded bit-stream
\end_layout
\begin_layout Itemize
Wide range of bit-rates available (from 2.15 kbps to 44 kbps)
\end_layout
\begin_layout Itemize
Dynamic bit-rate switching (AMR) and Variable Bit-Rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
variable bit-rate
\end_layout
\end_inset
(VBR) operation
\end_layout
\begin_layout Itemize
Voice Activity Detection
\begin_inset Index
status collapsed
\begin_layout Plain Layout
voice activity detection
\end_layout
\end_inset
(VAD, integrated with VBR) and discontinuous transmission (DTX)
\end_layout
\begin_layout Itemize
Variable complexity
\begin_inset Index
status collapsed
\begin_layout Plain Layout
complexity
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
Embedded wideband structure (scalable sampling rate)
\end_layout
\begin_layout Itemize
Ultra-wideband sampling rate at 32 kHz
\end_layout
\begin_layout Itemize
Intensity stereo encoding option
\end_layout
\begin_layout Itemize
Fixed-point implementation
\end_layout
\begin_layout Section
Preprocessor
\end_layout
\begin_layout Standard
This part refers to the preprocessor module introduced in the 1.1.x branch.
The preprocessor is designed to be used on the audio
\emph on
before
\emph default
running the encoder.
The preprocessor provides three main functionalities:
\end_layout
\begin_layout Itemize
noise suppression
\end_layout
\begin_layout Itemize
automatic gain control (AGC)
\end_layout
\begin_layout Itemize
voice activity detection (VAD)
\end_layout
\begin_layout Standard
The denoiser can be used to reduce the amount of background noise present
in the input signal.
This provides higher quality speech whether or not the denoised signal
is encoded with Speex (or at all).
However, when using the denoised signal with the codec, there is an additional
benefit.
Speech codecs in general (Speex included) tend to perform poorly on noisy
input, which tends to amplify the noise.
The denoiser greatly reduces this effect.
\end_layout
\begin_layout Standard
Automatic gain control (AGC) is a feature that deals with the fact that
the recording volume may vary by a large amount between different setups.
The AGC provides a way to adjust a signal to a reference volume.
This is useful for voice over IP because it removes the need for manual
adjustment of the microphone gain.
A secondary advantage is that by setting the microphone gain to a conservative
(low) level, it is easier to avoid clipping.
\end_layout
\begin_layout Standard
The voice activity detector (VAD) provided by the preprocessor is more advanced
than the one directly provided in the codec.
\end_layout
\begin_layout Section
Adaptive Jitter Buffer
\end_layout
\begin_layout Standard
When transmitting voice (or any content for that matter) over UDP or RTP,
packet may be lost, arrive with different delay, or even out of order.
The purpose of a jitter buffer is to reorder packets and buffer them long
enough (but no longer than necessary) so they can be sent to be decoded.
\end_layout
\begin_layout Section
Acoustic Echo Canceller
\end_layout
\begin_layout Standard
In any hands-free communication system (Fig.
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Acoustic-echo-model"
\end_inset
), speech from the remote end is played in the local loudspeaker, propagates
in the room and is captured by the microphone.
If the audio captured from the microphone is sent directly to the remote
end, then the remote user hears an echo of his voice.
An acoustic echo canceller is designed to remove the acoustic echo before
it is sent to the remote end.
It is important to understand that the echo canceller is meant to improve
the quality on the
\series bold
remote
\series default
end.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename echo_path.eps
width 10cm
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Acoustic echo model
\begin_inset CommandInset label
LatexCommand label
name "fig:Acoustic-echo-model"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Resampler
\end_layout
\begin_layout Standard
In some cases, it may be useful to convert audio from one sampling rate
to another.
There are many reasons for that.
It can be for mixing streams that have different sampling rates, for supporting
sampling rates that the soundcard doesn't support, for transcoding, etc.
That's why there is now a resampler that is part of the Speex project.
This resampler can be used to convert between any two arbitrary rates (the
ratio must only be a rational number) and there is control over the quality/com
plexity tradeoff.
\end_layout
\begin_layout Section
Integration
\end_layout
\begin_layout Standard
Knowing
\emph on
how
\emph default
to use each of the components is not that useful unless we know
\emph on
where
\emph default
to use them.
Figure
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Integration-VoIP"
\end_inset
shows where each of the components would be used in a typical VoIP client.
Components in dotted lines are optional, though they may be very useful
in some circumstances.
There are several important things to note from there.
The AEC must be placed as close as possible to the playback and capture.
Only the resampling may be closer.
Also, it is very important to use the same clock for both mic capture and
speaker/headphones playback.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename components.eps
width 80text%
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Integration of all the components in a VoIP client.
\begin_inset CommandInset label
LatexCommand label
name "fig:Integration-VoIP"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Compiling and Porting
\end_layout
\begin_layout Standard
Compiling Speex under UNIX/Linux or any other platform supported by autoconf
(e.g.
Win32/cygwin) is as easy as typing:
\end_layout
\begin_layout LyX-Code
% ./configure [options]
\end_layout
\begin_layout LyX-Code
% make
\end_layout
\begin_layout LyX-Code
% make install
\end_layout
\begin_layout Standard
The options supported by the Speex configure script are:
\end_layout
\begin_layout Description
--prefix=<path> Specifies the base path for installing Speex (e.g.
/usr)
\end_layout
\begin_layout Description
--enable-shared/--disable-shared Whether to compile shared libraries
\end_layout
\begin_layout Description
--enable-static/--disable-static Whether to compile static libraries
\end_layout
\begin_layout Description
--disable-wideband Disable the wideband part of Speex (typically to save
space)
\end_layout
\begin_layout Description
--enable-valgrind Enable extra hits for valgrind for debugging purposes
(do not use by default)
\end_layout
\begin_layout Description
--enable-sse Enable use of SSE instructions (x86/float only)
\end_layout
\begin_layout Description
--enable-fixed-point
\begin_inset Index
status collapsed
\begin_layout Plain Layout
fixed-point
\end_layout
\end_inset
Compile Speex for a processor that does not have a floating point unit
(FPU)
\end_layout
\begin_layout Description
--enable-arm4-asm Enable assembly specific to the ARMv4 architecture (gcc
only)
\end_layout
\begin_layout Description
--enable-arm5e-asm Enable assembly specific to the ARMv5E architecture (gcc
only)
\end_layout
\begin_layout Description
--enable-fixed-point-debug Use only for debugging the fixed-point
\begin_inset Index
status collapsed
\begin_layout Plain Layout
fixed-point
\end_layout
\end_inset
code (very slow)
\end_layout
\begin_layout Description
--enable-ti-c55x Enable support for the TI C5x family
\end_layout
\begin_layout Description
--enable-blackfin-asm Enable assembly specific to the Blackfin DSP architecture
(gcc only)
\end_layout
\begin_layout Section
Platforms
\end_layout
\begin_layout Standard
Speex is known to compile and work on a large number of architectures, both
floating-point and fixed-point.
In general, any architecture that can natively compute the multiplication
of two signed 16-bit numbers (32-bit result) and runs at a sufficient clock
rate (architecture-dependent) is capable of running Speex.
Architectures on which Speex is
\series bold
known
\series default
to work (it probably works on many others) are:
\end_layout
\begin_layout Itemize
x86 & x86-64
\end_layout
\begin_layout Itemize
Power
\end_layout
\begin_layout Itemize
SPARC
\end_layout
\begin_layout Itemize
ARM
\end_layout
\begin_layout Itemize
Blackfin
\end_layout
\begin_layout Itemize
Coldfire (68k family)
\end_layout
\begin_layout Itemize
TI C54xx & C55xx
\end_layout
\begin_layout Itemize
TI C6xxx
\end_layout
\begin_layout Itemize
TriMedia (experimental)
\end_layout
\begin_layout Standard
Operating systems on top of which Speex is known to work include (it probably
works on many others):
\end_layout
\begin_layout Itemize
Linux
\end_layout
\begin_layout Itemize
\begin_inset Formula $\mu$
\end_inset
Clinux
\end_layout
\begin_layout Itemize
MacOS X
\end_layout
\begin_layout Itemize
BSD
\end_layout
\begin_layout Itemize
Other UNIX/POSIX variants
\end_layout
\begin_layout Itemize
Symbian
\end_layout
\begin_layout Standard
The source code directory include additional information for compiling on
certain architectures or operating systems in README.xxx files.
\end_layout
\begin_layout Section
Porting and Optimising
\end_layout
\begin_layout Standard
Here are a few things to consider when porting or optimising Speex for a
new platform or an existing one.
\end_layout
\begin_layout Subsection
CPU optimisation
\end_layout
\begin_layout Standard
The single factor that will affect the CPU usage of Speex the most is whether
it is compiled for floating point or fixed-point.
If your CPU/DSP does not have a floating-point unit FPU, then compiling
as fixed-point will be orders of magnitudes faster.
If there is an FPU present, then it is important to test which version
is faster.
On the x86 architecture, floating-point is
\series bold
generally
\series default
faster, but not always.
To compile Speex as fixed-point, you need to pass --fixed-point to the
configure script or define the FIXED_POINT macro for the compiler.
As of 1.2beta3, it is now possible to disable the floating-point compatibility
API, which means that your code can link without a float emulation library.
To do that configure with --disable-float-api or define the DISABLE_FLOAT_API
macro.
Until the VBR feature is ported to fixed-point, you will also need to configure
with --disable-vbr or define DISABLE_VBR.
\end_layout
\begin_layout Standard
Other important things to check on some DSP architectures are:
\end_layout
\begin_layout Itemize
Make sure the cache is set to write-back mode
\end_layout
\begin_layout Itemize
If the chip has SRAM instead of cache, make sure as much code and data are
in SRAM, rather than in RAM
\end_layout
\begin_layout Standard
If you are going to be writing assembly, then the following functions are
\series bold
usually
\series default
the first ones you should consider optimising:
\end_layout
\begin_layout Itemize
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
filter_mem16()
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
iir_mem16()
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
vq_nbest()
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
pitch_xcorr()
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
interp_pitch()
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The filtering functions
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
filter_mem16()
\end_layout
\end_inset
and
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
iir_mem16()
\end_layout
\end_inset
are implemented in the direct form II transposed (DF2T).
However, for architectures based on multiply-accumulate (MAC), DF2T requires
frequent reload of the accumulator, which can make the code very slow.
For these architectures (e.g.
Blackfin and Coldfire), a better approach is to implement those functions
as direct form I (DF1), which is easier to express in terms of MAC.
When doing that however,
\series bold
it is important to make sure that the DF1 implementation still behaves like
the original DF2T behaviour when it comes to memory values
\series default
.
This is necessary because the filter is time-varying and must compute exactly
the same value (not counting machine rounding) on any encoder or decoder.
\end_layout
\begin_layout Subsection
Memory optimisation
\end_layout
\begin_layout Standard
Memory optimisation is mainly something that should be considered for small
embedded platforms.
For PCs, Speex is already so tiny that it's just not worth doing any of
the things suggested here.
There are several ways to reduce the memory usage of Speex, both in terms
of code size and data size.
For optimising code size, the trick is to first remove features you do
not need.
Some examples of things that can easily be disabled
\series bold
if you don't need them
\series default
are:
\end_layout
\begin_layout Itemize
Wideband support (--disable-wideband)
\end_layout
\begin_layout Itemize
Support for stereo (removing stereo.c)
\end_layout
\begin_layout Itemize
VBR support (--disable-vbr or DISABLE_VBR)
\end_layout
\begin_layout Itemize
Static codebooks that are not needed for the bit-rates you are using (*_table.c
files)
\end_layout
\begin_layout Standard
Speex also has several methods for allocating temporary arrays.
When using a compiler that supports C99 properly (as of 2007, Microsoft
compilers don't, but gcc does), it is best to define VAR_ARRAYS.
That makes use of the variable-size array feature of C99.
The next best is to define USE_ALLOCA so that Speex can use alloca() to
allocate the temporary arrays.
Note that on many systems, alloca() is buggy so it may not work.
If none of VAR_ARRAYS and USE_ALLOCA are defined, then Speex falls back
to allocating a large
\begin_inset Quotes eld
\end_inset
scratch space
\begin_inset Quotes erd
\end_inset
and doing its own internal allocation.
The main disadvantage of this solution is that it is wasteful.
It needs to allocate enough stack for the worst case scenario (worst bit-rate,
highest complexity setting, ...) and by default, the memory isn't shared between
multiple encoder/decoder states.
Still, if the
\begin_inset Quotes eld
\end_inset
manual
\begin_inset Quotes erd
\end_inset
allocation is the only option left, there are a few things that can be
improved.
By overriding the speex_alloc_scratch() call in os_support.h, it is possible
to always return the same memory area for all states
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
In this case, one must be careful with threads
\end_layout
\end_inset
.
In addition to that, by redefining the NB_ENC_STACK and NB_DEC_STACK (or
similar for wideband), it is possible to only allocate memory for a scenario
that is known in advance.
In this case, it is important to measure the amount of memory required
for the specific sampling rate, bit-rate and complexity level being used.
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Command-line encoder/decoder
\begin_inset CommandInset label
LatexCommand label
name "sec:Command-line-encoder/decoder"
\end_inset
\end_layout
\begin_layout Standard
The base Speex distribution includes a command-line encoder (
\emph on
speexenc
\emph default
) and decoder (
\emph on
speexdec
\emph default
).
Those tools produce and read Speex files encapsulated in the Ogg container.
Although it is possible to encapsulate Speex in any container, Ogg is the
recommended container for files.
This section describes how to use the command line tools for Speex files
in Ogg.
\end_layout
\begin_layout Section
\emph on
speexenc
\begin_inset Index
status collapsed
\begin_layout Plain Layout
speexenc
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The
\emph on
speexenc
\emph default
utility is used to create Speex files from raw PCM or wave files.
It can be used by calling:
\end_layout
\begin_layout LyX-Code
speexenc [options] input_file output_file
\end_layout
\begin_layout Standard
The value '-' for input_file or output_file corresponds respectively to
stdin and stdout.
The valid options are:
\end_layout
\begin_layout Description
--narrowband
\begin_inset space ~
\end_inset
(-n) Tell Speex to treat the input as narrowband (8 kHz).
This is the default
\end_layout
\begin_layout Description
--wideband
\begin_inset space ~
\end_inset
(-w) Tell Speex to treat the input as wideband (16 kHz)
\end_layout
\begin_layout Description
--ultra-wideband
\begin_inset space ~
\end_inset
(-u) Tell Speex to treat the input as
\begin_inset Quotes eld
\end_inset
ultra-wideband
\begin_inset Quotes erd
\end_inset
(32 kHz)
\end_layout
\begin_layout Description
--quality
\begin_inset space ~
\end_inset
n Set the encoding quality (0-10), default is 8
\end_layout
\begin_layout Description
--bitrate
\begin_inset space ~
\end_inset
n Encoding bit-rate (use bit-rate n or lower)
\end_layout
\begin_layout Description
--vbr Enable VBR (Variable Bit-Rate), disabled by default
\end_layout
\begin_layout Description
--abr
\begin_inset space ~
\end_inset
n Enable ABR (Average Bit-Rate) at n kbps, disabled by default
\end_layout
\begin_layout Description
--vad Enable VAD (Voice Activity Detection), disabled by default
\end_layout
\begin_layout Description
--dtx Enable DTX (Discontinuous Transmission), disabled by default
\end_layout
\begin_layout Description
--nframes
\begin_inset space ~
\end_inset
n Pack n frames in each Ogg packet (this saves space at low bit-rates)
\end_layout
\begin_layout Description
--comp
\begin_inset space ~
\end_inset
n Set encoding speed/quality tradeoff.
The higher the value of n, the slower the encoding (default is 3)
\end_layout
\begin_layout Description
-V Verbose operation, print bit-rate currently in use
\end_layout
\begin_layout Description
--help
\begin_inset space ~
\end_inset
(-h) Print the help
\end_layout
\begin_layout Description
--version
\begin_inset space ~
\end_inset
(-v) Print version information
\end_layout
\begin_layout Subsection*
Speex comments
\end_layout
\begin_layout Description
--comment Add the given string as an extra comment.
This may be used multiple times.
\end_layout
\begin_layout Description
--author Author of this track.
\end_layout
\begin_layout Description
--title Title for this track.
\end_layout
\begin_layout Subsection*
Raw input options
\end_layout
\begin_layout Description
--rate
\begin_inset space ~
\end_inset
n Sampling rate for raw input
\end_layout
\begin_layout Description
--stereo Consider raw input as stereo
\end_layout
\begin_layout Description
--le Raw input is little-endian
\end_layout
\begin_layout Description
--be Raw input is big-endian
\end_layout
\begin_layout Description
--8bit Raw input is 8-bit unsigned
\end_layout
\begin_layout Description
--16bit Raw input is 16-bit signed
\end_layout
\begin_layout Section
\emph on
speexdec
\begin_inset Index
status collapsed
\begin_layout Plain Layout
speexdec
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The
\emph on
speexdec
\emph default
utility is used to decode Speex files and can be used by calling:
\end_layout
\begin_layout LyX-Code
speexdec [options] speex_file [output_file]
\end_layout
\begin_layout Standard
The value '-' for input_file or output_file corresponds respectively to
stdin and stdout.
Also, when no output_file is specified, the file is played to the soundcard.
The valid options are:
\end_layout
\begin_layout Description
--enh enable post-filter (default)
\end_layout
\begin_layout Description
--no-enh disable post-filter
\end_layout
\begin_layout Description
--force-nb Force decoding in narrowband
\end_layout
\begin_layout Description
--force-wb Force decoding in wideband
\end_layout
\begin_layout Description
--force-uwb Force decoding in ultra-wideband
\end_layout
\begin_layout Description
--mono Force decoding in mono
\end_layout
\begin_layout Description
--stereo Force decoding in stereo
\end_layout
\begin_layout Description
--rate
\begin_inset space ~
\end_inset
n Force decoding at n Hz sampling rate
\end_layout
\begin_layout Description
--packet-loss
\begin_inset space ~
\end_inset
n Simulate n % random packet loss
\end_layout
\begin_layout Description
-V Verbose operation, print bit-rate currently in use
\end_layout
\begin_layout Description
--help
\begin_inset space ~
\end_inset
(-h) Print the help
\end_layout
\begin_layout Description
--version
\begin_inset space ~
\end_inset
(-v) Print version information
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Using the Speex Codec API (
\emph on
libspeex
\emph default
\begin_inset Index
status collapsed
\begin_layout Plain Layout
libspeex
\end_layout
\end_inset
)
\begin_inset CommandInset label
LatexCommand label
name "sec:Programming-with-Speex"
\end_inset
\end_layout
\begin_layout Standard
The
\emph on
libspeex
\emph default
library contains all the functions for encoding and decoding speech with
the Speex codec.
When linking on a UNIX system, one must add
\emph on
-lspeex -lm
\emph default
to the compiler command line.
One important thing to know is that
\series bold
libspeex calls are reentrant, but not thread-safe
\series default
.
That means that it is fine to use calls from many threads, but
\series bold
calls using the same state from multiple threads must be protected by mutexes
\series default
.
Examples of code can also be found in Appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Sample-code"
\end_inset
and the complete API documentation is included in the Documentation section
of the Speex website (http://www.speex.org/).
\end_layout
\begin_layout Section
Encoding
\begin_inset CommandInset label
LatexCommand label
name "sub:Encoding"
\end_inset
\end_layout
\begin_layout Standard
In order to encode speech using Speex, one first needs to:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
#include <speex/speex.h>
\end_layout
\end_inset
Then in the code, a Speex bit-packing struct must be declared, along with
a Speex encoder state:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
SpeexBits bits;
\end_layout
\begin_layout Plain Layout
void *enc_state;
\end_layout
\end_inset
The two are initialized by:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_init(&bits);
\end_layout
\begin_layout Plain Layout
enc_state = speex_encoder_init(&speex_nb_mode);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
For wideband coding,
\emph on
speex_nb_mode
\emph default
will be replaced by
\emph on
speex_wb_mode
\emph default
.
In most cases, you will need to know the frame size used at the sampling
rate you are using.
You can get that value in the
\emph on
frame_size
\emph default
variable (expressed in
\series bold
samples
\series default
, not bytes) with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_encoder_ctl(enc_state,SPEEX_GET_FRAME_SIZE,&frame_size);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
In practice,
\emph on
frame_size
\emph default
will correspond to 20 ms when using 8, 16, or 32 kHz sampling rate.
There are many parameters that can be set for the Speex encoder, but the
most useful one is the quality parameter that controls the quality vs bit-rate
tradeoff.
This is set by:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_encoder_ctl(enc_state,SPEEX_SET_QUALITY,&quality);
\end_layout
\end_inset
where
\emph on
quality
\emph default
is an integer value ranging from 0 to 10 (inclusively).
The mapping between quality and bit-rate is described in Fig.
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:quality_vs_bps"
\end_inset
for narrowband.
\end_layout
\begin_layout Standard
Once the initialization is done, for every input frame:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_reset(&bits);
\end_layout
\begin_layout Plain Layout
speex_encode_int(enc_state, input_frame, &bits);
\end_layout
\begin_layout Plain Layout
nbBytes = speex_bits_write(&bits, byte_ptr, MAX_NB_BYTES);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where
\emph on
input_frame
\emph default
is a
\emph on
(
\emph default
short
\emph on
*)
\emph default
pointing to the beginning of a speech frame,
\emph on
byte_ptr
\emph default
is a
\emph on
(char *)
\emph default
where the encoded frame will be written,
\emph on
MAX_NB_BYTES
\emph default
is the maximum number of bytes that can be written to
\emph on
byte_ptr
\emph default
without causing an overflow and
\emph on
nbBytes
\emph default
is the number of bytes actually written to
\emph on
byte_ptr
\emph default
(the encoded size in bytes).
Before calling speex_bits_write, it is possible to find the number of bytes
that need to be written by calling
\family typewriter
speex_bits_nbytes(&bits)
\family default
, which returns a number of bytes.
\end_layout
\begin_layout Standard
It is still possible to use the
\emph on
speex_encode()
\emph default
function, which takes a
\emph on
(float *)
\emph default
for the audio.
However, this would make an eventual port to an FPU-less platform (like
ARM) more complicated.
Internally,
\emph on
speex_encode()
\emph default
and
\emph on
speex_encode_int()
\emph default
are processed in the same way.
Whether the encoder uses the fixed-point version is only decided by the
compile-time flags, not at the API level.
\end_layout
\begin_layout Standard
After you're done with the encoding, free all resources with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_destroy(&bits);
\end_layout
\begin_layout Plain Layout
speex_encoder_destroy(enc_state);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
That's about it for the encoder.
\end_layout
\begin_layout Section
Decoding
\begin_inset CommandInset label
LatexCommand label
name "sub:Decoding"
\end_inset
\end_layout
\begin_layout Standard
In order to decode speech using Speex, you first need to:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
#include <speex/speex.h>
\end_layout
\end_inset
You also need to declare a Speex bit-packing struct
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
SpeexBits bits;
\end_layout
\end_inset
and a Speex decoder state
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
void *dec_state;
\end_layout
\end_inset
The two are initialized by:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_init(&bits);
\end_layout
\begin_layout Plain Layout
dec_state = speex_decoder_init(&speex_nb_mode);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
For wideband decoding,
\emph on
speex_nb_mode
\emph default
will be replaced by
\emph on
speex_wb_mode
\emph default
.
If you need to obtain the size of the frames that will be used by the decoder,
you can get that value in the
\emph on
frame_size
\emph default
variable (expressed in
\series bold
samples
\series default
, not bytes) with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_decoder_ctl(dec_state, SPEEX_GET_FRAME_SIZE, &frame_size);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
There is also a parameter that can be set for the decoder: whether or not
to use a perceptual enhancer.
This can be set by:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_decoder_ctl(dec_state, SPEEX_SET_ENH, &enh);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where
\emph on
enh
\emph default
is an int with value 0 to have the enhancer disabled and 1 to have it enabled.
As of 1.2-beta1, the default is now to enable the enhancer.
\end_layout
\begin_layout Standard
Again, once the decoder initialization is done, for every input frame:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_read_from(&bits, input_bytes, nbBytes);
\end_layout
\begin_layout Plain Layout
speex_decode_int(dec_state, &bits, output_frame);
\end_layout
\end_inset
where input_bytes is a
\emph on
(char *)
\emph default
containing the bit-stream data received for a frame,
\emph on
nbBytes
\emph default
is the size (in bytes) of that bit-stream, and
\emph on
output_frame
\emph default
is a
\emph on
(short *)
\emph default
and points to the area where the decoded speech frame will be written.
A NULL value as the second argument indicates that we don't have the bits
for the current frame.
When a frame is lost, the Speex decoder will do its best to "guess" the
correct signal.
\end_layout
\begin_layout Standard
As for the encoder, the
\emph on
speex_decode()
\emph default
function can still be used, with a
\emph on
(float *)
\emph default
as the output for the audio.
After you're done with the decoding, free all resources with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_bits_destroy(&bits);
\end_layout
\begin_layout Plain Layout
speex_decoder_destroy(dec_state);
\end_layout
\end_inset
\end_layout
\begin_layout Section
Codec Options (speex_*_ctl)
\begin_inset CommandInset label
LatexCommand label
name "sub:Codec-Options"
\end_inset
\end_layout
\begin_layout Quote
\align center
\emph on
Entities should not be multiplied beyond necessity -- William of Ockham.
\end_layout
\begin_layout Quote
\align center
\emph on
Just because there's an option for it doesn't mean you have to turn it on
-- me.
\end_layout
\begin_layout Standard
The Speex encoder and decoder support many options and requests that can
be accessed through the
\emph on
speex_encoder_ctl
\emph default
and
\emph on
speex_decoder_ctl
\emph default
functions.
These functions are similar to the
\emph on
ioctl
\emph default
system call and their prototypes are:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
void speex_encoder_ctl(void *encoder, int request, void *ptr);
\end_layout
\begin_layout Plain Layout
void speex_decoder_ctl(void *encoder, int request, void *ptr);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Despite those functions, the defaults are usually good for many applications
and
\series bold
optional settings should only be used when one understands them and knows
that they are needed
\series default
.
A common error is to attempt to set many unnecessary settings.
\end_layout
\begin_layout Standard
Here is a list of the values allowed for the requests.
Some only apply to the encoder or the decoder.
Because the last argument is of type
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
void *
\end_layout
\end_inset
, the
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
_ctl()
\end_layout
\end_inset
functions are
\series bold
not type safe
\series default
, and should thus be used with care.
The type
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
is the same as the C99
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
int32_t
\end_layout
\end_inset
type.
\end_layout
\begin_layout Description
SPEEX_SET_ENH
\begin_inset Formula $\ddagger$
\end_inset
Set perceptual enhancer
\begin_inset Index
status collapsed
\begin_layout Plain Layout
perceptual enhancement
\end_layout
\end_inset
to on (1) or off (0) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
, default is on)
\end_layout
\begin_layout Description
SPEEX_GET_ENH
\begin_inset Formula $\ddagger$
\end_inset
Get perceptual enhancer status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_GET_FRAME_SIZE Get the number of samples per frame for the current
mode (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_SET_QUALITY
\begin_inset Formula $\dagger$
\end_inset
Set the encoder speech quality (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
from 0 to 10, default is 8)
\end_layout
\begin_layout Description
SPEEX_GET_QUALITY
\begin_inset Formula $\dagger$
\end_inset
Get the current encoder speech quality (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
from 0 to 10)
\end_layout
\begin_layout Description
SPEEX_SET_MODE
\begin_inset Formula $\dagger$
\end_inset
Set the mode number, as specified in the RTP spec (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_GET_MODE
\begin_inset Formula $\dagger$
\end_inset
Get the current mode number, as specified in the RTP spec (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_SET_VBR
\begin_inset Formula $\dagger$
\end_inset
Set variable bit-rate (VBR) to on (1) or off (0) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
, default is off)
\end_layout
\begin_layout Description
SPEEX_GET_VBR
\begin_inset Formula $\dagger$
\end_inset
Get variable bit-rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
variable bit-rate
\end_layout
\end_inset
(VBR) status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_SET_VBR_QUALITY
\begin_inset Formula $\dagger$
\end_inset
Set the encoder VBR speech quality (float 0.0 to 10.0, default is 8.0)
\end_layout
\begin_layout Description
SPEEX_GET_VBR_QUALITY
\begin_inset Formula $\dagger$
\end_inset
Get the current encoder VBR speech quality (float 0 to 10)
\end_layout
\begin_layout Description
SPEEX_SET_COMPLEXITY
\begin_inset Formula $\dagger$
\end_inset
Set the CPU resources allowed for the encoder (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
from 1 to 10, default is 2)
\end_layout
\begin_layout Description
SPEEX_GET_COMPLEXITY
\begin_inset Formula $\dagger$
\end_inset
Get the CPU resources allowed for the encoder (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
from 1 to 10, default is 2)
\end_layout
\begin_layout Description
SPEEX_SET_BITRATE
\begin_inset Formula $\dagger$
\end_inset
Set the bit-rate to use the closest value not exceeding the parameter (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_GET_BITRATE Get the current bit-rate in use (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_SET_SAMPLING_RATE Set real sampling rate (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in Hz)
\end_layout
\begin_layout Description
SPEEX_GET_SAMPLING_RATE Get real sampling rate (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in Hz)
\end_layout
\begin_layout Description
SPEEX_RESET_STATE Reset the encoder/decoder state to its original state,
clearing all memories (no argument)
\end_layout
\begin_layout Description
SPEEX_SET_VAD
\begin_inset Formula $\dagger$
\end_inset
Set voice activity detection
\begin_inset Index
status collapsed
\begin_layout Plain Layout
voice activity detection
\end_layout
\end_inset
(VAD) to on (1) or off (0) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
, default is off)
\end_layout
\begin_layout Description
SPEEX_GET_VAD
\begin_inset Formula $\dagger$
\end_inset
Get voice activity detection (VAD) status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_SET_DTX
\begin_inset Formula $\dagger$
\end_inset
Set discontinuous transmission
\begin_inset Index
status collapsed
\begin_layout Plain Layout
discontinuous transmission
\end_layout
\end_inset
(DTX) to on (1) or off (0) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
, default is off)
\end_layout
\begin_layout Description
SPEEX_GET_DTX
\begin_inset Formula $\dagger$
\end_inset
Get discontinuous transmission (DTX) status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_SET_ABR
\begin_inset Formula $\dagger$
\end_inset
Set average bit-rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
average bit-rate
\end_layout
\end_inset
(ABR) to a value n in bits per second (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_GET_ABR
\begin_inset Formula $\dagger$
\end_inset
Get average bit-rate (ABR) setting (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_SET_PLC_TUNING
\begin_inset Formula $\dagger$
\end_inset
Tell the encoder to optimize encoding for a certain percentage of packet
loss (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in percent)
\end_layout
\begin_layout Description
SPEEX_GET_PLC_TUNING
\begin_inset Formula $\dagger$
\end_inset
Get the current tuning of the encoder for PLC (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in percent)
\end_layout
\begin_layout Description
SPEEX_SET_VBR_MAX_BITRATE
\begin_inset Formula $\dagger$
\end_inset
Set the maximum bit-rate allowed in VBR operation (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_GET_VBR_MAX_BITRATE
\begin_inset Formula $\dagger$
\end_inset
Get the current maximum bit-rate allowed in VBR operation (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
in bits per second)
\end_layout
\begin_layout Description
SPEEX_SET_HIGHPASS Set the high-pass filter on (1) or off (0) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
, default is on)
\end_layout
\begin_layout Description
SPEEX_GET_HIGHPASS Get the current high-pass filter status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
\begin_inset Formula $\dagger$
\end_inset
applies only to the encoder
\end_layout
\begin_layout Description
\begin_inset Formula $\ddagger$
\end_inset
applies only to the decoder
\end_layout
\begin_layout Section
Mode queries
\begin_inset CommandInset label
LatexCommand label
name "sub:Mode-queries"
\end_inset
\end_layout
\begin_layout Standard
Speex modes have a query system similar to the speex_encoder_ctl and speex_decod
er_ctl calls.
Since modes are read-only, it is only possible to get information about
a particular mode.
The function used to do that is:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
void speex_mode_query(SpeexMode *mode, int request, void *ptr);
\end_layout
\end_inset
The admissible values for request are (unless otherwise note, the values
are returned through
\emph on
ptr
\emph default
):
\end_layout
\begin_layout Description
SPEEX_MODE_FRAME_SIZE Get the frame size (in samples) for the mode
\end_layout
\begin_layout Description
SPEEX_SUBMODE_BITRATE Get the bit-rate for a submode number specified through
\emph on
ptr
\emph default
(integer in bps).
\end_layout
\begin_layout Section
Packing and in-band signalling
\begin_inset Index
status collapsed
\begin_layout Plain Layout
in-band signalling
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Sometimes it is desirable to pack more than one frame per packet (or other
basic unit of storage).
The proper way to do it is to call speex_encode
\begin_inset Formula $N$
\end_inset
times before writing the stream with speex_bits_write.
In cases where the number of frames is not determined by an out-of-band
mechanism, it is possible to include a terminator code.
That terminator consists of the code 15 (decimal) encoded with 5 bits,
as shown in Table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:quality_vs_bps"
\end_inset
.
Note that as of version 1.0.2, calling speex_bits_write automatically inserts
the terminator so as to fill the last byte.
This doesn't involves any overhead and makes sure Speex can always detect
when there is no more frame in a packet.
\end_layout
\begin_layout Standard
It is also possible to send in-band
\begin_inset Quotes eld
\end_inset
messages
\begin_inset Quotes erd
\end_inset
to the other side.
All these messages are encoded as
\begin_inset Quotes eld
\end_inset
pseudo-frames
\begin_inset Quotes erd
\end_inset
of mode 14 which contain a 4-bit message type code, followed by the message.
Table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:In-band-signalling-codes"
\end_inset
lists the available codes, their meaning and the size of the message that
follows.
Most of these messages are requests that are sent to the encoder or decoder
on the other end, which is free to comply or ignore them.
By default, all in-band messages are ignored.
\end_layout
\begin_layout Standard
\begin_inset Float table
placement htbp
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="17" columns="3">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Code
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Size (bits)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Content
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks decoder to set perceptual enhancement off (0) or on(1)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks (if 1) the encoder to be less
\begin_inset Quotes eld
\end_inset
aggressive
\begin_inset Quotes erd
\end_inset
due to high packet loss
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks encoder to switch to mode N
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks encoder to switch to mode N for low-band
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks encoder to switch to mode N for high-band
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks encoder to switch to quality N for VBR
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Request acknowledge (0=no, 1=all, 2=only for in-band data)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Asks encoder to set CBR (0), VAD(1), DTX(3), VBR(5), VBR+DTX(7)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Transmit (8-bit) character to the other end
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Intensity stereo information
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
16
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Announce maximum bit-rate acceptable (N in bytes/second)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
11
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
16
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
32
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Acknowledge receiving packet N
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
13
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
32
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
14
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
64
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
15
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
64
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
In-band signalling codes
\begin_inset CommandInset label
LatexCommand label
name "cap:In-band-signalling-codes"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Finally, applications may define custom in-band messages using mode 13.
The size of the message in bytes is encoded with 5 bits, so that the decoder
can skip it if it doesn't know how to interpret it.
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Speech Processing API (
\emph on
libspeexdsp
\emph default
)
\end_layout
\begin_layout Standard
As of version 1.2beta3, the non-codec parts of the Speex package are now
in a separate library called
\emph on
libspeexdsp
\emph default
.
This library includes the preprocessor, the acoustic echo canceller, the
jitter buffer, and the resampler.
In a UNIX environment, it can be linked into a program by adding
\emph on
-lspeexdsp -lm
\emph default
to the compiler command line.
Just like for libspeex,
\series bold
libspeexdsp calls are reentrant, but not thread-safe
\series default
.
That means that it is fine to use calls from many threads, but
\series bold
calls using the same state from multiple threads must be protected by mutexes
\series default
.
\end_layout
\begin_layout Section
Preprocessor
\begin_inset CommandInset label
LatexCommand label
name "sub:Preprocessor"
\end_inset
\end_layout
\begin_layout Standard
\noindent
In order to use the Speex preprocessor
\begin_inset Index
status collapsed
\begin_layout Plain Layout
preprocessor
\end_layout
\end_inset
, you first need to:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
#include <speex/speex_preprocess.h>
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
Then, a preprocessor state can be created as:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
SpeexPreprocessState *preprocess_state = speex_preprocess_state_init(frame_size,
sampling_rate);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
and it is recommended to use the same value for
\family typewriter
frame_size
\family default
as is used by the encoder (20
\emph on
ms
\emph default
).
\end_layout
\begin_layout Standard
For each input frame, you need to call:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_preprocess_run(preprocess_state, audio_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
where
\family typewriter
audio_frame
\family default
is used both as input and output.
In cases where the output audio is not useful for a certain frame, it is
possible to use instead:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_preprocess_estimate_update(preprocess_state, audio_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
This call will update all the preprocessor internal state variables without
computing the output audio, thus saving some CPU cycles.
\end_layout
\begin_layout Standard
The behaviour of the preprocessor can be changed using:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_preprocess_ctl(preprocess_state, request, ptr);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
which is used in the same way as the encoder and decoder equivalent.
Options are listed in Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Preprocessor-options"
\end_inset
.
\end_layout
\begin_layout Standard
The preprocessor state can be destroyed using:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_preprocess_state_destroy(preprocess_state);
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Preprocessor options
\begin_inset CommandInset label
LatexCommand label
name "sub:Preprocessor-options"
\end_inset
\end_layout
\begin_layout Standard
As with the codec, the preprocessor also has options that can be controlled
using an ioctl()-like call.
The available options are:
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_DENOISE Turns denoising on(1) or off(2) (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_DENOISE Get denoising status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_AGC Turns automatic gain control (AGC) on(1) or off(2)
(
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_AGC Get AGC status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_VAD Turns voice activity detector (VAD) on(1) or off(2)
(
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_VAD Get VAD status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_AGC_LEVEL
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_AGC_LEVEL
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_DEREVERB Turns reverberation removal on(1) or off(2)
(
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_DEREVERB Get reverberation removal status (
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_DEREVERB_LEVEL Not working yet, do not use
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_DEREVERB_LEVEL Not working yet, do not use
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_DEREVERB_DECAY Not working yet, do not use
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_DEREVERB_DECAY Not working yet, do not use
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_PROB_START
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_PROB_START
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_PROB_CONTINUE
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_PROB_CONTINUE
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_NOISE_SUPPRESS Set maximum attenuation of the noise
in dB (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_NOISE_SUPPRESS Get maximum attenuation of the noise
in dB (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_ECHO_SUPPRESS Set maximum attenuation of the residual
echo in dB (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_ECHO_SUPPRESS Get maximum attenuation of the residual
echo in dB (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE Set maximum attenuation of the
echo in dB when near end is active (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_ECHO_SUPPRESS_ACTIVE Get maximum attenuation of the
echo in dB when near end is active (negative
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
spx_int32_t
\end_layout
\end_inset
)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_SET_ECHO_STATE Set the associated echo canceller for residual
echo suppression (pointer or NULL for no residual echo suppression)
\end_layout
\begin_layout Description
SPEEX_PREPROCESS_GET_ECHO_STATE Get the associated echo canceller (pointer)
\end_layout
\begin_layout Section
Echo Cancellation
\begin_inset CommandInset label
LatexCommand label
name "sub:Echo-Cancellation"
\end_inset
\end_layout
\begin_layout Standard
The Speex library now includes an echo cancellation
\begin_inset Index
status collapsed
\begin_layout Plain Layout
echo cancellation
\end_layout
\end_inset
algorithm suitable for Acoustic Echo Cancellation
\begin_inset Index
status collapsed
\begin_layout Plain Layout
acoustic echo cancellation
\end_layout
\end_inset
(AEC).
In order to use the echo canceller, you first need to
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
#include <speex/speex_echo.h>
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Then, an echo canceller state can be created by:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
SpeexEchoState *echo_state = speex_echo_state_init(frame_size, filter_length);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where
\family typewriter
frame_size
\family default
is the amount of data (in samples) you want to process at once and
\family typewriter
filter_length
\family default
is the length (in samples) of the echo cancelling filter you want to use
(also known as
\shape italic
tail length
\shape default
\begin_inset Index
status collapsed
\begin_layout Plain Layout
tail length
\end_layout
\end_inset
).
It is recommended to use a frame size in the order of 20 ms (or equal to
the codec frame size) and make sure it is easy to perform an FFT of that
size (powers of two are better than prime sizes).
The recommended tail length is approximately the third of the room reverberatio
n time.
For example, in a small room, reverberation time is in the order of 300
ms, so a tail length of 100 ms is a good choice (800 samples at 8000 Hz
sampling rate).
\end_layout
\begin_layout Standard
Once the echo canceller state is created, audio can be processed by:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_echo_cancellation(echo_state, input_frame, echo_frame, output_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where
\family typewriter
input_frame
\family default
is the audio as captured by the microphone,
\family typewriter
echo_frame
\family default
is the signal that was played in the speaker (and needs to be removed)
and
\family typewriter
output_frame
\family default
is the signal with echo removed.
\end_layout
\begin_layout Standard
One important thing to keep in mind is the relationship between
\family typewriter
input_frame
\family default
and
\family typewriter
echo_frame
\family default
.
It is important that, at any time, any echo that is present in the input
has already been sent to the echo canceller as
\family typewriter
echo_frame
\family default
.
In other words, the echo canceller cannot remove a signal that it hasn't
yet received.
On the other hand, the delay between the input signal and the echo signal
must be small enough because otherwise part of the echo cancellation filter
is inefficient.
In the ideal case, you code would look like:
\begin_inset listings
lstparams "breaklines=true"
inline false
status open
\begin_layout Plain Layout
write_to_soundcard(echo_frame, frame_size);
\end_layout
\begin_layout Plain Layout
read_from_soundcard(input_frame, frame_size);
\end_layout
\begin_layout Plain Layout
speex_echo_cancellation(echo_state, input_frame, echo_frame, output_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
If you wish to further reduce the echo present in the signal, you can do
so by associating the echo canceller to the preprocessor (see Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Preprocessor"
\end_inset
).
This is done by calling:
\begin_inset listings
lstparams "breaklines=true"
inline false
status open
\begin_layout Plain Layout
speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_ECHO_STATE,echo_stat
e);
\end_layout
\end_inset
in the initialisation.
\end_layout
\begin_layout Standard
As of version 1.2-beta2, there is an alternative, simpler API that can be
used instead of
\emph on
speex_echo_cancellation()
\emph default
.
When audio capture and playback are handled asynchronously (e.g.
in different threads or using the
\emph on
poll()
\emph default
or
\emph on
select()
\emph default
system call), it can be difficult to keep track of what input_frame comes
with what echo_frame.
Instead, the playback context/thread can simply call:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_echo_playback(echo_state, echo_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
every time an audio frame is played.
Then, the capture context/thread calls:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_echo_capture(echo_state, input_frame, output_frame);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
for every frame captured.
Internally,
\emph on
speex_echo_playback()
\emph default
simply buffers the playback frame so it can be used by
\emph on
speex_echo_capture()
\emph default
to call
\emph on
speex_echo_cancel()
\emph default
.
A side effect of using this alternate API is that the playback audio is
delayed by two frames, which is the normal delay caused by the soundcard.
When capture and playback are already synchronised,
\emph on
speex_echo_cancellation()
\emph default
is preferable since it gives better control on the exact input/echo timing.
\end_layout
\begin_layout Standard
The echo cancellation state can be destroyed with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_echo_state_destroy(echo_state);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
It is also possible to reset the state of the echo canceller so it can be
reused without the need to create another state with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
speex_echo_state_reset(echo_state);
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Troubleshooting
\end_layout
\begin_layout Standard
There are several things that may prevent the echo canceller from working
properly.
One of them is a bug (or something suboptimal) in the code, but there are
many others you should consider first
\end_layout
\begin_layout Itemize
Using a different soundcard to do the capture and plaback will
\series bold
not
\series default
work, regardless of what you may think.
The only exception to that is if the two cards can be made to have their
sampling clock
\begin_inset Quotes eld
\end_inset
locked
\begin_inset Quotes erd
\end_inset
on the same clock source.
If not, the clocks will always have a small amount of drift, which will
prevent the echo canceller from adapting.
\end_layout
\begin_layout Itemize
The delay between the record and playback signals must be minimal.
Any signal played has to
\begin_inset Quotes eld
\end_inset
appear
\begin_inset Quotes erd
\end_inset
on the playback (far end) signal slightly before the echo canceller
\begin_inset Quotes eld
\end_inset
sees
\begin_inset Quotes erd
\end_inset
it in the near end signal, but excessive delay means that part of the filter
length is wasted.
In the worst situations, the delay is such that it is longer than the filter
length, in which case, no echo can be cancelled.
\end_layout
\begin_layout Itemize
When it comes to echo tail length (filter length), longer is
\series bold
not
\series default
better.
Actually, the longer the tail length, the longer it takes for the filter
to adapt.
Of course, a tail length that is too short will not cancel enough echo,
but the most common problem seen is that people set a very long tail length
and then wonder why no echo is being cancelled.
\end_layout
\begin_layout Itemize
Non-linear distortion cannot (by definition) be modeled by the linear adaptive
filter used in the echo canceller and thus cannot be cancelled.
Use good audio gear and avoid saturation/clipping.
\end_layout
\begin_layout Standard
Also useful is reading
\emph on
Echo Cancellation Demystified
\emph default
by Alexey Frunze
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
http://www.embeddedstar.com/articles/2003/7/article20030720-1.html
\end_layout
\end_inset
, which explains the fundamental principles of echo cancellation.
The details of the algorithm described in the article are different, but
the general ideas of echo cancellation through adaptive filters are the
same.
\end_layout
\begin_layout Standard
As of version 1.2beta2, a new
\family typewriter
echo_diagnostic.m
\family default
tool is included in the source distribution.
The first step is to define DUMP_ECHO_CANCEL_DATA during the build.
This causes the echo canceller to automatically save the near-end, far-end
and output signals to files (aec_rec.sw aec_play.sw and aec_out.sw).
These are exactly what the AEC receives and outputs.
From there, it is necessary to start Octave and type:
\end_layout
\begin_layout Standard
\begin_inset listings
lstparams "language=Matlab"
inline false
status open
\begin_layout Plain Layout
echo_diagnostic('aec_rec.sw', 'aec_play.sw', 'aec_diagnostic.sw', 1024);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The value of 1024 is the filter length and can be changed.
There will be some (hopefully) useful messages printed and echo cancelled
audio will be saved to aec_diagnostic.sw .
If even that output is bad (almost no cancellation) then there is probably
problem with the playback or recording process.
\end_layout
\begin_layout Section
Jitter Buffer
\end_layout
\begin_layout Standard
The jitter buffer can be enabled by including:
\begin_inset listings
lstparams "breaklines=true"
inline false
status open
\begin_layout Plain Layout
#include <speex/speex_jitter.h>
\end_layout
\end_inset
and a new jitter buffer state can be initialised by:
\end_layout
\begin_layout Standard
\begin_inset listings
lstparams "breaklines=true"
inline false
status open
\begin_layout Plain Layout
JitterBuffer *state = jitter_buffer_init(step);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where the
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
step
\end_layout
\end_inset
argument is the default time step (in timestamp units) used for adjusting
the delay and doing concealment.
A value of 1 is always correct, but higher values may be more convenient
sometimes.
For example, if you are only able to do concealment on 20ms frames, there
is no point in the jitter buffer asking you to do it on one sample.
Another example is that for video, it makes no sense to adjust the delay
by less than a full frame.
The value provided can always be changed at a later time.
\end_layout
\begin_layout Standard
The jitter buffer API is based on the
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
JitterBufferPacket
\end_layout
\end_inset
type, which is defined as:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
typedef struct {
\end_layout
\begin_layout Plain Layout
char *data; /* Data bytes contained in the packet */
\end_layout
\begin_layout Plain Layout
spx_uint32_t len; /* Length of the packet in bytes */
\end_layout
\begin_layout Plain Layout
spx_uint32_t timestamp; /* Timestamp for the packet */
\end_layout
\begin_layout Plain Layout
spx_uint32_t span; /* Time covered by the packet (timestamp units)
*/
\end_layout
\begin_layout Plain Layout
} JitterBufferPacket;
\end_layout
\end_inset
\end_layout
\begin_layout Standard
As an example, for audio the timestamp field would be what is obtained from
the RTP timestamp field and the span would be the number of samples that
are encoded in the packet.
For Speex narrowband, span would be 160 if only one frame is included in
the packet.
\end_layout
\begin_layout Standard
When a packet arrives, it need to be inserter into the jitter buffer by:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
JitterBufferPacket packet;
\end_layout
\begin_layout Plain Layout
/* Fill in each field in the packet struct */
\end_layout
\begin_layout Plain Layout
jitter_buffer_put(state, &packet);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
When the decoder is ready to decode a packet the packet to be decoded can
be obtained by:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
int start_offset;
\end_layout
\begin_layout Plain Layout
err = jitter_buffer_get(state, &packet, desired_span, &start_offset);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
If
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
jitter_buffer_put()
\end_layout
\end_inset
and
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
jitter_buffer_get()
\end_layout
\end_inset
are called from different threads, then
\series bold
you need to protect the jitter buffer state with a mutex
\series default
.
\end_layout
\begin_layout Standard
Because the jitter buffer is designed not to use an explicit timer, it needs
to be told about the time explicitly.
This is done by calling:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
jitter_buffer_tick(state);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
This needs to be done periodically in the playing thread.
This will be the last jitter buffer call before going to sleep (until more
data is played back).
In some cases, it may be preferable to use
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
jitter_buffer_remaining_span(state, remaining);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The second argument is used to specify that we are still holding data that
has not been written to the playback device.
For instance, if 256 samples were needed by the soundcard (specified by
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
desired_span
\end_layout
\end_inset
), but
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
jitter_buffer_get()
\end_layout
\end_inset
returned 320 samples, we would have
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
remaining=64
\end_layout
\end_inset
.
\end_layout
\begin_layout Section
Resampler
\end_layout
\begin_layout Standard
Speex includes a resampling modules.
To make use of the resampler, it is necessary to include its header file:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
#include <speex/speex_resampler.h>
\end_layout
\end_inset
\end_layout
\begin_layout Standard
For each stream that is to be resampled, it is necessary to create a resampler
state with:
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
SpeexResamplerState *resampler;
\end_layout
\begin_layout Plain Layout
resampler = speex_resampler_init(nb_channels, input_rate, output_rate, quality,
&err);
\end_layout
\end_inset
\end_layout
\begin_layout Standard
where
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
nb_channels
\end_layout
\end_inset
is the number of channels that will be used (either interleaved or non-interlea
ved),
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
input_rate
\end_layout
\end_inset
is the sampling rate of the input stream,
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
output_rate
\end_layout
\end_inset
is the sampling rate of the output stream and
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
quality
\end_layout
\end_inset
is the requested quality setting (0 to 10).
The quality parameter is useful for controlling the quality/complexity/latency
tradeoff.
Using a higher quality setting means less noise/aliasing, a higher complexity
and a higher latency.
Usually, a quality of 3 is acceptable for most desktop uses and quality
10 is mostly recommended for pro audio work.
Quality 0 usually has a decent sound (certainly better than using linear
interpolation resampling), but artifacts may be heard.
\end_layout
\begin_layout Standard
The actual resampling is performed using
\end_layout
\begin_layout Standard
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
err = speex_resampler_process_int(resampler, channelID, in, &in_length,
out, &out_length);
\end_layout
\end_inset
where
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
channelID
\end_layout
\end_inset
is the ID of the channel to be processed.
For a mono stream, use 0.
The
\emph on
in
\emph default
pointer points to the first sample of the input buffer for the selected
channel and
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
out
\end_layout
\end_inset
points to the first sample of the output.
The size of the input and output buffers are specified by
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
in_length
\end_layout
\end_inset
and
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
out_length
\end_layout
\end_inset
respectively.
Upon completion, these values are replaced by the number of samples read
and written by the resampler.
Unless an error occurs, either all input samples will be read or all output
samples will be written to (or both).
For floating-point samples, the function
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_process_float()
\end_layout
\end_inset
behaves similarly.
\end_layout
\begin_layout Standard
It is also possible to process multiple channels at once.
To do that, you can use speex_resampler_process_interleaved_int() or
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_process_interleaved_float()
\end_layout
\end_inset
.
The arguments are the same except that there is no
\begin_inset listings
inline true
status collapsed
\begin_layout Plain Layout
channelID
\end_layout
\end_inset
argument.
Note that the
\series bold
length parameters are per-channel
\series default
.
So if you have 1024 samples for each of 4 channels, you pass 1024 and not
4096.
\end_layout
\begin_layout Standard
The resampler allows changing the quality and input/output sampling frequencies
on the fly without glitches.
This can be done with calls such as
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_set_quality()
\end_layout
\end_inset
and
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_set_rate()
\end_layout
\end_inset
.
The only side effect is that a new filter will have to be recomputed, consuming
many CPU cycles.
\end_layout
\begin_layout Standard
When resampling a file, it is often desirable to have the output file perfectly
synchronised with the input.
To do that, you need to call
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_skip_zeros()
\end_layout
\end_inset
\series bold
before
\series default
you start processing any samples.
For real-time applications (e.g.
VoIP), it is not recommended to do that as the first process frame will
be shorter to compensate for the delay (the skipped zeros).
To destroy a resampler state, just call
\begin_inset listings
inline true
status open
\begin_layout Plain Layout
speex_resampler_destroy()
\end_layout
\end_inset
.
\end_layout
\begin_layout Section
Ring Buffer
\end_layout
\begin_layout Standard
In some cases, it is necessary to interface components that use different
block sizes.
For example, it is possible that the soundcard does not support reading/writing
in blocks of 20
\begin_inset space ~
\end_inset
ms or sometimes, complicated resampling ratios mean that the blocks don't
always have the same time.
In thoses cases, it is often necessary to buffer a bit of audio using a
ring buffer.
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Formats and standards
\begin_inset Index
status collapsed
\begin_layout Plain Layout
standards
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "sec:Formats-and-standards"
\end_inset
\end_layout
\begin_layout Standard
Speex can encode speech in both narrowband and wideband and provides different
bit-rates.
However, not all features need to be supported by a certain implementation
or device.
In order to be called
\begin_inset Quotes eld
\end_inset
Speex compatible
\begin_inset Quotes erd
\end_inset
(whatever that means), an implementation must implement at least a basic
set of features.
\end_layout
\begin_layout Standard
At the minimum, all narrowband modes of operation MUST be supported at the
decoder.
This includes the decoding of a wideband bit-stream by the narrowband decoder
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The wideband bit-stream contains an embedded narrowband bit-stream which
can be decoded alone
\end_layout
\end_inset
.
If present, a wideband decoder MUST be able to decode a narrowband stream,
and MAY either be able to decode all wideband modes or be able to decode
the embedded narrowband part of all modes (which includes ignoring the
high-band bits).
\end_layout
\begin_layout Standard
For encoders, at least one narrowband or wideband mode MUST be supported.
The main reason why all encoding modes do not have to be supported is that
some platforms may not be able to handle the complexity of encoding in
some modes.
\end_layout
\begin_layout Section
RTP
\begin_inset Index
status collapsed
\begin_layout Plain Layout
RTP
\end_layout
\end_inset
Payload Format
\end_layout
\begin_layout Standard
The RTP payload draft is included in appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:IETF-draft"
\end_inset
and the latest version is available at
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://www.speex.org/drafts/latest
\end_layout
\end_inset
.
This draft has been sent (2003/02/26) to the Internet Engineering Task
Force (IETF) and will be discussed at the March 18th meeting in San Francisco.
\end_layout
\begin_layout Section
MIME Type
\end_layout
\begin_layout Standard
For now, you should use the MIME type audio/x-speex for Speex-in-Ogg.
We will apply for type
\family typewriter
audio/speex
\family default
in the near future.
\end_layout
\begin_layout Section
Ogg
\begin_inset Index
status collapsed
\begin_layout Plain Layout
Ogg
\end_layout
\end_inset
file format
\end_layout
\begin_layout Standard
Speex bit-streams can be stored in Ogg files.
In this case, the first packet of the Ogg file contains the Speex header
described in table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:ogg_speex_header"
\end_inset
.
All integer fields in the headers are stored as little-endian.
The
\family typewriter
speex_string
\family default
field must contain the
\begin_inset Quotes eld
\end_inset
\family typewriter
Speex
\family default
\begin_inset space ~
\end_inset
\begin_inset space ~
\end_inset
\begin_inset space ~
\end_inset
\begin_inset Quotes erd
\end_inset
(with 3 trailing spaces), which identifies the bit-stream.
The next field,
\family typewriter
speex_version
\family default
contains the version of Speex that encoded the file.
For now, refer to speex_header.[ch] for more info.
The
\emph on
beginning of stream
\emph default
(
\family typewriter
b_o_s
\family default
) flag is set to 1 for the header.
The header packet has
\family typewriter
packetno=0
\family default
and
\family typewriter
granulepos=0
\family default
.
\end_layout
\begin_layout Standard
The second packet contains the Speex comment header.
The format used is the Vorbis comment format described here: http://www.xiph.org/
ogg/vorbis/doc/v-comment.html .
This packet has
\family typewriter
packetno=1
\family default
and
\family typewriter
granulepos=0
\family default
.
\end_layout
\begin_layout Standard
The third and subsequent packets each contain one or more (number found
in header) Speex frames.
These are identified with
\family typewriter
packetno
\family default
starting from 2 and the
\family typewriter
granulepos
\family default
is the number of the last sample encoded in that packet.
The last of these packets has the
\emph on
end of stream
\emph default
(
\family typewriter
e_o_s
\family default
) flag is set to 1.
\end_layout
\begin_layout Standard
\begin_inset Float table
placement htbp
wide true
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="16" columns="3">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Field
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Type
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Size
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
speex_string
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
char[]
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
speex_version
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
char[]
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
20
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
speex_version_id
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
header_size
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
rate
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
mode
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
mode_bitstream_version
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
nb_channels
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
bitrate
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame_size
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
vbr
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frames_per_packet
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
extra_headers
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
int
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Ogg/Speex header packet
\begin_inset CommandInset label
LatexCommand label
name "cap:ogg_speex_header"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
clearpage
\end_layout
\end_inset
\end_layout
\begin_layout Chapter
Introduction to CELP Coding
\begin_inset Index
status collapsed
\begin_layout Plain Layout
CELP
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "sec:Introduction-to-CELP"
\end_inset
\end_layout
\begin_layout Quote
\align center
\emph on
Do not meddle in the affairs of poles, for they are subtle and quick to
leave the unit circle.
\end_layout
\begin_layout Standard
Speex is based on CELP, which stands for Code Excited Linear Prediction.
This section attempts to introduce the principles behind CELP, so if you
are already familiar with CELP, you can safely skip to section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Speex-narrowband-mode"
\end_inset
.
The CELP technique is based on three ideas:
\end_layout
\begin_layout Enumerate
The use of a linear prediction (LP) model to model the vocal tract
\end_layout
\begin_layout Enumerate
The use of (adaptive and fixed) codebook entries as input (excitation) of
the LP model
\end_layout
\begin_layout Enumerate
The search performed in closed-loop in a
\begin_inset Quotes eld
\end_inset
perceptually weighted domain
\begin_inset Quotes erd
\end_inset
\end_layout
\begin_layout Standard
This section describes the basic ideas behind CELP.
This is still a work in progress.
\end_layout
\begin_layout Section
Source-Filter Model of Speech Prediction
\end_layout
\begin_layout Standard
The source-filter model of speech production assumes that the vocal cords
are the source of spectrally flat sound (the excitation signal), and that
the vocal tract acts as a filter to spectrally shape the various sounds
of speech.
While still an approximation, the model is widely used in speech coding
because of its simplicity.Its use is also the reason why most speech codecs
(Speex included) perform badly on music signals.
The different phonemes can be distinguished by their excitation (source)
and spectral shape (filter).
Voiced sounds (e.g.
vowels) have an excitation signal that is periodic and that can be approximated
by an impulse train in the time domain or by regularly-spaced harmonics
in the frequency domain.
On the other hand, fricatives (such as the "s", "sh" and "f" sounds) have
an excitation signal that is similar to white Gaussian noise.
So called voice fricatives (such as "z" and "v") have excitation signal
composed of an harmonic part and a noisy part.
\end_layout
\begin_layout Standard
The source-filter model is usually tied with the use of Linear prediction.
The CELP model is based on source-filter model, as can be seen from the
CELP decoder illustrated in Figure
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:The-CELP-model"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename celp_decoder.eps
width 45page%
keepAspectRatio
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
The CELP model of speech synthesis (decoder)
\begin_inset CommandInset label
LatexCommand label
name "fig:The-CELP-model"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Linear Prediction Coefficients (LPC)
\begin_inset Index
status collapsed
\begin_layout Plain Layout
linear prediction
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Linear prediction is at the base of many speech coding techniques, including
CELP.
The idea behind it is to predict the signal
\begin_inset Formula $x[n]$
\end_inset
using a linear combination of its past samples:
\end_layout
\begin_layout Standard
\begin_inset Formula \[
y[n]=\sum_{i=1}^{N}a_{i}x[n-i]\]
\end_inset
where
\begin_inset Formula $y[n]$
\end_inset
is the linear prediction of
\begin_inset Formula $x[n]$
\end_inset
.
The prediction error is thus given by:
\begin_inset Formula \[
e[n]=x[n]-y[n]=x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\]
\end_inset
\end_layout
\begin_layout Standard
The goal of the LPC analysis is to find the best prediction coefficients
\begin_inset Formula $a_{i}$
\end_inset
which minimize the quadratic error function:
\begin_inset Formula \[
E=\sum_{n=0}^{L-1}\left[e[n]\right]^{2}=\sum_{n=0}^{L-1}\left[x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\right]^{2}\]
\end_inset
That can be done by making all derivatives
\begin_inset Formula $\frac{\partial E}{\partial a_{i}}$
\end_inset
equal to zero:
\begin_inset Formula \[
\frac{\partial E}{\partial a_{i}}=\frac{\partial}{\partial a_{i}}\sum_{n=0}^{L-1}\left[x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\right]^{2}=0\]
\end_inset
\end_layout
\begin_layout Standard
For an order
\begin_inset Formula $N$
\end_inset
filter, the filter coefficients
\begin_inset Formula $a_{i}$
\end_inset
are found by solving the system
\begin_inset Formula $N\times N$
\end_inset
linear system
\begin_inset Formula $\mathbf{Ra}=\mathbf{r}$
\end_inset
, where
\begin_inset Formula \[
\mathbf{R}=\left[\begin{array}{cccc}
R(0) & R(1) & \cdots & R(N-1)\\
R(1) & R(0) & \cdots & R(N-2)\\
\vdots & \vdots & \ddots & \vdots\\
R(N-1) & R(N-2) & \cdots & R(0)\end{array}\right]\]
\end_inset
\begin_inset Formula \[
\mathbf{r}=\left[\begin{array}{c}
R(1)\\
R(2)\\
\vdots\\
R(N)\end{array}\right]\]
\end_inset
with
\begin_inset Formula $R(m)$
\end_inset
, the auto-correlation
\begin_inset Index
status collapsed
\begin_layout Plain Layout
auto-correlation
\end_layout
\end_inset
of the signal
\begin_inset Formula $x[n]$
\end_inset
, computed as:
\end_layout
\begin_layout Standard
\begin_inset Formula \[
R(m)=\sum_{i=0}^{N-1}x[i]x[i-m]\]
\end_inset
\end_layout
\begin_layout Standard
Because
\begin_inset Formula $\mathbf{R}$
\end_inset
is Hermitian Toeplitz, the Levinson-Durbin
\begin_inset Index
status collapsed
\begin_layout Plain Layout
Levinson-Durbin
\end_layout
\end_inset
algorithm can be used, making the solution to the problem
\begin_inset Formula $\mathcal{O}\left(N^{2}\right)$
\end_inset
instead of
\begin_inset Formula $\mathcal{O}\left(N^{3}\right)$
\end_inset
.
Also, it can be proven that all the roots of
\begin_inset Formula $A(z)$
\end_inset
are within the unit circle, which means that
\begin_inset Formula $1/A(z)$
\end_inset
is always stable.
This is in theory; in practice because of finite precision, there are two
commonly used techniques to make sure we have a stable filter.
First, we multiply
\begin_inset Formula $R(0)$
\end_inset
by a number slightly above one (such as 1.0001), which is equivalent to
adding noise to the signal.
Also, we can apply a window to the auto-correlation, which is equivalent
to filtering in the frequency domain, reducing sharp resonances.
\end_layout
\begin_layout Section
Pitch Prediction
\begin_inset Index
status collapsed
\begin_layout Plain Layout
pitch
\end_layout
\end_inset
\end_layout
\begin_layout Standard
During voiced segments, the speech signal is periodic, so it is possible
to take advantage of that property by approximating the excitation signal
\begin_inset Formula $e[n]$
\end_inset
by a gain times the past of the excitation:
\end_layout
\begin_layout Standard
\begin_inset Formula \[
e[n]\simeq p[n]=\beta e[n-T]\ ,\]
\end_inset
where
\begin_inset Formula $T$
\end_inset
is the pitch period,
\begin_inset Formula $\beta$
\end_inset
is the pitch gain.
We call that long-term prediction since the excitation is predicted from
\begin_inset Formula $e[n-T]$
\end_inset
with
\begin_inset Formula $T\gg N$
\end_inset
.
\end_layout
\begin_layout Section
Innovation Codebook
\end_layout
\begin_layout Standard
The final excitation
\begin_inset Formula $e[n]$
\end_inset
will be the sum of the pitch prediction and an
\emph on
innovation
\emph default
signal
\begin_inset Formula $c[n]$
\end_inset
taken from a fixed codebook, hence the name
\emph on
Code
\emph default
Excited Linear Prediction.
The final excitation is given by
\end_layout
\begin_layout Standard
\begin_inset Formula \[
e[n]=p[n]+c[n]=\beta e[n-T]+c[n]\ .\]
\end_inset
The quantization of
\begin_inset Formula $c[n]$
\end_inset
is where most of the bits in a CELP codec are allocated.
It represents the information that couldn't be obtained either from linear
prediction or pitch prediction.
In the
\emph on
z
\emph default
-domain we can represent the final signal
\begin_inset Formula $X(z)$
\end_inset
as
\begin_inset Formula \[
X(z)=\frac{C(z)}{A(z)\left(1-\beta z^{-T}\right)}\]
\end_inset
\end_layout
\begin_layout Section
Noise Weighting
\begin_inset Index
status collapsed
\begin_layout Plain Layout
error weighting
\end_layout
\end_inset
\begin_inset Index
status collapsed
\begin_layout Plain Layout
analysis-by-synthesis
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Most (if not all) modern audio codecs attempt to
\begin_inset Quotes eld
\end_inset
shape
\begin_inset Quotes erd
\end_inset
the noise so that it appears mostly in the frequency regions where the
ear cannot detect it.
For example, the ear is more tolerant to noise in parts of the spectrum
that are louder and
\emph on
vice versa
\emph default
.
In order to maximize speech quality, CELP codecs minimize the mean square
of the error (noise) in the perceptually weighted domain.
This means that a perceptual noise weighting filter
\begin_inset Formula $W(z)$
\end_inset
is applied to the error signal in the encoder.
In most CELP codecs,
\begin_inset Formula $W(z)$
\end_inset
is a pole-zero weighting filter derived from the linear prediction coefficients
(LPC), generally using bandwidth expansion.
Let the spectral envelope be represented by the synthesis filter
\begin_inset Formula $1/A(z)$
\end_inset
, CELP codecs typically derive the noise weighting filter as
\begin_inset Formula \begin{equation}
W(z)=\frac{A(z/\gamma_{1})}{A(z/\gamma_{2})}\ ,\label{eq:gamma-weighting}\end{equation}
\end_inset
where
\begin_inset Formula $\gamma_{1}=0.9$
\end_inset
and
\begin_inset Formula $\gamma_{2}=0.6$
\end_inset
in the Speex reference implementation.
If a filter
\begin_inset Formula $A(z)$
\end_inset
has (complex) poles at
\begin_inset Formula $p_{i}$
\end_inset
in the
\begin_inset Formula $z$
\end_inset
-plane, the filter
\begin_inset Formula $A(z/\gamma)$
\end_inset
will have its poles at
\begin_inset Formula $p'_{i}=\gamma p_{i}$
\end_inset
, making it a flatter version of
\begin_inset Formula $A(z)$
\end_inset
.
\end_layout
\begin_layout Standard
The weighting filter is applied to the error signal used to optimize the
codebook search through analysis-by-synthesis (AbS).
This results in a spectral shape of the noise that tends towards
\begin_inset Formula $1/W(z)$
\end_inset
.
While the simplicity of the model has been an important reason for the
success of CELP, it remains that
\begin_inset Formula $W(z)$
\end_inset
is a very rough approximation for the perceptually optimal noise weighting
function.
Fig.
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:Standard-noise-shaping"
\end_inset
illustrates the noise shaping that results from Eq.
\begin_inset CommandInset ref
LatexCommand ref
reference "eq:gamma-weighting"
\end_inset
.
Throughout this paper, we refer to
\begin_inset Formula $W(z)$
\end_inset
as the noise weighting filter and to
\begin_inset Formula $1/W(z)$
\end_inset
as the noise shaping filter (or curve).
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename ref_shaping.eps
width 45page%
keepAspectRatio
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Standard noise shaping in CELP.
Arbitrary y-axis offset.
\begin_inset CommandInset label
LatexCommand label
name "cap:Standard-noise-shaping"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Analysis-by-Synthesis
\end_layout
\begin_layout Standard
One of the main principles behind CELP is called Analysis-by-Synthesis (AbS),
meaning that the encoding (analysis) is performed by perceptually optimising
the decoded (synthesis) signal in a closed loop.
In theory, the best CELP stream would be produced by trying all possible
bit combinations and selecting the one that produces the best-sounding
decoded signal.
This is obviously not possible in practice for two reasons: the required
complexity is beyond any currently available hardware and the
\begin_inset Quotes eld
\end_inset
best sounding
\begin_inset Quotes erd
\end_inset
selection criterion implies a human listener.
\end_layout
\begin_layout Standard
In order to achieve real-time encoding using limited computing resources,
the CELP optimisation is broken down into smaller, more manageable, sequential
searches using the perceptual weighting function described earlier.
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
The Speex Decoder Specification
\end_layout
\begin_layout Section
Narrowband decoder
\end_layout
\begin_layout Standard
<Insert decoder figure here>
\end_layout
\begin_layout Subsection
Narrowband modes
\end_layout
\begin_layout Standard
There are 7 different narrowband bit-rates defined for Speex, ranging from
250 bps to 24.6 kbps, although the modes below 5.9 kbps should not be used
for speech.
The bit-allocation for each mode is detailed in table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:bits-narrowband"
\end_inset
.
Each frame starts with the mode ID encoded with 4 bits which allows a range
from 0 to 15, though only the first 7 values are used (the others are reserved).
The parameters are listed in the table in the order they are packed in
the bit-stream.
All frame-based parameters are packed before sub-frame parameters.
The parameters for a certain sub-frame are all packed before the following
sub-frame is packed.
The
\begin_inset Quotes eld
\end_inset
OL
\begin_inset Quotes erd
\end_inset
in the parameter description means that the parameter is an open loop estimatio
n based on the whole frame.
\end_layout
\begin_layout Standard
\begin_inset Float table
placement h
wide true
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="12" columns="11">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Parameter
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Update rate
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Wideband bit
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Mode ID
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
LSP
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
30
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
30
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
30
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL pitch
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL pitch gain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL Exc gain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Fine pitch
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Pitch gain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Innovation gain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Innovation VQ
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
16
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
20
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
35
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
48
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
64
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
96
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Total
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
43
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
119
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
160
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
220
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
300
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
364
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
492
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
79
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Bit allocation for narrowband modes
\begin_inset CommandInset label
LatexCommand label
name "cap:bits-narrowband"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
LSP decoding
\end_layout
\begin_layout Standard
Depending on the mode, LSP parameters are encoded using either 18 bits or
30 bits.
\end_layout
\begin_layout Standard
Interpolation
\end_layout
\begin_layout Standard
Safe margin
\end_layout
\begin_layout Subsection
Adaptive codebook
\end_layout
\begin_layout Standard
For rates of 8 kbit/s and above, the pitch period is encoded for each subframe.
The real period is
\begin_inset Formula $T=p_{i}+17$
\end_inset
where
\begin_inset Formula $p_{i}$
\end_inset
is a value encoded with 7 bits and 17 corresponds to the minimum pitch.
The maximum period is 144.
At 5.95 kbit/s (mode 2), the pitch period is similarly encoded, but only
once for the frame.
Each sub-frame then has a 2-bit offset that is added to the pitch value
of the frame.
In that case, the pitch for each sub-frame is equal to
\begin_inset Formula $T-1+offset$
\end_inset
.
For rates below 5.95 kbit/s, only the per-frame pitch is used and the pitch
is constant for all sub-frames.
\end_layout
\begin_layout Standard
Speex uses a 3-tap predictor for rates of 5.95 kbit/s and above.
The three gain values are obtained from a 5-bit or a 7-bit codebook, depending
on the mode.
\end_layout
\begin_layout Subsection
Innovation codebook
\end_layout
\begin_layout Standard
Split codebook, size and entries depend on bit-rate
\end_layout
\begin_layout Standard
a 5-bit gain is encoder on a per-frame basis
\end_layout
\begin_layout Standard
Depending on the mode, higher resolution per sub-frame
\end_layout
\begin_layout Standard
innovation sub-vectors concatenated, gain applied
\end_layout
\begin_layout Subsection
Perceptual enhancement
\end_layout
\begin_layout Standard
Optional, implementation-defined.
\end_layout
\begin_layout Subsection
Bit-stream definition
\end_layout
\begin_layout Standard
This section defines the bit-stream that is transmitted on the wire.
One speex packet consist of 1 frame header and 4 sub-frames:
\end_layout
\begin_layout Standard
\begin_inset Tabular
<lyxtabular version="3" rows="1" columns="5">
<features>
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Frame Header
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Subframe 1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Subframe2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Subframe 3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Subframe 4
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
The frame header is variable length, depending on decoding mode and submode.
The narrowband frame header is defined as follows:
\end_layout
\begin_layout Standard
\begin_inset Tabular
<lyxtabular version="3" rows="1" columns="6">
<features>
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
wb bit
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
modeid
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
LSP
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL-pitch
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL-pitchgain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
OL ExcGain
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
wb-bit: Wideband bit (1 bit) 0=narrowband, 1=wideband
\end_layout
\begin_layout Standard
modeid: Mode identifier (4 bits)
\end_layout
\begin_layout Standard
LSP: Line Spectral Pairs (0, 18 or 30 bits)
\end_layout
\begin_layout Standard
OL-pitch: Open Loop Pitch (0 or 7 bits)
\end_layout
\begin_layout Standard
OL-pitchgain: Open Loop Pitch Gain (0 or 4 bits)
\end_layout
\begin_layout Standard
OL-ExcGain: Open Loop Excitation Gain (0 or 5 bits)
\end_layout
\begin_layout Standard
...
\end_layout
\begin_layout Standard
Each subframe is defined as follows:
\end_layout
\begin_layout Standard
\begin_inset Tabular
<lyxtabular version="3" rows="1" columns="4">
<features>
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
FinePitch
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
PitchGain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
InnovationGain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Innovation VQ
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
FinePitch: (0 or 7 bits)
\end_layout
\begin_layout Standard
PitchGain: (0, 5, or 7 bits)
\end_layout
\begin_layout Standard
Innovation Gain: (0, 1, 3 bits)
\end_layout
\begin_layout Standard
Innovation VQ: (0-96 bits)
\end_layout
\begin_layout Standard
...
\end_layout
\begin_layout Subsection
Sample decoder
\end_layout
\begin_layout Standard
This section contains some sample source code, showing how a basic Speex
decoder can be implemented.
The sample decoder is narrowband submode 3 only, and with no advanced features
like enhancement, vbr etc.
\end_layout
\begin_layout Standard
...
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "nb_celp.c"
lstparams "caption={Sample Decoder}"
\end_inset
\end_layout
\begin_layout Subsection
Lookup tables
\end_layout
\begin_layout Standard
The Speex decoder includes a set of lookup tables and codebooks, which are
used to convert between values of different domains.
This includes:
\end_layout
\begin_layout Standard
- Excitation 10x16 (3200 bps)
\end_layout
\begin_layout Standard
- Excitation 10x32 (4000 bps)
\end_layout
\begin_layout Standard
- Excitation 20x32 (2000 bps)
\end_layout
\begin_layout Standard
- Excitation 5x256 (12800 bps)
\end_layout
\begin_layout Standard
- Excitation 5x64 (9600 bps)
\end_layout
\begin_layout Standard
- Excitation 8x128 (7000 bps)
\end_layout
\begin_layout Standard
- Codebook for 3-tap pitch prediction gain (Normal and Low Bitrate)
\end_layout
\begin_layout Standard
- Codebook for LSPs in narrowband CELP mode
\end_layout
\begin_layout Standard
...
\end_layout
\begin_layout Standard
The exact lookup tables are included here for reference.
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_5_64_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_5_256_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_8_128_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_10_16_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_10_32_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/exc_20_32_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/gain_table.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/gain_table_lbr.c"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../libspeex/lsp_tables_nb.c"
\end_inset
\end_layout
\begin_layout Section
Wideband embedded decoder
\end_layout
\begin_layout Standard
QMF filter.
Narrowband signal decoded using narrowband decoder
\end_layout
\begin_layout Standard
For the high band, the decoder is similar to the narrowband decoder, with
the main difference being that there is no adaptive codebook.
\end_layout
\begin_layout Standard
Gain is per-subframe
\end_layout
\begin_layout Chapter
Speex narrowband mode
\begin_inset CommandInset label
LatexCommand label
name "sec:Speex-narrowband-mode"
\end_inset
\begin_inset Index
status collapsed
\begin_layout Plain Layout
narrowband
\end_layout
\end_inset
\end_layout
\begin_layout Standard
This section looks at how Speex works for narrowband (
\begin_inset Formula $8\:\mathrm{kHz}$
\end_inset
sampling rate) operation.
The frame size for this mode is
\begin_inset Formula $20\:\mathrm{ms}$
\end_inset
, corresponding to 160 samples.
Each frame is also subdivided into 4 sub-frames of 40 samples each.
\end_layout
\begin_layout Standard
Also many design decisions were based on the original goals and assumptions:
\end_layout
\begin_layout Itemize
Minimizing the amount of information extracted from past frames (for robustness
to packet loss)
\end_layout
\begin_layout Itemize
Dynamically-selectable codebooks (LSP, pitch and innovation)
\end_layout
\begin_layout Itemize
sub-vector fixed (innovation) codebooks
\end_layout
\begin_layout Section
Whole-Frame Analysis
\begin_inset Index
status collapsed
\begin_layout Plain Layout
linear prediction
\end_layout
\end_inset
\end_layout
\begin_layout Standard
In narrowband, Speex frames are 20 ms long (160 samples) and are subdivided
in 4 sub-frames of 5 ms each (40 samples).
For most narrowband bit-rates (8 kbps and above), the only parameters encoded
at the frame level are the Line Spectral Pairs (LSP) and a global excitation
gain
\begin_inset Formula $g_{frame}$
\end_inset
, as shown in Fig.
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:Frame-open-loop-analysis"
\end_inset
.
All other parameters are encoded at the sub-frame level.
\end_layout
\begin_layout Standard
Linear prediction analysis is performed once per frame using an asymmetric
Hamming window centered on the fourth sub-frame.
Because linear prediction coefficients (LPC) are not robust to quantization,
they are first converted to line spectral pairs (LSP)
\begin_inset Index
status collapsed
\begin_layout Plain Layout
line spectral pair
\end_layout
\end_inset
.
The LSP's are considered to be associated to the
\begin_inset Formula $4^{th}$
\end_inset
sub-frames and the LSP's associated to the first 3 sub-frames are linearly
interpolated using the current and previous LSP coefficients.
The LSP coefficients and converted back to the LPC filter
\begin_inset Formula $\hat{A}(z)$
\end_inset
.
The non-quantized interpolated filter is denoted
\begin_inset Formula $A(z)$
\end_inset
and can be used for the weighting filter
\begin_inset Formula $W(z)$
\end_inset
because it does not need to be available to the decoder.
\end_layout
\begin_layout Standard
To make Speex more robust to packet loss, no prediction is applied on the
LSP coefficients prior to quantization.
The LSPs are encoded using vector quantization (VQ) with 30 bits for higher
quality modes and 18 bits for lower quality.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename speex_analysis.eps
width 35page%
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Frame open-loop analysis
\begin_inset CommandInset label
LatexCommand label
name "cap:Frame-open-loop-analysis"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Sub-Frame Analysis-by-Synthesis
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Graphics
filename speex_abs.eps
lyxscale 75
width 40page%
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Analysis-by-synthesis closed-loop optimization on a sub-frame.
\begin_inset CommandInset label
LatexCommand label
name "cap:Sub-frame-AbS"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The analysis-by-synthesis (AbS) encoder loop is described in Fig.
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:Sub-frame-AbS"
\end_inset
.
There are three main aspects where Speex significantly differs from most
other CELP codecs.
First, while most recent CELP codecs make use of fractional pitch estimation
with a single gain, Speex uses an integer to encode the pitch period, but
uses a 3-tap predictor (3 gains).
The adaptive codebook contribution
\begin_inset Formula $e_{a}[n]$
\end_inset
can thus be expressed as:
\begin_inset Formula \begin{equation}
e_{a}[n]=g_{0}e[n-T-1]+g_{1}e[n-T]+g_{2}e[n-T+1]\label{eq:adaptive-3tap}\end{equation}
\end_inset
where
\begin_inset Formula $g_{0}$
\end_inset
,
\begin_inset Formula $g_{1}$
\end_inset
and
\begin_inset Formula $g_{2}$
\end_inset
are the jointly quantized pitch gains and
\begin_inset Formula $e[n]$
\end_inset
is the codec excitation memory.
It is worth noting that when the pitch is smaller than the sub-frame size,
we repeat the excitation at a period
\begin_inset Formula $T$
\end_inset
.
For example, when
\begin_inset Formula $n-T+1\geq0$
\end_inset
, we use
\begin_inset Formula $n-2T+1$
\end_inset
instead.
In most modes, the pitch period is encoded with 7 bits in the
\begin_inset Formula $\left[17,144\right]$
\end_inset
range and the
\begin_inset Formula $\beta_{i}$
\end_inset
coefficients are vector-quantized using 7 bits at higher bit-rates (15
kbps narrowband and above) and 5 bits at lower bit-rates (11 kbps narrowband
and below).
\end_layout
\begin_layout Standard
Many current CELP codecs use moving average (MA) prediction to encode the
fixed codebook gain.
This provides slightly better coding at the expense of introducing a dependency
on previously encoded frames.
A second difference is that Speex encodes the fixed codebook gain as the
product of the global excitation gain
\begin_inset Formula $g_{frame}$
\end_inset
with a sub-frame gain corrections
\begin_inset Formula $g_{subf}$
\end_inset
.
This increases robustness to packet loss by eliminating the inter-frame
dependency.
The sub-frame gain correction is encoded before the fixed codebook is searched
(not closed-loop optimized) and uses between 0 and 3 bits per sub-frame,
depending on the bit-rate.
\end_layout
\begin_layout Standard
The third difference is that Speex uses sub-vector quantization of the innovatio
n (fixed codebook) signal instead of an algebraic codebook.
Each sub-frame is divided into sub-vectors of lengths ranging between 5
and 20 samples.
Each sub-vector is chosen from a bitrate-dependent codebook and all sub-vectors
are concatenated to form a sub-frame.
As an example, the 3.95 kbps mode uses a sub-vector size of 20 samples with
32 entries in the codebook (5 bits).
This means that the innovation is encoded with 10 bits per sub-frame, or
2000 bps.
On the other hand, the 18.2 kbps mode uses a sub-vector size of 5 samples
with 256 entries in the codebook (8 bits), so the innovation uses 64 bits
per sub-frame, or 12800 bps.
\end_layout
\begin_layout Section
Bit-rates
\end_layout
\begin_layout Standard
So far, no MOS (Mean Opinion Score
\begin_inset Index
status collapsed
\begin_layout Plain Layout
mean opinion score
\end_layout
\end_inset
) subjective evaluation has been performed for Speex.
In order to give an idea of the quality achievable with it, table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:quality_vs_bps"
\end_inset
presents my own subjective opinion on it.
It should be noted that different people will perceive the quality differently
and that the person that designed the codec often has a bias (one way or
another) when it comes to subjective evaluation.
Last thing, it should be noted that for most codecs (including Speex) encoding
quality sometimes varies depending on the input.
Note that the complexity is only approximate (within 0.5 mflops and using
the lowest complexity setting).
Decoding requires approximately 0.5 mflops
\begin_inset Index
status collapsed
\begin_layout Plain Layout
complexity
\end_layout
\end_inset
in most modes (1 mflops with perceptual enhancement).
\end_layout
\begin_layout Standard
\begin_inset Float table
placement h
wide true
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="17" columns="5">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Mode
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Quality
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Bit-rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
bit-rate
\end_layout
\end_inset
(bps)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
mflops
\begin_inset Index
status collapsed
\begin_layout Plain Layout
complexity
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Quality/description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
250
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
No transmission (DTX)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2,150
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Vocoder (mostly for comfort noise)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5,950
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Very noticeable artifacts/noise, good intelligibility
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3-4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8,000
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Artifacts/noise sometimes noticeable
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5-6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
11,000
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
14
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Artifacts usually noticeable only with headphones
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7-8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
15,000
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
11
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Need good headphones to tell the difference
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
18,200
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
17.5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Hard to tell the difference even with good headphones
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
24,600
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
14.5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Completely transparent for voice, good quality music
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3,950
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10.5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Very noticeable artifacts/noise, good intelligibility
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
11
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
reserved
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
13
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Application-defined, interpreted by callback or skipped
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
14
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Speex in-band signaling
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
15
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Terminator code
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Quality versus bit-rate
\begin_inset CommandInset label
LatexCommand label
name "cap:quality_vs_bps"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Perceptual enhancement
\begin_inset Index
status collapsed
\begin_layout Plain Layout
perceptual enhancement
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\series bold
This section was only valid for version 1.1.12 and earlier.
It does not apply to version 1.2-beta1 (and later), for which the new perceptual
enhancement is not yet documented.
\end_layout
\begin_layout Standard
This part of the codec only applies to the decoder and can even be changed
without affecting inter-operability.
For that reason, the implementation provided and described here should
only be considered as a reference implementation.
The enhancement system is divided into two parts.
First, the synthesis filter
\begin_inset Formula $S(z)=1/A(z)$
\end_inset
is replaced by an enhanced filter:
\begin_inset Formula \[
S'(z)=\frac{A\left(z/a_{2}\right)A\left(z/a_{3}\right)}{A\left(z\right)A\left(z/a_{1}\right)}\]
\end_inset
where
\begin_inset Formula $a_{1}$
\end_inset
and
\begin_inset Formula $a_{2}$
\end_inset
depend on the mode in use and
\begin_inset Formula $a_{3}=\frac{1}{r}\left(1-\frac{1-ra_{1}}{1-ra_{2}}\right)$
\end_inset
with
\begin_inset Formula $r=.9$
\end_inset
.
The second part of the enhancement consists of using a comb filter to enhance
the pitch in the excitation domain.
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Speex wideband mode (sub-band CELP)
\begin_inset Index
status collapsed
\begin_layout Plain Layout
wideband
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "sec:Speex-wideband-mode"
\end_inset
\end_layout
\begin_layout Standard
For wideband, the Speex approach uses a
\emph on
q
\emph default
uadrature
\emph on
m
\emph default
irror
\emph on
f
\emph default
ilter
\begin_inset Index
status collapsed
\begin_layout Plain Layout
quadrature mirror filter
\end_layout
\end_inset
(QMF) to split the band in two.
The 16 kHz signal is thus divided into two 8 kHz signals, one representing
the low band (0-4 kHz), the other the high band (4-8 kHz).
The low band is encoded with the narrowband mode described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Speex-narrowband-mode"
\end_inset
in such a way that the resulting
\begin_inset Quotes eld
\end_inset
embedded narrowband bit-stream
\begin_inset Quotes erd
\end_inset
can also be decoded with the narrowband decoder.
Since the low band encoding has already been described, only the high band
encoding is described in this section.
\end_layout
\begin_layout Section
Linear Prediction
\end_layout
\begin_layout Standard
The linear prediction part used for the high-band is very similar to what
is done for narrowband.
The only difference is that we use only 12 bits to encode the high-band
LSP's using a multi-stage vector quantizer (MSVQ).
The first level quantizes the 10 coefficients with 6 bits and the error
is then quantized using 6 bits, too.
\end_layout
\begin_layout Section
Pitch Prediction
\end_layout
\begin_layout Standard
That part is easy: there's no pitch prediction for the high-band.
There are two reasons for that.
First, there is usually little harmonic structure in this band (above 4
kHz).
Second, it would be very hard to implement since the QMF folds the 4-8
kHz band into 4-0 kHz (reversing the frequency axis), which means that
the location of the harmonics is no longer at multiples of the fundamental
(pitch).
\end_layout
\begin_layout Section
Excitation Quantization
\end_layout
\begin_layout Standard
The high-band excitation is coded in the same way as for narrowband.
\end_layout
\begin_layout Section
Bit allocation
\end_layout
\begin_layout Standard
For the wideband mode, the entire narrowband frame is packed before the
high-band is encoded.
The narrowband part of the bit-stream is as defined in table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:bits-narrowband"
\end_inset
.
The high-band follows, as described in table
\begin_inset CommandInset ref
LatexCommand ref
reference "cap:bits-wideband"
\end_inset
.
For wideband, the mode ID is the same as the Speex quality setting and
is defined in table
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:wideband-quality"
\end_inset
.
This also means that a wideband frame may be correctly decoded by a narrowband
decoder with the only caveat that if more than one frame is packed in the
same packet, the decoder will need to skip the high-band parts in order
to sync with the bit-stream.
\end_layout
\begin_layout Standard
\begin_inset Float table
placement h
wide true
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="7" columns="7">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Parameter
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Update rate
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Wideband bit
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Mode ID
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
LSP
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Excitation gain
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Excitation VQ
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
sub-frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
20
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
40
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
80
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Total
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
frame
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
36
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
112
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
192
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
352
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Bit allocation for high-band in wideband mode
\begin_inset CommandInset label
LatexCommand label
name "cap:bits-wideband"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Float table
placement h
wide true
sideways false
status open
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{center}
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="12" columns="3">
<features>
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Mode/Quality
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Bit-rate
\begin_inset Index
status collapsed
\begin_layout Plain Layout
bit-rate
\end_layout
\end_inset
(bps)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Quality/description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3,950
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Barely intelligible (mostly for comfort noise)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5,750
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Very noticeable artifacts/noise, poor intelligibility
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7,750
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Very noticeable artifacts/noise, good intelligibility
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9,800
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Artifacts/noise sometimes annoying
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
4
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
12,800
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Artifacts/noise usually noticeable
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
5
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
16,800
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Artifacts/noise sometimes noticeable
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
6
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
20,600
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Need good headphones to tell the difference
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
23,800
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Need good headphones to tell the difference
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
8
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
27,800
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Hard to tell the difference even with good headphones
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
9
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
34,200
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Hard to tell the difference even with good headphones
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
10
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
42,200
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Completely transparent for voice, good quality music
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{center}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Quality versus bit-rate for the wideband encoder
\begin_inset CommandInset label
LatexCommand label
name "tab:wideband-quality"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
clearpage
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
clearpage
\end_layout
\end_inset
\end_layout
\begin_layout Chapter
\start_of_appendix
Sample code
\begin_inset CommandInset label
LatexCommand label
name "sec:Sample-code"
\end_inset
\end_layout
\begin_layout Standard
This section shows sample code for encoding and decoding speech using the
Speex API.
The commands can be used to encode and decode a file by calling:
\family typewriter
\begin_inset Newline newline
\end_inset
% sampleenc in_file.sw | sampledec out_file.sw
\family default
\begin_inset Newline newline
\end_inset
where both files are raw (no header) files encoded at 16 bits per sample
(in the machine natural endianness).
\end_layout
\begin_layout Section
sampleenc.c
\end_layout
\begin_layout Standard
sampleenc takes a raw 16 bits/sample file, encodes it and outputs a Speex
stream to stdout.
Note that the packing used is
\series bold
not
\series default
compatible with that of speexenc/speexdec.
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "sampleenc.c"
lstparams "caption={Source code for sampleenc},label={sampleenc-source-code},numbers=left,numberstyle={\\footnotesize}"
\end_inset
\end_layout
\begin_layout Section
sampledec.c
\end_layout
\begin_layout Standard
sampledec reads a Speex stream from stdin, decodes it and outputs it to
a raw 16 bits/sample file.
Note that the packing used is
\series bold
not
\series default
compatible with that of speexenc/speexdec.
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "sampledec.c"
lstparams "caption={Source code for sampledec},label={sampledec-source-code},numbers=left,numberstyle={\\footnotesize}"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Jitter Buffer for Speex
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand lstinputlisting
filename "../speexclient/speex_jitter_buffer.c"
lstparams "caption={Example of using the jitter buffer for Speex packets},label={example-speex-jitter},numbers=left,numberstyle={\\footnotesize}"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
IETF RTP Profile
\begin_inset CommandInset label
LatexCommand label
name "sec:IETF-draft"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand verbatiminput
filename "draft-ietf-avt-rtp-speex-05-tmp.txt"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
Speex License
\begin_inset CommandInset label
LatexCommand label
name "sec:Speex-License"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset include
LatexCommand verbatiminput
filename "../COPYING"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage newpage
\end_inset
\end_layout
\begin_layout Chapter
GNU Free Documentation License
\end_layout
\begin_layout Standard
Version 1.1, March 2000
\end_layout
\begin_layout Standard
Copyright (C) 2000 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted
to copy and distribute verbatim copies of this license document, but changing
it is not allowed.
\end_layout
\begin_layout Section*
0.
PREAMBLE
\end_layout
\begin_layout Standard
The purpose of this License is to make a manual, textbook, or other written
document "free" in the sense of freedom: to assure everyone the effective
freedom to copy and redistribute it, with or without modifying it, either
commercially or noncommercially.
Secondarily, this License preserves for the author and publisher a way
to get credit for their work, while not being considered responsible for
modifications made by others.
\end_layout
\begin_layout Standard
This License is a kind of "copyleft", which means that derivative works
of the document must themselves be free in the same sense.
It complements the GNU General Public License, which is a copyleft license
designed for free software.
\end_layout
\begin_layout Standard
We have designed this License in order to use it for manuals for free software,
because free software needs free documentation: a free program should come
with manuals providing the same freedoms that the software does.
But this License is not limited to software manuals; it can be used for
any textual work, regardless of subject matter or whether it is published
as a printed book.
We recommend this License principally for works whose purpose is instruction
or reference.
\end_layout
\begin_layout Section*
1.
APPLICABILITY AND DEFINITIONS
\end_layout
\begin_layout Standard
This License applies to any manual or other work that contains a notice
placed by the copyright holder saying it can be distributed under the terms
of this License.
The "Document", below, refers to any such manual or work.
Any member of the public is a licensee, and is addressed as "you".
\end_layout
\begin_layout Standard
A "Modified Version" of the Document means any work containing the Document
or a portion of it, either copied verbatim, or with modifications and/or
translated into another language.
\end_layout
\begin_layout Standard
A "Secondary Section" is a named appendix or a front-matter section of the
Document that deals exclusively with the relationship of the publishers
or authors of the Document to the Document's overall subject (or to related
matters) and contains nothing that could fall directly within that overall
subject.
(For example, if the Document is in part a textbook of mathematics, a Secondary
Section may not explain any mathematics.) The relationship could be a matter
of historical connection with the subject or with related matters, or of
legal, commercial, philosophical, ethical or political position regarding
them.
\end_layout
\begin_layout Standard
The "Invariant Sections" are certain Secondary Sections whose titles are
designated, as being those of Invariant Sections, in the notice that says
that the Document is released under this License.
\end_layout
\begin_layout Standard
The "Cover Texts" are certain short passages of text that are listed, as
Front-Cover Texts or Back-Cover Texts, in the notice that says that the
Document is released under this License.
\end_layout
\begin_layout Standard
A "Transparent" copy of the Document means a machine-readable copy, represented
in a format whose specification is available to the general public, whose
contents can be viewed and edited directly and straightforwardly with generic
text editors or (for images composed of pixels) generic paint programs
or (for drawings) some widely available drawing editor, and that is suitable
for input to text formatters or for automatic translation to a variety
of formats suitable for input to text formatters.
A copy made in an otherwise Transparent file format whose markup has been
designed to thwart or discourage subsequent modification by readers is
not Transparent.
A copy that is not "Transparent" is called "Opaque".
\end_layout
\begin_layout Standard
Examples of suitable formats for Transparent copies include plain ASCII
without markup, Texinfo input format, LaTeX input format, SGML or XML using
a publicly available DTD, and standard-conforming simple HTML designed
for human modification.
Opaque formats include PostScript, PDF, proprietary formats that can be
read and edited only by proprietary word processors, SGML or XML for which
the DTD and/or processing tools are not generally available, and the machine-ge
nerated HTML produced by some word processors for output purposes only.
\end_layout
\begin_layout Standard
The "Title Page" means, for a printed book, the title page itself, plus
such following pages as are needed to hold, legibly, the material this
License requires to appear in the title page.
For works in formats which do not have any title page as such, "Title Page"
means the text near the most prominent appearance of the work's title,
preceding the beginning of the body of the text.
\end_layout
\begin_layout Section*
2.
VERBATIM COPYING
\end_layout
\begin_layout Standard
You may copy and distribute the Document in any medium, either commercially
or noncommercially, provided that this License, the copyright notices,
and the license notice saying this License applies to the Document are
reproduced in all copies, and that you add no other conditions whatsoever
to those of this License.
You may not use technical measures to obstruct or control the reading or
further copying of the copies you make or distribute.
However, you may accept compensation in exchange for copies.
If you distribute a large enough number of copies you must also follow
the conditions in section 3.
\end_layout
\begin_layout Standard
You may also lend copies, under the same conditions stated above, and you
may publicly display copies.
\end_layout
\begin_layout Section*
3.
COPYING IN QUANTITY
\end_layout
\begin_layout Standard
If you publish printed copies of the Document numbering more than 100, and
the Document's license notice requires Cover Texts, you must enclose the
copies in covers that carry, clearly and legibly, all these Cover Texts:
Front-Cover Texts on the front cover, and Back-Cover Texts on the back
cover.
Both covers must also clearly and legibly identify you as the publisher
of these copies.
The front cover must present the full title with all words of the title
equally prominent and visible.
You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve the
title of the Document and satisfy these conditions, can be treated as verbatim
copying in other respects.
\end_layout
\begin_layout Standard
If the required texts for either cover are too voluminous to fit legibly,
you should put the first ones listed (as many as fit reasonably) on the
actual cover, and continue the rest onto adjacent pages.
\end_layout
\begin_layout Standard
If you publish or distribute Opaque copies of the Document numbering more
than 100, you must either include a machine-readable Transparent copy along
with each Opaque copy, or state in or with each Opaque copy a publicly-accessib
le computer-network location containing a complete Transparent copy of the
Document, free of added material, which the general network-using public
has access to download anonymously at no charge using public-standard network
protocols.
If you use the latter option, you must take reasonably prudent steps, when
you begin distribution of Opaque copies in quantity, to ensure that this
Transparent copy will remain thus accessible at the stated location until
at least one year after the last time you distribute an Opaque copy (directly
or through your agents or retailers) of that edition to the public.
\end_layout
\begin_layout Standard
It is requested, but not required, that you contact the authors of the Document
well before redistributing any large number of copies, to give them a chance
to provide you with an updated version of the Document.
\end_layout
\begin_layout Section*
4.
MODIFICATIONS
\end_layout
\begin_layout Standard
You may copy and distribute a Modified Version of the Document under the
conditions of sections 2 and 3 above, provided that you release the Modified
Version under precisely this License, with the Modified Version filling
the role of the Document, thus licensing distribution and modification
of the Modified Version to whoever possesses a copy of it.
In addition, you must do these things in the Modified Version:
\end_layout
\begin_layout Itemize
A.
Use in the Title Page (and on the covers, if any) a title distinct from
that of the Document, and from those of previous versions (which should,
if there were any, be listed in the History section of the Document).
You may use the same title as a previous version if the original publisher
of that version gives permission.
\end_layout
\begin_layout Itemize
B.
List on the Title Page, as authors, one or more persons or entities responsible
for authorship of the modifications in the Modified Version, together with
at least five of the principal authors of the Document (all of its principal
authors, if it has less than five).
\end_layout
\begin_layout Itemize
C.
State on the Title page the name of the publisher of the Modified Version,
as the publisher.
\end_layout
\begin_layout Itemize
D.
Preserve all the copyright notices of the Document.
\end_layout
\begin_layout Itemize
E.
Add an appropriate copyright notice for your modifications adjacent to
the other copyright notices.
\end_layout
\begin_layout Itemize
F.
Include, immediately after the copyright notices, a license notice giving
the public permission to use the Modified Version under the terms of this
License, in the form shown in the Addendum below.
\end_layout
\begin_layout Itemize
G.
Preserve in that license notice the full lists of Invariant Sections and
required Cover Texts given in the Document's license notice.
\end_layout
\begin_layout Itemize
H.
Include an unaltered copy of this License.
\end_layout
\begin_layout Itemize
I.
Preserve the section entitled "History", and its title, and add to it an
item stating at least the title, year, new authors, and publisher of the
Modified Version as given on the Title Page.
If there is no section entitled "History" in the Document, create one stating
the title, year, authors, and publisher of the Document as given on its
Title Page, then add an item describing the Modified Version as stated
in the previous sentence.
\end_layout
\begin_layout Itemize
J.
Preserve the network location, if any, given in the Document for public
access to a Transparent copy of the Document, and likewise the network
locations given in the Document for previous versions it was based on.
These may be placed in the "History" section.
You may omit a network location for a work that was published at least
four years before the Document itself, or if the original publisher of
the version it refers to gives permission.
\end_layout
\begin_layout Itemize
K.
In any section entitled "Acknowledgements" or "Dedications", preserve the
section's title, and preserve in the section all the substance and tone
of each of the contributor acknowledgements and/or dedications given therein.
\end_layout
\begin_layout Itemize
L.
Preserve all the Invariant Sections of the Document, unaltered in their
text and in their titles.
Section numbers or the equivalent are not considered part of the section
titles.
\end_layout
\begin_layout Itemize
M.
Delete any section entitled "Endorsements".
Such a section may not be included in the Modified Version.
\end_layout
\begin_layout Itemize
N.
Do not retitle any existing section as "Endorsements" or to conflict in
title with any Invariant Section.
\end_layout
\begin_layout Standard
If the Modified Version includes new front-matter sections or appendices
that qualify as Secondary Sections and contain no material copied from
the Document, you may at your option designate some or all of these sections
as invariant.
To do this, add their titles to the list of Invariant Sections in the Modified
Version's license notice.
These titles must be distinct from any other section titles.
\end_layout
\begin_layout Standard
You may add a section entitled "Endorsements", provided it contains nothing
but endorsements of your Modified Version by various parties--for example,
statements of peer review or that the text has been approved by an organization
as the authoritative definition of a standard.
\end_layout
\begin_layout Standard
You may add a passage of up to five words as a Front-Cover Text, and a passage
of up to 25 words as a Back-Cover Text, to the end of the list of Cover
Texts in the Modified Version.
Only one passage of Front-Cover Text and one of Back-Cover Text may be
added by (or through arrangements made by) any one entity.
If the Document already includes a cover text for the same cover, previously
added by you or by arrangement made by the same entity you are acting on
behalf of, you may not add another; but you may replace the old one, on
explicit permission from the previous publisher that added the old one.
\end_layout
\begin_layout Standard
The author(s) and publisher(s) of the Document do not by this License give
permission to use their names for publicity for or to assert or imply endorseme
nt of any Modified Version.
\end_layout
\begin_layout Section*
5.
COMBINING DOCUMENTS
\end_layout
\begin_layout Standard
You may combine the Document with other documents released under this License,
under the terms defined in section 4 above for modified versions, provided
that you include in the combination all of the Invariant Sections of all
of the original documents, unmodified, and list them all as Invariant Sections
of your combined work in its license notice.
\end_layout
\begin_layout Standard
The combined work need only contain one copy of this License, and multiple
identical Invariant Sections may be replaced with a single copy.
If there are multiple Invariant Sections with the same name but different
contents, make the title of each such section unique by adding at the end
of it, in parentheses, the name of the original author or publisher of
that section if known, or else a unique number.
Make the same adjustment to the section titles in the list of Invariant
Sections in the license notice of the combined work.
\end_layout
\begin_layout Standard
In the combination, you must combine any sections entitled "History" in
the various original documents, forming one section entitled "History";
likewise combine any sections entitled "Acknowledgements", and any sections
entitled "Dedications".
You must delete all sections entitled "Endorsements."
\end_layout
\begin_layout Section*
6.
COLLECTIONS OF DOCUMENTS
\end_layout
\begin_layout Standard
You may make a collection consisting of the Document and other documents
released under this License, and replace the individual copies of this
License in the various documents with a single copy that is included in
the collection, provided that you follow the rules of this License for
verbatim copying of each of the documents in all other respects.
\end_layout
\begin_layout Standard
You may extract a single document from such a collection, and distribute
it individually under this License, provided you insert a copy of this
License into the extracted document, and follow this License in all other
respects regarding verbatim copying of that document.
\end_layout
\begin_layout Section*
7.
AGGREGATION WITH INDEPENDENT WORKS
\end_layout
\begin_layout Standard
A compilation of the Document or its derivatives with other separate and
independent documents or works, in or on a volume of a storage or distribution
medium, does not as a whole count as a Modified Version of the Document,
provided no compilation copyright is claimed for the compilation.
Such a compilation is called an "aggregate", and this License does not
apply to the other self-contained works thus compiled with the Document,
on account of their being thus compiled, if they are not themselves derivative
works of the Document.
\end_layout
\begin_layout Standard
If the Cover Text requirement of section 3 is applicable to these copies
of the Document, then if the Document is less than one quarter of the entire
aggregate, the Document's Cover Texts may be placed on covers that surround
only the Document within the aggregate.
Otherwise they must appear on covers around the whole aggregate.
\end_layout
\begin_layout Section*
8.
TRANSLATION
\end_layout
\begin_layout Standard
Translation is considered a kind of modification, so you may distribute
translations of the Document under the terms of section 4.
Replacing Invariant Sections with translations requires special permission
from their copyright holders, but you may include translations of some
or all Invariant Sections in addition to the original versions of these
Invariant Sections.
You may include a translation of this License provided that you also include
the original English version of this License.
In case of a disagreement between the translation and the original English
version of this License, the original English version will prevail.
\end_layout
\begin_layout Section*
9.
TERMINATION
\end_layout
\begin_layout Standard
You may not copy, modify, sublicense, or distribute the Document except
as expressly provided for under this License.
Any other attempt to copy, modify, sublicense or distribute the Document
is void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under this
License will not have their licenses terminated so long as such parties
remain in full compliance.
\end_layout
\begin_layout Section*
10.
FUTURE REVISIONS OF THIS LICENSE
\end_layout
\begin_layout Standard
The Free Software Foundation may publish new, revised versions of the GNU
Free Documentation License from time to time.
Such new versions will be similar in spirit to the present version, but
may differ in detail to address new problems or concerns.
See http://www.gnu.org/copyleft/.
\end_layout
\begin_layout Standard
Each version of the License is given a distinguishing version number.
If the Document specifies that a particular numbered version of this License
"or any later version" applies to it, you have the option of following
the terms and conditions either of that specified version or of any later
version that has been published (not as a draft) by the Free Software Foundatio
n.
If the Document does not specify a version number of this License, you
may choose any version ever published (not as a draft) by the Free Software
Foundation.
\end_layout
\begin_layout Standard
\begin_inset CommandInset index_print
LatexCommand printindex
\end_inset
\end_layout
\end_body
\end_document