Internet: netlib@nac.no
A similar collection of statistical software is available from
statlib@temper.stat.cmu.edu.
The symbolic algebra system REDUCE is supported by reduce-netlib@rand.org.
Naval Surface Warfare Center (E43)
[Witold Waldman, Witold.Waldman@dsto.defence.gov.au]
The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited linear
prediction voice coder version 3.2 (CELP 3.2) Fortran and C simulation
source codes are available for worldwide distribution (on DOS
diskettes, but configured to compile on Sun SPARC stations) from NTIS
and DTIC. Example input and processed speech files are included. A
Technical Information Bulletin (TIB), "Details to Assist in
Implementation of Federal Standard 1016 CELP," and the official
standard, "Federal Standard 1016, Telecommunications: Analog to
Digital Conversion of Radio Voice by 4,800 bit/second Code Excited
Linear Prediction (CELP)," are also available.
FS-1016 CELP 3.2 may also be obtained from file://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/celp_3.2a.tar.Z or file://ftp.super.org/pub/speech/celp_3.2a.tar.Z.
LPC is available from ftp://ftp.super.org/pub/speech/lpc10-1.0.tar.gz or file://svr-ftp.eng.cam.ac.uk/pub/speech/coding/lpc10-1.0.tar.gz.
MATLAB software for LPC-10 is available from
http://www.eas.asu.edu/~spanias/srtcrs.html.
Also, postscript copies of tutorials of speech coding can be found at
http://www.eas.asu.edu/~spanias/papers.html.
[Andreas Spanias, spanias@asu.edu]
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch,
The Federal Standard 1016 4800 bps CELP Voice Coder, Digital Signal
Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155.
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch,
The DoD 4.8 kbps Standard (Proposed Federal Standard 1016),
in Advances in Speech Coding, ed. Atal, Cuperman and Gersho,
Kluwer Academic Publishers, 1991, Chapter 12, p. 121-133.
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The
Proposed Federal Standard 1016 4800 bps Voice Coder: CELP, Speech
Technology Magazine, April/May 1990, p. 58-64.
Additional information on CELP can also be found in the
comp.speech FAQ.
The U. S. Federal Standard 1015 (NATO STANAG 4198) is described in:
Thomas E. Tremain, The Government Standard Linear Predictive Coding
Algorithm: LPC-10, Speech Technology Magazine, April 1982, pp. 40-49.
[Most of the above from Joe Campbell, jpcampb@afterlife.ncsc.mil, with
additions from Dan Frankowski, drankow@winternet.com, and Ed Hall, edhall@rand.org]
Note that this is NOT a G.722 coder. The ADPCM standard is
much more complicated, probably resulting in better quality sound but
also in much more computational overhead.
This is also available as:
file://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/G711_G722_G723.tar.gz
[From Dan Frankowski, dfrankow@winternet.com; Jack Jansen, Jack.Jansen@cwi.nl]
The Communications and Operating Systems Research Group (KBS) at the
Technische Universitaet Berlin is currently working on a set of
UNIX-based tools for computer-mediated telecooperation that will be
made freely available.
As part of this effort we are publishing an implementation of the
European GSM 06.10 provisional standard for full-rate speech
transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
excitation/long term prediction) coding at 13 kbit/s.
GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling
rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility
with typical UNIX applications, our implementation turns frames of 160
16-bit linear samples into 33-byte frames (1650 Bytes/s).
The quality of the algorithm is good enough for reliable speaker
recognition; even music often survives transcoding in recognizable
form (given the bandwidth limitations of 8 kHz sampling rate).
The interfaces offered are a front end modeled after compress(1), and
a library API. Compression and decompression run faster than realtime
on most SPARCstations. The implementation has been verified against the
ETSI standard test patterns.
Jutta Degener jutta@cs.tu-berlin.de, Carsten Bormann cabo@cs.tu-berlin.de)
Communications and Operating Systems Research Group, TU Berlin
[From Dan Frankowski, dfrankow@winternet.com; Jutta Degener, jutta@cs.tu-berlin.de]
This book is available in paperback and makes a good desk reference.
An algorithm implementation that matches a large body of psychoacoustical
work, but which is computationally very intensive, is presented in the paper:
The definitive papers describing the use of such a perceptual pitch
detector as applied to the classical pitch literature is in:
The current work that argues for a pure spectral method starts with the work
of Goldstein:
Two approaches are worth considering if something approximating pitch
is appropriate. The people at IRCAM have proposed a harmonic analysis
approach that can be implemented on a DSP:
The classic paper for time domain (peak picking) pitch algorithms is:
[The above from Malcolm Slaney, Interval Research, and John Lazzaro,
U.C. Berkeley.]
AES/EBU is a bit-serial communications protocol for transmitting
digital audio data through a single transmission line. It provides two
channels of audio data (up to 24 bits per sample), a method for
communication control and status information ("channel status bits"),
and some error detection capabilities. Clocking information (i.e.,
sample rate) is derived from the AES/EBU bit stream, and is thus
controlled by the transmitter. The standard mandates use of 32 kHz,
44.1 kHz, or 48 kHz sample rates, but some interfaces can be made to
work at other sample rates.
AES/EBU provides both "professional" and "consumer" modes. The big
difference is in the format of the channel status bits mentioned above.
The professional mode bits include alphanumeric channel origin and
destination data, time of day codes, sample number codes, word length,
and other goodies. The consumer mode bits have much less information,
but do include information on copy protection (naturally). Additionally,
the standard provides for "user data", which is a bit stream containing
user-defined (i.e., manufacturer-defined) data. According to Tim
Channon, "CD user data is almost raq CD subcode; DAT is StartID and
SkipID. In professional mode, there is an SDLC protocol or, if DAT,
it may be the same as consumer mode."
The physical connection media are commonly used with AES/EBU:
balanced (differential), using two wires and shield in three-wire microphone
cable with XLR connectors; unbalanced (single-ended), using audio coax cable
with RCA jacks; and optical (via fiber optics).
[The above from Phil Lapsley and Tim Channon,
tchannon@black.demon.co.uk]
Painter, E. M., and Spanias, A. S. (1997 and revised 1999). A Review of
Algorithms for Perceptual Coding of Digital Audio Signals. (PostScript, 3MB)
http://www.eas.asu.edu/~spanias/papers.html
[Andreas Spanias, spanias@asu.edu]
Desktop Sparc machines come with routines to convert between linear and
mu-law samples. On a desktop Sparc, see the man page for audio_ulaw2linear
in /usr/demo/SOUND/man.
Michael Villeret, et. al, A New Digital Technique for Implementation
of Any Continuous PCM Companding Law, IEEE Int. Conf. on Communications,
1973, vol. 1, pp. 11.12-11.17.
MIL-STD-188-113, Interoperability and Performance Standards
for Analog-to-Digital Conversion Techniques, 17 February 1987.
TI Digital Signal Processing Applications with the TMS320 Family
(TI literature number SPRA012A), pp. 169-198.
[From Joe Campbell; Craig Reese, cfreese@super.org; Sepehr Mehrabanzad,
sepehr@falstaff.dev.cdx.mot.com; Keith Kendall, KLK3%mimi@magic.itg.ti.com]
CD players use a 44.1 kHz sample rate, whereas DAT uses a 48
kHz sample rate. This means that you must do sample rate
conversion before you can get data from a CD player directly
into a DAT deck.
[From Ed Hall, edhall@rand.org:]
For a start, look at Multirate Digital Signal Processing
by Crochiere and Rabiner (see FAQ section 1.1).
Almost any technique for producing good digital low-pass
filters will be adaptable to sample-rate conversion. 44.1:48
and vice-versa is pretty hairy, though, because the lowest
whole-number ratio is 147:160. To do all that in one go
would require a FIR with thousands of coefficients, of which
only 1/147th or 1/160th are used for each sample--the real
problem is memory, not CPU for most DSP chips. You could
chain several interpolators and decimators, as suggested by
factoring the ratio into 3*7*7:2*2*2*2*2*5. This adds
complexity, but reduces the number of coefficients required
by a considerable amount.
[From Lou Scheffer:]
Theory of operation: 44.1 and 48 are in the ratio 147/160.
To convert from 44.1 to 48, for example, we (conceptually):
So we need to design an FIR filter that is flat to 20 KHz,
and down at least X db at 24 KHz. How big does X need to
be? You might think about 100 db, since the max signal size
is roughly +-32767, and the input quantization +- 1/2, so we
know the input had a signal to broadband noise ratio of 98
db at most. However, the noise in the stopband
(20KHz-3.5MHz) is all folded into the passband by the
decimation in step 3, so we need another 22 db (that's 160
in db) to account for the noise folding. Thus 120 db
rejection yields a broadband noise equal to the original
quantizing noise. If you are a fanatic, you can shoot for
130 db to make the original quantizing errors dominate, and
a 22.05 KHz cutoff to eliminate even ultrasonic aliasing.
You will pay for your fanaticism with a penance of more
taps, however.
A paper available as
file://ccrma-ftp.stanford.edu/pub/DSP/Tutorials/BandlimitedInterpolation.eps.Z
explains the algorithm. Free source code, as well as an HTML
discussion of the algorithm, is available
at http://ccrma-www.stanford.edu/~jos/resample/. It all works quite well.
[From Kevin Bradley, kb+@andrew.cmu.edu:]
There is an implementation of polyphase resampling for various
rates as a part of the Sox audio toolkit at
http://home.sprynet.com/~cbagwell/sox.html. See file polyphas.c for details.
Sox also contains an implementation of bandlimited interpolation
and linear interpolation, and serves as a ready vehicle for module
experimentation.
[From Fritz M. Rothacher, f.rothacher@ieee.org:]
You can add my Ph.D. thesis on sample-rate conversion to the FAQ:
Fritz M. Rothacher, Sample-Rate Conversion: Algorithms and VLSI
Implementation, Ph.D. thesis, Integrated Systems Lab, Swiss
Federal Institute of Technology, ETH Zuerich, 1995, ISBN 3-89191-873-9
It can also be downloaded from my homepage at
http://www.guest.iis.ee.ethz.ch/~rota.
The mathematical theory behind wavelets (and other related transforms) is
given in the appendix of the XWPL reference manual. The XWPL manual can be found at
http://venus.javeriana.edu.co/WAVELETS/.
Other sources of information on wavelets are:
A good introductory book on wavelets:
A more thorough book:
A couple more interesting papers:
Mac Cody's articles in Dr. Dobb's Journal, April 1992 and April 1993
Paper by Ingrid Daubechies in IEEE Trans. on Info. theory , vol 36.
No.5 , Sept 1990 and a book titled " Ten lectures on Wavelets" deal
with the mathematical aspects of the WT.
Binaries are available for the following platforms:
Sun Sparcstations running SunOS 4.1 or Solaris 2.3,
NeXT machines running NeXTstep 3.0 or higher, with an X server,
Silicon Graphics machines (IRIS),
DEC Alpha AXP running OSF/1 1.2 or higher,
i386/i486 PC compatible with Linux 0.99.
There is also a sample data directory containing interesting signals.
[From Fazal Majid majid@math.yale.edu]:
The programs have been tested on Sparcstations running SunOS
4.1.n with MATLAB 4.1. However, the "mex" code is generic and
should run on other platforms (you may have to tinker the
Makefiles a little bit to make this work). There are several
utility routines all of them callable from MATLAB. All the C
files (leading to the mex files) can also be directly
accessed from other C or Fortran code.
A collection of of papers and tech. reports from the DSP
group is also available. You could obtain this distribution
of software and papers by anonymous ftp from cml.rice.edu.
Report problems/bugs and installation info on
non-SUN/non-unix platforms send mail to wlet-tools@rice.edu
(or ramesh@dsp.rice.edu)
For all the gorey details, I suggest the paper:
Andrew Reilly and Gordon Frazer and Boualem Boashash: Analytic signal
generation---tips and traps, IEEE Transactions on Signal Processing,
no. 11, vol. 42, Nov. 1994, pp. 3241-3245.
For comp.dsp, the gist is:
If your original filter design produced an impulse response with
an even number of taps, then the filtering in 3 will introduce a
spurious half-sample delay (resampling the real signal component),
but that does not matter for many applications, and such filters
have other features to recommend them.
Andrew Reilly [Reilly@zeta.org.au]
Q2.1: Where can I get public domain algorithms for general-purpose DSP?
Updated 12/31/96
Netlib
EARN/BITNET: netlib%nac.no@norunix.bitnet
X.400: s=netlib; o=nac; c=no;
EUNET/uucp: nac!netlib
NSWC Library
Report No.: NSWC TR 90-21, January 1990
by Alfred H. Morris, Jr.
Dahlgren, VA 22448-5000
U.S.A.
IEEE Press book "Programs For Digital Signal Processing"
Q2.2: What are CELP and LPC? Where can I get the source for CELP and LPC?
Updated 9/24/98
NTIS
U.S. Department of Commerce
5285 Port Royal Road
Springfield, VA 22161
USA
(800) 553-6847
Q2.3: What is ADPCM? Where can I get source for it?
Updated: 1/7/97
ADPCM stands for Adaptive Differential Pulse Code Modulation. It is a
family of speech compression and decompression algorithms. A common
implementation takes 16-bit linear PCM samples samples and converts
them to 4-bit samples, yielding a compression rate of 4:1.
adpcm_coder(short inbuf[], char outbuf[], int nsample,
struct adpcm_state *state);
adpcm_decoder(char inbuf[], short outbuf[], int nsample,
struct adpcm_state *state);
Q2.4: What is GSM? Where can I get source for it?
Updated 1/7/96
Fax: +49.30.31425156, Phone: +49.30.31424315
Q2.5: How does pitch perception work, and how do I implement it on my DSP chip?
Updated 6/3/98
B.C.J. Moore, An Introduction to the Psychology of Hearing,
Academic Press, London, 1997.
Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector,"
Proceedings of the International Conference of Acoustics, Speech,
and Signal Processing, 1990, Albuquerque, New Mexico.
Available for ftp at
ftp://worldserver.com/pub/malcolm/ICASSP90.psc.Z
Ray Meddis and M. J. Hewitt. "Virtual pitch and phase
sensitivity of a computer model of the auditory periphery. "
Journal of the Acoustical Society of America 89 (6 1991): 2866-2682.
and 2883-2894.
J. Goldstein, "An optimum processor theory for the
central formation of the pitch of complex tones," Journal
of the Acoustical Society of America 54, 1496-1516, 1973.
Boris Doval and Xavier Rodet, "Estimation of Fundamental Frequency
of Musical Sound Signals," Proceedings of the 1991 International
Conference on Acoustics, Speech, and Signal Processing, Toronto,
Volume 5, pp. 3657-3660.
B. Gold and L. Rabiner, "Parallel processing techniques for estimating
pitch periods of speech in the time domain," Journal of the Acoustical
Society of America, 46, pp 441-448, 1969.
Q2.6: What standards exist for digital audio? What is AES/EBU? What is S/PDIF?
Updates 1/8/97Q2.6.1: Where can I get copies of ITU (formerly CCITT) standards?
Q2.6.2: What standards are there for digital audio?
AES/EBU
S/P-DIF
Q2.7: What is mu-law encoding? Where can I get source for it?
Updated 9/13/99
Q2.8: How can I do CD <-> DAT sample rate conversion?
Updated 9/13/99
Q2.9: Wavelets
Updated 6/3/98
Q2.9.1 What are wavelets? Where can I get more information?
Q2.9.2 What are some good books and papers on wavelets
Wavelets and Signal Processing- Oliver Rioul and Martin Vetterli,
IEEE Signal Processing magazine, Oct. 91, pp 14-38
Randy K. Young, Wavelet Theory and Its Applications,
Kluwer Academic Publishers, ISBN 0-7923-9271-X, 1993.
Ali N. Akansu and Richard A. Haddad,
Multiresolution Signal Decomposition Transforms, Subbands, Wavelets
Academic Press, Inc., ISBN 0-12-047140-X
Wavelets and Filter banks: Theory and Design, IEEE Transactions on
Signal Processing, Vol. 40, No.9, Sept. 1992, pp 2207-2232
Q2.9.3: Where can I get some software for wavelets?
ftp://pascal.math.yale.edu/pub/wavelets/software/xwpl
Rice Wavelet Tools
Q2.10: How do I calculate the coefficients for a Hilbert transformer?
Updated 6/3/98