Sound on the PC
Julian J. Bunn
January 1995
This article appeared in a revised form in Issue 8/95 of Speaker Builder.
Introduction
Those of you who have been following trends in the computing industry will know
that desktop computer multimedia is presently undergoing a technology
explosion. This is evidenced by the vast variety of hardware add-ons such as
sound and video frame grabber cards, CD-ROM drives, purpose-built desktop
loudspeakers, as well as software offerings like multimedia application
authoring tools, voice synthesis and recognition tools, image processing
packages and so on. In this article I will provide some background material
that I hope will be thought-provoking, together with some guesses of how the
technology advances may affect the desktop computer's role as a tool for the
amateur loudspeaker builder. The end of the article includes a prediction about
how these areas of technology will merge and evolve towards an integrated home
multimedia system based on a sophisticated audio/video computer. One use of
such a system may well be to correct for deficiences of room acoustics and
loudspeaker performance or placement in real-time.
Digital Sound Processing
Over the last few years exciting new products have appeared in the domestic
digital audio marketplace, at one time an area populated almost exclusively by
CD players. Witness the burgeoning field of home cinema amplifiers with
sophisticated on-board chips that allow almost unlimited creativity in terms of
digital signal processing and variously artificial sonoric effects.
During the same period, the growth in the range of products for producing sound
on home computers has been prodigious. So that it is now estimated that over
half of all home computers being sold are capable of generating and recording
sound. These computers typically contain extra printed circuit cards designed
specifically to treat sound. Only a few years ago, the electronics on the
board, and the software techniques using for playing and recording sound, were
of low quality: typically eight bit mono samples, rates of 8 kHz, and poor
signal to noise ratio. This situation has improved considerably, so that today,
the norm is sixteen bits per stereo channel, with sampling rates up to 44 kHz,
and signal to noise ratios and response curves that qualify such cards for
inclusion under the "HiFi" umbrella. Additionally, the techniques for
generating and recording sounds have advanced, with the result that home
computers can sound rather pleasing, and no longer like an old 78 played across
a bad `phone line. Methods for compressing audio and video data are well
advanced too. This, coupled with cheaper, faster and larger capacity hard disk
drives, means that satisfactory sizes of audio/video tracks or clips may now be
stored within the computer. Finally, the speed of CPU chips (for example the
Pentium), the greater memory bandwidth afforded by wider (more bits) and faster
busses, the larger amounts of RAM that are usually installed, all result in
sufficient audio and video data rates for jerk-free multimedia, and interesting
possibilities in the area of processing the audio and video digital signals in
real time.
The SoundBlaster Standard
The generic home PC is a desktop device running MS-DOS and, maybe, Windows 3.1,
with an Intel (or clone) x86 architecture chip as the CPU. PCs like this by far
outnumber any other sort of computer, whether in homes or in offices. Of
course, computer sound is not the exclusive province of the PC; it is standard
on the Macintosh, as well as on Sun and most other high-end "workstations"
(Silicon Graphics desktop workstations even include a small video camera for
use in video-conferencing as well as a microphone and loudspeakers). Although
sound card technology is not PC-specific, I will concentrate on the PC
specifics in this article.
The de-facto standard for the PC is the SoundBlaster card, manufactured
by Creative Labs, a Singapore-based company. This card was first introduced in
the 1980s. Being "SoundBlaster compatible" is a major marketing consideration
for other card manufacturers, since vast numbers of PC games require it. The
SoundBlaster "standard" includes the specification of a certain set of
programmable registers that perform functions such as receiving command strings
from the application, returning information on the card set-up to the
application, setting the play/record mode, altering the mixer settings,
starting and stopping DMA transfers, and so on. Creative's stranglehold on the
standard is unlikely to last indefinitely as newer operating systems allow
programmers to shield themselves from the hardware details by the use of
appropriate software "drivers".
Digital sound basics
Since the PC cannot directly manipulate analogue signals it has to deal with
digital units that are, in general, multiples of an 8 bit "byte". This means
that both ADC (Analogue to Digital) and DAC (Digital to Analogue) converter
chips must be present to convert the signals to and from a format that can be
handled. The basic sound generation operation is to convert the value
represented by one byte into a voltage level. Since an eight bit byte can
represent up to 256 (i.e. 2 to the power of 8) different values, then the
voltage level generated can have this number of values. Conversely, the basic
sound recording operation is to convert a voltage level into a byte
value. By stringing together a series of bytes that each represent a different
voltage level, the waveform of a sound is emulated. By manipulating the string
of bytes in various ways the resulting sound wave may be altered. Digital
Signal Processing (DSP) is the term used to refer to the methods by which
signals are treated algorithmically. Sound synthesis is a special DSP technique
for generating a digital signal that, when converted to analogue and played
through a transducer, sounds like a musical instrument. Sound synthesis has
been an especially important area of development in sound card technology, and
various methods are commonly used.
FM synthesis
This is based on the idea of "operators". The more operators, the more
satisfactory the synthesis of the sound is. One or more sine tones (the
"carriers") are modified with one or more sine tone operators. The frequencies
of the operators determine how the carriers are are modified: the resulting
sound is frequency modulated. This is a very general technique that allows to
not only emulate traditional sounds, but also to generate completely new
sounds. One disadvantage of FM synthesis is that the synthesised sounds of real
instruments are rarely very realistic. The initial SoundBlaster cards sported
FM chips which were made (at that time) only by Yamaha. FM synthesis is thus
likely to be around for some time, since it is required for SoundBlaster
compatibility.
Wave table synthesis
In contrast to FM synthesis, Wave Table synthesis allows extremely faithful
simulation of real instrument sounds, since it makes use of digitised
recordings (in the form of "wave files") of real instrument sounds. Boards
offering Wave Table synthesis usually come with a selection of avilable
instrument sounds. If a new instrument sound is needed, then it can be
downloaded from a Wave Table file repository accessed over a network, by using
a modem, or purchased on a diskette, etc.. A disadvantage of Wave Table
synthesis is that it requires a lot of RAM to store the Wave Tables, although
some chips (e.g. the Yamaha OPL4) have a permanent ROM that contains those that
are commonly required.
Digital Signal Processing
In this context we mean the inclusion on the sound card of one or more DSP
chips (as opposed to FM or Wave Table chips etc.). This will almost certainly
be the future means of handling computer-based sound. The key advantage of DSP
is that it is a technique that gives the developer full control over how sounds
are generated or treated: it does not rely on a fixed method instantiated into
chip logic, such as FM or Wave Table synthesis do. DSP chips are programmed to
apply an algorithmic process to a digitised audio signal or to directly
generate an audio signal. The details of the process are then up to the
application designer. Some examples are addition of reverberation to an
existing signal, application of an FFT for voice recognition purposes,
simulation of the sound of each digit on a touch-tone telephone . Two
disadvantages with DSP today are that the chips tend to be expensive, and are
not easily programmed: both these objections are likely to become less and less
valid.
Sound Chips
The heart of the sound card, then, is the chip set that controls the DSP. Audio
DSP is just a special case of a general need throughout the industry for chips
that can process digitized signal data at high rates and in great precision.
Consequently the silicon industry is ramping up production capacity and pouring
research and development money into new chip designs. We'll take some specific
examples that are tagged for the audio markets. Yamaha's OPL4 combines
20 FM and 24 Wave Table synthesised voices on the same chip. An optional effect
processor provides surround sound, echo and reverberation. The Wave Table data
are stored in ROM, and can be in 8,12 or 16 bit sample format. Analog
Devices recently announced the AD1845 chip, which incorporates full duplex
record and play and variable frequency sampling rates. This companies' chips
are already widely distributed on boards from manufacturers such as Orchid
Technology, Hewlett Packard, and Kurzweil Music Systems. The latter company
markets a "MASS Sound Engine" with 32-voice Wave Table synthesis and effects
like echo, reverberation, flanging and pitch shifting.Texas Instruments
TMS320C30 and TMS320C40 DSP devices are used by a number of board manufacturers
who offer PC hardware and software products for analysing audio signals.
Companies such as Sonitech and Loughborough Sound Images have
product ranges that include sophisticated spectrum analysis, speech
recognition, and filtering tools based on these TI chips. Hitachi are
working on chips dedicated to audio/video applications. Intel has
announced its intention to incorporate DSP in the next generation of
Pentium-based Pcs. TriMedia/Philips are working on a multimedia chip
known as a programmable DSP/CPU, which, as its name suggests, combines CPU and
DSP partitions on the same chip. This will probably be programmable from C.
The Versatile PC
Taking into account the technology trends and equipment already available, we
look now at how the tasks of the PC will likely diversify in the domestic
setting in years to come.
The speaker builder's PC
The PC already comes into its own as a tool for the speaker builder when
crossovers need to be designed, box dimensions calculated, optimum loudspeaker
positions in a room estimated, and so on. Many such tools exist both as
commercial products, and in the public domain or as Shareware. They are all
passive tools, however, in the sense that you need to type in parameters and
measurements before the PC can make the required calculations. More exciting
possibilities are now emerging, where the PC itself takes care of gathering the
data, which it then analyses and displays in a meaningful form. The obvious
examples of this are several products that turn the PC into an audio frequency
response measurement tool: you just take care of positioning the microphone,
and the PC plots the response of the equipment.
We can also imagine a tool that completely automates the process of designing a
crossover for a loudspeaker enclosure and drivers. Imagine the situation where
you have built the box and installed the drivers in it, and you want to build a
crossover that produces the flattest response curve (even though this may not
in fact produce the best sound!). Leave your copy of Vance Dickason in the
bookcase, and turn your PC on instead. This PC contains two sound cards: you're
going to use one to drive the tweeter, and one to drive the woofer. Connect the
tweeter to the output of one card, and the woofer to the output of the other.
Connect a microphone to the input of one sound card, and place the microphone a
short distance from the loudspeaker. Now fire up the automatic crossover design
software, and watch the PC screen as it displays the progress of its
deliberations! After a few moments it prints out the circuit diagram and LCR
values of the optimum crossover for your box and drivers. It asks if you'd like
to listen to some music in order to evaluate what the system would sound like,
or if you'd like to alter the order of the crossover, and see if that would
produce any improvement. In order to do all this, the crossover design software
uses a simulation of the response curve of an N order crossover with given LCR
values to send the appropriate (different) signals to the tweeter and woofer,
and then records and analyses the signal from the microphone that results.
According to this analysis, it simulates an adjustment of the LCR values in
order to fit a flat response curve. It is a very simple computational task to
fit using a least squares method the recorded response curve to a flat curve
by using the simulated variation in LCR values.
A PC in your hi-fi system
Modern consumer audio systems allow to superimpose the acoustic response of one
venue on top of a piece of music recorded in a different venue. These systems
offer the owner varying amounts of control over the parameters of such audio
signal conversion. We can imagine that the audio signal conversion component of
the above system be replaced by an audio DSP-capable PC. In this case the only
limit to the variety of effects that can be achieved is the flexibility of the
DSP computer programs running in the PC and our imagination. We can imagine a
set of software DSP "tools" that we can bolt together as building blocks to
achieve the processing effect or monitoring we want. Here is a list of some of
the more obvious:
An FFT tool that displays the frequency components of the source signal (i.e. a
spectrum analyser),
A graphics equalizer that allows us to boost or suppress bands of frequency in
the signal,
A stereo L-R difference signal extractor, that we use to detect mono, or use as
an extra output,
A Dolby Pro-Logic emulator, that in software extracts the surround sound
information from signals of this type,
A digital filter, whose coefficients are calculated to reproduce the acoustic
environment of an arbitrary venue,
A convolver, which convolves the digital filter above with an audio signal to
reproduce the sound in the venue,
A room response calculator, used with a microphone placed in the room, which
folds out the source signal from the measured signal in the room, and yields
the coefficients of the digital filter for the room,
An inverter, which flattens out the measured room response, by using the
inverse of the measured room response.
Futures
The main identifiable trend in PC-based audio today is clearly that of
integrating audio DSP on the motherboard of the PC. This not only does away
with sound card installation hassles, but also allows the manufacturer to
properly integrate and optimise the power of the sound chip set with the rest
of the electronics. We are likely to see full duplex play and record,
increasingly sophisticated on-board DSP algorithms (including perhaps
multi-channel surround sound decoders), higher sampling rates and better
filters, 32 bit sampling, rising to 64 bits in the longer term.
A discussion of PC-based audio futures would not be complete without mentioning
areas which, in the end, will most likely be combined into a single device that
acts as a home control centre. These include the integration of the telephone
system in the PC, with the full functionality of an answering machine, FAX
machine, together with voice recognition software that can identify the caller,
voice synthesis software that can formulate an appropriate reply(!), and
software that will allow the caller to interact with the PC by using, for
example, the buttons on a touchtone telephone. With the PC an integral part of
the home telephone system, it can provide you access to remote computers via
dial-up lines, allowing you to download or upload audio/video files, and even
to play them in real time once compression algorithms have advanced
sufficiently. Access to the Internet across dial-up lines and hence the vast
wealth of online information there is another highly attractive possibility.
Those of us fortunate to be able to use the Internet-based World Wide Web, with
its fully interactive choice of many TeraBytes of audio and/or visual data
files held on computers spread around the globe, already appreciate the
exciting possibilities of the fully networked home computer.
Summary
This has been a personal view of what I believe the future holds for PC-based
audio. It is primarily an enthusiast's, rather than an expert's, opinion.
Bibliography
Cheryl Ajluni, "Audio-IC Technologies Tackle New Challenges", Electronic
Design, February 20, 1995.
Analog Devices, "The Architectural Needs for Signal Processing Functionality in
Personal Computers",Technology Trends Backgrounder, November 1994.
Dennis Cronin, "Examining Audio DSP Algorithms", Dr.Dobbs Journal, July 1994.
Phil Atherton, "Could low cost DSP signal the end for analogue audio?",
Electronics World and Wireless World, May 1993.
Loughborough Sound Images, "A Synthetic Concert Hall in the Home",DSP Link,
Issue 11.