Facing the Codec Challenge
The latest versions of Windows Media Audio and RealAudio go head to head with MP3
(continued)
Playing Fair
We used the same procedure as in our last round — a direct comparison between
the original and the codec-compressed versions of musical selections either
from CDs or custom 16-bit PCM digital audio recordings, in both cases copied
bit for bit to a computer’s hard drive. The scoring reflects how close
the compressed versions came to matching the sound of the uncompressed originals.
For the listening tests, we used a computerized ABX system in which one of the three tracks in each comparison was codec-processed and the other two were identical copies of the CD original. One of these copies was identified as the reference track and assigned to button X in the custom software we used, while the other two tracks — the codec-processed version and the other copy of the original track — were randomly assigned to buttons A and B. In each case, the listener had to decide whether A or B sounded exactly the same as the reference, X. The other track was therefore the compressed version. The button matching X was given a score of 5, while the codec version received a score from 4.9 down to 1.0 depending on how closely the listener felt it matched the reference.
We used 11 highly varied musical segments and a track of audience applause
(see “The Test Tracks” on page 100) fed through five types of codec
processing: MP3 at 128 kbps (or MP3/128 for short), WMA at 128 and 64 kbps (WMA/128
and WMA/64), and RealAudio 8 at 132 and 64 kbps (Real/132 and Real/64). We tested
RealAudio at 132 rather than 128 kbps because 132 kbps is the ATRAC3 data rate
closest to 128 kbps, and it’s also the rate used in Real.com’s music-download
service. In all, there were 60 critical-listening comparisons. Like any scientifically
controlled listening test, this one was grueling — not for the faint of
heart or ears.
The test was designed to be as fair as possible. The order of the 12 selections
was randomized in the listening sessions so that each comparison would be an
independent listening judgment. The listeners were not only unaware of the specific
identities of tracks A and B in each trial, but they didn’t even know which
codec was being tested in each trial, and two of the listeners didn’t even
know which codecs were being tested in this whole round.
The extra veil of secrecy was drawn to prevent any listener bias, particularly of the anti-Microsoft variety, from influencing the results (one of the listeners is a die-hard Macintosh partisan). A Sony Vaio MXS10 computer held all of the audio data, randomly generated the button assignments, and kept track of the scores given to each button in each trial. The unscrambled scores eventually made their way into the five bar graphs printed at right.
The three listeners were contributing technical editor Daniel Kumin, contributor Frank Doris, and myself. Besides our technical chops, we’re also musicians — Dan and Frank are pretty mean guitar players, and I’m a violinist. This seems to help the judgment process, ironically, by allowing us to ignore the music that’s going on and concentrate only on the sound. To protect the guilty, I’ll refer to us only by the colors in the graphs (thus, listener Red, Blue, or Orange), and I’m not telling who is represented by which color.
