Audio Tutorial: A Gentle Intro (and tutorial) to Digital Audio

by Lynne Schalman and Steve Bergen
(you can e-mail us both at )
Last revised 12/23/07

  1. Define these computer words that will be used in discussing audio and digital music: byte, K, megabyte? (answer)

  2. What are some file types used in recording sound? How do you create or change the type of file? (answer)

  3. What is meant by sampling rate? What are some common sample rates? (answer)

  4. Which audio type is used for CDs? What sampling rate? (answer)

  5. What does it mean to "record to RAM" vs "record to disk"? (answer)

  6. Explain why one second of stereo sound recorded for CD quality would consume 176K. (answer)

  7. How much data on a typical CD? Approximately what is the maximum number of minutes of sound that can fit onto a CD? (answer)

  8. The format called MP3 is short-hand for MPEG3. What do you call a software program that can change an audio file from AIFF to MP3? (answer)

  9. An MP3 file is approximately what fraction of an AIFF file? (answer)

  10. Why are midi files so small (e.g. xxx.mid)? (answer)
  11. How come some audio files will play on one computer and not on another computer? (answer)

Useful links and files you may wish to obtain (legally, we hope)

  • Define these computer words that will be used in discussing audio and digital music: byte, K, megabyte?

    The basic unit of information is the BYTE which is equivalent to one character or symbol of text. For example, the word HELLO is a 5 byte word. Because spaces are just as significant as letters, NEW YORK is one 8 BYTE word, which is the way most New Yorkers say it anyway.

    1000 BYTES = 1 K: How many BYTES on a piece of paper? Well, a good approach is to consider double-spaced text with wide margins. For example, you might have 50 characters across the page, with a total of 20 lines down. Therefore, we have 20 X 50 bytes or 1000 bytes on the page. Computer people refer to 1000 bytes as 1 KILOBYTE and then abbreviate it as 1 K.

    Just as 1000 bytes equals 1 kilobyte (1K), 1 million bytes (1000K) is 1 MEGABYTE (usually pronounced MEG) and 1 billion bytes (1000 megabytes) is 1 GIGABYTE. Hard disks typically range from 40 meg to 8 gigabytes. CDs hold up to 650,000K, which we call 650 meg or over 1/2 a GIGABYTE.

    This chart summarizes basic computer literacy for various disks

    What are some file types used in recording sound? How do you create or change the type of file?

    AIF, WAV, AU, MID, MP3 are five types. There are many more. Click here for descriptions of these various file types.

    Whenever you are using "audio software" to record a sound with a microphone, there are different choices that you can make using the SAVE AS feature. Generally, there is a button or arrow that contains the various types of files that you can save with that software. A good exercise with a new piece of audio software is to create a sound file and then save the sound file in multiple file formats, each time including the type within the name. For example, let us say you sing my MP3 song (based on My Blue Heaven)

    While others might buy
    I don't understand why
    I'm happy in my MP3 heaven

    I download all the music that I want to keep
    My teacher says that its illegal, but I don't lose no sleep.

    Now save the file in aiff format and call it myblueaiff. Save it again (using SAVE AS) in au format and call it my blueau. Save it again in wav format and call it mybluewav. If you are on a PC, you might find an extension created for you. If on a Mac, no 3 letter extension is added. But these files will be useful for you in terms of learning which software will play what file types!
    What is meant by sampling rate? What are some common sample rates?

    A sampling rate means that the sound waves are "inspected² that many times each second to determine its qualities: note, tone and amplitude.

    I sometimes teach people how to say the word "NOTHING" at Summercore and vary the sampling rate. If we have a sampling rate of 2, we might hear: N ... TH. If we change the sample rate to 4, we might hear: N .. UH ... TH ... G. If we change the sample rate to 6, we might hear N ... UH ...TH ... I ... N ... G and it would begin to be a better capture of the sound.

    Typical sample rates used by recording software might range from 11,000 per second to 44,000 per second. That means that the software is capable of "listening" to the sound and analyzing it that many times per second. Each time it "listens," it assigns a number to what it hears. The abbreviation of khz stands for "kilohertz" or thousand per second. Therefore, it is common to see a sample rate between 11 khz and 44 khz.

    Which audio type is used for CDs? What sampling rate?

    When you record audio for a CD, the specs are 44 khz and 16 bit stereo. Khz means "kilohertz" or thousands per second. So 44 khz means 44,000 samples per second. The file type is AIFF. There can be no variations here. There is something called the "red book standard for CD audio" that requires all CD audio to be 16 bit and 44khz.

    By the way, even with a fixed number of samples (such as 44,000), the file size can vary if you record at 8 bit or 16 bit. This latter description refers to how many numbers are used to describe the sound that is heard at each sample time. 8 bit means 256 possibilities per sample while 16 bit means 65,000 possibilities per sample. You can get anything you want at Alice's restaurant ... 10 seconds (230K) ... taught to me by my wonderful guitar teacher (Bob Lasher) ... author of course is Arlo Guthrie who was recently quoted as saying "the internet is the last bastion of real freedom; it's the world's biggest democracy." Note that this file is 230K for 10 seconds, because I recorded this file in mono at 22KHz. If I had recorded it for "CD Quality" then each second would consume 176K and so 10 seconds would consume 1760K.

    "You can get any bug you fear
    When Y2K hits next year
    Walk right into the computer lab
    You may find that it's very sad."
    What does it mean to "record to RAM" vs "record to disk"?

    All of the numbers (the samples) need to be recorded somewhere when you digitize a sound. Most software programs digitize to the RAM chip. Typically, the software might have 5 or 10 or 20 free meg of space of RAM to use. Once this is filled, no more of the sound can be stored. Software like this limits you from recording long songs or long narrations, but this is the TYPICAL approach. More sophisticated software will be able to offer the option of "saving to disk" instead of only "recording to RAM."

    On the Mac, the standard audio recording utility allows just 10 seconds of recording time. A program such as Sound Effects gives us the ability to record (and edit) 1 to 1.5 minutes of sound while Ultra Recorder has the feature allowing you to save to disk and therefore record a longer piece.

    On the PC, the standard recording software found within Multimedia within accessories allows about 1 minute of standard recording. A program such as Total Recorder gives us far better flexibility.

    Explain why one second of stereo sound recorded for CD quality would consume 176K.

    Understanding Digital Sound (One second of stereo = 176K)

    With the language of sample rates and bytes, we are now able to understand how sound is stored digitally, e.g. CD or computer. A sampling rate of 44,000 means that the sound waves are "inspected" that many times each second to determine its qualities: note, tone and amplitude. There are 65,536 choices for each specific sampled sound. Since 65,536 = 256 x 256 and each byte has 256 patterns, this means that each sampled sound consumes 2 bytes of data. Thus 44,000 samples in one second would be 88,200 bytes or 88K. This now gets replicated for both the left and right channels (stereo), yielding 176K.

    How much data on a typical CD? What is the maximum number of minutes of sound that can fit onto a CD?

    Since, the previous answer, explained that one minute of data would be 60 x 176K or 10,560K, i.e. 10 megabytes, we see that 65 min would therefore equal 650 megabytes, filling up a typical musical CD! Any CD, whether computer or music holds a maximum of 650 megabytes. DVDs can get recorded on both sides, but CDs get recorded on just one side, the side WITHOUT the label.

    The format called MP3 is short-hand for MPEG3. What do you call a software program that can change an audio file from AIFF to MP3?

    We call the software programs "MP3 rippers" because they can "rip" large CD files into smaller files. There are also "MP3 players" that can play MP3 songs without being able to do the "ripping" task as well.

    MP3 FILES ON THE WEB

    The MP3 file format condenses that song, but retains its CD quality. If you go to the Don't Think Y2K page of www.summercore.com, you will see an explanation of these details with samples that you can download!

    We in the "Summercore" Teaching Company use a PC program called musicmaker to turn CD tracks into MP3 files. For example the Don't Think Y2K file on an audio CD is 33 meg. When we used musicmaker (available from www.musicmaker.com) to compress the song, the file was 3.1 meg, yet retained its CD quality. The mp3 files are easily moved from your hard drive to the Rio player via a parallel cable into the LPT1 port. To some extent MP3 is like the JPEG of the audio world!

    The Rio RMP300 portable music player stores up to 32 meg of songs in RAM. According to the Rio manual by Diamond Multimedia Systems (www.diamondmm.com)

    "MP3 files can be legally made for personal use from your favorite CDs that you have purchased"
    An MP3 file is approximately what fraction of an AIFF file?

    Generally, the MP3 file is about 1/10th the size. For example, my song DONıT THINK Y2K, ITıS ALL RIGHT is a 31 meg aif file on a CD (dont.aif) while we have posted dont.mp3 on www.summercore.com which is about 3 meg. The song itself is 3 min 11 seconds long. Of course the original tune is by Bob Dylan, but the new lyrics (copyrighted) are by me with Bob Lasher (Nobles '68) performing with voice, guitar and harmonica.

    Why are midi file so small (e.g. xxx.mid)?

    Midi files are files of notes not audio recordings. They play a designated instrument on your computer. Accordingly, midi files of long songs will be under 100K, since all that is being stored is the pattern of notes.

    MIDI

    How come some audio files will play on one computer and not on another computer? There are many versions of Netscape and Internet Explorer. There are many different operating systems (e.g. Windows 95, Windows 98, Mac OS 7.5, Mac OS 8.5. Occasionally, a sound has been recorded in a format that plays on one system and not the other.