Audio Tutorial: A Gentle Intro (and tutorial) to Digital Audioby Lynne Schalman and Steve Bergen
(you can e-mail us both at )
Last revised 12/23/07
The basic unit of information is the BYTE which is equivalent to one character or symbol of text. For example, the word HELLO is a 5 byte word. Because spaces are just as significant as letters, NEW YORK is one 8 BYTE word, which is the way most New Yorkers say it anyway.
1000 BYTES = 1 K: How many BYTES on a piece of paper? Well, a good approach is to consider double-spaced text with wide margins. For example, you might have 50 characters across the page, with a total of 20 lines down. Therefore, we have 20 X 50 bytes or 1000 bytes on the page. Computer people refer to 1000 bytes as 1 KILOBYTE and then abbreviate it as 1 K.
Just as 1000 bytes equals 1 kilobyte (1K), 1 million bytes (1000K) is 1 MEGABYTE (usually pronounced MEG) and 1 billion bytes (1000 megabytes) is 1 GIGABYTE. Hard disks typically range from 40 meg to 8 gigabytes. CDs hold up to 650,000K, which we call 650 meg or over 1/2 a GIGABYTE.
This chart summarizes basic computer literacy for various disks
AIF, WAV, AU, MID, MP3 are five types. There are many more. Click here for descriptions of these various file types.
Whenever you are using "audio software" to record a sound with a microphone, there are different choices that you can make using the SAVE AS feature. Generally, there is a button or arrow that contains the various types of files that you can save with that software. A good exercise with a new piece of audio software is to create a sound file and then save the sound file in multiple file formats, each time including the type within the name. For example, let us say you sing my MP3 song (based on My Blue Heaven)
I download all the music that I want to keep
My teacher says that its illegal, but I don't lose no sleep.
A sampling rate means that the sound waves are "inspected² that many times each second to determine its qualities: note, tone and amplitude.
I sometimes teach people how to say the word "NOTHING" at Summercore and vary the sampling rate. If we have a sampling rate of 2, we might hear: N ... TH. If we change the sample rate to 4, we might hear: N .. UH ... TH ... G. If we change the sample rate to 6, we might hear N ... UH ...TH ... I ... N ... G and it would begin to be a better capture of the sound.
Typical sample rates used by recording software might range from 11,000 per second to 44,000 per second. That means that the software is capable of "listening" to the sound and analyzing it that many times per second. Each time it "listens," it assigns a number to what it hears. The abbreviation of khz stands for "kilohertz" or thousand per second. Therefore, it is common to see a sample rate between 11 khz and 44 khz.
When you record audio for a CD, the specs are 44 khz and 16 bit stereo. Khz means "kilohertz" or thousands per second. So 44 khz means 44,000 samples per second. The file type is AIFF. There can be no variations here. There is something called the "red book standard for CD audio" that requires all CD audio to be 16 bit and 44khz.
By the way, even with a fixed number of samples (such as 44,000), the file size can vary if you record at 8 bit or 16 bit. This latter description refers to how many numbers are used to describe the sound that is heard at each sample time. 8 bit means 256 possibilities per sample while 16 bit means 65,000 possibilities per sample. You can get anything you want at Alice's restaurant ... 10 seconds (230K) ... taught to me by my wonderful guitar teacher (Bob Lasher) ... author of course is Arlo Guthrie who was recently quoted as saying "the internet is the last bastion of real freedom; it's the world's biggest democracy." Note that this file is 230K for 10 seconds, because I recorded this file in mono at 22KHz. If I had recorded it for "CD Quality" then each second would consume 176K and so 10 seconds would consume 1760K.
All of the numbers (the samples) need to be recorded somewhere when you digitize a sound. Most software programs digitize to the RAM chip. Typically, the software might have 5 or 10 or 20 free meg of space of RAM to use. Once this is filled, no more of the sound can be stored. Software like this limits you from recording long songs or long narrations, but this is the TYPICAL approach. More sophisticated software will be able to offer the option of "saving to disk" instead of only "recording to RAM."
On the Mac, the standard audio recording utility allows just 10 seconds of recording time. A program such as Sound Effects gives us the ability to record (and edit) 1 to 1.5 minutes of sound while Ultra Recorder has the feature allowing you to save to disk and therefore record a longer piece.
On the PC, the standard recording software found within Multimedia within accessories allows about 1 minute of standard recording. A program such as Total Recorder gives us far better flexibility.
Understanding Digital Sound (One second of stereo = 176K)
With the language of sample rates and bytes, we are now able to understand how sound is stored digitally, e.g. CD or computer. A sampling rate of 44,000 means that the sound waves are "inspected" that many times each second to determine its qualities: note, tone and amplitude. There are 65,536 choices for each specific sampled sound. Since 65,536 = 256 x 256 and each byte has 256 patterns, this means that each sampled sound consumes 2 bytes of data. Thus 44,000 samples in one second would be 88,200 bytes or 88K. This now gets replicated for both the left and right channels (stereo), yielding 176K.
Since, the previous answer, explained that one minute of data would be 60 x 176K or 10,560K, i.e. 10 megabytes, we see that 65 min would therefore equal 650 megabytes, filling up a typical musical CD! Any CD, whether computer or music holds a maximum of 650 megabytes. DVDs can get recorded on both sides, but CDs get recorded on just one side, the side WITHOUT the label.
We call the software programs "MP3 rippers" because they can "rip" large CD files into smaller files. There are also "MP3 players" that can play MP3 songs without being able to do the "ripping" task as well.
MP3 FILES ON THE WEB
The MP3 file format condenses that song, but retains its CD quality. If you go to the Don't Think Y2K page of www.summercore.com, you will see an explanation of these details with samples that you can download!
We in the "Summercore" Teaching Company use a PC program called musicmaker to turn CD tracks into MP3 files. For example the Don't Think Y2K file on an audio CD is 33 meg. When we used musicmaker (available from www.musicmaker.com) to compress the song, the file was 3.1 meg, yet retained its CD quality. The mp3 files are easily moved from your hard drive to the Rio player via a parallel cable into the LPT1 port. To some extent MP3 is like the JPEG of the audio world!
The Rio RMP300 portable music player stores up to 32 meg of songs in RAM. According to the Rio manual by Diamond Multimedia Systems (www.diamondmm.com)
Generally, the MP3 file is about 1/10th the size. For example, my song DONıT THINK Y2K, ITıS ALL RIGHT is a 31 meg aif file on a CD (dont.aif) while we have posted dont.mp3 on www.summercore.com which is about 3 meg. The song itself is 3 min 11 seconds long. Of course the original tune is by Bob Dylan, but the new lyrics (copyrighted) are by me with Bob Lasher (Nobles '68) performing with voice, guitar and harmonica.
Midi files are files of notes not audio recordings. They play a designated instrument on your computer. Accordingly, midi files of long songs will be under 100K, since all that is being stored is the pattern of notes.