home |  about |  articles |  中文版本 |  search |  subscriptions |  srtilley.com

Exploring the Impacts of
Pervasive Computing

The Recorder Quest

Volume 4
Number 6
Oct. 18, 2000

RealAudio

by Scott Tilley

Maybe it's just that Halloween is fast approaching, but lately what should be relatively simple tasks seem to morph into endless adventures with the dark side of computing. In this instance, it was the seemingly innocuous desire to record lectures so that they could be automatically transcribed to text for later use. This led to a two-week quest to find the right recorder, analog or digital, and the right transcription software. It turns out that just getting the right recorder is far more arduous than it should be. Victrola anyone?

Like many professors, I like to record my lectures. When I'm feeling magnanimous, I put the recording online for students to listen to at their leisure. Less altruistically, I hope that the lecture capture's a performance of sufficiently high quality that, when combined with the slides I use, will contribute to a book on whatever course I happen to be teaching. There are two important steps in converting lecture audio to book text. The first step is to record the lecture itself. The second step is to convert the audio from the recorded lecture into text.

The first step led to my quest for a voice recorder of sufficiently high quality so that the audio can later be transcribed to text. The transcription can be done manually, although this is a time-consuming and error-prone process. Ideally, the transcription can be done automatically, using one of popular voice recognition software packages available. However, before the transcription can take place, the recording must be made.

I spent nearly two weeks on this seemingly simple task, researching which recorder to choose, seeing what software comes with the recorder, and reading reviews from users and the trade press.  The first choice to make is analog versus digital. Analog recorders are a lot less expensive than digital recorders, but they are, well, analog. They use little tapes and don't come with any fancy software that facilitates the downloading of the recorded material and its subsequent transcription. Digital recorders are more expensive than their analog siblings, and are definitely the way of the future. But in my opinion, they are not yet ready for use in recording long lectures. 

The Sony ICD-R100PC

ICD-R100PCThe first digital recorder I purchased was the Sony ICD-R100PC. This device (shown at right) cost $199 a few weeks ago, but magically the price had dropped to $149 when I actually went to make the purchase. Like most Sony audio products, the quality of the recording is excellent. The recorder is very light, weighing just 3 ounces. It is powered by two AAA batteries and will provide about 8 hours of recording time. It has the necessary external microphone jack and an earphone jack for recording and playback.

To download the audio to a computer, a small cable (provided) connects the recorder to a parallel port. Sony's choice of the parallel port, rather than the more modern USB port, is unfortunate. Equally unfortunate is that the software that is included for managing the download process runs only on Windows 95/98/Me, not Windows 2000. Moreover, the recordings are stored in a proprietary compressed format rather than a more common choice, such as MP3, RealAudio, or Windows Media. The included software is used to transfer the recorded files from the proprietary format to WAV format. 

My main problem with the ICD-R100PC was recording time, or rather the lack of it. The recorder has built-in memory that can record 64 minutes in SP mode. This is at a 11KHz sampling rate, which is good enough to produce a high-quality voice recording that might then be processed by a speech recognition system. The problem is that my lectures are 90 minutes long (or longer, with double lectures lasting 3 hours with a break halfway through), so 64 minute is just not long enough. When the recorder is placed in LP mode, the recording time stretches to 150 minutes, but the audio quality is reduced significantly. The sampling rate drops to 7KHz, which is barely acceptable to the human ear but no where near good enough for a speech recognition program to process. Since the unit's 16MB of memory is not expandable, I had to return it and try something else. 

The Sony ICD-MS1

ICD-MS1My next choice was another Sony product, the very cutting-edge ICD-MS1 (shown at right). This device is unique in that it uses removable memory sticks to store its digital recordings. This means that one is not limited to the 64 minutes of high-quality audio in SP mode, as I was with the ICD-R100PC. This is a potentially great idea, but a costly one: the unit costs $299. It does ship with a 16MB memory stick, but as this is only enough for 64 minutes (as discussed above), I had to buy another one at an additional cost of $159 for a 64MB stick. As an aside, the ICD-MS1 can use regular memory sticks, or the newer (and more expensive) "Magic Gate" memory sticks. 

The quality of the recording was again excellent. The recorder remains very light, weighing just 3.1 ounces. It is powered by two AAA batteries and provides about 4 hours of recording time. It also has the necessary external microphone jack and earphone jack for recording and playback. In LP mode the ICD-MS1 will record 131 minutes, a change from the ICD-R100PC because the ICD-MS1 samples at 8KHz in LP mode. This produces marginally better long-play recordings, but still not good enough for processing by a speech recognition program. 

Because of the removable storage option, my main problem with the ICD-MS1 wasn't recording time. Rather, it was with the removable storage itself. Memory sticks are propriety Sony technology, which means that the only computer that can read the contents of a memory stick is a Sony PC. There are adapters that one can buy for adding memory stick processing to any PC, including a floppy disk device, a PC Card device, and a USB device, but all add to the cost of the overall package. For example, the USB adapter costs about $60. Adding the cost of the recorder and the extra memory raises the total cost of this digital recording solution to about $575.

Like the ICD-R100PC, the ICD-MS1 stores recordings in a  a proprietary compressed format (although one with a different file extension, so I don't know if the difference is cosmetic or real). The included software is used to transfer the recorded files on the memory stick from the proprietary format to WAV format. My notebook computer is a Sony Vaio, so it does have a memory stick slot. "Great!" I thought. I'll just pop the memory stick into the Vaio and download the recording at RAM speeds. 

Unfortunately, I never got to try the conversion software because I couldn't download the recording from the memory stick at all. The memory stick slot requires a special driver, available only from Sony, which currently only supports Windows 98/Me. I use Windows 2000, so again I was out of luck. There was no way for me to transfer the audio from the digital voice recorder to my computer for further processing. Since I had already considered all other digital voice recorders that I knew about, my only choice was to return the ICD-MS1 and opt for a low-tech analog solution.

The Sony M-637V

M-637VMy final choice was the Sony M-637V, an old-fashioned but high-quality analog recorder (shown at right). This unit costs just $59 and includes a clip-on microphone. The recording quality is actually comparable to the two digital recorders discussed above. It is powered by two AA batteries and boasts an impressive 24 hours of continuous use on one set of batteries.  At 6 3/8 ounces, the recorder is larger and heavier than the digital models, but still small enough to sit comfortably in a shirt pocket.

The recording time varies depending on the length of the micro-cassette tapes used. The longest I've found are 90-minute tapes that record 45 minutes per side in SP mode, or 90 minutes per side in LP mode. The sound quality is of course better in SP mode. The big difference between this form of removal media and memory sticks is price: a package of 6 90-minute tapes costs under $10. Such a huge price discrepancy, coupled with the difference in cost in the recorder itself, has to be taken very seriously.

The main drawback with an analog device is downloading the audio from the recorder to the computer. This is done in real-time, which means 90 minutes of silent tape replay, with the earphone jack output connected to the microphone jack input on my PC. I use Sound Forge XP to capture the recordings and save the lectures in both WAV and RealAudio format. So far, this solution has worked fine. However, I've not yet had a chance to run the voice recognition software on the saved files. That's next.


Copyright © S.R. Tilley & Associates

disclaimer