| |||||||||||||
by Scott Tilley
The first step led to my quest for a voice recorder of sufficiently high quality so that the audio can later be transcribed to text. The transcription can be done manually, although this is a time-consuming and error-prone process. Ideally, the transcription can be done automatically, using one of popular voice recognition software packages available. However, before the transcription can take place, the recording must be made. I spent nearly two weeks on this seemingly simple task, researching which recorder to choose, seeing what software comes with the recorder, and reading reviews from users and the trade press. The first choice to make is analog versus digital. Analog recorders are a lot less expensive than digital recorders, but they are, well, analog. They use little tapes and don't come with any fancy software that facilitates the downloading of the recorded material and its subsequent transcription. Digital recorders are more expensive than their analog siblings, and are definitely the way of the future. But in my opinion, they are not yet ready for use in recording long lectures. The Sony ICD-R100PC
To download the audio to a computer, a small cable (provided) connects the recorder to a parallel port. Sony's choice of the parallel port, rather than the more modern USB port, is unfortunate. Equally unfortunate is that the software that is included for managing the download process runs only on Windows 95/98/Me, not Windows 2000. Moreover, the recordings are stored in a proprietary compressed format rather than a more common choice, such as MP3, RealAudio, or Windows Media. The included software is used to transfer the recorded files from the proprietary format to WAV format. My main problem with the ICD-R100PC was recording time, or rather the lack of it. The recorder has built-in memory that can record 64 minutes in SP mode. This is at a 11KHz sampling rate, which is good enough to produce a high-quality voice recording that might then be processed by a speech recognition system. The problem is that my lectures are 90 minutes long (or longer, with double lectures lasting 3 hours with a break halfway through), so 64 minute is just not long enough. When the recorder is placed in LP mode, the recording time stretches to 150 minutes, but the audio quality is reduced significantly. The sampling rate drops to 7KHz, which is barely acceptable to the human ear but no where near good enough for a speech recognition program to process. Since the unit's 16MB of memory is not expandable, I had to return it and try something else. The Sony ICD-MS1
The quality of the recording was again excellent. The recorder remains very light, weighing just 3.1 ounces. It is powered by two AAA batteries and provides about 4 hours of recording time. It also has the necessary external microphone jack and earphone jack for recording and playback. In LP mode the ICD-MS1 will record 131 minutes, a change from the ICD-R100PC because the ICD-MS1 samples at 8KHz in LP mode. This produces marginally better long-play recordings, but still not good enough for processing by a speech recognition program. Because of the removable storage option, my main problem with the ICD-MS1 wasn't recording time. Rather, it was with the removable storage itself. Memory sticks are propriety Sony technology, which means that the only computer that can read the contents of a memory stick is a Sony PC. There are adapters that one can buy for adding memory stick processing to any PC, including a floppy disk device, a PC Card device, and a USB device, but all add to the cost of the overall package. For example, the USB adapter costs about $60. Adding the cost of the recorder and the extra memory raises the total cost of this digital recording solution to about $575. Like the ICD-R100PC, the ICD-MS1 stores recordings in a a proprietary compressed format (although one with a different file extension, so I don't know if the difference is cosmetic or real). The included software is used to transfer the recorded files on the memory stick from the proprietary format to WAV format. My notebook computer is a Sony Vaio, so it does have a memory stick slot. "Great!" I thought. I'll just pop the memory stick into the Vaio and download the recording at RAM speeds. Unfortunately, I never got to try the conversion software because I couldn't download the recording from the memory stick at all. The memory stick slot requires a special driver, available only from Sony, which currently only supports Windows 98/Me. I use Windows 2000, so again I was out of luck. There was no way for me to transfer the audio from the digital voice recorder to my computer for further processing. Since I had already considered all other digital voice recorders that I knew about, my only choice was to return the ICD-MS1 and opt for a low-tech analog solution. The Sony M-637V
The recording time varies depending on the length of the micro-cassette tapes used. The longest I've found are 90-minute tapes that record 45 minutes per side in SP mode, or 90 minutes per side in LP mode. The sound quality is of course better in SP mode. The big difference between this form of removal media and memory sticks is price: a package of 6 90-minute tapes costs under $10. Such a huge price discrepancy, coupled with the difference in cost in the recorder itself, has to be taken very seriously. The main drawback with an analog device is downloading the audio from the recorder to the computer. This is done in real-time, which means 90 minutes of silent tape replay, with the earphone jack output connected to the microphone jack input on my PC. I use Sound Forge XP to capture the recordings and save the lectures in both WAV and RealAudio format. So far, this solution has worked fine. However, I've not yet had a chance to run the voice recognition software on the saved files. That's next. | ||||||
| ||||