The idea of speaking into my computer and having it correctly type what I say has intrigued me since I saw the Star Trek episode Assignment: Earth, in which Gary Seven dictates to his IBM Selectric typewriter while plotting to sabotage a NASA launch.
The thought that I can now actually say – and have my computer type – the phrase,”the museum is open Monday to Friday from 9 am to 6 pm, Saturday from 9 am to 3 pm, Sunday from noon to 4 pm, closed major holidays,” makes me positively giddy – covering Disney World doesn’t look so daunting anymore.
It was with this light thought that I cheerfully set about installing IBM’s new SimplySpeaking Gold (remember: IBM made the Selectric! No one gets fired for buying IBM!), touted by Big Blue as the software that would change the world. My father was with me, and as I was describing what the software would do (‘yeah, that’s it… I can just talk into it and it will type what I say,’) he was shooting me looks of open dubiousness, if not mild derision.
“YouE’re skeptical,” I said.
“I’m not skeptical,” he said, “I know it won’t work.”
“Why,” I asked, knowingly,”would IBM offer a 30 day money back guarantee on it if it didn’t work?”
“I don’t know” said my father,”But it won’t work.”
Chuckling to myself (what does he know?) I set to installing SimplySpeaking Gold. Following the directions to the letter, I donned the little headset that came with the software. The training session lasted about half an hour, after which I started talking and it started typing.
Unfortunately, those two actions were entirely independent. It was as if had installed Tourette’sSyndrome for Windows95. I said,”Hey, look Dad, I’m talking and this thing is typing,” and it typed, “pay stark land vice talking in myths saying it is typing.” (“typing”, I noticed later, was one word it consistently spelled correctly, along with`SimplySpeaking Gold”) I said,”this system sucks.” It typed,”cheese feet and ducks.” Okay, it wasn’t really that bad – I am exaggerating a little (just a little) – but it was, in fact, terrible.
I returned it the following day. Later I spoke with a software salesman, who told me that almost everyone who bought the IBM software at his shop (one of New York’s largest) brought it back.
“That’s not to say it’s bad,” he was careful to say, “it’s just that a lot of people bring it back.”
This salesman went on to tell me that a lot of the people who were disappointed with IBM really liked Dragon NaturallySpeaking, but that that software was much more difficult to learn then IBM’s. Since I thought that learning IBM’s was simply a matter of training myself to speak in the manner of one of those VCR manuals that has been translated from the original Korean via Swahili, I was game for anything.
To be fair, IBM’s ViaVoice is said (well, said by IBM) to be better than SimplySpeaking. But in an article in the San Francisco Chronicle, David Einstein reported something hauntingly similar to my experience:
“…when I said, “This is my first dictation” ViaVoice wrote “This is mild irritation.” I repeated the sentence and it came out, “This is missus sophistication’.
Why, that is much better!
My next test was with Dragon’s NaturallySpeaking. With doubt in my heart, I installed the software and went through its training session. One thing that struck me immediately was that while I was reading through the training session’s text (it gives you a choice of three, I chose Dave Barry’s Adventures in Cyberspace) it was recognizing my voice right out of the box.
But I was truly astounded when, after finishing the session, I was able to write a long letter with very few mistakes: this thing actually works! Don’t believe it? Come over to my house and I’ll show you (two of my neighbours are going out to buy it after one demo).
For example, I’m writing the following five paragraphs by speaking into my computer. It’s an absolutely joyous thing: I’m sitting here with my feet on my desk speaking absolutely normally and watching it type everything I say.
And okay, there are some drawbacks (like the fact that it just wrote “arson” instead of “all are some’, and I had to go back and correct): I sit at my desk wearing this funky headset and looking for all the world like a Time-Life operator ready to take your phone call (E’Good morning, my name is Nick, are you calling about our Sports Illustrated swimsuit issue?’).
But the fact is, I can dictate into this thing at about 100 words per minute after three days of use – and the folks at Dragon say that this will only improve over time.
I have noticed that in the last few days of using this software intensely it has made the same mistakes on a couple of occasions. But it also learns incredibly quickly. I only had to train “Minas Gerais” and “Sao Paulo” once, and never even had to tell it to recognize Rio de Janeiro. Handy, when IE’m working on Brazil (it also recognized, after training, “rodoviãria” and “real’, which are pronounced decidedly not as theyE’re written).
But you’ve got to have patience (it just wrote “patients’), and realize that it will take about a solid week before you begin to get close to 96% recognition.
The mistakes NaturallySpeaking made while I recited the last five paragraphs were “good morning, my name is neck”; “with my field on my desk”; and the aforementioned “arson” and “patientsE’. Still, that’s not so bad. Earlier OCR scanning devices made far more mistakes, and for most of the friends of mine who can’t type to save their lives, a couple of mistakes in each paragraph is a far happier situation than a blank page.
But Naturally Speaking – or its presence – did cause some problems on my machine. After running it and other programs simultaneously, my computer crashed – but it turned out to be a Microsoft problem, and I had to download a small patch to fix it. You’ll also need a relatively good machine: while Dragon says you need at least a Pentium 133 Mhz, 32MB of RAM and 65MB of hard drive space, I’d say that’s conservative.
Another good question is whether you can dictate into a tape recorder on the road – some smarter authors (and now I) use a tape recorder for mapping (“J&R Music World on the south side of Park row 200 metres south of John St”) and it would be a hoot to have the machine transcribe it. Well, short of spending upwards of $250 on a mini disk recorder, you’re out of luck: traditional minicassette and other analog recorders just don’t have the quality to work with NaturallySpeaking.
NaturallySpeaking has several models to choose from, but the recognition engine is the same on all – bells and whistles change as you spend more money. But their basic Point & Speak (US$59 RRP in the US) model allows you to do everything I did here. The Personal edition and Preferred Editions (US$99 and US$149 to US$159) have greater customization abilities, and very expensive Deluxe editions are available as well. SimplySpeaking Gold sells for US$139 in the US.