There will be others writing comprehensive reviews that give a good idea of what ViaVoice has to offer. But there is nothing like hearing how a program fares in day to day use. As a world class, truly terrible typist, I felt that ViaVoice and I were made for each other. So… was this article written simply by leaning back and letting the thoughts verbally stream onto the page? You’ve gotta be kidding…
Circa 1994 / 3:30 PM
“Buster. Open File.” The teachers packing the small room for a demonstration of early voice recognition strain forward to watch. Nothing happens. “Buster! Open file!” Still nothing. The presenter adjusts his mic and tries again. “BUSTER! OPEN FILE!” One teacher whispers to another, “I’ve been up to my ears in little Busters all day. What I don’t need is a computer with an attitude.”
How exciting it was back then. Full-fledged voice recognition seemed the next giant step for computing and the Mac was leading the way. Then, sort of like the manned landings on the moon, we went off track. Eventually, the PC world got various speech recognition programs, but the Mac remained tied to the keyboard except for some limited verbal commands such as Apple’s “Speakable Items” and Dragon’s no longer supported “PowerSecretary.”
In the last few months, this has begun to change.
VoicePower Pro by GT Value Europe Ltd., a discrete speech product utilizing the Dragon recognition engine, was released with an American dictionary in late November. http://www.voicepowerpro.com
IBM’s ViaVoice Millennium Edition, with continuous speech recognition and a large active vocabulary, became available for the Mac in early December. http://www4.ibm.com/software/speech/mac
MacSpeech, a continuous speech product built around the Philips dictation engine, plans to take full advantage of the Macintosh hardware and operating system when it ships later this year. http://www.macspeech.com
Dragon Systems (whose Naturally Speaking 4.0 Professional for Windows is considered by many to be the best of the the current Windows-based products) is now planning a Mac program with continuous speech. It may require OS X. http://www.dragonsys.com/news/pr/051099.apple.html
(For a fuller discussion of these and other options for the Mac, be sure and check out Susan Fulton’s excellent article at “Computing Out Loud.” http://www.out-loud.com/mac.html)
ViaVoice: First Impressions and Getting Started
While eagerly awaiting ViaVoice, my main concern was that though it might do well in a predictable format with a limited vocabulary, I would find it impossible to write using a more creative language. But I was pleasantly surprised–amazed actually–at the vocabulary already built into the program (certainly richer than AppleWorks’ spell checker). In fact, the training process involves reading passages from “Treasure Island” as well as “A Ghost Story” by Mark Twain.
The training was lengthy, but fairly smooth. I read all six passages in three sittings, allowing ViaVoice to process each pair while I got my voice back. Though this could have be done in one (long) day, spreading the training session over several days not only gives your voice a chance to rest, it, hopefully, trains VV to recognize ‘you’ under slightly different circumstances.
I then imported some things I had written for My Mac into SimpleText to have VV analyze my working vocabulary. Surprisingly, it was aware of many unexpected words, but continually wanted help with common, but capitalized, words such as ‘As,’ ‘But,’ and ‘Did.’ Though it is not difficult to retrain such words, it is time consuming, and makes me wonder if all I am doing is confusing the computer further.
Next I spent several evenings reading old My Mac columns into ViaVoice and correcting after each paragraph. This is done, not in an application such as “Word” or “AppleWorks,” but in ViaVoice’s own “SpeakPad,” as VV must not only recognize your words, it evaluates them in terms of context. This makes for a very slow program, but it is quite often accurate in deciding whether you intended to write ‘to,’ ‘too,’ or ‘two.’ That’s the good news.
The bad news, at least for a person who uses lots of quotes, parenthesis, and varied punctuation, is that ordinary text becomes a tongue twister, fraught with potential for mistakes. VV continues, even with constant retraining, to hear my “Quote” as two words so that, unless I ‘swallow’ my command, the word “to” follows every quotation mark. And constantly saying “open parentheses” and “close parentheses” makes “Sister Susie sells seashells by the seashore” seem a snap. By now I’ve surely got spittle spots all over the screen.
The inevitable repetition of long terms such as “Question mark” and “Exclamation point” make dictation unnecessarily exhausting. Why can’t we make macros for these often used commands and then train them to our voice? If VV could learn short cuts for these common terms and could learn that we use, for instance, a certain inflection for basic commands, it would go a long way toward making the program both more accurate and more user-friendly.
Which brings up another feature that seems particularly weak. No matter how thoroughly we train ViaVoice to recognize our own vocabularies, there seems to be no way to train the commands needed to run the program. And, at least in my experience, VV does a better job recognizing unusual vocabulary words than it does in coping with its own commands.
There is nothing more irritating than saying “Uppercase on” and having everything come out underlined… line after line after line. No amount of “Underline off”–not even “UNDERLINE OFF !@#$@”–was able to break the cycle. I find it essential to keep one hand on the mouse and the other on the keyboard.
Using an ‘attention word’ (such as “Buster”) before beginning the command, rather than simply pausing before speaking, is helpful in alerting the computer that a command is on the way. This helps to keep VV from writing out “italicize on” when all you want is some italics on the page. I was rather planning to use the word “Dolly” as it is both short and unlikely to come up in my regular writing. Appropriate too, as my current computer is a Umax S900 clone.
Unfortunately, VV has only one option. Computer. The word “Computer” is not only another three syllables to add to an already long command, those of us who write about computers could find it popping up in all sorts of unwanted places. Therefore, I gave up on using an attention word and take my chances with the bare commands. Dolly, I need you.
Working with ViaVoice: “Golden Blocks and the Three Beers”
I had assumed that after doing extensive pre-training, ViaVoice, with its excellent vocabulary, would do an equally excellent job of understanding what I said. But this was not to be. Although surprisingly good with many unlikely words, VV finds simple words and phrases a challenge. Mind you, it is being used (with varying success) by everyone from doctors to journalists to school kids. And with voices as varied as a Minnesota clip to a Texas drawl. It even has to cope with speakers from Canada to Australia.
But, although I consider my accent as pretty mainstream USA and tend to speak reasonably clearly–though admittedly too fast–ViaVoice and I are often far apart. We regularly argue over things like ‘while’ vs. ‘what.’ For example, I admit “Goldie Locks” may be asking a little much in the way of recognition.
But when I replay my sentence “Mama Bear said, let’s go for a walk while our breakfast is cooling.” I definitely hear the words “while our.” VV, equally definite, gives me five choices, all containing “what are”. Okay, I think, let’s try again. No retraining, just speak a little more clearly. This time I get “Small beer said, let’s go for a walk walleye breakfast is cooling.” Like I say, a great, if unexpected, vocabulary.
By the way, even 95% accuracy requires a lot of correction and don’t expect your spell checker to catch your mistakes. Any words, intended or not, are going to be real words, correctly spelled. That is both the advantage and the disadvantage of a dictation program.
When it comes to dictating, ViaVoice is painfully slow. I am using a Umax S900 with 96MB RAM (and Virtual Memory +10 as instructed in the manual), upgraded with a Newer 300/200 G3 card and OS 8.6. Apple Profiler thinks I have a Power Macintosh 8500. But even those with real G4 Macs and large amounts of RAM find speed a problem. I certainly understand that VV has a difficult job in translating our variable speech into content-sensitive English. And I am continually amazed at what it can do.
But trying to dictate ‘off the cuff’ to something several sentences behind your spoken word is like trying to rub your stomach and pat your head. Very confusing. It works best to have a written copy to read from (which sort of defeats the purpose) or to close your eyes and just start talking. No peeking until you are ready to correct and transfer into your main application.
But don’t wait too long! Another disadvantage is that ViaVoice is not notably stable. Crashes happen. Even after doubling the memory to the IBM SpeakPad and to the background engine, random crashes can wipe out everything. This is partly because of the way SpeakPad saves your files.
If you save with the default “save dictation session information” on, your files are theoretically able to be reopened with your voice still intact. But the default can lead to huge files (a three page article was over 16 MB!) and it’s like playing Russian Roulette to expect that the smaller (2 to 6 MB) files will reopen. In my experience, they usually don’t. Therefore, I save without dictation. The file reopens, but there is no longer any way to determine just what I had in mind when ViaVoice typed out “the potlatch were yes and win it.” So be sure to correct before you close.
A “MacVoice” in the wilderness:
If you want to try ViaVoice for yourself, but are a little nervous about going it alone, there is a new list just for you. Eric Prentice, who hosts a variety of Mac lists, has recently begun one for ViaVoice. Not only does it have a devoted group of members, some of the people working in the field also take part. If you are interested in what others are learning about voice recognition, researching problems and solutions, or would simply like to have your own views heard, check out MacVoice at http://www.themacintoshguy.com/lists.
MacVoice || A place for discussion of everything having to do with speech recognition on the Macintosh.. Send a message to
MacVoice-ON@themacintoshguy.com to join the list.
There are a number of doctors on the MacVoice List. It is heartwarming to think of all those doctors’ offices warmed by Macs in all shapes and colors, though I can’t help but wonder what their VV reports actually say.
Presumably, once the medical terms have been added to the vocabulary, the somewhat predictable format used by many specialists lends itself fairly well to ViaVoice. A few of the doctors have reported good success, others are not so pleased. Still others seem to be anxiously awaiting microphone support for their iBooks.
Even so, I am a little leery of using ViaVoice to keep those medical records up to date. I decided to check it out. “ViaVoice,” I said, “write ‘cancer of the right kidney.'” “Sure thing,” said VV, and typed out ‘vampire of the fright kitchen.’ It makes me wonder. If your doctor were to use ViaVoice, could you go in for surgery… and come out with a stake through the heart?
So, what’s the verdict? “Potboiler will win”
Fun. Frustrating. Exciting. Exasperating. If ViaVoice isn’t quite ready for prime time, neither am I. In fact, much as I looked forward to a stable, easy to use continuous speech program, I can see that I will have to change the way I write–and think–before I can make the switch from keyboard to voice. On the other hand, our son, who truly hates to type, has begun sending us lengthy, well written emails to help my husband set up an iMac video studio at home. And he did it within a week of receiving ViaVoice. It all depends on your needs, your motivation, and the sort of writing you want to do.
However, unless you are a person accustomed to thinking out loud, perhaps by dictating your thoughts orally you may find it hard to to create ‘on the fly,’ as it were. Out there in PC land, where more complete dictation programs are already available, I haven’t seen the smoke rising as liberated souls burn their keyboards.
But what has all this got to do with potboilers? Well, one of ViaVoice’s little idiosyncrasies is that before I say a thing, mysterious words sometimes spill onto the screen. Maybe it’s all that heavy breathing, but when my computer suddenly starts giving me messages from beyond, I listen. “Potboiler will win,” it whispers. Now, ‘Potboiler’ wasn’t in the training sessions. And it’s not a word I’m likely to sub-vocalize. I think ViaVoice knows something. VV and I may just have to head for Las Vegas or take up horse racing.