Speech!

Speech is, by far, one of the neatest features of Macs. It’s also one of the least used. It seems that everybody turns off speech recognition for one reason or another. Some people say it’s too slow while others say it uses too much memory. Not only is it useful, but it’s a lot of fun.

There are two types of speech uses on a Mac. One is Text to Speech (TTS) while the other is Speech Recognition. Combined, these form PlainTalk. Text to Speech is the easiest to use. If you open any speech savvy application, such as SimpleText, you can type and have it read back to you in a variety of voices depending upon your system. There is also a Mexican Spanish TTS system available off the Internet. This allows your computer to read back Spanish with an accent. In PlainTalk 1.5, TTS can read alert boxes that pop up.

While having your Mac talk is fun, it is limited in its use. Speech recognition, the second part of PlainTalk, is where the real fun begins. With this on, you can open files, empty the trash, find out the time and date, close windows, zoom windows and even change the view of windows. Those are just a few of the things the Finder will do. If you have any speech savvy applications, many will be able to be controlled with speech.

Such is the case with Netscape Navigator and Internet Explorer. There is an application called SurfTalk that works in the background, adding speech recognition to your Web Browser. These include opening bookmark pages, adding bookmarks, opening hyperlinks, going back and forward, stopping and reloading of pages. The only thing you can’t do is speak an address; you still have to type that.

Now you can see that PlainTalk can be useful. The memory requirements are about one megabyte, varying somewhat according to your system setup. The speed depends on how much noise is in the room and your computer. If there’s a lot of background noise, the chances of having it work are reduced. The other part is the speed of your computer. The faster it is, the faster speech will run. My 6100 with 16 megs of RAM has never run into memory problems and speech recognition isn’t that slow. I would suggest trying it on your computer to see what it’s like.

The more technical of you may be wondering how PlainTalk works. It’s a fairly straightforward process that takes in the voice, transforms it into usable data, and matches it with a preset word or phrase.

To take in the voice, the Mac uses the microphone. All you have to do is talk and the Mac will listen. After getting this information, it changes it into digital values. Then it processes it further. Known as signal processing, this step takes the digital values and makes it into a sequence of patterns. These are much like those used by the human ear.

The next step, recognition search, takes the patterns and compares them to preset words and sounds. These are used by the application to see what is allowed or expected and triggers a command to do something.

The good part about the Apple recognition is that the software needs no training period to be used well. Unless you have a deep accent, it will work very well on the first try.

To make the computer talk requires a little more work. The first step is known as text processing. What this does is to make some text readable by the computer, including abbreviations.

The next step is to create the rhythm for the text. This is known as prosodic processing. This makes the computer sound real instead of just saying words. When a comma is placed, the computer has to pause. This step is how the computer knows when to pause.

The last step is the signal processing – actually creating the sound. This takes the words and the rhythm and computes it to sound like the desired voice. This is completely software based, but requires many computations.

You can get all of this software, as well as links to third-party developers at Apple’s Speech Web Site, http://www. speech.apple.com


Brian Koponen (briankop@mail.idt.net)

Leave a Reply