Say What?

Commentary
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

I wrote this week's to hire a recent speech recognition facility in Microsoft word 2040. That is, I should say that I won't display you would help of the speech recognition facility.

I'm sorry. That first paragraph was unedited. What I really meant to say was, "I wrote this week's tirade using the speech recognition facility in Microsoft Word 2003. That is, I should say that I wrote it despite the alleged help of the speech recognition facility." But the first paragraph is what the software came up with when I spoke those two sentences.

Considering how difficult it is for machines to comprehend speech, the results really weren't all that bad, but they still weren't particularly helpful. I'm not even close to being the world's fastest or most accurate typist, but I could have written this tirade in a small fraction of the time if I had used my fingers exclusively rather than speaking it and then correcting the numerous mistakes. Furthermore, I'm willing to bet that using the same speech recognition software to, for example, create U.N. documents without doing any manual editing would lead to a catastrophic international incident in 30 seconds or less.

I suppose I should be patient, which is not something I'm very good at. The software gets better the more you train it, or so they say. I've already gone through three 10-minute training sessions, and the software still types things like "ring sessions" when I say "training sessions" or, worse and much funnier, "all readable eagle will be" when I say "a rambling old fogy." That's hardly the sort of thing that an aspiring young author, to say nothing of a rambling old fogy like myself, would want.

The problem is partly my fault. I like to have the radio playing in the background when I work. It's nothing raucous, just an all-jazz station, but despite keeping the volume low, the sound still occasionally reaches my computer's microphone. Every once in a while, a few garbled lines of a Cole Porter song creep into my article and I have to delete them. I'm sorry, but if you want the words to I Love Paris, you'll have to find them yourself.

The problem of background noise suggests another issue. If it's ever perfected, speech recognition may be fine for me. I work in a lonely writer's garret. (Actually, it's not a garret. It's a room in my condo that I've turned into an office. Unlike a garret, the room is not in an attic or a loft. It's on the lower of two levels, but "garret" sounds more poetic than "home office.") Being alone, I can limit background noises, but what about people who work in the cubicle warrens that are typical of today's workplaces? All of the ambient noise is going to confuse their computers.

Worse, tempers will no doubt flare with the swelling cacophony caused by everyone's dictation, as they continually raise their voices so their computers can hear them over the other cubicle dwellers who are also competing to be heard. I certainly wouldn't want to work in that environment without strict gun controls in place.

If you decide to try out this speech recognition stuff, I have one recommendation. If the phone rings, remember to turn off your computer's microphone when you answer the call. Before modifications, this tirade contained an incredibly bad transcription of my end of a call from a belligerent telemarketer who tried to get me to change my long-distance phone plan. I deleted the text because you probably wouldn't be interested in the conversation and I'm not sure that my editor would have allowed all of my words--even after they were mangled by the software's considerable mishearing and misspelling.

Turning off the microphone after the phone rings won't entirely solve the problem, but, if you're fast enough, all that you'll need to delete is "a a," which is how the software transcribed the ring of my phone. I've got to remember to get a less anemic phone.

I don't want you to think that I'm taking too narrow a view of speech recognition. I realize that it is being used for more than just dictation. Some companies are using it to replace their "press one for frustration, press two for aggravation" automated telephone attendants. The first time I came across a phone system with speech recognition capabilities was probably a year or two ago. At the time, I had an HP Colorado tape drive and an HP OfficeJet G85 printer. I had a problem with the tape drive.

When I called the HP support number, an electronic voice provided the usual greeting. It then said something to the effect of, "Say which product you're calling about. For example, say Colorado tape drive or OfficeJet G85." I was dumbfounded. I wasn't sure whether an exceptionally clever developer had programmed extra-sensory perception into the system, I was the beneficiary of a colossal cosmic coincidence, or the system used caller ID to look up my product registration information. My money is on the latter.

When I said "Colorado tape drive," the system understood and immediately transferred me to the correct department, which was both good news and bad. Writing these tirades has led to a serious occupational hazard: Whenever I find technology that, quite unexpectedly, works exactly the way it's supposed to, I have mixed emotions. One part of me is immensely grateful for not having been once again frustrated by technology. Another part says, "Darn, now I'll have to find another topic for this week's tirade." (I usually say something other than "darn," but it's not my intention to offend in any way, so I'll leave my words to your imagination.)

I shouldn't have worried about not having anything to write about. I ran into another Silicon Sam phone attendant only a few weeks ago. To say that the latest experience wasn't quite as pleasant as the first would be a gross understatement.

I recently upgraded my plain old cell phone to a smart phone. Unfortunately, my cell phone company totally screwed up the data services portion of the invoice for the new plan. Since the mistake was obscenely in the company's favor, I immediately called to address the problem. I got a speech recognition system.

After greeting me, the system spoke the phone number that I was calling from and asked if that was the number that I was calling about. Because I was not calling from my cell phone, I answered "no."

The system then said, "Speak the number you are calling about."

I enunciated my cell phone number as clearly as I could and the system responded with, "I didn't get that. Please say it again." After a couple of tries, it got it.

The system then said something like, "Please say what you are calling about." It didn't give many hints as to what it is looking for, so I gave it my best shot. "Cell phone data services billing," I said.

"I didn't understand that. Please try again."

"Cell phone data services."

"I didn't understand that. Please try again."

"Data services."

"Thank you. I'll transfer you."

In addition to cell phones, the company also offers cable services, Internet access, and a few other services, which meant that the probability of being transferred to the right department was slim. I'm not a lucky guy, so naturally the system put me through to the Internet technical service department. I got a human who, of course, couldn't help me with my cellular data billing problem, but he promised to transfer me to the correct department. Instead, I was sent back to the front door of phone system hell. I had to start all over again.

After getting a few more wrong departments, I finally found someone who was able to transfer me to the right person. That person told me the name of her department, which is the magic phrase that I was expected to pronounce clearly for the phone system when calling about this specific issue. How I was supposed to know that in advance is beyond me, but never mind. I copied down the department name and immediately lost the scrap of paper that I wrote on, so I fervently hope that I don't have any more problems.

You may be wondering why I'm even bothering with speech recognition. Even if you're not wondering, I'm going to tell you anyway because the publisher is expecting a few more words here. I'm working on another project that I really don't want to do, and I'm grasping at any and every stupid distraction that I can possibly use as an admittedly irrational excuse to procrastinate. Speech recognition seemed to be a good candidate. It was. I've wasted an unbelievable amount of time using speech recognition. Unfortunately, that's put me much closer to the project deadline, making me even more panicked than usual, which is saying a lot since I'm normally an exceptionally nervous person.

I told a bit of a fib in the preceding paragraph. In truth, there were actually two writing projects that I was trying to avoid when I began to play with speech recognition, but one was this week's tirade, and I seem to have taken care of that. The other project is not at all tirade-related. That reminds me of something. This column probably isn't the best place to say this, but one of my favorite authors, Douglas Adams, once said, "I love deadlines. I love the whooshing noise they make as they go by."

It might be a good idea at this point if someone called 911 and asked them to send an ambulance to the home of Victoria, the editor. I think she may have passed out after reading the Douglas Adams quote.

Joel Klebanoff is a consultant, a writer, and president of Klebanoff Associates, Inc., a Toronto, Canada-based marketing communications firm. Joel has 25 years experience working in IT, first as a programmer/analyst and then as a marketer. He holds a Bachelor of Science in computer science and an MBA, both from the University of Toronto. Contact Joel at This email address is being protected from spambots. You need JavaScript enabled to view it.. He would like to say that he used speech recognition software to type this sentence. He would like to say that, but he didn't because he didn't want to fix all of the software's mistakes.

Joel Klebanoff

Joel Klebanoff is a consultant, writer, and formerly president of Klebanoff Associates, Inc., a Toronto-based marketing communications firm. He has 30 years' experience in various IT capacities and now specializes in writing articles, white papers, and case studies for IT vendors and publications across North America. Joel is also the author of BYTE-ing Satire, a compilation of a year's worth of his columns. He holds a BS in computer science and an MBA, both from the University of Toronto.


MC Press books written by Joel Klebanoff available now on the MC Press Bookstore.

BYTE-ing Satire BYTE-ing Satire
Find out the hilarious answer to the eternal question: "Is technology more hindrance than help?"
List Price $14.95

Now On Sale

BLOG COMMENTS POWERED BY DISQUS