TechTip: Watson, APIs, and You, Part 1

Business Intelligence
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times


If there’s anything hotter than that new bacon-kale-champagne salad, it’s Watson and its family of APIs. But what is Watson, and what does it mean for you?

Everywhere you look today, you see the little Watson symbolin magazines, in TV ads, everywhere. It has become the Good Housekeeping seal of approval on any product or idea, stamping it with the authority and superiority of artificial intelligence.

But do very many of us really know what that stands for and what’s involved? Over the next few TechTips, we’re going to spend a little time digging into Watson and discovering just what makes Watson so special and how it may impact your life over the coming years. 

As we go through this, we will be looking at two basic entities: Watson itself and the APIs it has spawned. 

What Are API’s?

What a ridiculous question.

We all know what APIs are. They’re these things, little applications, that do stuff, usually stuff that we don’t want to take time to do ourselves. Like calculating the cosine of an angle. 

If you look at Wikipedia, it defines an API as “a set of subroutine definitions, protocols, and tools for building application software.” It goes on about how APIs are used as the building blocks of software routines and as communication modules between different applications. In other words, they’re bit players, doing specialized but not very glamorous functions. And at one time that was a pretty good definition.

But over the years, and particularly in the last few, that definition has undergone some rapid changes. Today, APIs have more to do with encapsulated, larger, more robust applications than they do with being building blocks that do just one thing.  What separates an API from a more general application suite is that the API is not built to just stand on its own. Rather, it can be and is designed to be used as a plug-in to enhance the functionality of an application that you have built.

This was true in the past, but what is different now is the size and scope of what the API is designed to do. As we said, in the olden days, APIs were sort of point solutions, designed to do a very specific thing and nothing more. Today’s APIs are broader in scope and can really be considered application suites in their own right. But they are application suites that are built so that you can plug them into your own custom solution, and that breadth makes all the difference in terms of an API’s power and utility. 

For example, a Watson API that we will look at a bit down the road is Expressive TTS, which evaluates a written passage for emotionally oriented content. This is far more complex and broad-based than the APIs of my youth and is representative of the new breed of APIs that are being released. And, of course, it is Watson and so has a sort of supernatural vibe to it. As I said above, references to Watson are everywhere, but what is it really?

What Is Watson?

First, let’s get rid of the elephant in the room. We all know that IBM screwed up, once more, when they named it. It should have been called Skynet. It would have been a cultural tour de force, but it’s too late now, and the relatively unexciting moniker of “Watson” is what we are stuck with, although I feel it is fitting that Dr. John Watson, late of Her Majesty’s 5th Northumberland Fusiliers, finally received some recognition for his service in the Anglo-Afghan wars. 

If you look it up, Watson is a “cognitive technology that can think like a human,” although if that is true, I am really not sure how useful it will be. In a number of spots, I have seen it described as a “question-answering platform,” often with the qualifier that the questions can be asked in “plain language.” 

Digging into it, Watson is a hardware and software entity that uses powerful parallel processing to simultaneously search for multiple answers to a question and then choose the answer that appears most often.

Hardware -wise, Watson runs on 90 Power 750 servers, each with a 3.5GHz POWER7 eight-core processor with four threads per core for a total of 2,880 POWER7 processor threads and 16 terabytes of RAM. For the Jeopardy! competition, all of the data Watson referenced was kept in RAM as disk access would have been too slow to compete with humans. 

On the software side, Watson is written in a hodgepodge of languages, including C++, Prolog, and Java running on a Linux server with an Apache Hadoop framework.  The key, however, is the use of the IBM DeepQA architecture.

DeepQA is a content analysis approach that takes in questions in a natural rather than a structured language.

The first thing it does with a question is to decompose it, taking each part of it and looking for both the concepts on which each word is based as well as relationships that word (or phrase) may have with other words and phrases in the question.  Using this information, it can then build a “reconstruction” of the initial question that represents the “information need” for that question. A search of Watson’s resources then begins to assemble evidence related to that information need.

Once this process is completed, Watson begins a highly parallel process of constructing a set of answers using a number of different algorithms where each answer has a set of evidence supporting it that can be used to assign a relative confidence level to each answer. 

This is a great leap forward in terms of artificial intelligence (forming multiple answers and evaluating their potential correctness and doing it lickety-split), but it does leave me wondering if Watson is able to come up with anything new or outside of the conventional wisdom. But that’s a small point when you consider how many questions there are that can be answered within the confines of what we know now.

And the Future?

Today, Watson is being groomed for a number of missions as IBM looks for something more important than winning prize money on TV. Cancer research is one area that is being touted, and Watson is already being used to help design treatment plans based on the patient’s history and state.  And business analysis applications are also under study. Over the next few TechTips, we will look at what some of the APIs are that IBM has already constructed and how you might be able to use them.

Obviously, this is just the start of what Watson and DeepQA can do, and by the time the robot police round up all of the survivors and move us to internment camps in Nebraska, today’s Watson will probably look pretty primitive.