We are going to turn our attention over the next two months to the Watson Studio and Watson Knowledge Catalog. Come on in and look around.
Watson Studio, as opposed to the APIs that we have been looking at for a couple of years, which have a single focus and perform just one sort of task, is actually more of an environment, an environment where you can work on various things and do a broad range of activities.
In short, Watson Studio is all about data. Your data. The data you will be feeding into Watson. The data you will be using to make your decisions.
Everyone knows that you can’t, for the most part, just load your data into a big CSV file and run it into Watson and get brilliant output.
The fact is that, even though Watson has facilities to deal with “natural language” (the kind of language you and I use on social media and in normal conversation), it’s still reasonable to expect some amount of pre-analysis data manipulation to have to take place.
And this is what data scientists do (in addition to claiming that their jobs are “sexy”). They look over data, make sure it’s consistent and clean, and then feed it in.
And Watson Studio is an environment you could use to do that in.
Watson Studio and Watson Knowledge Catalog
The Watson Studio and Watson Knowledge Catalog apps are very related and totally integrated, and both deal with getting your data ready to use. Studio is the simpler of the two. I could talk about them together, but I am going to discuss them separately. There’s enough to say about each that they should be two articles.
Studio lets you create projects that can be used to both access and prep your data. Here are some fun facts about how Studio does that.
Studio provides three roles, each with different permissions: Administrators, Editors, and Viewers. As you might expect, admins can do anything. Editors have fairly broad rights. They can look at and comment on everything and can add or edit data assets, but they can’t remove them and can’t do any of the management functions. Viewers can do very little (view analytic assets and comment on them).
Data Assets Points
This functionality allows you to import data from either the cloud or onsite locations, including your organization’s catalogs. You can then upload files to the project’s storage.
You then bring in streaming information and review it with the Streams Designer Tool. Many times, we think of streaming only in terms of social media, but this tool allows you to build your own streaming source and then bring that into your universe.
Finally, you clean and shape the data with the Data Refinery tool. Cleaning, of course, involves removing bad data, formatting improperly formatted data, eliminating incomplete data, and, especially, removing duplicate data. The other side, shaping, involves filtering, sorting, removing columns, and other activities that will result in data that is clean and well-organized. This is an area where data scientists typically spend a great deal of their time, but this tool helps speed up the process.
Data Analysis Options
The final step in using Watson Studio is to analyze the data and draw your conclusions. And there are a number of ways in which this can be done.
The first is to use either the Python Jupyter notebooks or RStudio.
Alternatively, you can also build, train, test, and deploy solutions based on machine or deep-learning models. You would use a Spark instance and the Spark Flow Editor to set up the necessary flows and weights that can then be adjusted as part of the training. In addition, you can go all Han Solo and do some deep-learning experimentation with neural networks. On a more mundane plane, you can use Studio and deep-learning to classify images (training the Visual Recognition API).
Finally, and I think this is neat, you can create dashboards to show data visualizations without having to do any coding.
So, are ya? Scared, that is? Sounds pretty highfalutin and complex, don’t it?
Yeah, it probably is.
But if you start on the home page for the Studio and the Knowledge Catalog, IBM will lead you along. Of course, you have to be moderately intelligent and willing to spend some time digging into things, but it should be pretty easy.
Besides, what choice do you have? You’re living in the 21st century. You and your company have to be ready to move forward in this century, and it looks like this will be the century of machine intelligence.
Next month, more info, but this time on the Knowledge Catalog.