Last month, we looked at Watson Studio. Now we come to Watson Studio’s big brother, the Watson Knowledge Catalog. Let’s see what that’s about.
This month, I want to look at Watson Knowledge Catalog. I tend to think of it as Watson Studio’s big brother, but I’m not sure if that’s actually true. You decide when we get to the end of this. I trust your judgment.
One of the big differences between Studio and Knowledge Catalog is the idea of catalogs. With Studio, everything was set up for a project. But with Knowledge Catalog, the catalog is the big dog.
Catalog Management Systems
Catalog Management Systems are oriented around centralizing data and content that would otherwise be scattered throughout various vertical silos. This is a major problem in most companies. And the larger the company, the more problematic this becomes.
Having a catalog management system allows you to overcome that problem and make the data that’s really important to a company easily visible and organized. This content can be whatever you want to include. Engineering drawings, customer information, pricing, any type of data either visual or textual.
Components of Watson Knowledge Catalog
The Watson Knowledge Catalog application, in IBM’s own words, “provides a secure enterprise catalog management platform that is supported by a data policy framework.”
And now, to translate this into eighth-grade English, working in reverse order.
- “Data policy framework” is a fancy name for a set of rules that determine just who is allowed to see what data. It should be tied into the company’s business rules, which determine what level of information each particular company role is authorized to see.
- “Enterprise catalog management platform”—Any time you see the word “enterprise,” you know that this is something designed to support the needs of a large company. In the jaded world that I inhabit, it also seems to be synonymous with “expensive,” but I’m sure that’s not the case with IBM. Seriously though (because I was totally kidding in the previous sentence), “enterprise” is a knight in full armor. Or to put it another way, it’s the Full Monty. The point is, it’s a robust system that can handle the most rigorous of requirements.
- “Secure”—Security is a key element and one that IBM has not overlooked.
In the end, the Knowledge Catalog is fully functional. The only question left is, what does it really do?
The first step is to index everything that you want to keep track of. This indexing can be done in one of two ways: data indexing or analytical indexing.
A data asset includes the data, information on how to access the data, the format of the data, a classification, and who is authorized to access that data. Data assets come in two basic flavors: structured (e.g., a spreadsheet or a database) and unstructured (e.g., text documents, including PDFs, MS docs, and Google docs).
Analytic assets are things like Jupyter notebooks (which are often used in data science applications) or an actual trained model.
Assets can be added in a number of ways.
- Leave your data where it is and just add a connection to the asset to the catalog.
- Discover and add all assets associated with a relational database that you have a connection to.
- Upload files to the dedicated, encrypted storage area associated with the catalog.
- Publish assets associated with a Watson Studio project.
- Add data sets from the Watson IBM Community.
- Import assets from the InfoSphere Information Governance Catalog.
You can always find assets that you want to add by searching with keywords and filters that will look at subject tags and other properties, check out reviews of assets to see if they fit in with what you are looking for, and choose from highly rated or suggested assets based on your previous usage, dreams, and the inevitable “other factors.” OK, maybe not dreams. That’s a bit creepy.
Work with Assets in Projects
The thing about the Knowledge Catalog is that is exactly what it is: a catalog. It’s a collection of things that you can use. But to use those assets you’re accumulating, you need to move them over to the Watson Studio and create a project. You can also create projects in Studio to accumulate assets.
One of the key elements of the Knowledge Catalog is the fact that it controls access through the use of data policies. Data policies set up a series of rules that indicate who can access what in the catalog.
The way that the Knowledge Center does this is three-tiered. You start by defining categories of assets. Then, you attach policies to the categories. And finally, to the policies you attach rules that specifically determine who can access what.
Certainly, Watson Studio and the Watson Knowledge Catalog work together to provide both the cataloging and the processing of data and analytic assets. And the amount of control you place on all of this is up to you.