Whether you're a novice to the subject of big data, an in-the-field expert, or a manager trying to judge the applicability of big data in your organization, you will find something of value in Big Data Governance.
Big data is getting a lot of press in the industry, and though the term may sound like an oxymoron devised by an overly caffeinated marketing executive, big data is more than marketing hyperbole. It's a real and growing topic of interest and study by enterprises and governmental organizations worldwide.
Sunil Soares' new book, Big Data Governance: An Emerging Imperative, does a wonderful job of framing the requirements of data governance across the myriad opportunities that big data creates. Aimed at both technical and non-technical managers working in large organizations, Big Data Governance is a primer composed of case studies and frameworks that demonstrate how organizations can approach the topic of big data with an eye toward taking advantage of the potential by using industry-respected disciplines of data governance.
Defining Big Data
It's estimated that humans create 2.5 quintillion bytes of data every day. IDC estimates that the amount of information in the digital universe exceeds 1.8 zettabytes (1.8 trillion gigabytes) and is doubling every two years. Yet 90 percent of this data has been created in the last two years alone.
This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. These data sets—some organized, some disorganized, some structured, but most unstructured—are the subject elements of big data.
But what is big data in reality? The attributes of big data are often identified by four factors: the volume of information collected; the velocity by which it arrives or changes; the variety of the types of data that is captured; and the veracity of that information.
So what's the challenge?
If you can imagine standing in front of a fire hose of information that's constantly changing—information that is overwhelming in its scope and intensity and that your systems cannot begin to accept or manipulate—you'll begin to comprehend the raw power of big data that your organization is experiencing every day. So the challenge—and the subject matter of Soares' book—is learning how to harness that raw power and finding the tools your organization must use to begin to make use of the information. Those tools are the processes of data governance. Data governance is the discipline of constructing and maintaining standards for the information. In other words, big data governance is the process of transforming vast quantities of bits and bytes into useable and actionable information.
About the Book
Big Data Governance is a comprehensive compendium of data governance frameworks and step-by-step processes applicable to the challenges of big data. It contains over 50 case study snippets illustrating the opportunities and experiences of real organizations in the field. It spans the industry sectors and functions of healthcare, utility, retail, telecommunications, insurance, customer service, and IT.
At over 300 pages in length, the book offers the reader the opportunity to dive deeply into the topic at almost any level of detail. Yet its accessibility is perfect for managers who wish to quickly come up to speed with the evolving and rigorous disciplines of big data governance.
By far, the most informative chapters are in Part Two, "Big Data Governance Disciplines," which address everything from big data governance organization strategies, metadata, privacy, data quality, and business process integration, to the managing of the lifecycle of big data. The writing and the comprehensive quality of the book reflect the richness of Soares' own experience as well as that of the numerous other contributing authors.
The book is vendor- and platform-agnostic with a chapter dedicated to "The Big Data Platforms" that talks authoritatively about IBM, Oracle, SAP, Microsoft, HP, Informatica, SAS, Teradata, EMC, Amazon, Google, Pentaho, and Talend. There is an ample Glossary of Terms, and the book is well indexed.
Whether you're a novice to the subject of big data, an in-the-field expert, or a manager trying to judge the applicability of big data in your organization, you will find something of value in Big Data Governance. Soares has done an excellent job of assembling an important information tool for his readers.