Get beyond the hype and conquer your IBM i data.
Written by SEQUEL Software, a division of Help/Systems
With organizations collecting more customer, transactional, and relational data than ever, data volumes are expanding past their system's processing capacity. But big data doesn't have to be alarming. In this white paper, learn:
- Why big data matters
- The three Vs of big data: Volume, velocity, and variety
- How to build a conquering strategy
- The keys to solving the IBM i partitioning paradox
- What tool you need to go beyond the fear and hype of big data
Organizations have started to explore what they can do with their volumes of data, which has led to an intense interest in big data technology. Before investing in new solutions, it is critical to understand the scope of information growth, the challenges associated with big data access and analysis, and how to overcome those issues.
In 2005, the International Data Corporation (IDC) estimated the total volume of enterprise data at 130 exabytes. That number is on pace to double every two years, reaching 40,000 exabytes by 2020. Because traditional databases were not equipped to handle that volume, big data emerged as both a problem and a solution. On one hand, pulling together large volumes of information from numerous platforms and systems can offer more comprehensive insight not afforded by traditional analytics. On the other, big data, by definition, has such a large volume that legacy databases and software don't have the necessary architecture or functionality to deliver data quickly and effectively.
"Data at scale" is a popular data management practice for growing volumes on IBM i. Unfortunately, it creates a partitioning paradox. In dealing with rapidly escalating volumes, IBM i operators use table (or range) partitioning. This allows IT to break files into multiple components. Up to 32,767 data partitions can be created on IBM i, making it a viable option for breaking through capacity constraints.
However, legacy software such as Query/400 allows access to only one partition member at a time. In a big data environment, IT departments likely need to draw on information from several members, which would require several queries. The issue of volume is not isolated to IBM i, as it can also create demand for new storage hardware and make processing tasks more resource-intensive.
Solutions are available for the sole purpose of managing big data. However, big data initiatives need to address the core issues associated with the technology:
• Volume of information
• Speed at which it must be accessed
• Wide range of data types and sources
Big Interest in Big Data Analytics
While big data creates challenges for IT professionals and business users, many decision-makers believe it's worth it. For instance, IDC's "Big Data, Analytics, and Cloud Drive Enterprise Software Growth in 2012" report attributed much of the revenue growth of the application development and deployment (AD&D) market to enterprise interest in analytics. Considering that AD&D accounted for 24 percent of all software revenue in 2012, the big data sector has definitely benefited from healthy investment.
It's important to adopt business processes and technology solutions that streamline collection and data management activities because organizations can gain significant value from overcoming the obstacles of using big data technology effectively.
Gartner revealed that much of the pressure to adopt analytics solutions comes from the data center, with 42 percent of IT leaders already benefiting from big data deployments or planning to invest in the technology within a year. In fact, according to Gartner, one of the most prominent concerns among adopters is that their deployments have come too late, putting them behind the competition.
Big Data Challenges: The Three Vs
The defining attributes of big data—volume, variety, and velocity—make it challenging to deal with. The amount of data entering business technology ecosystems is more than what traditional technology solutions were meant to deal with, and it's only getting larger. In InformationWeek's "2013 State of Storage Survey," the percentage of respondents managing 100 terabytes of data or more nearly doubled between 2009 and 2012.
The exploding data volume issue is further complicated by the fact that information is coming in from a greater variety of sources. For IT managers, this means data must be collected from numerous systems that may be running different platforms.
When it comes to data variety, a large portion of data in enterprise IT systems today is considered unstructured, adding another layer of complexity to the problem. This type of information does not have a defined data model and does not fit well into traditional relational databases. Considering that researchers from Dataversity estimate between 40 and 80 percent of enterprise information is unstructured, it is essential to ensure that this information can be effectively categorized and searched.
The final V of big data—velocity—means that companies must be able to find and use critical information quickly. The value of any given data diminishes over time. For example, an IT employee looking at the number of scheduled jobs completed on time would benefit from knowing which ones failed sooner rather than later; the same concept applies to data analytics initiatives whether they deal with large or small amounts of information.
Want to Know More?