Real-time, heterogeneous data sharing puts information where it is needed, when it is needed, and in the formats that best serve diverse needs across the enterprise.
Editor's note: This article is an extraction from the Vision Solutions white paper "Real-Time Database Sharing: What Can It Do for Your Business," available free from the MC Press White Paper Center.
Managers striving to use smarter data to create smarter businesses need clear pictures of the state of their enterprise, without having to rely on isolated snapshots from a variety of incompatible data sources. Yet, in many organizations, data silos segregated by department and/or business function are built on incongruent platforms. In these environments, fashioning an integrated enterprise view presents major data formatting, transformation, and consistency challenges.
To overcome these obstacles, IT needs a way to seamlessly replicate data, in real-time, from, for example, DB2 to Oracle, Informix, or SQL Server and back again. And, to make the solution practical, it must be done without the need for custom programming or large-scale integration efforts and without impacting the operation of existing applications.
The data silos that give rise to this challenge occur for a variety of reasons. Corporate mergers and acquisitions bring together disparate systems. In some cases, the benefits of selecting separate best-of-breed applications outweigh the costs of supporting a variety of platforms. New or upgraded applications run on new technologies, but legacy applications continue to run on old platforms. Or departmental IT autonomy may result in a lack of enterprise platform standards.
If all applications and data could remain isolated, any platform differences among them wouldn't create data integration problems, but that's not the case. Applications must often share data. What's more, business intelligence initiatives may introduce new technologies into the mix, creating the need for yet more data sharing. For example, production applications may employ DB2 for i databases, while users want to employ familiar spreadsheet and business intelligence tools that require, say, SQL Server databases.
The requirement may be even more complex. For example, an organization may need to stream data in real-time not only from DB2, but also from Oracle, Sybase, Informix, and SQL Server databases running on a variety of IBM i, AIX, Windows, Solaris, Linux, and HP-UX platforms all into, for example, a Teradata data warehouse.
These challenges are not new, but traditional solutions often don't suffice anymore. In the past, organizations integrated data by periodically copying it to a centralized data store. However, those processes took months to plan and implement. And the solutions didn't scale well, nor were they flexible enough to accommodate evolving requirements.
As an alternative, some companies use an application integration product or message-oriented middleware to provide a general framework to send and receive data between otherwise isolated applications. However, to enable this, applications must be significantly reworked and additional expertise is needed to support the messaging layer, which can require a significant investment.
Another approach is to use data replication to automatically transmit data updates from one database to another. This is the kernel of a preferred solution, but many replication products support only identical source and target databases. Furthermore, most do not offer the data extraction or transformation features that are usually critical requirements for optimizing data warehouses.
Most RDBMS vendors provide replication services for their own databases, but replication among heterogeneous databases may require an add-on gateway product. These products are often difficult to manage, and they may not support replication between different releases of the same database, which greatly limits their scope and flexibility.
A far better solution is a product that, out of the box, supports data replication among heterogeneous hardware, operating systems, and databases. These tools can provide access across the enterprise to accurate, real-time data, while eliminating the need to undertake expensive, complex data-integration projects.
Make Versus Buy: The Quest for Return on Investment
Some IT departments consider building replication capabilities in-house, but they usually abandon the project after assessing the requirements of a heterogeneous replication solution that will be flexible enough to meet not just today's needs, but also those that will evolve over time.
When forecasting the ROI on an in-house replication development project, IT managers typically apply a fully burdened labor rate of US $125–150K per year when calculating the following costs:
- Development Costs: Data capture mechanisms must be developed for every database that will be a data source. Data apply mechanisms must be developed for every target database. In addition, if the solution must support two-way replication, mechanisms to handle data collisions are also necessary. A data collision occurs when the same data is updated simultaneously in different locations. In addition, facilities will have to be coded to handle data format differences between the participating databases. And it may be necessary to transform data to accommodate business intelligence requirements when using replication to feed data warehouses.
- Test System Costs: RDBMS licenses, LPAR resources, storage, and additional RDBMS add-on products such as gateways may be needed to create a test environment for an internally developed solution.
- Maintenance Costs: Maintaining an internally developed data-sharing solution requires ongoing coding changes to accommodate new or altered data schemas. And, unless comprehensive monitoring and management tools are developed along with the replication facilities, operators may spend considerably more time managing the solution than would be the case with a packaged solution. Furthermore, turnover in the IT department results in the loss of people familiar with your data and application structures, which can dramatically increase the costs of all of these activities.
- Opportunity Costs: Time that IT staff spends developing, testing, and maintaining a replication solution is time that they don't have available to work on other critical projects.
The data-sharing solution you deploy should provide central, standardized, easy-to-manage, click-and-go facilities for sharing data between disparate databases and application silos without requiring programming changes or major integration projects and without impacting the performance of production systems. What's more, it should fully leverage the flexibility of open standards technology. This will allow the solution to adapt quickly to IT and business changes. The result will be a cost-effective solution that quickly generates an ongoing ROI.
For more information, please download the Vision Solutions white paper "Real-Time Database Sharing: What Can It Do for Your Business" from the MC Press White Paper Center.