Grid computing is an important new technology that enables secure, distributed, collaborative applications to function across a virtualized environment. An application designed to use grid technology can harness the unused power of idle servers to dramatically speed up computational simulations or spread the force of a peak load across multiple servers. Grid technology can also enhance collaboration between researchers across the globe by providing integrated access to research databases.
This article describes the basics of grid computing and provides examples of how grid technology can be used in business and science. It introduces important grid standards, including the relationship between Web Services and grid services, and describes the grid products that are currently available.
What Is Grid Computing?
Let's start with the basics--networking. At one level, a network is an infrastructure of wires, cables, switches, and routers that link together groups of computers. From another angle, a network is a collection of software running on those computers that utilizes that physical infrastructure to communicate using well-known protocols. We use networks all the time for email, Web browsing, file transfers, and much more.
Since at least the late 1990s, computer researchers have discussed ways of creating enhanced computer networks that do more than simply enable communication. These researchers wondered how computer users could take better advantage of idle computer resources. They wondered how data could be shared more efficiently among widely distributed physical locations.
In 2001, a trio of researchers published a seminal paper, "The Anatomy of the Grid," that addressed these questions. In their paper, Ian Foster, Carl Kesselman, and Steven Tuecke formally defined the grid computing field and proposed a set of open standards and protocols that grid computing technologies would use. Foster and his colleagues asserted that the main problem that grid computing should address is "coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations."
What Does Grid Computing Provide?
A virtual organization consists of companies or other organizations that share selected computing resources and data for a common purpose. While virtual organizations can vary widely in scope and purpose, they all use grid computing to accomplish two main goals: resource sharing and data sharing.
A virtual organization might include researchers from multiple universities sharing research data using grid computing. The Network for Earthquake Engineering Simulation (NEES) uses grid technology to help researchers across the United States share data and collaborate more effectively.
A more complex virtual organization might consist of numerous businesses or corporate departments that share computational resources for a common purposes. Such business collaboratives need not be permanent; a virtual organization's grid infrastructure can be temporary, providing needed resources for a limited number of joint operations.
Many existing grid computing applications leverage the power of unused computational resources to dramatically lessen the amount of time required to perform calculations or simulations. A company could use grid technology to leverage the computing power of idle email servers or file servers to speed up financial forecasting simulations during peak loads.
Grid technology can also be used within a single organization. Emergent Game Technologies (formerly Butterfly.net) uses grid technology to provide effective load balancing of online games.
The Globus Alliance and IBM list other prominent examples where grid computing has been successfully used in science and business environments.
Grid Standards and Grid Toolkits
As grid computing has grown, organizations such as the Global Grid Forum and the Globus Alliance have worked to establish standards for grid technology. The ideas that were proposed in "The Anatomy of the Grid" were more formally defined in "The Physiology of the Grid." This paper introduced the Open Grid Services Architecture (OGSA), which forms the foundation of modern grid computing. OGSA defines grid services, which are essentially stateful Web Services that form the basic building blocks of grid computing. It also defines standard mechanisms for publishing, discovering, and managing grid services.
The Globus Alliance provides programming toolkits that can be used to develop and deploy grid services. These toolkits implement the OGSA and are available under a BSD-like open source license.
The Globus Toolkit 3.0 (GT3), released in June 2003, uses the Open Grid Services Infrastructure (OGSI) to provide the first reference implementation of OGSA. This toolkit provides programmers the tools needed to develop and deploy grid services. GT3 provides Java APIs for creating and managing grid services. In addition, GT3 includes a number of standard grid services developed by Globus. Like Web Services, deployed grid services are generally hosted within an enterprise Java server; Globus provides a basic J2EE container with the toolkit for development and testing purposes. For additional information about GT3, refer to the GT3 Documentation and the Globus Toolkit 3 Core White Paper.
The IBM Grid Toolbox V3 for Multiplatforms is an enhanced OGSI-based grid toolkit based on GT3. It provides a simple installer and an easy-to-use Web interface. Additionally, it provides a more mature Java hosting environment for grid services.
OGSI-based grid services are based on Web Service standards, but they include several important extensions. Web Services provide a cross-platform mechanism for computer programs to communicate over the Internet using XML messages. Since standard XML messages are used, programs that use Web Services to communicate need not be written in the same programming language. If you are not familiar with Web Services, the GT3 Programmers Tutorial provides a more detailed overview.
Unlike Web Services, grid services can be stateful, and they can be transient. The life cycle of a grid service may be managed. Taken together, these extensions have important implications. Each client can be presented with its own instance of a grid service. Each instance can include data that persists for the lifetime of the grid service instance. And each instance can persist for a specified time period and then be deleted.
On January 20, 2004, Globus, IBM, and HP jointly announced the WS-Resource Framework (WSRF). Globus defines WSRF as "a set of six Web services specifications that define what is termed the WS-Resource approach to modeling and managing state in a Web services context." While earlier grid service specifications were built on Web Service technology with grid-specific extensions, WSRF was designed to provide a common set of specifications for grid services and Web Services. The adoption of WSRF signals the convergence of grid computing and Web Service technologies.
WSRF supersedes OGSI as the basis for new OGSA implementations. Ian Foster described the relationship in his keynote introducing WSRF at GlobusWORLD 2004: "WSRF restates OGSI in Web Service terms." The Globus Toolkit 4.0 (GT4), released on April 29, 2005, features an OGSA implementation based on WSRF. GT4 programming APIs are available for Java, C, and Python. Because of the differences between OGSI and WSRF, legacy grid services written for GT3 will likely require some porting to run under GT4.
The Globus Toolkit Official Documentation includes installation guides, API documentation, migration guides, and numerous reference documents for each Globus Toolkit release. For users new to grid technology and eager to try practical examples of grid services, the GT3 Programmers' Tutorial by Borja Sotomayor provides an excellent step-by-step guide to programming grid services. No prior experience with grid services is required. An updated tutorial for GT4 is also available. The Globus Documentation Project hosts these tutorials.