For years, analysts and systems engineers have been stumped by a basic conflict: How do you maximize performance while sustaining increasing numbers of users and increasing quantities of data? Traditional performance curves show that, as you add more users or more data to any information system, different systems show different performance curves. Invariably, the more demand you add, the less performance you achieve.
During the 1980s and 1990s, the credo was to throw more hardware resources at diminishing performance curves. Even so, it was not uncommon for all IT computing resources to run at dreadful levels of capacity--between 50 and 60%--to vouchsafe sub-second response time. IT's rationale was simple: To sustain adequate performance--particularly in Windows-based server systems--it was cheaper to add more memory or disk than to try to resolve structural bottlenecks or fine-tune the processes of the operating system.
Performance and Capacity Advantages of Mini and Mainframe Systems
In comparison to PC servers, mainframe and mini-computers (like the AS/400), have traditionally had better track records for maintaining performance with lower levels of unused capacity. For instance, it's not uncommon to hear stories of iSeries systems providing sub-second response though the systems may be running at capacities higher than 80%. The reason is that the i5/OS operating system (and OS/400 before it) virtualizes the physical resources of the hardware so that elements such as storage, memory, and especially workloads can be fine-tuned and better managed.
Unfortunately, however, as PC server hardware prices dropped, it's traditionally been simpler for IT to implement increased numbers of turnkey PC servers, networked together, to attempt to resolve performance issues for Windows-based systems. In other words, instead of rallying behind systems that offered better performance management tools, most IT departments chose to implement the lower-priced systems.
Step by step, IT began to equate high levels of server performance with lower levels of capacity utilization and increased network and operating system complexity.
Typical Growth Patterns Toward Complexity
Consider the typical scenario of IT growth in a Midwestern manufacturing company. Its management chose to buy a new application for use by its production department. It was a Windows Server application that promised to fill the requirements of the department nicely. However, the turnkey package required that the application run on its own server hardware, with its own custom-tailored configuration. The salesman said the best performance would be achieved if there were plenty of memory and disk available. Memory and disk were cheap, so IT brought in the system.
Likewise, the product design department had a need for a Computer Aided Design (CAD) system, and it too needed its own piece of customized hardware running a different operating system.
Meanwhile the company's accounting department invested in an ERP system running on an i5 running i5/OS.
Department by department, the manufacturing company automated its workgroups, only to discover that--at the end of the automation process--it had created a complex network of servers and applications that, by design, were running at only 60% of overall potential capacity.
Complexity Complicates Performance
The problem with this kind of uncoordinated systems growth is that the inherent complexity of this network soon begins limiting IT's ability to sustain the performance of the overall information system.
For instance, as the company moves towards implementing e-business applications across the Internet, the IT environment will quickly become highly complex. More than likely, the information system will consist of some combination of routers, edge servers, Web servers, Web application servers, EJB servers, legacy transaction servers, and database servers, many of which may run on different hardware and operating systems.
How can IT ensure that the overall system is performing as expected in this type of multi-tiered environment? The answer is, IT can't. Unfortunately, basic questions about performance will remain unanswered:
- Are work requests completing successfully? If not, where are they failing?
- Are successful work requests completing within the expected response time? If not, where are the bottlenecks?
- How many work requests were completed over some period of time compared to prior periods? Is the workload growing?
- Are system-level resources being used for optimal performance? If not, can they be dynamically redirected to alleviate bottlenecks?
To accurately answer these questions, you must have the ability to do the following:
- Identify work requests based on business priority.
- Track the performance of work requests across server and subsystem boundaries.
- Manage the underlying physical and network resources to achieve your specified performance goals.
But in a heterogeneous environment such as the one described above, it's not possible to treat all the components of the overall information system as a single resource. In fact, it's not even always possible to identify an individual work request or transaction.
Not so, of course, with the iSeries i5. But though the i5 and its predecessor iSeries and AS/400 systems have many good tools to answer these kinds of questions, today these systems are only a part of a larger complex of devices and platforms.
IBM's Virtualization Engine Suite
To address these performance issues, IBM began developing the concept of a Virtualization Engine Suite nearly four years ago. This suite of tools was designed to help organizations better manage the capacity and performance of their overall information systems.
Today, IBM's Virtualization Engine Suite for Servers is a pre-tested, multi-platform set of tools for a variety of server operating system environments. But these tools are not for everyone. They only make sense in the most complex heterogeneous environments.
The Virtualization Engine Suite is packaged in two different flavors: One for IBM operating systems like AIX and i5/OS, and one for Windows and Solaris operating systems. But, according to IBM, there are interchangeable components that represent the Virtualization Engine's architecture:
- Enterprise Workload Manager (EWLM): This component enables the customer to automatically monitor and manage multi-tiered, distributed, heterogeneous or homogeneous workloads across an IT infrastructure to better achieve defined business goals for end-user services.
- Systems Provisioning Capability: This component creates a virtual representation of pooled resources that are shared between different workloads. Systems provisioning is delivered by IBM Tivoli Provisioning Manager 2.1. This capability supports the separation of the physical view of resources from the logical view.
- IBM Director Multiplatform: This component helps deliver a common, consistent, cross-platform systems management solution for IBM servers, storage, and operating systems. It provides a single administrative console for management tasks (operating system management, storage management, distributed systems management, and platform management), a common management infrastructure for upward integration with Tivoli, and a management foundation for the implementation of an on demand architecture.
- Virtualization Engine Console: The console is based on the IBM Integrated Solutions Console framework to provide a consolidated view for managing the virtualized enterprise resources. The Virtualization Engine Console is used to manage the IT environment by looking at the overall systems beyond operating system boundaries and helping to better maximize resource sharing.
- IBM Grid Toolbox: This component is based upon the Globus Toolkit V3.0, and its purpose is to create a connection between various combinations of resources to construct a security-rich, robust computing grid infrastructure.
In other words, IBM's Virtualization Engine provides the ability for large customers that have many different operating systems and network components to begin looking at their entire infrastructure as a single entity. By "virtualizing" these resources, administrators can begin to measure how their systems are responding as a whole and then pool and manage these resources according to the workload requirements.
Using Open Standards for Performance Monitoring and Connectivity
IBM's strategy for its Virtualization Engine Suite is to pull together the different components using industry-standard APIs and protocols and to rely heavily upon its own product base of middleware tools. These tools include WebSphere built with J2EE and its Tivoli management suite of products.
The Virtualization Engine also relies heavily upon the technology provided by IBM's Power 5 processors. The Power 5 technology allows as many as 10 virtual partitions to run multiple versions of operating systems simultaneously. These partitions are controlled by a supervising program called a Hypervisor. The Hypervisor supports partitioning and controls the multiple operating system environments. The Power 5 technology also enables virtual local area networks (VLANs) to virtualize the resources available in the physical network. It also allows virtual I/O to enable adapters and other devices to be emulated in memory.
I5/OS Similarities to Virtualization Engine
Meanwhile, if all this sounds suspiciously similar to what is occurring within the i5/OS (OS/400) operating system itself, you shouldn't be too surprised. Of all of IBM's operating systems, i5/OS has traditionally offered more virtual management capabilities than any other operating system in IBM's line of eServers. Many of the concepts of pooled resources, work requests, and resource allocation for performance monitoring are derived from IBM's long experience with both mainframe and OS/400 operating systems. That's one of the reasons that, historically, these mainframe and mini-computer systems have traditionally been more productive than PC server systems.
The difference is that now IBM is extending these same concepts of virtualization across the entire IT infrastructure, piecing together computing resources to provide management services for the entire information system. The Virtualization Engine is offering large organizations the opportunity to construct an i5-like infrastructure--composed of multiple servers, operating systems, and devices--to function as one large, heavily managed and controlled i5 information system.
IBM's Strategies for the Future of Computing Performance
Of course, IBM considers today's Virtualization Engine technologies transitional. IBM's great goal is to move companies toward an on demand infrastructure in which all computing resources are virtualized. This will require increasing the power of processor technology, continuing the development of open standards for grid computing, and building more comprehensive supervisory Hypervisors that can link together and control new computing devices and mechanisms.
When IBM's model for computing has become completely virtualized, according to analysts, our IT organizations will finally enter an era in which all computing resources can be sold as a commodity, like water from a tap. If you want to compute, you turn the spigot to on. If you want more computing power, you turn the spigot harder.
This is IBM's goal for performance as well. Instead of trying to balance the age-old conundrum of performance against capacity, IBM wants to provide an unlimited computing service to its customers, to be billed as it is used. In that light, IBM's Virtualization Engine can be seen as merely the first step in that long road toward its goal of computing resources available on demand.
Editor's Note: To read more about the Virtualization Engine, see "Practical Virtualization."
Thomas M. Stockwell is Editor in Chief of MC Press Online, LP.