A couple of years ago, the relational database of the AS/400 had no name. Then, IBM referred to it as DB2/400. As of OS/400 V4R4, however, IBM has christened the relational-database-with-no-name as Universal Database (UDB). Find out the history of UDB and what new features will be coming with this new, improved database.
The AS/400s integrated database in V4R4 has been renamed DB2 Universal Database (DB2 UDB). However, we AS/400 advocates know that DB2/400 by any other name would smell as sweet. So why the new name? Isnt UDB just the same old DB2? UDB is the same database we have come to know and love but with a number of powerful new features. (These features wont come immediately with V4R4, though; they will not be available until the third quarter of 1999.) DB2 UDB has become IBMs cross-platform relational database (RDB). For us, this is a significant point because our good old DB2 is now on multiple hardware and software platforms, including Intel/Windows NT/98, Intel/OS/2, PowerPC/AIX, SPARC/Solaris, and HPPA/HPUX. With the increased interest in multiplatform Internet applications, our knowledge of DB2, and hence UDB, is becoming very marketable.
Get with the SQL Program
If youre not an SQL proponent, stop reading, because all the enhancements to DB2/400 that come with its new name are available from SQL. But before you sign off, give me a moment to sell you on SQL. SQL is the true cross-platform language; DB2/400 is one of the few RDBs that allows you to create, read, update, delete, and otherwise maintain your data outside of SQL. Speed comparisons of record-level versus SQL access aside, SQL is the best way to describe your data, the best way to perform set-based processing, and the best way to perform ad hoc updates and retrievals. (And many SQL champions will argue the record-level access issue with you although they are experts in query optimization). When RDBs were first invented, the intended method for manipulating data was always relational algebra. Now, you can quit readingthat is, if you want to ignore the power and ubiquitousness of SQL.
A History Lesson
UDB was not developed recently; it has been on many IBM platforms for several years now. When UDB first became available, however, it was not called a universal
database but rather an object-relational database (ORDB). A half-dozen years ago, vendors of RDB systems (such as Oracle, IBM, and Computer Associcates) came out with ORDBs to compete with a new class of database systemsobject-oriented databases (OODB). OODBs came into vogue with the advent of object-oriented programming languages, initially with the success of C++ but more recently with Java. These OO languages dealt with objects, which, conceptually, are data structures tightly coupled with the functions that manipulate the data of those structures. RDBs are also structures of data that can be mapped to OO languages. The problem, however, was that RDBs did not have the facilities to handle the nontraditional data types that modern applications required, such as images, movie clips, graphics, spreadsheets, and also lengthy character-based documents such as legal records or international shipping specifications.
I worked with an OODB for almost two years, and even though I am an OO champion, I was not an OODB advocate. OODBs had problems. They reflected a completely new technology filled with unforeseen glitches. But the biggest problem I had was that OODBs had no query facilities. There was no way to verify database update code without writing more code, and then whos to say that the verification code was correct? Also, several years ago, there was no way to create ad hoc reports with OODBs. Today, even though OODBs are making query facilities available, theyre still kludgy. The point is, when Im developing business applications using C++, Java, RPG, or some other programming language, I want my RDB.
Regardless of what I thought, OODBs were gaining in popularity, and to compete, RDB vendors developed what were initially called ORDBs, which provided BLOBs (binary large objects) and CLOBs (character large objects) to support the nontraditional data as well as user-defined data types and user-defined SQL functions. But ORDBs arent really object-oriented; they dont have inheritance or polymorphism capabilitieshence, in my opinion, the change in name from object-relational database to universal database.
Todays applications need controlled storage for large pieces of data. Just look at the C drive on your PC, and youll see a variety of data types, including spreadsheets, images, and documents. Now, associate these data types with applications. Consider a real estate application, for example, which might require the controlled storage of graphical images of homes along with spreadsheet analyses of cost estimates. In this same package, a mortgage application might have lengthy, character-based legal documents as well as signatures that are not quite as long but, being binary, are nontraditional. Current applications also need data types that are strongly typed. For instance, a data type called Money might be a fixed-point, two-decimal, 10-digit field. The Money data type would have type-safety because you cant add an integer value of U.S. Postal Service code (ZIP code) to it. Applications today also need reusable functions that are more tightly coupled with data than with traditional application functions. UDB gives us these capabilities so we dont have to resort to OODBs (and OO languages) to create modern applications.
IBM says its latest RDBs are universal because they have universal access, application support, extensibility, scalability, reliability, and management. They have universal access because UDB is accessible from a wide variety of protocols (see Figure
1). They have universal application support with Domino, Java, Net.Data, and legacy languages. They are universally extensible with their BLOBs, CLOBs, user-defined data types and functions, and datalinks (yet another strategy for supporting nontraditional data that Ill get into later). These RBDs are universally scalable because of the symmetrical multiprocessing (SMP) and very large database (VLDB) capabilities of IBM platforms (the AS/400, for instance, supports up to 128 terabytes of disk storage for data warehousing). They are universally reliable because of IBMs history of stable platforms (the AS/400 is known for its 99.9 percent availability rate). They have universal management because UDB can be configured and maintained with a graphical interface that is the same regardless of the UDB platform.
The BLOB Meets Rochester
Going back to my real estate example, see Figure 2 for the SQL syntax required to create a file (table in SQL parlance) called House. Its pretty much the same old SQL create- table syntax but for the BLOBs and CLOBs. The HouseImage field is defined as a 2-MB BLOB. The Contract field is a 1-MB CLOB. The Signature field is another BLOB field that proves John Doe did, in fact, sign the contract. But what about that SalesPerson field, with its Datalink datatype? Often, your LOBs may not reside on the same machine as your UDB, or you simply dont want LOBs stored in your RDB. In UDB, the datalink data type allows you to specify the URL that refers to a LOB on your local or a remote machine. This allows you to keep LOBs in the Integrated File System (IFS), which is optimized for byte-stream nontraditional data-like images. The SalesPerson field can store a URL that points to the location of LOB data on some platform connected via the Internet.
User-defined Data Types
The House file does not contain a field for the sale price. Perhaps I should have created a field with the SQL data type of Decimal with the following:
But remember that I talked about a Money data type, so why dont I go ahead and create this new data type with the following SQL statement:
CREATE DISTINCT TYPE Money AS Decimal(10,2) WITH COMPARISONS;
Now, I have a new data type called Money. The Decimal data type declaration that follows the AS clause is the source type; it must be one of SQLs built-in data types. The WITH COMPARISONS clause serves as a reminder that instances of the new distinct type can be compared with each other using six comparison operators: =, <, <=, >, >=, and <>. This clause allows comparisons between fields of the same distinct type. Because they can be compared, so too can they be used by the SELECT statements ORDER BY, GROUP BY, and DISTINCT clauses.
The problem with user-defined data types is that the type-safety is so strong that you cant easily perform mathematical operations on dissimilar data types. To add a standard decimal data type field to a type of field such as my new Money data type, for instance, you would need to use a technique known as casting. For example, suppose I want to increase all salaries by 5 percent. If the Salary field is of my Money type, I would have to use the following SQL statement:
UPDATE Employee SET Salary = Money(decimal(Salary) * 1.05);
The function called decimal casts the value of the Salary field to be of the decimal data type so it can be multiplied by one and five hundredths. The Money function takes the result of that calculation (a decimal value) and converts it to the Money data type. These two functionsthe decimal function that converts a Money parameter to a decimal and the Money function that converts a decimal parameter into a Money valuewere automatically generated when I created the distinct data type Money. You would think that these function calls increase processing, but in practice, they process very efficiently because, after all, the Money date type really is a decimal.
You might expect to be able to add two values typed as my new Money data type by taking the following approach:
SELECT Total price: , StickerPrice + RustProofing
WHERE Make = HONDA AND Model = CIVIC and YEAR = 1997;
Not so, however. You need to think of the plus (+) operator as a function, the add function. When you want to add two integers together, SQL already has an add function to do that, but when you want to add two values of a user-defined data type, you need to create an add function. Thankfully, the process is pretty easy:
CREATE FUNCTION +(Money, Money)
SOURCE +(Decimal(), Decimal());
The SOURCE clause says to convert each Money parameter to a decimal value and to use the built-in decimal data types plus function to perform the addition operation. Optionally, I could have created a function called add:
CREATE FUNCTION add (Money, Money)
SOURCE +(Decimal(), Decimal());
But then the SELECT statement above would have to say add (StickerPrice, RustProofing). The use of the plus operator is more intuitive.
If you want to be able to multiply a Money value by an integer, you have to create a multiplication function:
CREATE FUNCTION *(Money, Integer)
SOURCE *(Decimal(), Integer);
And if you further want to perform aggregate functions, such as averaging, you have to explicitly define such functions:
CREATE FUNCTION avg (Money)
SOURCE avg (Decimal());
So far, all the functions I have created turn around and ask SQL built-in functions to do the dirty work, but you can also program these functions yourself with your favorite high-level language (HLL). You may want to create functions that define behaviors that are significant to the distinct data type. For performance, you might want to create functions that execute some sort of predicate processing to decrease the resulting set before transferring the data to the client:
CREATE FUNCTION orderSelect (Money, Varchar(30))
RETURNS TABLE (orderNum Integer, customerNum Integer, total Money)
EXTERNAL NAME ORD023RG
Cross-platform Databases, Applications, and Programmers
A couple of years ago, the AS/400 was considered a has-been. Now, the AS/400 has been reborn as a RISC-based, 64-bit, reliable, and secure operation system and an RDB that is standard across a wide variety of platforms and operating systems. Thanks to this rebirth, our applications and careers seem to have a bright future. DB2 UDB also gives us the vehicle to transport our applications, databases, and careers to other platforms. The only potential caveat is that well have to use SQL to get therea small asking price for the cost of admission.
ODBC Microsoft Windows Database Access Standard OLE DB Microsoft Windows Information Access Standard ADO Microsoft Windows Information Access Classes JDBC Java-based Database Connectivity
SQLJ Java-based Embedded SQL
SQL Standard SQL
DRDA X/Open Distributed Database Standard
CLI X/Open Database Access Standard
EDA/SQL IBI's EDA SQL Standard
DAL Apple's RDB Standard APIs
Net.Data Internet Data Access
Figure 1: UDB has what IBM considers universal access with support for a wide variety of protocols.
Create Table House (
LastName Char ( 25 ),
FirstName Char ( 15 ),
MiddleInit Char ( 15 ),
HouseImage BLOB ( 2M ),
Contract CLOB ( 1M )
Signature BLOB ( 1M )
SalesPerson Datalink( 50)
Figure 2: The SQL syntax to create a table that contains character and binary large objects is simple.