Patch Those Leaky Interfaces

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In today's high-tech IT world, it's common for shops to share data on multiple platforms. For those trusted with the task of writing the interfaces to share the data, there are many annoyances. For example, who hasn't sent data from one platform to an export file, updated flags in the database to signal the data has been sent, and then discovered that the export file never made it to its final destination on the remote system? Worse, even when the data does make it to the remote system, it's common to encounter an error resulting in partial data updates, which often makes restarting the entire process a mess.

Fortunately, for those sharing data between Microsoft's SQL Server and the iSeries, there's a splendid aid at your disposal: distributed transactions (DTs). DTs are functionally similar to local database transactions in that they have a beginning boundary, data modification statements, and an ending boundary whereupon the data changes that occurred are either committed or rolled back. However, distributed transactions extend the concept further by allowing data modification statements to occur against databases on multiple platforms.

Think of how moving data between disparate systems would be simplified with distributed transactions:
1. A transaction boundary is created.
2. Data is moved from the source platform to the destination platform.
3. The source platform marks its data as sent.
4. If everything is successful, all of the changes are committed on both systems
5. If there is a failure, all of the changes are rolled back. When the error condition is fixed, the process can easily resume.

Since transactions involve the "all or nothing" concept, the programmer is assured the data is successfully changed on both platforms or on neither. Never again need we fuss over where to pick back up in the multiplatform processing cycle or reset flags to send data again!

The SQL Server documentation gives good a good introduction to DTs and explains how they work. This article covers the basics of performing a DT using SQL Server's Transact SQL (T-SQL).

Setup

To set up a SQL Server/iSeries environment capable of performing a DT, you need to configure the following items:

  • iSeries files to participate in a DT must be journaled.
  • The Client Access V5R1 ODBC driver must be installed on the SQL Server machine.
  • A linked server to the iSeries must be configured.
  • The SQL Server Distributed Transaction Coordinator (DTC) must be started.


The first requirement is that the iSeries physical files to be modified must be journaled. Tables created in a schema (library) created by the CREATE SCHEMA statement are automatically journaled. To verify if a physical file is journaled, use the Display File Description (DSPFD) command. If it is not journaled, use the Start Journal Physical File (STRJRNPF) command to start journaling. If you need help with iSeries journaling concepts (journals, receivers, etc.) see chapters 19 and 20 of the Backup and Recovery Guide.

The second requirement involves installing the Client Access ODBC Driver (V5R1 or higher with the latest service pack) on the SQL Server machine. OS/400 has to be at V5R1 or higher as well. (Starting with V5R2, Client Access has been renamed to iSeries Access, but I will refer to it as Client Access here.) In case you're wondering, the Client Access OLE DB provider IBMDA400 does not currently support distributed transaction processing and therefore cannot be used.

Once the Client Access ODBC Driver is installed, configure an ODBC data source to the iSeries under the System Data Source Name (DSN) tab. For this article, I named my DSN "ISERIES" and used the default options.

The third requirement is to configure a linked server (requires SQL Server 7.0 and above--SQL Server 2000 is used here). A linked server definition allows SQL Server to access tables from a remote database as though they were part of its own local database.

To configure the linked server, start the SQL Server Enterprise Manager. Navigate the tree hierarchy and select the server you want to work with. Expand the server and then expand the "Security" node. Right-click on "Linked Servers" and choose "New Linked Server." In the linked server name box, enter ISERIES again, for consistency with the ODBC DSN. This linked server name will be used to refer to iSeries tables when working with T-SQL.

Under server type, choose "Other Data Source" and select the "Microsoft OLE DB Provider for ODBC Drivers" in the provider name combo box. Under "Product Name" enter "DB2 for iSeries." In the "Data Source" box, enter a valid iSeries DSN (if following along with this example, enter ISERIES.) In the "Provider String" box, you may optionally enter any DSN overrides. For example, to make the iSeries library TESTDATA the default library, enter DBQ=TESTDATA, where DBQ is the Client Access ODBC Driver's keyword to override the library list.

Next, you need to establish the security credentials for the linked server. Click on the Security tab of the "Linked Server Properties" window. In this window, SQL Server gives the option to define a login cross-reference to link the credentials of a specific SQL Server user to a specific iSeries user, but for simplicity, I will not use this feature in this example. In the bottom half of the window, there are options for login definitions not specified in the cross-reference list. Choose the "Be made using this security context" option (SQL Server 7.0's option is "They will be mapped to") and enter a valid iSeries user name and password in the boxes below. Whenever SQL Server attempts to talk to the iSeries linked server, it will use the login information specified here. The linked server has now been configured. Click the OK button.

The last step involves starting the Microsoft SQL Server DTC service. The DTC, which can be started from the SQL Server Service Manager utility, is responsible for handling DT processing across multiple database servers.

Accessing Data on a Linked Server

To verify that the linked server is set up correctly, run a distributed query (DQ). A DQ is a T-SQL query that accesses data on a linked server. One way to run a DQ is to specify a four-part table name in the FROM clause of a SELECT. Specifically, for an iSeries-linked server, the four-part table name is specified as follows:

SELECT *
  FROM linked server.RDB name.schema name.table name


For example, if your linked server name is called "ISERIES," your iSeries' relational database (RDB) name is S1024000 (it's usually the same as your system name), your schema (library) is LIVEDATA, and your table is ORDERS, you would enter the following to retrieve the table's data:

SELECT *
  FROM ISERIES.S1024000.LIVEDATA.ORDERS


This will allow SQL Server to query the ORDERS table on your iSeries as though it were local to SQL Server. Start the Query Analyzer utility, and try it! In fact, using the four-part syntax shown above, you can place an iSeries table in the FROM, JOIN, subquery, or nested select portion of a SELECT statement. The better news is that linked server tables can also participate in UPDATE and DELETE statements (provided the linked server's ODBC or OLE DB drivers are capable, which is the case with the Client Access ODBC driver.)

Another way to run a DQ is to use the OPENQUERY function. OPENQUERY submits a passthrough query to the backend database engine for processing and returns the results as though it were a SQL Server table. OPENQUERY requires two parameters: a linked server name and an SQL statement. The following is an example of how to use OPENQUERY:

SELECT *
  FROM OPENQUERY(ISERIES,'Select * From LiveData.Orders')


The main difference between the two examples is that, with the four-part table name syntax, SQL Server queries less efficiently than with OPENQUERY. OPENQUERY avoids much of SQL Server's overhead by submitting a SQL statement directly to the linked server's database engine. To do this, however, the SQL statement supplied to OPENQUERY must conform to the linked server's SQL dialect. In other words, you can't submit a T-SQL statement to an iSeries linked server.

Many DQ performance considerations are beyond the scope of this article. For some of the iSeries-specific performance considerations, see "Running Distributed Queries with SQL/400 and SQL Server 7.0" in the September/October 2000 issue of AS/400 Network Expert. For more information on DQs, see the SQL Server T-SQL documentation on the OPENQUERY, OPENROWSET, and distributed query topics.

Running a Distributed Transaction

Now, we're at the heart of the topic. For this demonstration, on the SQL Server side, I'll be using the NORTHWINDCS sample database, which is included with Office XP (you could also use the sample database called NORTHWIND that comes with Office 2000). I'll focus on a particular table called Products, which is the Product Master table.

For this example, assume that an identical Products table exists on the iSeries and that these two tables need to be synchronized at five-minute intervals. The structure of the Products tables for each platform is shown in Figure 1.

CREATE TABLE dbo.Products (
ProductID int IDENTITY (1, 1) NOT NULL,
ProductName varchar (40) NOT NULL,
SupplierID int NULL,
CategoryID int NULL,
QuantityPerUnit varchar (20) NULL,
UnitPrice money NULL,
UnitsInStock smallint NULL,
UnitsOnOrder smallint NULL,
ReorderLevel smallint NULL,
Discontinued bit NOT NULL,

CREATE TABLE NORTHWIND.PRODUCTS (

PRODUCTID INTEGER DEFAULT NULL,

PRODUCTNAME FOR COLUMN PRODU00001
    VARCHAR(60) CCSID 37 DEFAULT NULL,

SUPPLIERID INTEGER DEFAULT NULL,

CATEGORYID INTEGER DEFAULT NULL,

QUANTITYPERUNIT FOR COLUMN QUANT00001
    VARCHAR(20) CCSID 37 DEFAULT NULL,

UNITPRICE DOUBLE PRECISION DEFAULT NULL,

UNITSINSTOCK FOR COLUMN UNITS00001 SMALLINT DEFAULT NULL,

UNITSONORDER FOR COLUMN UNITS00002 SMALLINT DEFAULT NULL,

REORDERLEVEL FOR COLUMN REORD00001 SMALLINT DEFAULT NULL,

DISCONTINUED FOR COLUMN DISCO00001 SMALLINT DEFAULT NULL,

SYNCHRONIZED FOR COLUMN SYNCHRONIZ 
    SMALLINT NOT NULL DEFAULT 0,

CONSTRAINT NORTHWIND.PRODUCTS_PRIKEY PRIMARY KEY(PRODUCTID))

Figure 1: These are the Products tables from the NORTHWIND database as they exist within SQL Server and the iSeries. The Synchronized column was added to both for tracking an item change.

For simplicity, assume that the synchronization will flow in only one direction. The Products table on the SQL Server side is the "master"--that is, changes to the Products table have to be done through a SQL Server application. Further, changes to the iSeries table will only be those resulting from the synchronization process.

To try this scenario, open Query Analyzer, select the NorthwindCS database, and issue the following SQL statement to add a "synchronized" flag to the Products table:

ALTER TABLE dbo.Products ADD Synchronized Bit Not Null Default 1


Next, create a schema (library) on your iSeries called NORTHWIND using the "CREATE SCHEMA NORTHWIND" SQL statement. Create the Products table in schema NORTHWIND using the second CREATE TABLE statement shown in Figure 1 (remember to use the appropriate SQL naming convention). This table will be journaled automatically. Finally, copy the Products table data from SQL Server to the iSeries using the distributed query shown in Figure 2.

Use NorthwindCS
Go
--
-- NOTE:
-- Replace S1024000 with your system's RDB name
-- 
Insert Into ISERIES.S1024000.NORTHWIND.PRODUCTS
Select *
  From Products A
 Where ProductID Not In
    (Select * 
       From OpenQuery(ISERIES,'SELECT PRODUCTID
                                 FROM NORTHWIND.PRODUCTS
                                 WITH NC'))

Go

Figure 2: This distributed query will insert data into the iSeries Products table from the SQL Server Products table.

Look at Figure 2's INSERT STATEMENT. The four-part table name syntax is specified as the table to receive the data. The SELECT portion consists of the SQL Server Products table with a subselect to the iSeries Products table again, to make sure a duplicate record isn't inserted (of course, all records will be inserted the first time through.)

In the subselect, though, the iSeries Products table is embedded in the OPENQUERY function instead of the four-part table name syntax. In this case, the reason for using OPENQUERY instead of the four-part table name has to do with performance.

Now that the tables are synchronized, subsequent inserts, changes, and deletes to the SQL Server table have to be tracked and moved to the iSeries table. Figure 3 shows a complete T-SQL stored procedure to do this.

Create Procedure spSynchronizeProductsTable
As
--
-- Synchronize iSeries and SQL Server Products 
-- Table for all records that have changed.
--
-- 
--
-- Procedure Status Returns:
--   0 - OK
--   1 - Failed Status
--
--
-- NOTE: Everywhere S1024000 occurs, replace
--       with your system's RDB name
--

-- Cursor Variables
Declare @ProductID  int
Declare @ProductName  varchar(40)
Declare @SupplierID  int
Declare @CategoryID  int
Declare @QuantityPerUnit varchar(20)
Declare @UnitPrice  money
Declare @UnitsInStock  smallint
Declare @UnitsOnOrder  smallint
Declare @ReorderLevel  smallint
Declare @Discontinued  bit
Declare @Syncrhonized  bit

Declare @Status  int

--
-- Turning on XACT_Abort alleviates the need for the 
-- OLE DB/ODBC connection to support nested
-- transactions
--
Set XAct_Abort On

Set NoCount On -- Turn this off during debugging
Set @Status=0


--***************************************************
--
-- SECTION 1
--
-- Insert all new records into the iSeries table
-- 
Insert Into ISERIES.S1024000.NORTHWIND.PRODUCTS
Select *
  From Products A
 Where ProductID Not In
    (Select * 
       From OpenQuery(ISERIES,'SELECT PRODUCTID
                                 FROM NORTHWIND.PRODUCTS
                                 WITH NC'))

--***************************************************
-- SECTION 2
--
-- Read all changed product records
--
Declare ProductSync Cursor For
 Select *
   From Products
  Where Synchronized=0 -- Changed Products
    For Update Of Synchronized


-- Open Cursor & Fetch First Record
Open ProductSync

Fetch ProductSync Into @ProductID,@ProductName,
                       @SupplierID,@CategoryID,
                       @QuantityPerUnit,@UnitPrice,
                       @UnitsInStock,@UnitsOnOrder,
                       @ReorderLevel,@Discontinued,
                       @Syncrhonized

While @@Fetch_Status=0
Begin
    Begin Distributed Transaction

    -- This example assumes the ProductID can't be changed!
    Update ISERIES.S1024000.NORTHWIND.PRODUCTS
       Set ProductName=@ProductName,
           SupplierID=@SupplierID,
           CategoryID=@CategoryID,
           QuantityPerUnit=@QuantityPerUnit,
           UnitPrice=@UnitPrice,
           UnitsInStock=@UnitsInStock,
           UnitsOnOrder=@UnitsOnOrder,
           ReorderLevel=@ReorderLevel,
           Discontinued=@Discontinued
     Where ProductID=@ProductID

    -- Set Local Products Table "Synchronized" Flag
    Update Products
       Set Synchronized=1
     Where Current Of ProductSync

    -- Commit Distributed Transaction
    If @@Error=0
    Begin
         Set @Status=1
        Commit
    End
    Else
        Rollback

    Fetch ProductSync Into @ProductID,@ProductName,
                           @SupplierID,@CategoryID,
                           @QuantityPerUnit,@UnitPrice,
                           @UnitsInStock,@UnitsOnOrder,
                           @ReorderLevel,@Discontinued,
                           @Syncrhonized
End
-- Close and deallocate cursor
Close ProductSync
Deallocate ProductSync


--***************************************************
-- SECTION 3
--
-- Delete all records in the iSeries table
-- that are no longer in SQL Server
--
Delete From ISERIES.S1024000.NORTHWIND.PRODUCTS
 Where ProductID Not In (
               Select ProductID 
                 From Products) 

Set XAct_Abort Off

Return @Status

Figure 3: This stored procedure will propagate adds, updates, and deletes from the SQL Server Products table to the iSeries Products table.

Notice that the XAct_Abort is set to On. This is done to prevent nested transactions, which the iSeries ODBC driver does not allow. By default, SQL Server processes all statements inside a default transaction so that partial rollbacks can occur. Starting another explicit transaction using BEGIN TRANSACTION actually starts a nested transaction, which will cause the CA ODBC driver to error out. Setting XAct_Abort to On turns off the default initial transaction boundary. By implication, this setting will also prevent SQL Server from doing partial rollbacks.

The first code section is a repeat of the code already shown in Figure 2. An INSERT statement is used to move all new records from the SQL Server Products table to the iSeries table.

The second section involves reflecting all changes to the products in the SQL Server table on the iSeries. A cursor is opened against the local Products table to select all products that have changed. Inside the loop, the BEGIN DISTRIBUTED TRANSACTION statement is executed to start a transaction for each item. In this case, each product update will be treated as a single transaction. If your situation requires either all or none of the Product updates to occur, you can specify the BEGIN and COMMIT transaction boundaries outside of the loop.

Inside the loop, an UPDATE is issued against the iSeries table for each field. After the update is completed, SQL Server's SYNCRHONIZED column is set to true to indicate that the two tables are in sync for the given ProductID. After the second update is completed, the transaction is committed or rolled back, depending on whether an error occurred. This is where the power of the DT shines: The SQL Server synchronized flag will not be set to True unless the data is successfully placed on the iSeries.

The third and final section deletes all products from the iSeries table that no longer exist in the SQL Server table. Again, the four-part table name is specified, and an EXISTS clause is used to see if the ProductID on the iSeries still exists in the SQL Server Products table. You probably realized that the INSERT and DELETE statements were not embedded inside of a BEGIN DISTRIBUTED TRANSACTION block. This is because DT processing isn't required here, since data is being updated on only one platform.

Writing that stored procedure was relatively painless--it's hardly different from a procedure written to synchronize two local tables! However, there are still two additional requirements to make the synchronization take place. The first requirement is to set the Synchronized flag to False (0) whenever a product is changed. You can do this through either the application program or an update trigger. The second necessity is to schedule this stored procedure to run at regular intervals using SQL Server Agent or some other scheduling mechanism.

Does It Really Work?

If you're still following along in this example, you can now see for yourself how this works. Open the NorthwindCS.ADP Client/Server sample database with Microsoft Access. Go to the database window, choose the Tables tab, and double-click on the Products table to open it. Delete a few records, insert a few new records, and change a few records. For the changed records, set the synchronized flag to False (0). (To delete existing records, you will have to remove the referential integrity constraint between the Order Details and Products table.) Issue the CREATE PROCEDURE statement shown in Figure 3, then execute it as follows:

Use NorthwindCS
Go

Exec spSynchronizeProductsTable


When you query the data on the iSeries, all of your modifications to the SQL Server table should be reflected.

iSeries-Side Cursors

In case you need to create a DT involving a cursor on the iSeries, Figure 4 shows how to do this.

Set Transaction Isolation Level Read Uncommitted
Go
Use NorthwindCS
Go   
-- 
--V5R1 CA SI06804 gives "invalid parameter marker message" 
--
--
-- NOTE: Everywhere S1024000 occurs, replace
--       with your system's RDB name
--


-- Cursor Variables 
Declare @ProductID  Int
Declare @ProductName  Varchar(40)
Declare @SupplierID  Int
Declare @CategoryID  Int
Declare @QuantityPerUnit Varchar(20)
Declare @UnitPrice  Money
Declare @UnitsInStock  Smallint
Declare @UnitsOnOrder  Smallint
Declare @ReorderLevel  Smallint
Declare @Discontinued  Smallint
Declare @Synchronized  Smallint


Set NoCount On
--
-- NOTE: iSeries Data Source should be set
--       to COMMIT=*NONE so that the 
--       isolation level can be set
--
Set XAct_Abort On
--
-- The repeatable read or serializable transaction 
-- isolation level is required for an updateable 
-- cursor on the linked server.
--
-- If this level is too restrictive, consider using
--  a READ ONLY cursor with UPDATE statements.
--
Set Transaction Isolation Level Repeatable Read

--
-- NOTE: The cursor is required to be opened inside
--       the transaction boundary
--
-- 
Begin Distributed Transaction
--
-- Updateable cursors in SQL Server require a
-- unique key on the linked server table.
--
-- If your cursor receives a fetch_status of -2,
-- check to make sure all column names are 
-- placed in upper case.
-- (i.e. "ProductID"<>"PRODUCTID")
--
Declare ProductSync Cursor For
 Select *
   From ISERIES.S1024000.NORTHWIND.PRODUCTS
  Where SYNCHRONIZED=0 
    For Update Of SYNCHRONIZED

Open ProductSync

Fetch ProductSync Into @ProductID,@ProductName,
                       @SupplierID,@CategoryID,
                       @QuantityPerUnit,@UnitPrice,
                       @UnitsInStock,@UnitsOnOrder,
                       @ReorderLevel,@Discontinued,
                       @Synchronized

While @@Fetch_Status=0
Begin
    Update Products
       Set ProductName=@ProductName,
           SupplierID=@SupplierID,
           CategoryID=@CategoryID,
           QuantityPerUnit=@QuantityPerUnit,
           UnitPrice=@UnitPrice,
           UnitsInStock=@UnitsInStock,
           UnitsOnOrder=@UnitsOnOrder,
           ReorderLevel=@ReorderLevel,
           Discontinued=@Discontinued
     Where ProductID=@ProductID


    Update ISERIES.S1024000.NORTHWIND.PRODUCTS
       Set SYNCHRONIZED=1
     Where Current Of ProductSync

    Fetch ProductSync Into @ProductID,@ProductName,
                           @SupplierID,@CategoryID,
                           @QuantityPerUnit,@UnitPrice,
                           @UnitsInStock,@UnitsOnOrder,
                           @ReorderLevel,@Discontinued,
                           @Synchronized
End

Close ProductSync
Deallocate ProductSync

If @@Error=0
    Commit
Else
    Rollback

Set Transaction Isolation Level Read Committed

Set XAct_Abort Off

Figure 4: This T-SQL shows how to use an updateable cursor on the iSeries within a distributed transaction.

The major difference between this code and the code in Figure 3 (other than the table reversal) is that the transaction boundary has to be placed before the cursor declaration. This means that all of the records will be involved within the transaction boundary. To have an updateable cursor on a linked server, SQL Server requires that the isolation level be set to repeatable read or serializable. These locking levels are restrictive in terms of record locking, so use updateable cursors sparingly.

The one other thing to be aware of is that I had to modify the ODBC DSN with a default commitment control level of *NONE. Without this setting, I would erratically get error messages stating that the required transaction isolation level could not be achieved.

Trials and Tribulations of New Technology

Even though DTs are extremely useful and will continue to grow in popularity, there are still pitfalls. While the end product looks easy enough, it takes quite a bit of fiddling to get everything to work correctly. Listed below are some of the major things I battled with:

Linked Server Errors Cause Processing to Halt

Even though the code shows a tidy Commit and Rollback, the fact is that, when a linked server error occurs, the entire procedure stops with an error severity of 16. As far as I can tell, there is no way to trap these errors. (If someone knows a way around this, please let me and everyone else know by posting a note to the forum associated with this article.) If, for example, a record on the iSeries is locked so that it can't be changed, the procedure will just stop instead of allowing a programmatic response to the condition. This is the worst drawback I encountered.

Case-Sensitive Names

Be careful when entering four-part table names because the RDB name, schema, and table names should be entered in uppercase. In a few cases, when I used an iSeries side cursor, column names seemed to be case-sensitive as well.

Unique Indexes

If you need an updateable iSeries side cursor, SQL Server requires that the table have a unique index. If for some reason your base table isn't able to have a unique index, you can use a read-only cursor with individual UPDATE statements to change the data.

Service Pack Levels

This is the real killer. I toyed around with various Client Access levels and service packs and received varied results. Here is the exact configuration I tested with:

  • OS/400 V5R1 with Group Database Fix SF99501-04
  • Windows XP Professional with Service Pack 1
  • SQL Server 2000 (with no service pack and Service Packs 1 and 3)
  • Client Access V5R1 SI05361 and SI06804
  • iSeries Access V5R2 SI07675, SI06631 (SI05853 didn't work)


Things are a little too fragile for my liking. Unfortunately, it seems that the CA ODBC driver's ability to work with DQs and DTs changes from service pack to service pack. For instance, I had complete success with everything shown in this article using CA V5R2 SI07675. However, SI05853 was a complete flop. The V5R1 SI06804 did everything except for the iSeries-side updateable cursor.

My only reason for sharing this information is that it was frustrating trying to find the right combination of software levels to make the thing work!

Ensuring the Veracity and Timeliness of Shared Data

As the requirements for sharing data between platforms in real time increases, so will the popularity of DTs. Their ease of use and ability to guarantee the "all or nothing" concept among multiple database servers make them an ideal candidate for fulfilling many of the cross-platform interface requirements.

Michael Sansoterra is a programmer/analyst at SilverLake Resources, an IT services firm based in Grand Rapids, Michigan. You can reach him at This email address is being protected from spambots. You need JavaScript enabled to view it..

BLOG COMMENTS POWERED BY DISQUS