In my 20+ years experience in the software development field, I've met only a few (lucky) system architects who had the luxury to design new software solutions from scratch. More often, a system architect faces a challenge of integrating existing software components developed in legacy programming languages with new components written in modern languages such as Java. For complex applications, the transition from legacy code base to Java may take months if not years. In addition, rewriting existing software components may be impractical or impossible. In this case, a business process implementation can be split between legacy code and Java. For example, you may wish to retain a robust order processing module and augment it with a Java module that allows the customers to place orders over the Internet. Both modules work on the same transactional resources (customer, order, and stock data) and ideally should be able to share the same transaction context. In this article, I present a multithreaded framework that allows for concurrent execution of Java and legacy components that access and manipulate shared database resources.
The Distributed Transaction 101
The improved transaction processing architecture,
implemented back in V5R2, allows i5/OS to fully and efficiently support the
transaction model represented by the X/Open Distributed Transaction
Specification (DTS) and the Java JTA specification. The cornerstone of this
improved architecture, sometimes referred to as New Transaction Services (NTS),
is the decoupling of transactions from threads, processes, and activation
groups. An internal MI object, a transaction object, has been added to
facilitate the separation of a transaction context from a given activation
In the NTS, the locks and commit blocks are scoped to a transaction object. The associated space of a Transaction Object contains the commitment definition for the transaction.
The association of a transaction object to an XA thread of control is temporary. To accommodate both X/Open DTP and JTA specifications, i5/OS implements the concept of the XA thread of control in two ways:
- A system thread acts as the XA thread of control. This is the default behavior. When a global transaction represented by its global transaction identifier (XID) is activated—for example by the xa_start(XID) function—the transaction handle (a unique identifier) and the associated transaction object's address are set in the calling thread's Thread Control Block (TCB). After the thread has been associated with a transaction, each SQL statement processed in that thread uses the transaction context associated with that thread. The thread can be disassociated from the transaction by calling the xa_end(XID) function. At this time, the transaction information in the Thread Control Block is set to NULL.
- In V5R4, a second method was formalized in which the SQL connection acts as the XA thread of control. The application servers implementing JTA—such as WebSphere, as well as clients that call XA through ZDA, DRDA, or XDA servers—use this connection thread of control model. When using connection as the thread of control natively on i5, transaction associations are started and ended via the SqlSetConnectAttr CLI API rather than the xa_start and xa_end APIs.
Figure 1: The transaction is decoupled from an
XA thread of control. (Click images to enlarge.)
According to the X/Open DTS specification, only one thread can be associated with a transaction context at any given time. This means that multiple threads can work on the same transaction, but this work needs to be multiplexed. In some environments, this may fulfill the requirements. For example, a Java client attaches to a new global transaction, performs some transactional work, and detaches from the transaction. The transaction is then passed to a legacy program that attaches to this transaction, performs some additional transactional work, and detaches from the transaction. At this point, a transaction monitor may decide to complete the transaction. This works fine since the locks and the commitment definition are scoped to a given transaction object. In the high volume transactional systems, however, there are typically multiple threads that wish to manipulate overlapping sets of resources.
The simple multiplexing scenario described above does not provide the required concurrency levels. The threads need to wait for their turn to attach to the transaction or spawn another transaction, which often results in lock conflicts and deadlocks. These issues are addressed by loosely coupled transactions that share locks.
Huh? OK, let me explain. The global transactions are said to be "loosely coupled" when the transaction identifiers (XIDs) of two transaction branches have the same global transaction identifier (GTRID) but different branch qualifiers (BQUALs). By default, loosely coupled transaction branches do not share locks. On DB2 for i5/OS, this means that there are two separate transaction objects for these two branches. Starting with V5R4, however, there is an option that allows loosely coupled transactions to share locks. In other words, multiple loosely coupled branches can now share the same transaction object. In this case, each thread can have its own transaction branch within a resource manager. Multiple threads can continue transactional work on the same resources without ever running into lock conflicts.
Generally, a transaction branch can be in one of the five states: Active, Idle, Prepared, Rollback Only, and Heuristically Completed. The allowable state transitions are defined in Table 6-4 in Distributed Transaction Processing: The XA Specification. Note that the lock-sharing functionality for loosely coupled transaction branches is not architected by XA. Rather, it is an extension to the spec. In DB2 implementation, the last transaction branch to be completed commits or rolls back the changes for all the transaction branches with the same GTRID. The xa_prepare requests for the other transaction branches complete those transaction branches and return XA_RDONLY. However, changes made and locks acquired for those transaction branches remain pending until the last transaction branch with the same GTRID runs to completion. If xa_rollback is requested for one transaction branch while others are not yet completed, DB2 handles the request by marking the other branches rollback-only. So it is not possible for some branches to commit while others roll back. The xa_commit requests for other transaction branches receive a XA_ROLLBACK return code if xa_rollback was already requested for one or more of the loosely coupled transaction branches that share locks. Note also that it is not valid to request xa_rollback for one of the transaction branches before all are prepared because the transaction manger must carry out a two-phase commit if there are multiple transaction branches.
Sample Application Walkthrough
- JTAMonitor.java—This is a Java class that uses JTA to manage global transactions. It creates new branches, passes them to other components, and completes them. It also contains logic that can be used to complete orphaned transactions or query the status of global transactions.
- JInsertCoffees.java—This is a Java class that inserts a new row into a database table called COFFEES. The insert is performed in the context of an existing global transaction that has been previously started by JTAMonitor.
- CUpdCoff.c—This is a native ILE C program that updates the row previously inserted by JInsertCoffees. The update is performed in the context of a global transaction started by JTAMonitor.
- CUpdateCoffeesWrapper.java—This is a wrapper class that uses Program Call Markup Language (PCML) implemented in the IBM Toolbox for Java to call the native CUpdCoff program from Java.
- TestXATransactions.java—This is the main Java program that calls various components to perform transactional work.
application also contains a number of Java helper classes that are necessary to
perform such mundane tasks as configuring a DB2 for i5/OS data source, creating
new XID object, printing out the list of global transactions, and so forth.
You'll find a short description of those classes in the readme.1st file that is
contained in the downloadable
material that accompanies this article.
For illustration purposes, the global transaction identifier (GTRID) and branch qualifier (BQUAL) are passed to the software components as input parameters. In the test scenario, the JTAMonitor starts three transaction branches with the same GTRID but different BQUALs. Next, a row is inserted through Java. The C program is executed twice, each time within a different transaction branch. So the row is updated two times. If all components execute successfully, JTAMonitor commits the transaction. Otherwise, a rollback is requested.
Let's examine the source code to see how these tasks get accomplished. Here's a code excerpt from TestXATransactions.java (most of the error-handling was removed for clarity):
Figure 2: This is the source code of
In Figure 2, at  a CoffeeBean is instantiated. This bean contains the information about a new coffee brand to be inserted into the COFFEES table. At , another CoffeeBean is instantiated. This time, I use a constructor that accepts just two parameters: cof_id, and sales. This bean is later used to update an existing row in the table. The cof_id is used to locate the row in the table and sales to update the SALES column. The AS400JDBCXADataSource object is created at . This object is contained in the IBM Toolbox for Java, and it implements the XADataSource interface. The data source is configured at . The XADataSourceConfigurator helper class sets the data source's properties—such as system name, user ID, and user password—necessary to connect to System i. In addition, it sets the XALooselyCoupledSupport property to 1 to enable the loosely coupled transactions that share locks. This is shown in the following code snippet:
The data source is used by other software components to produce
XAConnection and XAResource objects. At , the JTAMonitor is created. As
mentioned, this class is responsible for the distributed transaction management.
I discuss the JTAMonitor implementation in the section below.
The main purpose of the sample application is to illustrate how to perform concurrent transactional work on the same set of resources. This requires that multiple threads are created and executed in parallel. To implement this behavior, I use a new concurrency framework that has been introduced in Java 1.5, which is based on ExecutorService, Callable, and Future interfaces. An ExecutorService object executes submitted Callable tasks. It contains a pool of worker threads that run the tasks concurrently and asynchronously. A Callable task implements the Callable interface, which is similar to Runnable in that both are designed for classes whose instances can be executed by another thread. A Callable, however, returns a result and may throw an exception. The result and the possible exception returned from a Callable are encapsulated in a Future object that is produced by the ExecutorService upon return from a worker thread. So, at  an ExecutorService object is instantiated, and its worker thread pool size is set to 3. At  the reference to the previously configured data source object is passed to the JTAMonitor. The beginTransaction method is called on the JTAMonitor at  to start the first branch of the global transaction. Then, two other branches are also started. Note that all three transaction branches have the same transaction identifier (GTRID) but different branch qualifiers (BQUALs). At  an JInsertCoffees object is created. Since this class implements Callable, it can be submitted to ExecutorService for execution at . The result is returned at  by calling the get method on the Future object. If an exception is thrown in JInsertCoffees, it gets encapsulated in an ExecutionException and returned to caller. The original exception can be extracted by calling getCause method on the ExecutionException object as shown at . The original exception that can be thrown in JInsertCoffees is actually a custom exception called TrnsMonitorException. It is a wrapper that allows me to return XA, SQL, and application-specific exceptions. It is re-thrown at  to force a rollback. At  and  I use the same pattern as at . This time, however, two update requests are submitted concurrently. Note that both updates attempt to modify the same row (cof_id = 10). The loosely coupled support guarantees that there are no locking conflicts. In the very unlikely case where two threads attempted the update at the very same instance of time, the access conflict would be resolved by DB2 through the internal seizes. The rest of the TestXATransactions flow is pretty straightforward. If the control reaches , the insert and two updates must have succeeded, so all three branches are prepared. Then at  a commit is attempted. The commit needs to be in the try-catch block, because two branches are completed at the prepare time. In the DB2 for i5/OS implementation, all three transaction branches share the same transaction object. Two branches are marked as read-only, and only one branch is marked as read-write. According to the XA DTP spec, the read-only branches get completed at the prepare time (since, by definition, there is nothing to commit). So, all transactional work is indirectly (through the associated transaction object) scoped to the read-write transaction branch. Only this transaction branch can be committed. In a typical scenario, where there is a pool of loosely coupled transactions, the last transaction that was prepared can also be committed. At this point, the transactional work represented by the transaction object is committed, the locks released, and the object purged from the system. If errors occur during the TestXATransactions execution the program attempts to gracefully complete the outstanding branches, as shown at . Should the rollback fail, the list of existing transactions is printed to the console at  so that the administrator can take an appropriate action.
Now that the flow of the sample application was outlined, let's focus on the most critical coding techniques used in various components:
JTAMonitor implements methods to start, prepare, commit, and roll back transaction branches. All these methods follow a very similar logic. For example, here's an excerpt from beginTransaction method:
Figure 3: This is the beginTransaction method
implemented in JTAMonitor.java.
In Figure 3, at  an Xid object is instantiated. This object represents the transaction context and is used by the resource manager (DB2, in this case) to identify a specific global transaction. At  the XADataSource produces an XAConnection. In a DB2 for i5/OS implementation, the transactional work submitted over a given connection is actually executed by one of the database prestart jobs. Since the sample application utilizes the IBM Toolbox for Java JDBC driver, the database connections are served by QZDASOINIT jobs. At  the XAConnection, in turn, produces an XAResource object. At  the newly created XAResource object is used to start a new global transaction with GTRID and BQUAL values encapsulated in the Xid object. The start method not only creates a new transaction context on the resource manager but also automatically associates the current connection with the transaction context. The transaction state is changed to active. The beginTransaction method performs no transactional work. Its purpose is to initiate a new transaction branch. The transactional work is performed by other specialized classes such as JInsertCoffees. So at this point, I disassociate the current thread from the transaction context. This is accomplished at . The transaction status changes from active to idle. For example, JInsertCoffees uses the transaction branch started by the beginTransaction method to insert a new row into the COFFEES table. Here's a code fragment that represents the core functionality of the JInsertCoffess class:
Figure 4: This pseudo code illustrates the
core functionality of JInsertCoffees.
In Figure 4, at  the Xid implementation object is created. At  the XADataSource produces an XAConnection. The necessary XAResource object is instantiated at . This is a pattern that you'll see throughout the sample application. It allows me to make the classes that manipulate the data resource manager independent. At  the XAResource is used to associate the transactional work performed by JInsertCoffees with a transaction context represented by a given Xid. Remember that at this point the transaction branch status is changed from Idle to Active. At  a logical database connection is obtained from XAConnection; an SQL INSERT statement is prepared at  and executed at . Finally, at  the thread is disassociated from the branch so that other threads can gain control over it.
A similar logic is employed in the legacy CUPDCOF program. In this case, however, I use the XA APIs implemented in C rather than JTA APIs in Java. Let's quickly examine the relevant excerpt from the C source: