Check out SQL Server 2005's new features that simplify integration with other databases.
As the old saying goes,
"good things come to those who wait." At last, the long-anticipated release of
SQL Server 2005 has finally arrived, and there are plenty of good things in it.
This article focuses on how SQL Server 2005 makes integration with the DB2
database easier thanks to linked server enhancements.
What Is a Linked Server?
Introduced way back in SQL Server 7.0, linked servers
allow SQL Server to query OLE DB and ODBC data sources. This means a data source
can be another SQL Server, a DB2 database, a spreadsheet, a text file, or almost
any other data store. Once the linked server has been defined, remote data
sources can participate in joins with local tables or other linked servers.
Depending on the capability of the OLE DB provider, the linked server data
source may even be updateable. For instance, SQL Server can be used to update or
delete rows in the table of a DB2 database based on records in a local
table.
As a brief review of using linked servers in Transact-SQL (T-SQL
is SQL Server's dialect of SQL), say we have a linked server called DB2400 that
points to a DB2 database on the iSeries. Data can be queried from the linked
server by using a four-part naming convention or by using the OpenQuery
function:
-- Query DB2 table using four part table name Select * From DB2400.S104X824.QIWS.QCUSTCDT A Join CustInfo.dbo.customers B On B.CustomerId=A.CustID
-- Query DB2 table using OpenQuery function Select * From OpenQuery(DB2400, 'Select * From QIWS.QCUSTCDT') As Remote Join CustInfo.dbo.customers B On B.CustomerId=Remote.CustID
Of course, more complicated queries can be constructed. For a thorough
review of linked servers and their capabilities, see "Running Distributed
Queries with SQL/400 and SQL Server 7.0" (AS/400 Network Expert,
September/October 2000) and "Patch Those
Leaky Interfaces" (MC Mag Online, June 2003).
Setting Up a Linked Server
A linked server must be configured before it can be
used. Setup requires two steps:
- Install the desired ODBC driver and set
up a data source name (or install the desired OLE DB provider). This article
uses the ODBC driver that comes with iSeries Access V5R3.
- Create the linked server by using T-SQL statements (or by using the new SQL
Server Management Studio object explorer graphical interface. Expand the Server
Objects node to find the Linked Server definitions. Note: This location for
maintaining linked servers is a change from the SQL Server 2005 beta and prior
editions of SQL Server, where linked servers were located under the Security
node.)
The code below demonstrates how to create a linked server
called DB2400 that is based on an ODBC connection with a DSN name called
AS400:
-- -- Create Linked Server using ODBC DSN "AS400" -- sp_addlinkedserver @server=N'DB2400', @srvproduct=N'DB2 UDB for iSeries', @provider=N'MSDASQL', @datasrc=N'AS400', @provstr='CMT=0;SYSTEM=as400.mycompany.com', @catalog='S104X824' go -- -- Define the credentials that will be used to -- access objects hosted by the Linked Server -- sp_addlinkedsrvlogin @rmtsrvname=N'DB2400', @useself='false', @rmtuser=N'MyUser', @rmtpassword='MyPassword' go -- -- RPC option is required for doing EXEC AT -- EXEC sp_serveroption 'DB2400', 'rpc out', true go
The built-in stored procedure sp_addlinkedserver is used to register
a linked server. The provider name in this case is MSDASQL, which is the OLE DB
wrapper around the ODBC APIs. The data source (datasrc) parm contains the name
of the ODBC DSN, which is configured under administrative tools (on Windows 2000
and above). The provider string (provstr) parameter overrides any of the ODBC
DSN's default settings. Finally, the catalog parameter should be the name of the
relational database entry for your iSeries (by default, it is the same as the
system name.)
Stored procedure sp_addlinkedsrvlogin defines the login
credentials for the server. In the example, the rmtUser and rmtPassword entries
dictate the credentials to be used when SQL Server attempts to access a resource
on the linked server.
Finally, in order for a linked server to use some
of the new linked server features, the rpc out setting must be set to true. This
can be accomplished programmatically by using sp_serveroption, as shown in the
example.
To remove the linked server entry, issue the following
commands:
sp_droplinkedsrvlogin @rmtsrvname=N'DB2400',@locallogin=Null go sp_dropserver @server=N'DB2400' go
Latest Enhancements
SQL Server's ability to access remote data sources
via linked servers often makes the programmer's life easier by easily allowing
remote data to be joined and updated using T-SQL. However, linked servers have
suffered from a few deficiencies, including the inabilities to use parameter
markers, execute stored procedures, and execute Data Definition Language (DDL)
statements such as CREATE TABLE. Under these circumstances, Data Transformation
Services (DTS) or another tool was used.
These other programming options
may no longer be needed because linked servers have increased capability. The
EXEC command has been enhanced with an AT clause to specify that an SQL
statement should execute on a linked server:
EXEC statement AT
linked server.
Keep in mind that the SQL statement submitted with
EXEC AT will be in the SQL dialect of the remote server (in this case
DB2).
Using the DB2400 linked server (defined above), here is a sample of
how to use EXEC AT to retrieve data from the iSeries:
-- -- As with OpenQuery, enhanced EXEC support allows pass -- through queries to be executed on a linked server. -- EXEC ('SELECT * FROM DATALIB.OrderHdr WHERE OrderID>1') AT DB2400
Here is an example of issuing a DDL statement followed by an insert to a
remote DB2 table using the four-part table convention:
-- -- EXEC AT can issue DDL statements -- EXEC ('Create Table DATALIB.OrderTemp (OrderId Int Not Null, PartId Char(15) Not Null)') AT DB2400 GO
Insert Into DB2400.S104X824.DATALIB.ORDERTEMP Values(1,'BICYCLE') GO
This kind of power allows a developer to write a program capable of
complex cross-platform database access entirely in T-SQL.
Further,
parameter markers can now be used to create optimized, re-useable statements.
Here is an example of how the previous insert statement can be done using the
new EXEC AT syntax:
-- Insert Example Declare @OrderId Int Declare @PartId Char(15) Set @OrderId=2 Set @PartId='MOTORCYCLE' EXEC ('Insert Into DATALIB.OrderTemp Values (?,?) With NC',@OrderId,@PartId) AT DB2400
Update statements are allowable as well:
-- Update example Declare @OrderId Int Declare @PartId Char(15) Set @OrderId=2 Set @PartId='MOTORCYCLE1' EXEC ('Update DATALIB.OrderTemp Set PartId=? Where OrderId=?',@PartId,@OrderId) AT DB2400
Keep in mind that, while most Data Manipulation Language (DML)
statements can also be accomplished with the traditional four-part syntax, EXEC
AT allows the additional flexibility of using things like built-in functions and
user-defined functions that are defined only on the host. For example, if you
need to use a datalink-related DB2 function, you would only be able to invoke
this function using EXEC AT because T-SQL knows nothing about the DB2 datalink
data type and related functions.
Here's an example of a stored procedure
call that accepts a parameter.
Set NoCount On Declare @OrderID Int Set @OrderID=10249 -- -- Call Parameterized Stored Procedure -- Exec ('Call DATALIB.GetOrders (?)', @OrderID) AT DB2400
To retrieve the result of an output variable from a stored procedure,
specify the OUTPUT keyword after the variable name, as follows:
-- Test Stored Proc with output variable Set NoCount On Declare @OrderID Int Exec ('Call DATALIB.GETNEXTORDER (?)', @OrderID OUTPUT) AT DB2400
Select @OrderID
If the stored procedure produces a result set, it will be returned as
well. This can be beneficial in a client/server environment for applications
using both DB2 and SQL Server calls because both systems can be accessed from a
single SQL Server connection. This setup relieves the need to have DB2
connectivity installed and configured on each desktop.
One drawback of
EXEC AT is the limited programmatic support for SELECT and stored procedure
result sets. While EXEC AT provides the ability to specify complex and
parameterized SELECTs that will execute more efficiently on the remote server,
EXEC AT does not allow joins or programmatic access to the result set. However,
this limitation can be overcome by returning the results from EXEC AT to a
temporary table using the INSERT/EXEC statement.
-- -- The following Insert Exec example requires -- a Distributed Transaction so the MS-DTC -- service must be running. -- Set NoCount On Declare @OrderID Int Set @OrderID = 10248
If Object_ID('tempdb..#tmpOrderHdr') Is Not Null Drop Table #tmpOrderHdr
Create Table #tmpOrderHdr (OrderID Int Not Null Primary Key, CustomerID VarChar(5) Not Null)
-- -- Insert data into local SQL Server table -- from parameterized SELECT in DB2 -- Insert Into #tmpOrderHdr Exec ('SELECT OrderID,CustomerID FROM DATALIB.OrderHdr WHERE OrderID=? With NC', @OrderID) AT DB2400 -- -- Do programmatic access to result set -- such as cursor processing or JOINs here. -- Select * From #tmpOrderHdr
Alternatively, INSERT/EXEC can be used against a stored procedure instead
of a SELECT:
-- -- A DB2 stored procedure can be executed -- Set @OrderID = 10249 Insert Into #tmpOrderHdr Exec ('Call DATALIB.GetOrders (?)', @OrderID) AT DB2400
Running INSERT/EXEC against a linked server requires a distributed
transaction. In a nutshell, a distributed transaction (DT) creates an
environment where database consistency and atomicity concepts are broadened
across multiple database (even heterogeneous) servers. In a DT, data
modifications must be successful on multiple systems in order for the changes to
be committed. Whereas transactions in the DB2 world normally require journaling,
it is not required in this scenario to run an INSERT/EXEC statement.
To
use a distributed transaction, the following setup is required:
- The MS-DTC
(distributed transaction coordinator) service must be running.
- You must have iSeries Access V5R1 or higher (make sure to use the latest
service pack as there have been various bugs).
- The MS-DTC service should have support for XA transactions enabled. On a
Windows XP machine, this can be accomplished by selecting Control Panel ->
Administrative Tools -> Component Services. Expand the component services and
computers node. Right-click on My Computer and choose Properties. Select the
MSDTC tab and choose the Security Configuration button. Next, click on Enable XA
Transactions. Figure 1 shows a sample configuration screen shot.
Figure 1: Here's an example of what your configuration screen would look
like. (Click image to enlarge.)
For more information on distributed transactions, click here.
OLE DB Provider IBMDASQL
Linked servers use Microsoft's OLE DB technology.
OLE DB providers are often preferred over ODBC drivers because using ODBC
requires an extra programmatic interface layer between the OLE DB and ODBC
standards. However, in the case of the iSeries, the ODBC driver offers the
better of the two options because it supports more features. Because IBM
continues to enhance the OLE DB providers for the iSeries, I thought it
worthwhile to test one of them.
iSeries Access V5R3 comes with a new
SQL-only OLE DB provider called IBMDASQL. Below is a T-SQL script that defines a
linked server called DB2400OLEDB that uses the IBMDASQL provider.
sp_addlinkedserver @server=N'DB2400OLEDB', @srvproduct=N'DB2400 UDB for iSeries', -- IBMDA400/IBMDASQL are OLE DB providers -- for DB2 UDB for iSeries. IBMDASQL is -- available with iSeries Access V5R3 @provider=N'IBMDASQL', -- System Name @datasrc=N'AS400.MYCOMPANY.COM', @catalog='S104X824'
go
sp_addlinkedsrvlogin @rmtsrvname=N'DB2400OLEDB', @useself='false', @rmtuser=N'MyUser', @rmtpassword='MyPassword' go -- -- RPC option is required for doing EXEC AT -- EXEC sp_serveroption 'DB2400OLEDB', 'rpc out', true
By substituting the DB2400OLEDB linked server name in the prior T-SQL
examples, I found a mixture of things that did and didn't work (including
differences in what worked in the SQL Server 2005 June CTP beta vs. the released
product). Because of these issues, I wouldn't recommend using IBMDASQL at this
time.
The Integration Factor
When integrating data from heterogeneous data
sources, what used to take loads of time to program now can be done quickly
thanks to these linked server enhancements. And if SQL Server 2005's linked
servers don't offer enough data access versatility, custom stored procedures and
table-valued functions can be written in a .NET programming language such as C#
or VB.NET. I'll discuss this concept in Part 2 of this series.
If you're
thinking of upgrading to SQL Server 2005, know that there are plenty of new
tools to help with data integration challenges, whether with the iSeries edition
of DB2, Oracle, or MySQL. Microsoft has done a good job of creating versatile
and easy-to-use programming tools to make disparate systems talk
seamlessly.
Michael Sansoterra is a developer at i3
Business Solutions, an IT services firm based in Grand Rapids, Michigan. You can
reach him at
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
. |