Use these new functions to easily extend your applications to the web.
If you've been waiting for a DB2 for i feature that is unprecedented in power and simplicity, get ready to become "unhinged." Further, prepare to hammer on your IBM i admin to install the latest database fixes, because DB2 for i 7.1 Group PTF Level 22 introduces a suite of functions that allow developers to make all manner of HTTP requests. (Actually, Level 22 had a problem, so you should skip right to the latest level, which is currently 23.) Right out of the box and without fancy programming, your apps can easily speak to a cloud server or intranet application server to download a file, contact a web service, or retrieve the source of a web page, to name a few possibilities.
Although I'll try to include reference links where appropriate, this tip assumes the reader has some familiarity with HTTP requests and responses.
If you interrogate the SYSFUNCS catalog (or drill down through IBM Navigator for i's database node), you will find the following new HTTP functions in the SYSTOOLS schema:
From the list, you can see there are several function sets that can be used to send an HTTP request using any of the standard HTTP methods (also known as verbs), including GET, PUT, POST, DELETE, and HEAD. Quite often, the GET verb is used to retrieve data. The particular verb you'll need to use is dictated by the web application you're communicating with. For example, if you wanted to upload an IFS file programatically to a Microsoft SkyDrive account using the available REST API, you would use either the PUT or the POST verb.
In addition to the HTTP functions, there are four companion functions that are often used with HTTP requests. I’m going to take a slight detour to discuss these first:
The Base64 functions are for encoding and decoding binary content that has been encoded as text for safe transmission over the web. For example, email attachments and binary data from a web service (such as an image) are often Base64 encoded.
The URL encode and decode functions are used to make sure that pieces of a URL are constructed properly because certain characters in a URL have special meaning—e.g., forward slash (/), question mark (?), ampersand (&), etc. These special characters need to be handled properly lest they be misinterpreted by the HTTP server.
As a quick example of why this is necessary, say I want to construct a URL to do a Google search on "A&W Root Beer." My URL may look like this:
https://www.google.com/search?q=A&;W Root beer&client=AS400
In this example, q and client are parameter names in the URL. However, because the ampersand (&) is in the search text and this special character is also used to signal a new parameter in the URL, the web server is going to think W is a new parameter that has been passed improperly. The way to fix this is to make sure that the URL is “safe” by encoding the variable content. The URLENCODE function will replace spaces with a plus sign (+) and replace special characters with a percent sign (%) followed by their hex value.
Here is an example of how the function ensures parameter values are safely passed to the HTTP server:
|| SYSTOOLS.URLENCODE('A&W Root beer','UTF-8')
|| '&client=' || SYSTOOLS.URLENCODE('AS400','UTF-8')
The result of the expression is this (note the ampersand becomes %26):
Technically, we didn't need to encode the client value of "AS400" because there are no special characters, but generally all non-hardcoded parameter values in a URL should be run through the encoding function to be safe. The second function parameter specifies the encoding. UTF-8 (i.e., 8-bit unicode) is the default and recommended value, but other standard encoding values such as UTF-16, US-ASCII, and ISO-8859-1 are valid.
Getting Data from the Web
Back to the task at hand. There's a lot of ground to cover with all of these functions, but for now I'm just going to focus on the HTTPGETCLOB and HTTPGETBLOB scalar functions. These functions submit an HTTP "GET" request to a specified HTTP server and return the result as a BLOB or CLOB.
You can think of these gems as a non-GUI web browser in a genie's lamp (or should I say in a scalar function?). For example, running this statement with our sample URL from above will return Google's HTML response in a single column:
These functions accept two parameters: URL and HTTPHeaders. Although left as an empty string here, the HTTPHeaders parameter is a CLOB that accepts an XML formatted string of request header fields and values such as User-Agent that can be passed to the server. In this case, no headers were sent.
The abridged results (with inserted line breaks for readability) look like this, pretty much the same thing you will see if you paste the same URL query in your favorite browser and then choose the option to view the page source:
<html itemscope="itemscope" itemtype="http://schema.org/WebPage">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta itemprop="image" content="/images/google_favicon_128.png">
<title>A&W Root beer - Google Search</title>
It's cool that we can easily retrieve HTML. The only problem is that it can be a pain in the neck to parse. Therefore, let's move on to a better use for these functions: the ability to call a parameterized web service and retrieve a result.
Note: Many organizations are publishing data via web services. Before you attempt to retrieve and parse an HTML web page, check whether there's a web service that can supply the same information. While it's possible to parse the Google HTML results from the sample URL, it's much easier to call the equivalent Google web service-based API to return just the search result data. That saves the headache of removing the HTML markup from the data returned by a web page.
Let's say you need a web service that accepts a U.S. zip code and returns the corresponding city and state. Searching the web, you find this free web service URL that can be incorporated into your app:
When reviewing this URL in a browser, you see there are four means of invoking the service:
? SOAP 1.1
? SOAP 1.2
? HTTP GET
? HTTP POST
If you're not familiar with SOAP, it originally stood for Simple Object Access Protocol (and now it's just "SOAP"). And, contrary to the name, SOAP requests can get quite complicated. When using SOAP, you're required to create an XML envelope containing parameters, etc. Here's an example of what a browser or "client" will pass to the HTTP server for a SOAP request:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
SOAP requires quite a bit of XML, even for the simplest of web service (WS) invocations, so I usually avoid calling a WS via SOAP if something easier is available. This is not to say SOAP is bad, but for many web services with simple parameter data, it can be overkill.
In this case, the HTTP GET and HTTP POST options indicate the WS provider also offers a representational state transfer (i.e., "REST") option to invoke the service. Invoking REST services is often simpler than SOAP because:
- 1.Parameters are passed in the URL instead of XML.
- 2.You can fiddle with them interactively and see what they do using just about any web browser. For example, just slap this URL into your web browser: http://www.webservicex.net/uszip.asmx/GetInfoByZIP?USZip=49525. You'll see the XML response. (REST services do not always return XML; they can also be plain text, binary, etc.)
- 3.An XML result from a REST service is generally easier to parse.
So, using these cool new DB2 for i HTTP functions, it's easy to let DB2 get the WS data. Run this SQL statement and watch it pull data from the WS:
SELECT data FROM (VALUES(SYSTOOLS.HTTPGETCLOB('http://www.webservicex.net/uszip.asmx/GetInfoByZIP?USZip=49525',''))) WS(data);
Here's the XML result (the same thing you'll see if you paste the URL in a browser):
<?xml version="1.0" encoding="utf-8"?>
Note: If you're using SQL in the STRSQL green-screen utility, you'll need to CAST the result to VARCHAR (<32K) in order to see the results.
The new XMLTABLE table function can be used to dynamically extract the values from the XML result:
City VARCHAR(128) PATH 'CITY',
State VARCHAR(1024) PATH 'STATE',
Zip VARCHAR(255) PATH 'ZIP',
AreaCode VARCHAR(255) PATH 'AREA_CODE',
TimeZone VARCHAR(255) PATH 'TIME_ZONE'
) AS WebServiceResult;
Of course, languages like C, RPG, or Java can also be used to parse the results of an XML result.
Because you can retrieve cloud data as a scalar or tabular result, you can now connect DB2 for i to the cloud and let data from the cloud participate in queries against your local database: the ultimate in heterogeneous queries!
Downloading File Data from the Web
For the final example, consider HTTPGETBLOB. HTTPGETBLOB has the same parameters as HTTPGETCLOB, and they pretty much work the same; the main difference is that the result returned from the server will remain unaltered. When retrieving binary data or text data that you don't want translated in any fashion (such as ASCII to EBCDIC), use HTTPGETBLOB.
Here's a sample RPG program that demonstrates how to download a PDF from the web directly to a file on the IFS in the /tmp folder using HTTGETPBLOB (don't forget to add error handling):
D PDF S SQLTYPE(BLOB_FILE)
D WebURL S 255 Varying
PDF_FO=SQFOVR; // Overwrite
SELECT SYSTOOLS.HTTPGETBLOB(:WebURL,'') PDF_DATA
The HTTPGETBLOB function is used to retrieve a Lionel trains PDF brochure from lionel.com. Once the PDF is returned as a BLOB, it's passed to the special BLOB_FILE SQL data type (RPG variable :PDF), which is configured to automatically dump its content to the specified IFS file.
In the past, I've implemented similar functionality to download a PDF from a web server using an RPG sockets program. This new method is trivial in comparison! It goes without saying, you'll normally want to implement a parameterized URL and IFS file name. Once parameterized, this program will allow the download and storage of any valid URL; the equivalent of right-clicking on a hyperlink in a browser and choosing "Save As."
Application Design Considerations
When retrieving data over the web, be careful about performance considerations. Obviously, there is the inherent latency in making a call over the Internet. Moreover, when returning large amounts of data in a table function like XMLTABLE, DB2 for i doesn't always choose the best join order for the tables (and in fact DB2 doesn't have any way to know how many rows will be returned) unless you create a wrapper table function and specify a cardinality.
Further, these functions require Java 1.6 to be installed, which means there's overhead when using the Java Virtual Machine (JVM). These functions have been relatively quick on the IBM i 7.1 machines I've been on, but during my many days working with customers on small boxes on V5R3 or V5R4, I have a history of thinking that Java takes too much overhead to use in a large number of interactive jobs. Hopefully, that's changing.
As an alternative to using these functions (or if you're not at IBM i 7.1 yet), you can always write your own RPG sockets program to act as an HTTP client. Other free options are IBM's Integrated Web Services for i tool, or Scott Klement's HTTP API library. You can write your high-level language (RPG, C, COBOL, etc.) code in such a way as to allow DB2 to tap into your code via an external function or external stored procedure call. So you can in fact still have the best of both worlds and avoid the Java overhead if needed.
Finally, make sure to carefully consider error handling. Whatever is on the web is outside of your control, so the application should know what to do if something isn't working right. Take extra care when using "free" web services because you never know how long they'll be available.
All in all, this suite of DB2 functions is spectacular! They allow you to easily exchange data with HTTP-enabled applications whether on the Internet or your intranet. They can be used to retrieve HTML from a web page, contact a web service, consume an RSS feed, push an IFS file to a server, or download a file (image, PDF, text, etc.) from a server. Best of all, since the functions can be run interactively in any SQL environment, testing, prototyping, and development are a snap.
Stick around. In the future, I'll demonstrate how to pass HTTP headers in a request, how to examine the response headers, and how to perform basic authentication when accessing data that requires a user name and password.
This reference is for DB2 LUW, but most of the concepts apply to IBM i. When trying the examples, remember to use the SYSTOOLS schema instead of the DB2XML schema. The examples include consuming an RSS feed and listing unread emails in a GMail inbox.