|Use SQL to Export and Delimit Files on the IBM i|
|Programming - SQL|
|Written by Thomas Snyder|
|Wednesday, 20 January 2010 01:00|
Create CSV files from multiple physical files.
In a previous article, I discussed how to "Delimit Data on the IBM i the Easy Way." In this article, we'll take it a step further and discuss alternate methods of exporting data from multiple physical files and delimited text files using comma-separated values (CSV).
The CPYF command gives you some pretty powerful capabilities when copying files. But you can do even more with SQL. There are many ways to perform the same task with the resources available on the IBM i. You just have to decide how much code you want to write and be aware of the resources that are available.
Starting the Interactive SQL Tool
Throughout most of this article, we will be using the interactive SQL tool available from the IBM i command line. To start the interactive SQL tool, type STRSQL on the command line. For more detailed information on the interactive SQL tool, refer to one of my previous articles, "Use Interactive SQL from the IBM i Command Line."
Creating a Copy of an Existing File
Let's start off simple. To begin using the SQL commands, create a copy of an existing physical file as is. This can easily be done with other commands, such as CPYF, but this will be a good start for our SQL endeavors.
We will be using the MC_PHONE file for our example, which has the following DDS:
A R MCFMT
A MCACCT 6S 0 COLHDG('ACCOUNT NUMBER')
A MCAREA 3A COLHDG('AREA CODE')
A MCPHONE 7A COLHDG('PHONE NBR')
A MCDATE 8S 0 COLHDG('CHANGE DATE')
We will be copying our mc_phone file into a new file named mc_phone1.
create table mylib/mc_phone1
as (select * from mylib/mc_phone)
Here is a quick breakdown of what the command is doing:
If you execute the DSPFFD command on the resulting mc_phone1 file, you will see that the field definitions are the same as those in the original mc_phone file.
And to verify that the data was copied into our new mc_phone1 file, we can run the following statement on the mc_phone1 table.
select * from mc_phone1
And here you have the results of the query to show the data that we will be working with:
Filter Data While Dropping and Mixing Field Positions
CPYF does the job of filtering the data, and you can use the CRTFILE(*YES) option to automatically create the target file. But if you need to change the field positions, you will need to create the target file first and use the FMTOPT(*MAP) option. And if you want to drop fields between the source and the target file, you will need to also use the FMTOPT(*DROP) option.
You can easily support this in SQL without the preexisting file by specifying which fields will be in your output file and which order to put them in.
For this example, we will only copy the records that were created in 2010, and we will drop the date table from the results. And just for good measure, we will also move the account number to be the third field instead of the first.
create table mylib/mc_phone2
as (select mcarea, mcphone, mcacct from mylib/mc_phone
where mcdate > 20091231)
Instead of using the asterisks in the SELECT statement, we specified which fields to use and which order to put them in in the target file.
If you execute the DSPFFD command on the resulting mc_phone2 file, you will see that the mcdate field does not exist. You can also see that the field order has been changed to be as expected.
Delimiting and Filtering the Data with SQL
CPYF can be used to delimit the data by setting defaults in the delimiter fields of the target file, as discussed in my previous article. But with SQL, you can add the delimiters to the target file while you are creating it by concatenating the delimiter characters to the fields.
TABS are common delimiters, and you can use them by specifying the Hex EBCDIC Tab value of X'05'. But let's make it a little more interesting this time by creating a CSV file that we can easily download and use with Excel.
Using SQL, we also have the ability to filter the data, and to top it all off, we don't need the additional DDS! All we have to do is insert the constant values to be used with our source data to create the delimiters. When the new fields are created, we need to specify the field name immediately after the constant value is specified to be used.
create table mylib/mc_phone3
as (select '"' delimit1, mcarea,
'","' delimit2, mcphone,
'","' delimit3, mcacct,
from mylib/mc_phone where mcdate > 20091231)
In theory, this statement should generate the results that we're looking for. And if you're using the data transfer utility with iSeries Access, it probably will. But if you download the file using FTP, you may find some additional and undesirable characters included with your delimiters.
No problem. We can get around that by encoding our fields to the EBCDIC CCSID of 37 or whatever CCSID you prefer.
create table mylib/mc_phone3
as (select cast('"' as char(1) ccsid 37) delimit1, mcarea,
cast('","' as char(3) ccsid 37) delimit2, mcphone,
cast('","' as char(3) ccsid 37) delimit3, mcacct,
cast('"' as char(1) ccsid 37) delimit4
from mylib/mc_phone where mcdate > 20091231)
If you execute DSPFFD on mc_phone3, you will see that the new delimited fields are included in the data. And if you view the data using a SELECT statement, you can see that the new data is delimited.
Figure 4: Running DSPFFD on mc_phone3 shows that the new delimited fields are included in the data.
Now if we download the file, providing a .csv extension to the file, we can easily open the file in a spreadsheet. The .csv extension will help the application identify the expected format of the data as CSV.
Retrieving the Data from Multiple Source Files
Now that we know how to insert delimiters, let's really explore the reason that you would use SQL by retrieving the data from multiple source files.
For this example, we will be using an additional file called MC_ACCT, which contains only the account number, first name, and last name. Here is the DDS for the MC_ACCT file:
A R MAFMT
A MAACCT 6S 0 COLHDG('ACCOUNT NUMBER')
A MAFNAME 32A COLHDG('FIRST NAME')
A MALNAME 32A COLHDG('LAST NAME')
A K MAACCT
Suppose you already have 100 things on your plate and a request comes through for a new process that requires critical business information (simply a list of records from a few files) to be sent to an external system—and the boss needs it yesterday! Sure, you could write an RPG program to do it, or you could do it in SQL using a single statement!
create table mylib/mc_phone4
as (select cast('"' as char(1) ccsid 37) delimit1, a.maacct,
cast('","' as char(3) ccsid 37) delimit2, a.mafname,
cast('","' as char(3) ccsid 37) delimit3, a.malname,
cast('","' as char(3) ccsid 37) delimit4, p.mcarea,
cast('","' as char(3) ccsid 37) delimit5, p.mcphone,
cast('"' as char(1) ccsid 37) delimit6
from mylib/mc_acct a, mylib/mc_phone p
where a.maacct = p.mcacct)
The single letters a and p are references to the mc_acct and mc_phone files, respectively, and are assigned in the from portion of the command. The link to the tables is specified in the where portion, which identifies the account number fields maacct and mcacct.
And there you have it: a quick data export comprised of multiple files in CSV format using a single line of code!
The Right Tools for Any Job
As you can see, there are different ways to perform different tasks, based upon the level of complexity of the goal that you are trying to reach. There are different commands available, such as CPYFRMIMPF and CPYF, or you can simply use SQL and do it all. But then again, RPG can do everything here and more. Again, my point is that there are different tools for different tasks. If you know what they are, you can use them where most appropriate.
|Last Updated on Monday, 18 January 2010 15:06|