Using OPNQRYF to Join Files

Business Intelligence
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Joining is combining data from two files into one record format. Relational databases join files so that applications have all the information they need.

There's more than one way to join files on the AS/400. You can create permanent join logical files by using DDS or SQL views, or you can create temporary joins that last only as long as you need them by using Open Query File (OPNQRYF).

Creating join logical files through DDS was described in "If You Can't Beat 'em, Join 'em" (MC, May 1995). If you're not familiar with join logical files on the AS/400, I encourage you to read that article.

Typically, you create join logical files when you need to constantly retrieve data from multiple related files. You may have an order entry system that has an invoice header file and an invoice detail file. When your application retrieves information from the invoice header file, it also retrieves information from the invoice detail file. Rather than define two files and create the code to process them, with a join logical file you can create one file that logically connects both files; you can reference and process one join logical file instead of two files.

The system automatically maintains the logical view; when records are added to or deleted from any of the files that make up a join logical file, the system by default, immediately updates the logical view.

Once created, join logical files normally remain on the system as permanent files, always ready to release information. Many cases requiring joins, however, are better served by temporary joins. If you're creating all your joins through join logical files, you could be wasting system resources and hurting the performance of your system. For example, permanently joining files for a monthly report may waste system resources by maintaining a logical view that is used only once a month.

If you use the OPNQRYF command to join files, the logical view is temporary. It resides on the system only for the duration of the job that requires it, so no system resources are consumed maintaining the logical view.

This article explains OPNQRYF's join capabilities and shows you how to implement common types of joins with the OPNQRYF command. First, however, I'll discuss why joining is necessary.

Why Joining Is Necessary

Joining is a key concept in the relational model of database management systems (DBMSs). One goal of relational DBMSs is to avoid having inconsistent or missing information (or "anomalies") in the database.

If our database software did not permit joins, we would have to either put all the information required to process a sale into one file or write a program to retrieve the information from numerous files. Shown below is a list of the types of information that might be necessary to complete a sales transaction.

* Customer number

* Customer's name

* Customer's address

* Customer's phone number

* Item number

* Item description

* Item quantity in inventory

* Item unit price

* Quantity sold

* Date sold

There are some problems with the first method. First, it is possible to lose information. We can't store information about a customer who does not currently have any orders with us. Similarly, if there are currently no orders for a particular item, we can't find out how many of those items we have in inventory, how much the item sells for, or even what it's called.

Second, it is possible to have conflicting information in the database. If a customer moves, we have to find every order for him and change his address in all of them. If we don't find all of his orders, we have different addresses on file. It is also possible that we could miskey his address or phone number when entering a new order. The computer has no way to know which of the contradicting values is valid.

To avoid these anomalies, the relational model splits the file into separate files of related information, or "normalize" the file. The sales file becomes three files: a customer master file, an item master file, and a sales transactions file.

* Customer master:

Customer number

Customer's name

Customer's address

Customer's phone number

* Item master:

Item number

Item description

Item quantity in inventory

Item unit price

* Sales transactions:

Customer number

Item number

Quantity sold

Date sold

With this method, we no longer have to worry about contradicting or missing information, but the sales transactions no longer have all the data they need. This is where joining comes in. By joining the three files, we can view the data the way we had it organized at first, yet store it in the database without the possibility of introducing anomalies.

Why Join with OPNQRYF?

There are some good reasons to join files temporarily with OPNQRYF rather than permanently with DDS or SQL views.

* Joining with OPNQRYF allows you to avoid keeping a join logical file on disk when it is not being used. Logical files increase system overhead because a change to a physical file requires the system to change its logical files as well. Unless you use a join logical file frequently or the join logical file is over stable data, it may be better to use OPNQRYF.

* OPNQRYF permits you to have key fields from more than one physical file.

* OPNQRYF can join logical files. You can join only physical files with DDS or SQL views.

* Join logical files can join only on matching field values. OPNQRYF allows you to join on unequal conditions and calculated values.

* OPNQRYF will perform a type of join?the exception join?that isn't available through join logical files.

Types of Joins

Some relational database systems support only one type of join?the inner join?but the AS/400 supports three. I'll explain each one and illustrate it by joining data from a personnel system. 1 contains sample data about employees (PEREMP) and their dependents (DEPEND).

Some relational database systems support only one type of join?the inner join?but the AS/400 supports three. I'll explain each one and illustrate it by joining data from a personnel system. Figure 1 contains sample data about employees (PEREMP) and their dependents (DEPEND).

An inner join disregards a data record if complementary data is not in another file. 2 shows the results of an inner join of the employees file and the dependents file. Bob Green (employee 97) has no dependents, so an inner join does not include any information for him.

An inner join disregards a data record if complementary data is not in another file. Figure 2 shows the results of an inner join of the employees file and the dependents file. Bob Green (employee 97) has no dependents, so an inner join does not include any information for him.

An outer join includes all records from the primary file, even when complementary data doesn't exist. The form of outer join supported by the AS/400 is called a partial outer join or left outer join. Records from the primary file (the first file) are included in the resultant table even if no complementary records exist. Unmatched records from other files (secondary files) are not included.

If you use an outer join to join the employee master to the dependents file, you get information about all employees, including Bob Green (see 3). The dependent name, dependent relationship, and dependent date of birth fields have default values in the record for Bob.

If you use an outer join to join the employee master to the dependents file, you get information about all employees, including Bob Green (see Figure 3). The dependent name, dependent relationship, and dependent date of birth fields have default values in the record for Bob.

An exception join finds only unmatched records. Joining the employee master to the dependents master retrieves only one record?Bob Green's (see 4). Many relational systems do not have this type of join.

An exception join finds only unmatched records. Joining the employee master to the dependents master retrieves only one record?Bob Green's (see Figure 4). Many relational systems do not have this type of join.

Unfortunately, it is possible to enter invalid data into a database or to delete a master file record that has corresponding data in other files. The referential integrity features of DB2/400, introduced in V3R1, can help you avoid such situations. In the meantime, if you think it's possible that a record might not have a match, you may want to use an outer join instead of an inner join.

Making OPNQRYF Join Files

To make OPNQRYF join files, you must fill in certain parameters correctly.

In the FILE parameter, list at least two files that you want to join. The first file is the primary join file, and the rest are secondary join files. This distinction matters only for outer and exception joins.

In the FORMAT parameter, give the name of a file that OPNQRYF can use for a pattern when it passes the joined data to a program. Use a production file if you have one that suits your needs. Most often, however, you'll need to create a dummy physical file with the fields you need from all the joined files.

In the JDFTVAL parameter, specify what type of join you want. JDFTVAL(*NO), the default, gives you an inner join. Specify *YES for an outer join, and *ONLYDFT for an exception join. If any file in the FILE parameter is a join logical file, this parameter must match the join logical file.

In the JFLD parameter, list the fields used to match up the files. If a field has the same name in two or more files, qualify it with their file names (e.g., PEREMP/ENO) or with the relative position of the file in the FILE parameter (e.g., 1/ENO). In 1, the ENO field would match the DENO field. Since the fields have different names, no qualification is necessary.

In the JFLD parameter, list the fields used to match up the files. If a field has the same name in two or more files, qualify it with their file names (e.g., PEREMP/ENO) or with the relative position of the file in the FILE parameter (e.g., 1/ENO). In Figure 1, the ENO field would match the DENO field. Since the fields have different names, no qualification is necessary.

You may join fields on inequalities as well, but that is unusual. You may also join on two or more fields.

If you do not list matching fields in the JFLD parameter, every record of one file is joined to every record of another file. (This join is known as the Cartesian product.) Generally, you should avoid such a loosely defined join.

In the JORDER parameter, you control the order by which the system joins the files. For example, if you're joining files A, B, and C, you can force the system to join file C first, file A second, and file B third. The JORDER parameter is permitted only for inner joins. Because the system may want to rearrange the file order for performance reasons, the default value of *ANY will probably yield the best performance results. The system may retrieve records in different sequences, depending on which file is considered primary and on what order the secondary join files are processed in.

Unless you have some compelling reason to use this parameter, let the system join the files any way it wants.

In the MAPFLD parameter, qualify fields from the FORMAT file that are found in two or more of the joined files. We won't have to qualify examples that join PEREMP and DEPEND, since no field has the same name in both files.

Examples

Let's take a look at using the parameters to create the three types of joins I have described.

The inner and outer joins return data from both files, so I define a record format that contains all of the fields from both files, although defining all fields from both files is not a requirement. 5 contains the DDS for physical file PERWRK. This file doesn't contain data and doesn't need a member. You will use it in the FORMAT parameter for the inner and outer joins. Any high-level language (HLL) program that reads the resulting data from the join will be coded to read PERWRK.

The inner and outer joins return data from both files, so I define a record format that contains all of the fields from both files, although defining all fields from both files is not a requirement. Figure 5 contains the DDS for physical file PERWRK. This file doesn't contain data and doesn't need a member. You will use it in the FORMAT parameter for the inner and outer joins. Any high-level language (HLL) program that reads the resulting data from the join will be coded to read PERWRK.

The PEREMP and DEPEND files each have employee number fields?ENO in PEREMP and DENO in DEPEND. The JFLD parameter will tell OPNQRYF to join the files based on the values of these fields.

Shown here is the OPNQRYF command for the inner join, along with the other CL commands needed to complete this job.

OVRDBF FILE(PERWRK) +

TOFILE(PEREMP) +

SHARE(*YES)

OPNQRYF FILE((PEREMP) (DEPEND)) +

FORMAT(PERWRK) +

KEYFLD((ENO) (DBIRTH)) +

JFLD((ENO DENO)) +

JDFTVAL(*NO)

CALL PGM(DEPLIST)

CLOF OPNID(PEREMP)

DLTOVR FILE(PERWRK)

The FILE parameter of the OPNQRYF command says the data is in PEREMP and DEPEND. The FORMAT parameter tells OPNQRYF to make the joined data look like the records in PERWRK. The KEYFLD parameter tells OPNQYF how to sort the data. The JFLD parameter matches ENO of the PEREMP file to DENO of the DEPEND file. JDFTVAL(*NO) specifies that any PEREMP records that have no matches in DEPEND are to be ignored.

Program DEPLIST is written in a language such as RPG or COBOL and is coded to read PERWRK and write to a printer. The data, however, is not in PERWRK, but in PEREMP and DEPEND. That's why the override is there. The override to the primary join file, PEREMP, forces DEPLIST to read from the open data path created by OPNQRYF.

When DEPLIST ends, the Close File (CLOF) and Delete Override (DLTOVR) commands tidy up the job.

To get the outer join results shown in 3, change one line. Instead of accepting the default JDFTVAL(*NO), specify JDFTVAL(*YES).

To get the outer join results shown in Figure 3, change one line. Instead of accepting the default JDFTVAL(*NO), specify JDFTVAL(*YES).

The exception join shown in 4 is different. There is no matching data in the DEPEND file, so those fields always have default values. You could use the PERWRK format for this type of join, but since the PEREMP fields are the only ones with meaningful information, use PEREMP as the format file.

The exception join shown in Figure 4 is different. There is no matching data in the DEPEND file, so those fields always have default values. You could use the PERWRK format for this type of join, but since the PEREMP fields are the only ones with meaningful information, use PEREMP as the format file.

OVRDBF FILE(PEREMP) SHARE(*YES)

OPNQRYF FILE((PEREMP) (DEPEND)) +

FORMAT(PEREMP) +

KEYFLD((ENO)) +

JFLD((ENO DENO)) +

JDFTVAL(*ONLYDFT)

CALL PGM(EMPLIST)

CLOF OPNID(PEREMP)

DLTOVR FILE(PEREMP)

Program EMPLIST is written to read PEREMP, which it may do under other circumstances. Under the control of OPNQRYF, it reads only the PEREMP records that have no matching records in the DEPEND file.

Notice that the PEREMP file has a field called ESUPRV, which has the supervisor's employee number. Suppose you need to know who supervises whom. You can find out by joining the PEREMP file to itself!

You'll need a file to store supervisor and employee information. Call it SUPRWRK. As the DDS in 6 shows, SUPRWRK has room for the employee's name and number and his supervisor's name and number. However, SUPRWRK does not contain data and need not have a member. The real data is in PEREMP.

You'll need a file to store supervisor and employee information. Call it SUPRWRK. As the DDS in Figure 6 shows, SUPRWRK has room for the employee's name and number and his supervisor's name and number. However, SUPRWRK does not contain data and need not have a member. The real data is in PEREMP.

Use this OPNQRYF command for this type of inner join.

OVRDBF FILE(SUPRWRK) +

TOFILE(PEREMP) +

SHARE(*YES)

OPNQRYF FILE((PEREMP) (PEREMP)) +

FORMAT(SUPRWRK) +

KEYFLD((SNAME) (ENAME)) +

JFLD((1/ENO 2/ESUPRV)) +

MAPFLD((SNO '1/ENO') +

(SNAME '1/ENAME') +

(ENO '2/ENO') +

(ENAME '2/ENAME'))

CALL PGM(SUPRLIST)

CLOF OPNID(PEREMP)

DLTOVR FILE(SUPRWRK)

Listing PEREMP twice in the FILE parameter causes OPNQRYF to treat PEREMP as two different files so that one record of PEREMP can be joined to another (see 7). The JFLD parameter matches the employee number field of one record to the supervisor field of another.

Listing PEREMP twice in the FILE parameter causes OPNQRYF to treat PEREMP as two different files so that one record of PEREMP can be joined to another (see Figure 7). The JFLD parameter matches the employee number field of one record to the supervisor field of another.

To distinguish between the data in the two records, use the MAPFLD parameter. In this example, the ENO and ENAME fields of the supervisor's record are called SNO and SNAME in the resultant table. I chose not to rename the ENO and ENAME fields of the employee's records, but I could have.

Program SUPRLIST is written as if it will read the SUPRWRK file, but when it begins to run, it reads the data that OPNQRYF retrieves from PEREMP.

Join Me Again!

Joining files with OPNQRYF gives you plenty of options. OPNQRYF also allows you to reduce the system overhead that's dedicated to maintaining logical files. There's much more I can tell you about joining files. For instance, consider these possibilities.

* Joining more than two files.

* Joining with more than one common join field.

* Joining on inequalities.

* Using the Cartesian product.

* Joining with the QRYSLT parameter.

* Joining on data of different types.

* Joining on expressions.

Look for an article on additional joining methods in an upcoming issue of MC.

Ted Holt is an associate technical editor for Midrange Computing.


Using OPNQRYF to Join Files

Figure 1: Employee and Dependents Files PEREMP and DEPEND

 PEREMP ENO ENAME EPHONE ESUPRV EHIRED 1 Bigg, Paul 234-4321 NULL 10/21/80 55 Black, Jack 345-7777 1 8/21/82 97 Green, Bob 345-6789 1 8/17/81 101 Smith, Joe 234-5678 97 3/30/93 105 White, Moe 234-7654 55 2/18/92 119 Brown, Pat 234-8888 97 4/19/94 DEPEND DENO DNAME DRELAT DBIRTH 1 Bigg, Sue Spouse 9/14/47 55 Black, Betty Spouse 11/23/53 101 Smith, Kay Spouse 5/29/70 101 Smith, Bud Child 1/06/92 101 Smith, Tyler Child 1/08/94 105 White, Ann Child 4/16/90 119 Brown, John Spouse 8/02/71 119 Brown, Don Child 5/25/94 
Using OPNQRYF to Join Files

Figure 2: Inner Join of PEREMP and DEPEND

 ENO ENAME EPHONE ESUPRV EHIRED DNAME DRELAT DBIRTH 1 Bigg, Paul 234-4321 NULL 10/21/80 Bigg, Sue Spouse 9/14/47 55 Black, Jack 345-7777 1 8/21/82 Black, Betty Spouse 11/23/53 101 Smith, Joe 234-5678 97 3/30/93 Smith, Kay Spouse 5/29/70 101 Smith, Joe 234-5678 97 3/30/93 Smith, Bud Child 1/06/92 101 Smith, Joe 234-5678 97 3/30/93 Smith, Tyler Child 1/08/94 105 White, Moe 234-7654 55 2/18/92 White, Ann Child 4/16/90 119 Brown, Pat 234-8888 97 4/19/94 Brown, John Spouse 8/02/71 119 Brown, Pat 234-8888 97 4/19/94 Brown, Don Child 5/25/94 
Using OPNQRYF to Join Files

Figure 3: Outer Join of PEREMP and DEPEND

 ENO ENAME EPHONE ESUPRV EHIRED DNAME DRELAT DBIRTH 1 Bigg, Paul 234-4321 NULL 10/21/80 Bigg, Sue Spouse 9/14/47 55 Black, Jack 345-7777 1 8/21/82 Black, Betty Spouse 11/23/53 97 Green, Bob 345-6789 1 8/17/81 101 Smith, Joe 234-5678 97 3/30/93 Smith, Kay Spouse 5/29/70 101 Smith, Joe 234-5678 97 3/30/93 Smith, Bud Child 1/06/92 101 Smith, Joe 234-5678 97 3/30/93 Smith, Tyler Child 1/08/94 105 White, Moe 234-7654 55 2/18/92 White, Ann Child 4/16/90 119 Brown, Pat 234-8888 97 4/19/94 Brown, John Spouse 8/02/71 119 Brown, Pat 234-8888 97 4/19/94 Brown, Don Child 5/25/94 
Using OPNQRYF to Join Files

Figure 4: Exception Join of PEREMP and DEPEND

 ENO ENAME EPHONE ESUPRV EHIRED 97 Green, Bob 345-6789 1 8/17/81 
Using OPNQRYF to Join Files

Figure 5: DDS for Physical File PERWRK

 *. 1 ...+... 2 ...+... 3 ...+... 4 ...+... 5 ...+... 6 A R PERWRKR A ENO R REFFLD(ENO PEREMP) A ENAME R REFFLD(ENAME PEREMP) A EPHONE R REFFLD(EPHONE PEREMP) A ESUPRV R REFFLD(ESUPRV PEREMP) A EHIRED R REFFLD(EHIRED PEREMP) A DNAME R REFFLD(DNAME DEPEND) A DRELAT R REFFLD(DRELAT DEPEND) A DBIRTH R REFFLD(DBIRTH DEPEND) 
Using OPNQRYF to Join Files

Figure 6: DDS for SUPRWRK

 *. 1 ...+... 2 ...+... 3 ...+... 4 ...+... 5 ...+... 6 A REF(EMPMAS) A R SUPRWRKR A SNO R REFFLD(ENO) A SNAME R REFFLD(ENAME) A ENO R REFFLD(ENO) A ENAME R REFFLD(ENAME) 
Using OPNQRYF to Join Files

Figure 7: Inner Join of PEREMP and PEREMP

 SNO SNAME ENO ENAME 1 Bigg, Paul 55 Black, Jack 1 Bigg, Paul 97 Green, Bob 55 Black, Jack 105 White, Moe 97 Green, Bob 119 Brown, Pat 97 Green, Bob 101 Smith, Joe 
BLOG COMMENTS POWERED BY DISQUS