In January, I took a new gig with a new client and was exposed to the most obtuse field names that I had ever seen in my life. They had absolutely nothing to do with the underlying type of data, and there was no rhyme or reason to the naming convention. Now, this is normally not too much of a problem, but the system I was designing had to access over 70 files out of 1,000 in a single library, each file having about 100 fields and at least four logical files (indexes). In addition, most of the physical files had a minimum of 1 million records to contend with. All of this points to my having spent a lot of time trying to optimize the data access of my queries, and I did not relish using my handy decoder ring each time I wanted to search for an index that might be appropriate for the query I was optimizing.
Display File Field Description (DSPFFD) is a wonderful utility if you like browsing through green data, but what I needed in this situation was a friendly listing of indexes that decoded obfuscated field names into their column headers. Such a listing would have been a handy reference to have during the query optimization process. I developed a spreadsheet, which can be downloaded from the MC Web site at www.midrangecomputing.com/mc, to aid in query optimization and illuminate the dark world of the target data.
This spreadsheet uses ActiveX Data Objects (ADOs) to retrieve a list of indexes for a target physical file on the AS/400. It then uses another ADO method to retrieve column headers for fields in the target physical file so it can print a report of available indexes. The macro that produces the report uses a technique called disconnected recordsets to manipulate the collection of column headers as it iterates through each index entry retrieved from the AS/400. This technique is a valuable addition to your toolbox because it allows you to massage and manipulate recordsets that you obtain from databases. The resulting spreadsheets are also useful, so let me walk you through the macro so you can see how its done!
First the user of the spreadsheet enters the library, table name, data source name (DSN), user ID, and password in an Excel form. Clicking the List Indexes button executes the macro GetIndexInfo. GetIndexInfo starts by declaring an ADO connection object and connecting to the specified data source. (This spreadsheet is not AS/400-specific; it connects to any ODBC data source and produces an index report on any defined database table.)
ADO connection objects have a method available called OpenSchema, which can be used to retrieve information about the database that ADO is connected to. The OpenSchema method takes arguments that allow you both to specify the type of information that you want to retrieve about the database and to limit the scope of the query to only certain tables, views, or indexes. The first argument of OpenSchema is an enumerator (which is an integer constant) indicating the schema that you want to open. (There are enumerators for column information, indexes, tables, and other listings that you can ask for.) The second argument is actually an array of arguments that affect records returned by OpenSchema. Each different schema enumeration takes different arguments, which can be found in the Microsoft Platform SDK documentation for database access, ADO, and OLE DB.
Consider the adSchemaColumns enumerator for a moment. This enumerator returns a recordset containing a list of database columns. If it is called with all arguments empty, it returns all columns in all tables in the entire database. (This could take quite a lot of time, so dont try this at home!) Here are two examples of arguments passed to OpenSchema requesting the adSchemaColumns collection:
The keyword Empty specifies that you do not care about that particular argument to the query. The first example creates x as a recordset containing all columns for the B table in library F on your AS/400; the second returns a recordset of all columns for all tables in library F on your AS/400. Once you have the recordsets in hand, you can easily navigate the recordset to find information about columns in your physical files.
The macro detailed in this article is available as an Excel spreadsheet that you can download from the MC Web site at www.midrangecomputing. com/mc/. The only requirements for running the macro are some sort of ODBC connection available to your AS/400, such as the Client Access ODBC driver, and a good version of ADO installed and referenced. ADO can be obtained at www.microsoft.com/data, and, to reference ADO in the spreadsheet, go to Tools/References in the Visual Basic Editor and make sure that ADO is selected.
The macro, a snippet of which is shown in Figure 1 (you can download the code in its entirety at www.midrangecomputing.com/mc/), first generates a recordset containing all column definitions for the specified physical file. This is accomplished by using the OpenSchema method with the adSchemaColumns enumerator and by passing the target library and table name that the spreadsheet user specified. Next, the spreadsheet uses the MoveLast method of the created recordset to move to the last record, ensuring that all column information has been returned to the client recordset. The spreadsheet then sets the connection property of the recordset to Nothing, disconnecting the recordset from the AS/400. At this point, the recordset is still active and can still be searched, but all searches and manipulation are local to the client and do not involve positioning the cursor on the AS/400, resulting in significant performance improvement when navigating the column recordset.
The macro then opens a recordset of indexes available on the target physical file. This is accomplished by using the OpenSchema method with the adSchemaIndexes enumerator and by again passing an array of arguments that indicate the physical file and
library you are interested in. The macro goes into a while loop (so it can process each record in the recordset) and begins printing index information into the spreadsheet. Inside the while loop, things get interesting. The first if/then condition checks whether or not the index name has changed. If it has, the macro moves down the spreadsheet one line and prints both the name of the index (offset 5 in the recordset) and whether or not the index is unique (offset 7). It then sets the IXNAME variable equal to the current index name it is processing so the index name is not printed unless it changes. (Fields returned by the OpenSchema method are also documented in the Microsoft Platform SDK data access documentation.)
Next, the macro examines offset 20 in the recordset. This offset indicates whether the column indexed is returned in ascending or descending order. The macro uses this value to print the appropriate human-readable response on the spreadsheet. The macro then prints the name of the indexed column (offset 17) and uses the column name to set a filter on the column recordset. The filter is just like an SQL WHERE clause. The filter restricts the disconnected recordset to only records that match the filter condition.
The column recordset has a field in it called COLUMN_NAME. The macro is saying to show only records in the column recordset where COLUMN_NAME equals this column. The other records in the recordset do not disappear; they are just masked by ADO so it appears to the program that only one record is in the recordset. A byproduct of setting a filter against a disconnected recordset is that the record pointer is moved to the first record in the set that matched the criteria. The next line, if not CS.EOF, ensures that a matching column definition is found for the indexed column. When a matching column definition is found, the macro prints offset 27 of the column recordset, which represents the column description of the database field. The macro then moves to the next index record via the MoveNext method and loops.
When all index records have been processed, the macro drops out of the loop and defines the PrintArea of the spreadsheet. PrintArea is a property of the PageSetup object, which is associated with the current worksheet. Defining the PrintArea ensures that, if the spreadsheet user clicks File/Print, the spreadsheet prints only information returned by the last call to the macro and no empty or erroneous cells. I find this to be a nice convenience to add to any spreadsheet, so I wanted to make sure that I pointed the method out to you.
Where Can You Go from Here?
Disconnected recordsets give you the ability to retrieve a recordset from your AS/400 and then peruse it at your leisure during the processing of a macro or script. An area where you might use this technique is typically small recordsets (usually those with fewer than 2,000 records), where it makes sense to retrieve data once and then use the filter or find methods to navigate the data during processing. For example, I built a graphical front-end to analyze sales data for a company. The list of salesmen had about 800 records but was needed by several different screens in the application. I retrieved the recordset at the start of the program and never had to hit the AS/400 for that information again. This technique saves bandwidth and makes the application appear much more lithe to users, as the data is retrieved only once but used over and over by different processes and screens.
In the index report generated by this spreadsheet, you also have a neat little utility that may help you during optimization of Query/400 and SQL queries. Explore other schemas that can be returned by the OpenSchema method, and you may find other information that you want to retrieve. Theres a wealth of info out there.
... Code that dims objects and
... connects to AS/400 and prints all of the
... column headers is here but removed for space
... Open CS which is a recordset
... of columns in the target table
ArgArray = Array(Empty, Library, Table, Empty)
Set cs = con1.OpenSchema(adSchemaColumns, ArgArray)
move to end of recordset and disconnect
cs.ActiveConnection = Nothing
... Open the list of indexes
ArgArray = Array(Empty, Library, Empty, Empty, Table)
Set rs = con1.OpenSchema(adSchemaIndexes, ArgArray)
r is cussent offset of spreadsheet
while we read indexes
While Not rs.EOF
...code to print index info
...removed for space
next line gets the field name FldName = rs.Fields(17).Value
set the filter on the cs recordset
to only show the current column cs.Filter = COLUMN_NAME = & FldName &
if we found the column description
print the column description of the field If Not cs.EOF Then
ActiveCell.Offset(R, 4).Value = cs.Fields(27).Value
move to the next line of the spreadsheet
R = R + 1
set the print area to be only what we just retrieved
Worksheets(TableIndexes).PageSetup.PrintArea = A1:E & R + 4
Application.ScreenUpdating = True
move the pointer to the top of the spreadsheet
Figure 1: This macro snippet retrieves the list of indexes for an AS/400 table and uses a disconnected recordset to add the column descriptions to the spreadsheet.