Marketing wants Web analytics in a simple and modern view to validate campaigns.
Perhaps your ISP gives you canned reports on Web site activity that don't show you what you need to know or don't offer very good historical reporting. Or perhaps you'd like a better way to do more in-depth or ad hoc analysis. If your organization would like to improve how it tracks, reports, and analyzes Web site activity, consider Help/Systems' SEQUEL for the task.
About Web Logs
Web server log files typically contain one record for every occurrence that a file is accessed. For example, when a visitor views a Web page containing three images, the log file generates four records: one for the HTML page plus three for the image files. Each Web log record is a variable-length character string that identifies the date, time, file name, DNS, referring DNS, browser type, and other information about the "hit." There's a wealth of data available in each record, but the challenge is transforming the huge volume of data into useful information.
That's where SEQUEL comes in.
Transforming Data into Information
ANZLOGFILE is a SEQUEL script that pulls log files from an ISP's server and converts the data into a physical file on the System i. ANZLOGFILE runs from a job scheduler at a time when the day's new Web activity logs are available. Specifically, it…
- Uses a secure login and password to automatically initiate an FTP session with the UNIX, Windows/NT, or IBM Web server.
- FTPs the log file to the System i.
- Executes a view named ANZPARSE1 that parses and converts the variable-length records in the log file into a fixed-format physical file.
- Deletes records from the new file that are not needed for analysis, including records for graphics files (jpg, gif, png…); Cascading Style Sheets (CSS); calls to CGI, PL, and other scripts; internal access; known Web crawler or spider activity.
- Performs a reverse-DNS and updates the records with a DNS name, when it can be determined.
- Executes a view named ANZUPDATE that adds records to a relational row-and-column physical file named LOGPARSE, which holds historical information used for analysis and reporting. The SEQUEL views, scripts, and tables described below use the LOGPARSE file.
How Much Activity Does Each Page Generate?
Figure 1: PAGESUM summarizes page hits by date.
Figure 2: TARGETSUM delivers an access count for target pages.
Figure 3: TARGETURL shows the visitor's IP address and host name.
Where Do Visitors Come From?
Figure 4: REFERSUM identifies what sites are referring visitors to your Web site.
Figure 5: SOURCECNT2 shows visitor URL, host, pages visited, and total time spent at the site.
What Visitors Come to the Web Site?
SOURCECNT2 shows the source URL and host for each visitor and summary information on the number of pages visited and the total time spent at the site.
Figure 6: SOURCECNT2 shows visitors' source URL and host.
From here, you can right-click any line to drill down into the detail on what pages visitors reviewed. This view, called SOURCEURL, provides the results shown below. The Time and Target Page fields show what pages were viewed, the order in which pages were viewed, and the time spent on each page.
Figure 7: SOURCEURL provides page-view details.
These SEQUEL examples show a small portion of how you can "slice and dice" Web log information. Once you create your own historical LOGPARSE file, there's no limit to the number of additional SEQUEL views, graphs, and reports you can create to analyze Web site activity.