Partner TechTip: Analyze Web Site Activity Through Graphs and Reports

System Administration
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Marketing wants Web analytics in a simple and modern view to validate campaigns.


Perhaps your ISP gives you canned reports on Web site activity that don't show you what you need to know or don't offer very good historical reporting. Or perhaps you'd like a better way to do more in-depth or ad hoc analysis. If your organization would like to improve how it tracks, reports, and analyzes Web site activity, consider Help/Systems' SEQUEL for the task.

About Web Logs

Web server log files typically contain one record for every occurrence that a file is accessed. For example, when a visitor views a Web page containing three images, the log file generates four records: one for the HTML page plus three for the image files. Each Web log record is a variable-length character string that identifies the date, time, file name, DNS, referring DNS, browser type, and other information about the "hit." There's a wealth of data available in each record, but the challenge is transforming the huge volume of data into useful information.


That's where SEQUEL comes in.

Transforming Data into Information

ANZLOGFILE is a SEQUEL script that pulls log files from an ISP's server and converts the data into a physical file on the System i. ANZLOGFILE runs from a job scheduler at a time when the day's new Web activity logs are available. Specifically, it…

  • Uses a secure login and password to automatically initiate an FTP session with the UNIX, Windows/NT, or IBM Web server.
  • FTPs the log file to the System i.
  • Executes a view named ANZPARSE1 that parses and converts the variable-length records in the log file into a fixed-format physical file.
  • Deletes records from the new file that are not needed for analysis, including records for graphics files (jpg, gif, png…); Cascading Style Sheets (CSS); calls to CGI, PL, and other scripts; internal access; known Web crawler or spider activity.
  • Performs a reverse-DNS and updates the records with a DNS name, when it can be determined.
  • Executes a view named ANZUPDATE that adds records to a relational row-and-column physical file named LOGPARSE, which holds historical information used for analysis and reporting. The SEQUEL views, scripts, and tables described below use the LOGPARSE file.

How Much Activity Does Each Page Generate?

PAGESUM summarizes page hits by date for each file in the Web site. Web site developers and managers can quickly determine the effects of Web or email promotions based upon the hit statistics for each target page.



Figure 1: PAGESUM summarizes page hits by date.


TARGETSUM delivers a total access count for each target page, sorted in descending order. From here, the Web site manager can run either of two drill-down options to explore more detail.



Figure 2: TARGETSUM  delivers an access count for target pages.


From the TARGETSUM view, the manager can right-click a target page and run the TARGETURL view, which shows the visitor's source IP address and host name if available, or it displays daily access statistics for individual target pages.



Figure 3: TARGETURL shows the visitor's IP address and host name.


Where Do Visitors Come From?

REFERSUM identifies what sites are referring visitors to your Web site. This is especially useful for tracking marketing or promotional results from different sources. It also shows what activity originates from different search engines or business partners.



Figure 4: REFERSUM identifies what sites are referring visitors to your Web site.



Figure 5: SOURCECNT2 shows visitor URL, host, pages visited, and total time spent at the site.

What Visitors Come to the Web Site?

SOURCECNT2 shows the source URL and host for each visitor and summary information on the number of pages visited and the total time spent at the site.



Figure 6: SOURCECNT2 shows visitors' source URL and host.


From here, you can right-click any line to drill down into the detail on what pages visitors reviewed. This view, called SOURCEURL, provides the results shown below. The Time and Target Page fields show what pages were viewed, the order in which pages were viewed, and the time spent on each page.



Figure 7: SOURCEURL provides page-view details.


These SEQUEL examples show a small portion of how you can "slice and dice" Web log information. Once you create your own historical LOGPARSE file, there's no limit to the number of additional SEQUEL views, graphs, and reports you can create to analyze Web site activity.