Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
 Discussion Forums
 HTML, XML, JavaScript...
 Software Reviews
 Editors,Others...
 Top100
 JavaScript Tutorials, ...
 Tutorials
 ASP, CSS, Databases...
 Discussion List
 FAQ, Roundup, Configure ...
 Authoring
 HTML, JavaScript, CSS...
 Design
 Layout, Navigation,...
 Graphics
 Tools, Colors, Images...
 Software
 Browsers, Editors, XML...
 Internet
 Domains, E-Commerce, ...
 WDVL Resources
  Intermdiate, Tutorials,...
 WDVL
 Discussion Lists, Top 100,...
 Technology Jobs


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Top 10 Articles
  1. Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
  2. JavaScript Tutorial for Programmers
  3. Design
  4. JavaScript Tutorial for Programmers - Objects
  5. JavaScript Tutorial for Programmers - JavaScript Grammar
  6. JavaScript Tutorial for Programmers - Versions of JavaScript
  7. Cascading Style Sheets
  8. JavaScript Tutorial for Programmers - Embedding JavaScript
  9. JavaScript Tutorial for Programmers - Functions
  10. Authoring JavaScript
Domain Name Lookup
Search to find the availability of a domain name. Just enter the complete domain name with extension (.com, .net, .edu)

Storing the Data - Page 7

December 14, 2001

Now that we're successfully parsing out the individual elements from each line in the log file, what are we going to do with them? It's time to think about what sorts of things we want to keep track of, and how to represent them in our data structure. One good thing to keep track of is the time of the first and last access processed. When printed out in our report, this will let us see what range of time is covered by the analyzed log file lines. Another obvious thing to keep track of is how many raw hits are in the log file. Similarly, we can track the total amount of data (in megabytes) sent out by the server, and the number of HTML page views. We'll begin implementing these features by adding the following to the top of the log_report.plx script, just before the start of the while loop that parses the log file lines:

my($begin_time, $end_time, $total_hits, $total_mb, $total_views);

This establishes a number of scalar variables that will be visible throughout the script, and will be used to store the various categories of information we're interested in tracking. Now, at the end of the while loop, we'll comment out that debugging print statement and add the new lines shown here in order to store those various pieces of data:

#  print join "\n", $host, $ident_user, $auth_user, $date, $time,
#    $time_zone, $method, $url, $protocol, $status,
#    $bytes, $referer, $agent, "\n";

  unless ($begin_time) {
    $begin_time = "$date:$time";
  }
  $end_time = "$date:$time";

  ++$total_hits;
  $total_mb += ($bytes / (1024 * 1024));

  next if $url =~ /\.(gif|jpg|jpeg|png|xbm)$/i;
  # don't care about these for visit-tracking purposes

  ++$total_views;
  &store_line($host, $date, $time, $url, $referer, $agent);
}

We stick the assignment to $begin_time inside an unless block that checks to see if the variable has been assigned already, so it only gets assigned when the first line of the log file is processed. The $end_time variable is just overwritten with the current values of $date and $time for every line, such that we end up with the date and time of the last access when we're done parsing the log file. Adding one to $total_hits each time through the loop using the auto-increment operator (++) is easy enough to understand. $total_mb is assigned using the interesting += operator, which does what you would probably guess it does: it takes whatever number is on the right and adds it to the contents of the variable on the left, storing the new sum in the variable. It is thus the equivalent of:

$total_mb = $total_mb + ($bytes / (1024 * 1024));

except it's a bit easier to write. Dividing $bytes by the product of 1024 * 1024 simply converts that number to megabytes. The next line uses that handy condensed form of an if statement: do something if something else. In this case, it says to bail out and go to the next cycle through the while loop (which in this case means going to the next line in the log file) if the contents of $url end in .gif, .jpg, .jpeg, .png, or .xbm. This reflects the fact that we're only interested in actual "page views" at this point, and don't care about the image files whose requests also end up in the log file. We could instead have used something like:

next unless $url =~ /\.html?$/;

which would skip to the next line from the log file unless the current line's $url ended in .htm or .html, but this would skip requests for CGI scripts and for directories that return a default page such as index.html. It probably makes sense to count those requests in $total_views. Next, now that we've gotten rid of those extraneous log file entries, it's time to add one to the contents of $total_views. And finally, we invoke a subroutine called &store_line with the arguments $host, $time, $url, $referer, and $agent. We'll be using that subroutine in an effort to generate statistics on something more interesting: the activities of the individual visitors to our site.

Different Log File Formats (con't) - Page 6
Perl for Web Site Management
The "Visit" Data Structure - Page 8


Up to => Home / Authoring / Languages / Perl / Manage




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers