Fred Morris Consulting

Apache Web Log Parser User's Guide

Version: 1.2

Copyright (c) 2001 by Fred Morris & Fred Morris Consulting, Seattle WA. All rights reserved except as specifically granted by the license. Mailing address: 2542 Westlake N, Seattle WA USA 98109 Telephone: 206.297.6344 E-mail: m3047@inwa.net


Java and JDK are registered trademarks of Sun Microsystems Inc.
CodeWarrior is a registered trademark of Metrowerks, a subsidiary of Motorola.
Macintosh Runtime for Java and MacOS are registered trademarks of Apple Computer
Apache is the standard mark for the Apache web server, a testament to the collaborative power of the internet community


revision date: 18-Oct-2001
revision by: Fred Morris


Overview

This applet parses standard Apache web logs into a tab-delimited format and writes them to a local file. The applet was written with CodeWarrior and targeted to JDK 1.1. There are some known issues with MRJ 2.2.5 on the Macintosh which are documented in the on-line help.

Security Considerations

This is an unsigned applet, and it reads and writes local files. You wouldn't want just any applet to be able to read and write files on your local file system! You need to keep that in mind, and you will have to make some adjustments to your applet security settings in order to use it.

The basic change you will have to make is to allow (unsigned) applets read and write access to your local file system. Because of the inherent risks in this, one of two courses of action is recommended:

Using the Applet

The applet is simple to use.

  1. Select an input file or specify a URL from which to read the log file.
  2. Select an output file.
  3. Set processing options if desired.
  4. Turn on headers if desired.
  5. Click the Process button.

Main Window

The main window is the pane that you see when you load the applet. It has the following features:

Set Input File button

Clicking this button brings up an input file dialog, allowing you to select the input file. When this button is clicked any URL specified is cleared.

Input File directory and file name display

The currently selected input file is displayed here.

Input URL display

Instead of specifying an input file, you can type a URL to retrieve the input file from in this text field. Typing anything in this field clears the input file directory and name fields.

Set Output File button

Clicking this button brings up an output file dialog, allowing you to specify an output file.

Output File directory and file name display

The currently targeted output file is displayed here.

Headers checkbox

Clicking this checkbox writes headers as the first line in the output file.

Options button

Clicking this button brings up the options window.

Process button

Clicking this button causes the input file/URL to be processed and the output file to be written.

Help button

Clicking this button displays the on-line help window.

message area

Processing messages are displayed here.

Options Window

Checkboxes allow you to specify which fields are included in the output. Text fields allow you to specify selection criteria based on the fields. A field doesn't have to be included in the output in order to select on it.

Selection Logic

Criteria entered into different fields is ANDed, except for query words.

Criteria entered is inclusive, meaning that if criteria is entered for a field then it must be satisfied; if no criteria is entered, then no selection is performed based on that field.

Some fields match a substring, and some support less than/greater than logic, as follows:

Substring matching fields:

Less than/greater than fields:

Substrings are not case sensitive. The syntax for less than/greater than fields is:

<search expression> :=
    {<greater-than-term>} {<less-than-term>}

<greater-than-term> :=
    >{=} <term>

<less-than-term> :=
    <{=} <term>

Syntax will be checked when you click the Process button; any errors will be displayed in the messages area.

Dates in Selection Expressions

Dates must be specified in the same format as they appear in the apache log:

dd/Mmm/yyyy

For instance:

30/Jul/1958

or

01/Jan/2000

Node fields

Node fields refer to the depth in the file spec. Node 1 refers to the top-level directory name, and so forth.

Query Word fields

Query words are ORed together (before being ANDed with any other fields). The order that the query words are specified in is irrelevant.

Miscellaneous Implementation and Operational Notes

Locale

The locale is coded as java.util.Locale.US. This affects the format that dates and times can be specified in in selection expressions, and is based on the assumption that Apache logs are not localized... if I'm wrong about this, somebody please tell me!

Anonymous only

This release only handles anonymous requests; lines which have a username will flag errors. If you want a version which handles validated users, contact us.

A message from our sponsor...

Licensing Details

This is a copyrighted work, we make it available to you free of charge and for you to use at your own risk. You assume all risks and liability associated with its use and behavior, and agree to bear the costs of remedying any and all damages associated therewith; it is not guaranteed to be bug-free, and there may be security risks associated with its use.

Because if the inherent security risks, we allow you to copy it to your local hard drive directly from the web site.

This license to copy to mitigate known security issues is not a generalized license to copy. You must copy it in its entirety. You may not make any alterations, beyond replacing logo.gif with a picture of your dog, car, favorite flower, or significant other; you may not replace logo.gif with a graphic which conveys branding or ownership by other than Fred Morris Consulting. You may not copy it to your intranet or another web server. You may not distribute copies to others. You may under no circumstances reverse-engineer it or create derivative works.

This license may be revoked at any time, in which case you must cease use. This is not likely to happen, but I need to cover my butt. I may produce a future version or a derivative work which does cost something to license; if my ISP ever complains about the system load, I may need to find a way to collect something to cover their costs. On the other hand if it gets popular enough maybe I'll let one of the Apache support sites host it as a service to the community.

If you do not agree to the terms of this license, you should cease use immediately and destroy any and all copies in your posession. Violating this license agreement is de facto proof that you do not agree to the terms of this license.

Use of this applet in violation of this license makes you liable for damages under copyright law, and may additionally be construed as a dilution or disparagement of the lawful holder's rights in this work and of the reputation of Fred Morris and Fred Morris Consulting. We may seek to recover court costs at our discretion.

Under no circumstances shall the ISP hosting this applet, Innovative Access of Washington (INWA), be held liable for damages; INWA has no association with this work, beyond providing hosting for the account of Fred Morris and Fred Morris Consulting.

There is roughly a month's work in this little gizmo. If I was writing it for someone, I'd probably charge them $5000US. Keep that in mind. Postcards and friendly e-mails are always welcome! ;-)

Any legal actions pertaining to this license or to the use of the applet or in any other way predicated upon the existence or involvement of this applet will be brought in a court in the State of Washington USA or as geographically close thereto as possible. If any portion of this license is ruled invalid or unenforceable, the remaining provisions shall remain in effect. This license is a contract which will be construed in accordance with the laws of the State of Washington.

Want to do something with this applet which isn't allowed under the terms of this license? Contact us!

Yes, we do Java... among other things

Fred Morris Consulting has been in business since 1984. Areas of particular focus have included VAX/VMS; TCP/IP networking; warehousing, data acquisition and industrial control systems; biomathematics; printing. We've also worked with a variety of flavors of u*ix as well as MacOS and Microsoft operating systems. We've worked with pretty much every language and platform out there except for IBM mainframes. There is only one thing we dread, and that is MFC: ick ick ick!