Blame | Last modification | View Log | RSS feed
## Sample Webalizer configuration file# Copyright 1997-2000 by Bradford L. Barrett (brad@mrunix.net)## Distributed under the GNU General Public License. See the# files "Copyright" and "COPYING" provided with the webalizer# distribution for additional information.## This is a sample configuration file for the Webalizer (ver 2.01)# Lines starting with pound signs '#' are comment lines and are# ignored. Blank lines are skipped as well. Other lines are considered# as configuration lines, and have the form "ConfigOption Value" where# ConfigOption is a valid configuration keyword, and Value is the value# to assign that configuration option. Invalid keyword/values are# ignored, with appropriate warnings being displayed. There must be# at least one space or tab between the keyword and its value.## As of version 0.98, The Webalizer will look for a 'default' configuration# file named "webalizer.conf" in the current directory, and if not found# there, will look for "/etc/webalizer.conf".# LogFile defines the web server log file to use. If not specified# here or on on the command line, input will default to STDIN. If# the log filename ends in '.gz' (ie: a gzip compressed file), it will# be decompressed on the fly as it is being read.LogFile /var/log/httpd/access_log# LogType defines the log type being processed. Normally, the Webalizer# expects a CLF or Combined web server log as input. Using this option,# you can process ftp logs as well (xferlog as produced by wu-ftp and# others), or Squid native logs. Values can be 'clf', 'ftp' or 'squid',# with 'clf' the default.#LogType clf# OutputDir is where you want to put the output files. This should# should be a full path name, however relative ones might work as well.# If no output directory is specified, the current directory will be used.OutputDir /var/www/usage# HistoryName allows you to specify the name of the history file produced# by the Webalizer. The history file keeps the data for up to 12 months# worth of logs, used for generating the main HTML page (index.html).# The default is a file named "webalizer.hist", stored in the specified# output directory. If you specify just the filename (without a path),# it will be kept in the specified output directory. Otherwise, the path# is relative to the output directory, unless absolute (leading /).HistoryName /var/lib/webalizer/webalizer.hist# Incremental processing allows multiple partial log files to be used# instead of one huge one. Useful for large sites that have to rotate# their log files more than once a month. The Webalizer will save its# internal state before exiting, and restore it the next time run, in# order to continue processing where it left off. This mode also causes# The Webalizer to scan for and ignore duplicate records (records already# processed by a previous run). See the README file for additional# information. The value may be 'yes' or 'no', with a default of 'no'.# The file 'webalizer.current' is used to store the current state data,# and is located in the output directory of the program (unless changed# with the IncrementalName option below). Please read at least the section# on Incremental processing in the README file before you enable this option.Incremental yes# IncrementalName allows you to specify the filename for saving the# incremental data in. It is similar to the HistoryName option where the# name is relative to the specified output directory, unless an absolute# filename is specified. The default is a file named "webalizer.current"# kept in the normal output directory. If you don't specify "Incremental"# as 'yes' then this option has no meaning.IncrementalName /var/lib/webalizer/webalizer.current# ReportTitle is the text to display as the title. The hostname# (unless blank) is appended to the end of this string (seperated with# a space) to generate the final full title string.# Default is (for english) "Usage Statistics for".#ReportTitle Usage Statistics for# HostName defines the hostname for the report. This is used in# the title, and is prepended to the URL table items. This allows# clicking on URL's in the report to go to the proper location in# the event you are running the report on a 'virtual' web server,# or for a server different than the one the report resides on.# If not specified here, or on the command line, webalizer will# try to get the hostname via a uname system call. If that fails,# it will default to "localhost".#HostName localhost# HTMLExtension allows you to specify the filename extension to use# for generated HTML pages. Normally, this defaults to "html", but# can be changed for sites who need it (like for PHP embeded pages).#HTMLExtension html# PageType lets you tell the Webalizer what types of URL's you# consider a 'page'. Most people consider html and cgi documents# as pages, while not images and audio files. If no types are# specified, defaults will be used ('htm*', 'cgi' and HTMLExtension# if different for web logs, 'txt' for ftp logs).PageType htm*PageType cgiPageType phpPageType shtml#PageType phtml#PageType php3#PageType pl# UseHTTPS should be used if the analysis is being run on a# secure server, and links to urls should use 'https://' instead# of the default 'http://'. If you need this, set it to 'yes'.# Default is 'no'. This only changes the behaviour of the 'Top# URL's' table.#UseHTTPS no# DNSCache specifies the DNS cache filename to use for reverse DNS lookups.# This file must be specified if you wish to perform name lookups on any IP# addresses found in the log file. If an absolute path is not given as# part of the filename (ie: starts with a leading '/'), then the name is# relative to the default output directory. See the DNS.README file for# additional information.DNSCache /var/lib/webalizer/dns_cache.db# DNSChildren allows you to specify how many "children" processes are# run to perform DNS lookups to create or update the DNS cache file.# If a number is specified, the DNS cache file will be created/updated# each time the Webalizer is run, immediately prior to normal processing,# by running the specified number of "children" processes to perform# DNS lookups. If used, the DNS cache filename MUST be specified as# well. The default value is zero (0), which disables DNS cache file# creation/updates at run time. The number of children processes to# run may be anywhere from 1 to 100, however a large number may effect# normal system operations. Reasonable values should be between 5 and# 20. See the DNS.README file for additional information.DNSChildren 10# HTMLPre defines HTML code to insert at the very beginning of the# file. Default is the DOCTYPE line shown below. Max line length# is 80 characters, so use multiple HTMLPre lines if you need more.#HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"># HTMLHead defines HTML code to insert within the <HEAD></HEAD># block, immediately after the <TITLE> line. Maximum line length# is 80 characters, so use multiple lines if needed.#HTMLHead <META NAME="author" CONTENT="The Webalizer"># HTMLBody defined the HTML code to be inserted, starting with the# <BODY> tag. If not specified, the default is shown below. If# used, you MUST include your own <BODY> tag as the first line.# Maximum line length is 80 char, use multiple lines if needed.#HTMLBody <BODY BGCOLOR="#E8E8E8" TEXT="#000000" LINK="#0000FF" VLINK="#FF0000"># HTMLPost defines the HTML code to insert immediately before the# first <HR> on the document, which is just after the title and# "summary period"-"Generated on:" lines. If anything, this should# be used to clean up in case an image was inserted with HTMLBody.# As with HTMLHead, you can define as many of these as you want and# they will be inserted in the output stream in order of apperance.# Max string size is 80 characters. Use multiple lines if you need to.#HTMLPost <BR CLEAR="all"># HTMLTail defines the HTML code to insert at the bottom of each# HTML document, usually to include a link back to your home# page or insert a small graphic. It is inserted as a table# data element (ie: <TD> your code here </TD>) and is right# alligned with the page. Max string size is 80 characters.#HTMLTail <IMG SRC="msfree.png" ALT="100% Micro$oft free!"># HTMLEnd defines the HTML code to add at the very end of the# generated files. It defaults to what is shown below. If# used, you MUST specify the </BODY> and </HTML> closing tags# as the last lines. Max string length is 80 characters.#HTMLEnd </BODY></HTML># The Quiet option suppresses output messages... Useful when run# as a cron job to prevent bogus e-mails. Values can be either# "yes" or "no". Default is "no". Note: this does not suppress# warnings and errors (which are printed to stderr).Quiet yes# ReallyQuiet will supress all messages including errors and# warnings. Values can be 'yes' or 'no' with 'no' being the# default. If 'yes' is used here, it cannot be overriden from# the command line, so use with caution. A value of 'no' has# no effect.#ReallyQuiet no# TimeMe allows you to force the display of timing information# at the end of processing. A value of 'yes' will force the# timing information to be displayed. A value of 'no' has no# effect.#TimeMe no# GMTTime allows reports to show GMT (UTC) time instead of local# time. Default is to display the time the report was generated# in the timezone of the local machine, such as EDT or PST. This# keyword allows you to have times displayed in UTC instead. Use# only if you really have a good reason, since it will probably# screw up the reporting periods by however many hours your local# time zone is off of GMT.#GMTTime no# Debug prints additional information for error messages. This# will cause webalizer to dump bad records/fields instead of just# telling you it found a bad one. As usual, the value can be# either "yes" or "no". The default is "no". It shouldn't be# needed unless you start getting a lot of Warning or Error# messages and want to see why. (Note: warning and error messages# are printed to stderr, not stdout like normal messages).#Debug no# FoldSeqErr forces the Webalizer to ignore sequence errors.# The Apache HTTP server may generate out-of-sequence log entries# so this option is enabled.FoldSeqErr yes# VisitTimeout allows you to set the default timeout for a visit# (sometimes called a 'session'). The default is 30 minutes,# which should be fine for most sites.# Visits are determined by looking at the time of the current# request, and the time of the last request from the site. If# the time difference is greater than the VisitTimeout value, it# is considered a new visit, and visit totals are incremented.# Value is the number of seconds to timeout (default=1800=30min)#VisitTimeout 1800# IgnoreHist shouldn't be used in a config file, but it is here# just because it might be usefull in certain situations. If the# history file is ignored, the main "index.html" file will only# report on the current log files contents. Usefull only when you# want to reproduce the reports from scratch. USE WITH CAUTION!# Valid values are "yes" or "no". Default is "no".#IgnoreHist no# Country Graph allows the usage by country graph to be disabled.# Values can be 'yes' or 'no', default is 'yes'.#CountryGraph yes# DailyGraph and DailyStats allows the daily statistics graph# and statistics table to be disabled (not displayed). Values# may be "yes" or "no". Default is "yes".#DailyGraph yes#DailyStats yes# HourlyGraph and HourlyStats allows the hourly statistics graph# and statistics table to be disabled (not displayed). Values# may be "yes" or "no". Default is "yes".#HourlyGraph yes#HourlyStats yes# GraphLegend allows the color coded legends to be turned on or off# in the graphs. The default is for them to be displayed. This only# toggles the color coded legends, the other legends are not changed.# If you think they are hideous and ugly, say 'no' here :)#GraphLegend yes# GraphLines allows you to have index lines drawn behind the graphs.# I personally am not crazy about them, but a lot of people requested# them and they weren't a big deal to add. The number represents the# number of lines you want displayed. Default is 2, you can disable# the lines by using a value of zero ('0'). [max is 20]# Note, due to rounding errors, some values don't work quite right.# The lower the better, with 1,2,3,4,6 and 10 producing nice results.#GraphLines 2# The "Top" options below define the number of entries for each table.# Defaults are Sites=30, URL's=30, Referrers=30 and Agents=15, and# Countries=30. TopKSites and TopKURLs (by KByte tables) both default# to 10, as do the top entry/exit tables (TopEntry/TopExit). The top# search strings and usernames default to 20. Tables may be disabled# by using zero (0) for the value.#TopSites 30#TopKSites 10#TopURLs 30#TopKURLs 10#TopReferrers 30#TopAgents 15#TopCountries 30#TopEntry 10#TopExit 10#TopSearch 20#TopUsers 20# The All* keywords allow the display of all URL's, Sites, Referrers# User Agents, Search Strings and Usernames. If enabled, a seperate# HTML page will be created, and a link will be added to the bottom# of the appropriate "Top" table. There are a couple of conditions# for this to occur.. First, there must be more items than will fit# in the "Top" table (otherwise it would just be duplicating what is# already displayed). Second, the listing will only show those items# that are normally visable, which means it will not show any hidden# items. Grouped entries will be listed first, followed by individual# items. The value for these keywords can be either 'yes' or 'no',# with the default being 'no'. Please be aware that these pages can# be quite large in size, particularly the sites page, and seperate# pages are generated for each month, which can consume quite a lot# of disk space depending on the traffic to your site.#AllSites no#AllURLs no#AllReferrers no#AllAgents no#AllSearchStr no#AllUsers no# The Webalizer normally strips the string 'index.' off the end of# URL's in order to consolidate URL totals. For example, the URL# /somedir/index.html is turned into /somedir/ which is really the# same URL. This option allows you to specify additional strings# to treat in the same way. You don't need to specify 'index.' as# it is always scanned for by The Webalizer, this option is just to# specify _additional_ strings if needed. If you don't need any,# don't specify any as each string will be scanned for in EVERY# log record... A bunch of them will degrade performance. Also,# the string is scanned for anywhere in the URL, so a string of# 'home' would turn the URL /somedir/homepages/brad/home.html into# just /somedir/ which is probably not what was intended.#IndexAlias home.htm#IndexAlias homepage.htm# The Hide*, Group* and Ignore* and Include* keywords allow you to# change the way Sites, URL's, Referrers, User Agents and Usernames# are manipulated. The Ignore* keywords will cause The Webalizer to# completely ignore records as if they didn't exist (and thus not# counted in the main site totals). The Hide* keywords will prevent# things from being displayed in the 'Top' tables, but will still be# counted in the main totals. The Group* keywords allow grouping# similar objects as if they were one. Grouped records are displayed# in the 'Top' tables and can optionally be displayed in BOLD and/or# shaded. Groups cannot be hidden, and are not counted in the main# totals. The Group* options do not, by default, hide all the items# that it matches. If you want to hide the records that match (so just# the grouping record is displayed), follow with an identical Hide*# keyword with the same value. (see example below) In addition,# Group* keywords may have an optional label which will be displayed# instead of the keywords value. The label should be seperated from# the value by at least one 'white-space' character, such as a space# or tab.## The value can have either a leading or trailing '*' wildcard# character. If no wildcard is found, a match can occur anywhere# in the string. Given a string "www.yourmama.com", the values "your",# "*mama.com" and "www.your*" will all match.# Your own site should be hidden#HideSite *mrunix.net#HideSite localhost# Your own site gives most referrals#HideReferrer mrunix.net/# This one hides non-referrers ("-" Direct requests)#HideReferrer Direct Request# Usually you want to hide theseHideURL *.gifHideURL *.GIFHideURL *.jpgHideURL *.JPGHideURL *.pngHideURL *.PNGHideURL *.ra# Hiding agents is kind of futile#HideAgent RealPlayer# You can also hide based on authenticated username#HideUser root#HideUser admin# Grouping options#GroupURL /cgi-bin/* CGI Scripts#GroupURL /images/* Images#GroupSite *.aol.com#GroupSite *.compuserve.com#GroupReferrer yahoo.com/ Yahoo!#GroupReferrer excite.com/ Excite#GroupReferrer infoseek.com/ InfoSeek#GroupReferrer webcrawler.com/ WebCrawler#GroupUser root Admin users#GroupUser admin Admin users#GroupUser wheel Admin users# The following is a great way to get an overall total# for browsers, and not display all the detail records.# (You should use MangleAgent to refine further...)#GroupAgent MSIE Micro$oft Internet Exploder#HideAgent MSIE#GroupAgent Mozilla Netscape#HideAgent Mozilla#GroupAgent Lynx* Lynx#HideAgent Lynx*# HideAllSites allows forcing individual sites to be hidden in the# report. This is particularly useful when used in conjunction# with the "GroupDomain" feature, but could be useful in other# situations as well, such as when you only want to display grouped# sites (with the GroupSite keywords...). The value for this# keyword can be either 'yes' or 'no', with 'no' the default,# allowing individual sites to be displayed.#HideAllSites no# The GroupDomains keyword allows you to group individual hostnames# into their respective domains. The value specifies the level of# grouping to perform, and can be thought of as 'the number of dots'# that will be displayed. For example, if a visiting host is named# cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just# "uu.net" being displayed, while a 2 will result in "mia.uu.net".# The default value of zero disable this feature. Domains will only# be grouped if they do not match any existing "GroupSite" records,# which allows overriding this feature with your own if desired.#GroupDomains 0# The GroupShading allows grouped rows to be shaded in the report.# Useful if you have lots of groups and individual records that# intermingle in the report, and you want to diferentiate the group# records a little more. Value can be 'yes' or 'no', with 'yes'# being the default.#GroupShading yes# GroupHighlight allows the group record to be displayed in BOLD.# Can be either 'yes' or 'no' with the default 'yes'.#GroupHighlight yes# The Ignore* keywords allow you to completely ignore log records based# on hostname, URL, user agent, referrer or username. I hessitated in# adding these, since the Webalizer was designed to generate _accurate_# statistics about a web servers performance. By choosing to ignore# records, the accuracy of reports become skewed, negating why I wrote# this program in the first place. However, due to popular demand, here# they are. Use the same as the Hide* keywords, where the value can have# a leading or trailing wildcard '*'. Use at your own risk ;)#IgnoreSite bad.site.net#IgnoreURL /test*#IgnoreReferrer file:/*#IgnoreAgent RealPlayer#IgnoreUser root# The Include* keywords allow you to force the inclusion of log records# based on hostname, URL, user agent, referrer or username. They take# precidence over the Ignore* keywords. Note: Using Ignore/Include# combinations to selectivly process parts of a web site is _extremely# inefficent_!!! Avoid doing so if possible (ie: grep the records to a# seperate file if you really want that kind of report).# Example: Only show stats on Joe User's pages...#IgnoreURL *#IncludeURL ~joeuser*# Or based on an authenticated username#IgnoreUser *#IncludeUser someuser# The MangleAgents allows you to specify how much, if any, The Webalizer# should mangle user agent names. This allows several levels of detail# to be produced when reporting user agent statistics. There are six# levels that can be specified, which define different levels of detail# supression. Level 5 shows only the browser name (MSIE or Mozilla)# and the major version number. Level 4 adds the minor version number# (single decimal place). Level 3 displays the minor version to two# decimal places. Level 2 will add any sub-level designation (such# as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add# the system type if it is specified. The default Level 0 displays the# full user agent field without modification and produces the greatest# amount of detail. User agent names that can't be mangled will be# left unmodified.#MangleAgents 0# The SearchEngine keywords allow specification of search engines and# their query strings on the URL. These are used to locate and report# what search strings are used to find your site. The first word is# a substring to match in the referrer field that identifies the search# engine, and the second is the URL variable used by that search engine# to define it's search terms.SearchEngine yahoo.com p=SearchEngine altavista.com q=SearchEngine google.com q=SearchEngine eureka.com q=SearchEngine lycos.com query=SearchEngine hotbot.com MT=SearchEngine msn.com MT=SearchEngine infoseek.com qt=SearchEngine webcrawler searchText=SearchEngine excite search=SearchEngine netscape.com search=SearchEngine mamma.com query=SearchEngine alltheweb.com query=SearchEngine northernlight.com qr=# The Dump* keywords allow the dumping of Sites, URL's, Referrers# User Agents, Usernames and Search strings to seperate tab delimited# text files, suitable for import into most database or spreadsheet# programs.# DumpPath specifies the path to dump the files. If not specified,# it will default to the current output directory. Do not use a# trailing slash ('/').#DumpPath /var/log/httpd# The DumpHeader keyword specifies if a header record should be# written to the file. A header record is the first record of the# file, and contains the labels for each field written. Normally,# files that are intended to be imported into a database system# will not need a header record, while spreadsheets usually do.# Value can be either 'yes' or 'no', with 'no' being the default.#DumpHeader no# DumpExtension allow you to specify the dump filename extension# to use. The default is "tab", but some programs are pickey about# the filenames they use, so you may change it here (for example,# some people may prefer to use "csv").#DumpExtension tab# These control the dumping of each individual table. The value# can be either 'yes' or 'no'.. the default is 'no'.#DumpSites no#DumpURLs no#DumpReferrers no#DumpAgents no#DumpUsers no#DumpSearchStr no# End of configuration file... Have a nice day!