Essentially, the way GoAccess works is that it will parse the well-known Apache access log file, from there, GoAccess will collect data from the parsed log and it will display it to the console or the X terminal. The collected information or generated reports will be displayed to the user/sysadmin in a visual/interactive window. Reports include:
General Statistics, bandwidth etc.
Top Visitors
Requested files
Requested static files, images, swf, js, etc.
Referrers URLs
404 or Not Found
Operating Systems
Browsers and Spiders
Hosts, Reverse DNS, IP Location
HTTP Status Codes
Referring Sites
Keyphrases
Different Color Schemes
Unlimited log file size
Log format…
GoAccess can parse both of Apaches' log formats, the Common Log Format (CLF) and the Combined Log Format (XLF/ELF), including virtual host. It is possible to parse Nginx as well. (if configured with the standard Apache log format)
Install GoAccess on ubuntu
sudo apt-get install goaccess
Using goaccess
You have to make sure apache or nginx is installed and configured for websites access.
Usage:
goaccess [ -b ][ -s ][ -e IP_ADDRESS][ - a ] <-f log_file >
The following options can also be supplied to the command:
-f - Path to input log file.
-b - Enable total bandwidth consumption.
For faster parsing, don't enable this flag.
-s - Enable HTTP status codes report.
For faster parsing, don't enable this flag.
-a - Enable a List of User-Agents by host.
For faster parsing, don't enable this flag.
-e - Exclude an IP from being counted under the
HOST module. Disabled by default.
Goaccess Examples
The simplest and fastest usage would be:
# goaccess -f access.log
That will generate an interactive text-only output.
To generate full statistics we can run GoAccess as:
# goaccess -f access.log -a -s -b
The -a flag indicates that we want to process an agent-list for every host parsed. The -s flag tells
GoAccess go get every HTTP status code. The -b flag will process the total bandwidth consumption for
files, hosts, and dates.
Now if we want to add more flexibility to GoAccess, we can do a series of pipes. For instance:
If we would like to process all access.log.*.gz we can do:
# zcat access.log.*.gz | goaccess
OR
# zcat -f access.log* | goaccess
Another useful pipe would be filtering dates out of the Apache's access log
The following will get all HTTP requests starting on 05/Dec/2010 until the end of the file.
# sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -s -b
If we want to parse only a certain time-frame from DATE a to DATE b, we can do:
sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -s -b
Note that this could take longer time to parse depending on the speed of sed.
Also, it is worth pointing out that if we want to run GoAccess at lower priority, we can run it as:
# nice -n 19 goaccess -f access.log -s -a -b
and if you don't want to install it on your server, you can still run it from your local machine:
# ssh user@server 'cat /var/log/apache2/access.log' | goaccess -s -a -b