Over the years I used some web statistics tools to find out who visits my website and what they access most frequently. I’ve used Google Analytics and Mint among others but some years ago I stopped using such tools.
The Apache server my hoster uses logs every access into nice log files and that doesn’t need any added scripting on the delivered pages themselves. My hoster even provides some statistics based on that log files and uses Webalizer for this, but Webalizer looks really horrible. So I didn’t use that very often.
Some days ago I got the idea, that it should be possible to transform those log files into something better looking and more useful. I played around with the idea of developing something myself, but before investing time into a new project diligent research of already existing solutions is always useful. And in my research I found GoAccess which does exactly what I had in mind.
GoAccess can read all kinds of log file formats. Strangely none of the predefined formats matched the format my log files use, but that wasn’t really that big of an issue as the format is customizable and well documented
GoAccess can run in the terminal or among others generate a static html page. I use the latter. All nice and well. I now could look at the nicely formatted statistics data for my website, but one piece was still missing. The documentation stated that I could also get data about from where in the world my website was accessed. It took some time, but finally I realized that I didn’t use the correct options to install GoAccess via Homebrew. So I reinstalled it like that:
brew uninstall goaccess brew install goaccess --with-libmaxminddb
Nice, now the section “GEO LOCATION” showed up in my GoAccess html page, but it was empty. Ah well. Turns out you still need to install a geolocation database. A free database is available at MaxMind. More accurate databases can be purchased from the same company, but for my purposes the free one is totally sufficient and I don’t know whether GoAccess would do more with a better database anyways.
Now there are different ways to use GoAccess. Mine is a very simple one, that is probably not really efficient, but it works for me. I have a directory on my MacBook into which I copied that geolocation database and into which I copy the log files from the web server.
In the same directory I also have a file named
config that looks like this:
log-format %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" date-format %d/%b/%Y time-format %H:%M:%S geoip-database GeoLite2-City.mmdb
And there is also the following script that generates the html report:
#!/usr/bin/env bash gunzip *.gz goaccess -f beeger.net-* -p config -o rep.html --real-os
Those log files are zipped when they come from the server and they start with the domain name of my website. So I unzip them before passing them to GoAccess. I use that
--real-os flag to have real names of the operating systems used by the visitors of my website instead of some internal build names.