Playing with graphing; matplotlib
Posted Thu, 22 Nov 2007
webhits.data contains updates of this format:
http.hit@1193875199000000:1 http.hit@1193875200000000:1 http.hit@1193875213000000:1 http.hit@1193875214000000:5The values are hits seen in a single second to this website. This particular data set includes only the past month's worth of data.
Let's graph "total hits per hour" over time.
% ./evtool.py update /tmp/webhits.db - < webhits.data % ./evtool.py fetchsum /tmp/webhits.db $((60 * 60)) http.hit60*60 is 3600, aka 1 hour. hits, 1 hour. I also reran it with 60*60*24 aka 24 hour totals. hits, 1 day.
The data aggregation may be incorrect; not sure if I really got 12K hits on each of the first few days this month. However, using fex+awk+sort on the logfiles themselves shows basically the same data:
% cat access.* | ~/projects/fex/fex '[2 1:1' | countby 0 | sort -k2 | head -3 11534 01/Nov/2007 11488 02/Nov/2007 11571 03/Nov/2007Actually looking at the logs shows 5K hits from a single IP on 01/Nov/2007, and it's the googlebot.