Search this site





grok for apache log analysis

I recently made a small change to my rss and atom feeds. I add a tracker image in the content. It looks like this:
<img src="/images/spacer.gif?path_of_a_blog_entry">
Any time someone views a post in an RSS feed, that image is loaded and the client browser happily reports the referrer url and I get to track you! Wee.

This is in an effort to find out how many people actually read my blog. Now that I can track viewship of the rss/atom feeds, how do I go about analyzing it? grok to the rescue:

% grep 'spacer.gif\?' accesslog \
   | perl grok -m '%APACHELOG%' -r '%IP% %QUOTEDSTRING:REFERRER%' \
   | sort | uniq -c | sort -n
<IP addresses blotted out, only a few entries shown>
  1 XX.XX.XX.XX ""
  9 XX.XX.XX.XX ""
  10 XX.XX.XXX.XXX ""
  10 XX.XXX.XXX.XX ""
  27 XX.XXX.XXX.XX ""
Each line represents a unique viewer, and tells me what the reader was using to view the feed.

Yay grok.