Search this site


[prev]  Page 2 of 4  [next]

Metadata

Articles

Projects

Presentations

RRDTool to graph log-originating data.

I need to relearn rrdtool, again, for this sysadmin time machine project. Today's efforts were spent testing for features I hoped were in RRDTool. So far, my feature needs are met :)

Take something simple, like webserver logs. Let's graph the hits.

Create the RRD:

rrdtool create webhits.rrd --start 1128626000 -s 60 \
   DS:hits:GAUGE:120:0:U RRA:AVERAGE:.5:5:600000 \
   RRA:AVERAGE:.5:30:602938 RRA:AVERAGE:.5:60:301469 \
   RRA:AVERAGE:.5:240:75367 RRA:AVERAGE:.5:1440:12561
My logs start *way* back in November of last year, so I create the rrd with a start date of sometime in Novemeber. The step is 60, so it expects data every minute. I then specify one data type, hits, which is a gaugue (rate), and ranges from 0 to infinity (U). The rest of the command is RRA's defining how data is stored. The first one says take 5 samples and average them, and store 600,000 of these samples, at a maximum.

Now that we have the database, we need a "hits-per-minute" data set. I wrote a short perl script, parsehttp that will read from standard input and calculate hits-per-minute and output rrdtool update statements. Capture this output and run it through sh:

./parsehttp < access.log | sh -x
Simple enough. This will calculate hits-per-minute for all times in the logs and store it in our RRD.

Now that we have the data, we can graph it. However, since I want to view trends and compare time periods, I'll need to do something fancier than simple graphs.

RRDTool lets you graph multiple data sets on the same graph. So, I want to graph this week's hits and last week's hits. However, since the data sets are on different time intervals, I need to shift last week's set forward by one week. Here's the rrdtool command that graphs it for us, with last week's and this week's data on the same graph, displayed at the same time period:

rrdtool graph webhits.png -s "-1 week" \
   DEF:hits=webhits.rrd:hits:AVERAGE  \
   DEF:lastweek=webhits.rrd:hits:AVERAGE:start="-2 weeks":end="start + 1 week" \
   SHIFT:lastweek:604800 \
   LINE1:lastweek#00FF00:"last week" LINE1:hits#FF0000:"this week"
That'll look like line noise if you've never used RRDTool before. I define two data sets with DEF: hits and lastweek. They both read from the 'hits' data set in webhits.rrd. One starts at "-1 week" (one week ago, duh) and the other starts 2 weeks ago and ends last week. I then shift last week's data forward by 7 days (604800 seconds). Lastly, I draw two lines, one for last weeks (green), the other for this weeks (red).

That graph looks like this:

That's not really useful, becuase there's so many data points the graph almost becomes meaningless. This is due to my poor creation of RRAs. We can fix that by redoing the database, or using the TREND feature. Change our graph statement to be:

rrdtool graph webhits.png -s "-1 week" \
   DEF:hits=webhits.rrd:hits:AVERAGE  \
   DEF:lastweek=webhits.rrd:hits:AVERAGE:start="-2 weeks":end="start + 1 week" \
   SHIFT:lastweek:604800 \
   CDEF:t_hits=hits,86400,TREND CDEF:t_lastweek=lastweek,86400,TREND \
   LINE1:lastweek#CCFFCC:"last week" LINE1:hits#FFCCCC:"this week" \
   LINE1:t_lastweek#00FF00:"last week" LINE1:t_hits#FF0000:"this week"
I added only two CDEF statements. They take a data set and "trend" it by one day (86400 seconds). This creates a sliding average across time. I store these in new data sets called t_hits and t_lastweek and graph those aswell.

The new graph looks like this:

You'll notice the slide values are chopped off on the left, that's becuase it doesn't have enough data points at those time periods to make an average. However, including the raw data makes the graph scale as it did before, making viewing the trend difference awkward. So, let's fix it by not graphing the raw data. Just cut out the LINE1:lastweek and LINE1:hits options.

Fixing the sliding average cutoff, add a title, and a vertical label:

rrdtool graph webhits.png -s "-1 week" \
   -t "Web Server Hits - This week vs Last week" \
   -v "hits/minute" \
   DEF:hits=webhits.rrd:hits:AVERAGE:start="-8 days":end="start + 8 days"  \
   DEF:lastweek=webhits.rrd:hits:AVERAGE:start="-15 days":end="start + 8 days" \
   SHIFT:lastweek:604800 \
   CDEF:t_hits=hits,86400,TREND CDEF:t_lastweek=lastweek,86400,TREND \
   LINE1:t_lastweek#00FF00:"last week" LINE1:t_hits#FF0000:"this week" \
The graph is still from one week ago until now, but our data sets used extend beyond those boundaries, so that sliding averages can be calculated throughout. The new, final graph, looks like this:

Now I can compare this week's hits against last weeks, quickly with a nice visual. This is what I'm looking for.

This would become truely useful if we had lots of time periods (days, weeks, whatever) to look at. Then we could calculate standard deviation, etc. A high outlier could be marked automatically with a label, giving an instant visual cue that something is potentially novel. It might be simple to create a sort-of sliding "standard deviation" curve. I haven't tried that yet.

New event recording database prototype

I finally managed to find time today to work on my events database project. In the processes of doing so, I found a few bugs in grok that needed to get fixed. Some of my regular expressions were being a bit greedy, so certain pattern expansion was breaking.

To summarize the event recording system, it is a webserver listening for event publish requests. It accepts the "when" "where" and "what" of an event, and stores it in a database.

To have my logs pushed to the database, I'll leverage the awesome power of Grok. Before I do that, I gathered all of the auth.log files and archives and compiled them into their respective files.

The grok.conf for this particular maneuver:

exec "cat ./logs/nightfall.auth.log ./logs/sparks.auth.log ./logs/whitefox.auth.log" {
   type "all syslog" {
      match_syslog = 1;
      reaction = 'fetch -qo - "http://localhost:8080/?when=%SYSLOGDATE|parsedate%&where=%HOST%/%PROG|urlescape|shdq%&what=%DATA:GLOB|urlescape|shdq%"';
   };
};
This is farily simple. I added a new standard filter, 'urlescape' to grok becuase I needed it. it will url escape a data piece. Hurray!

Run grok, and it sends event notifications to the webserver for every syslog-matching line. Using FreeBSD's command-line web client, fetch.

sqlite> select count(*) from events;
8085
Now, let's look for something meaningful. I want to know what happened on all sshd services between 1am and 3am this morning (Today, May 3rd):
nightfall(~/projects/eventdb) % date -j 05030100 +%s
1146632400
nightfall(~/projects/eventdb) % date -j 05030400 +%s
1146643200
Now I know the Unix epoch times for May 3rd at 1am and 4am.
sqlite> select count(*) from events where time >= 1146632400 
   ...> and time <= 1146643200 and location like "%/sshd" 
   ...> and data like "Invalid user%";
2465
This query is instant. Much faster than doing 'grep -c' on N log files across M machines. I don't care how good your grep-fu is, you aren't going to be faster.This speed feature is only the beginning. Think broader terms. Nearly instantly zoom to any point in time to view "events" on a system or set of systems. Filter out particular events by keyword or pattern. Look for the last time a service was restarted. I could go on, but you probably get the idea. It's grep, but faster, and with more features.

As far as the protocol and implementation goes, I'm not sure how well this web-based concept is going to prevail. At this point, I am not interested in protocol or database efficiency. The prototype implementation is good enough. From what I've read about Splunk in the past months in the form of advertisements and such, it seems I already have the main feature Splunk has: searching logs easily. Perhaps I should incorporate and sell my own, better-than-Splunk, product? ;)

Bear in mind that I have no idea what Splunk actually does beyond what I've gleaned from advertisements for the product. I'm sure it's atleast somewhat useful, or no one would invest.

Certainly, a pipelined HTTP client could perform this much faster than doing 10000 individual http requests. A step further would be having the web server accept any number of events per page request. The big test is going to see how well HTTP scales, but that can be played with later.

At this point, we have come fairly close to the general idea of this project: Allowing you to zoom to particular locations in time and view system events.

The server code for doing this was very easy. I chose Python and started playing with CherryPy (a webserver framework). I had a working event reciever server in about 30 minutes. 29 minutes of that time was spent writing a threadsafe database class to front for pysqlite. The CherryPy bits only amount to about 10 lines of code, out of 90ish.

The code to do the server can be found here: /scripts/cherrycollector.py

Finding multiplicative inverses using perl

perl -le 'while (++$x) { print $x and exit if (($ARGV[0] * $x) % $ARGV[1] == 1) }' -- -215 26

precedence sorting with perl's sort()

This isn't terribly complicated, but it may be useful to you. Mostly I'm putting it here for my own future reference.
sort { $users{$b} <=> $users{$a} || $a cmp $b } keys(%users)
This will sort first numerically by the values stored in %users and then by the keys of %users. This means that the return value of sort is the keys sorted by data values first, and then alphabetically when both data values are equal.

Curious me web stats.

I got bored and wanted to see how my new site compares against other things. These statistics are by no means thorough. The benchmarker I used is a very simple program called webbench. It simply sends request after request at whatever you point it at. The statistics were done using 5 concurrent clients over 30 seconds (-c 5 -t 30).

I think the results are pretty interesting.

http://www.google.com/ 7126 pages/min
http://www.amazon.com/ 8340 pages/min
http://www.csh.rit.edu/ 3690 pages/min
http://www.csh.rit.edu/~psionic/new 7066 pages/min
http://www.csh.rit.edu/~psionic/new?nocache 710 pages/min

A factor of 10 difference with caching than without. Sheesh. How easy is it to cache things in mason? Let me show you.
<%init>
return if (!exists($m->request_args()->{'nocache'}) && 
		$m->cache_self(key => $m->comp("/lib/cachename"), expires_in => '30 minutes'));
</%init>
Put that in the %init section of any component and poof, mason caches the output of that component for 30 minutes (as specified). According to this site, you can cache the return values of components too, which makes it a bit like Memoize in a way.
The /lib/cachename componenet is a little dohicky I wrote up to automatically generate a key name to cache by:
<%perl>
	my ($k,$v);
	my $args = scalar($m->caller_args(1));
	my $key = scalar($m->callers(1))->{"path"};
	$key .= ".$k=$v" while (($k,$v) = each(%$args));
	$key =~ s!s!_!g;
	return $key;
</%perl>
This component generates a key in the form of "compname.foo=bar.baz=fizz" or whatever. It takes the arguments passed to the component and makes a unique key out of it. This is so I can cache specific calls to each component (like Memoize!) with specific arguments rather than simply caching only one output and using that. This is slick.

apache+mod_perl perl handler adventures

I started thinking about the redesign tonight, and Mason is clearly more than I need.

I still want to use mod_perl, but don't want a whole mess of perl scripts to drive the website. So instead of going with either end of the spectrum, I've started writing my own perl handler (PerlHandler option in apache). It's dead simple and will let me do things I need to do with apache and also allow me to have most of the site driven with XML and XSLT in some very cool ways.

More news on this as it develops.

Terminals and meta characters

So it seems that there's two ways your terminal can send "meta" key combinations.

One method is to send escape + the character. So hitting ALT+J will send an escape character followed by 'J'.
Alternatively, your terminal might simply read the character value in and set it's most significant bit high before sending that byte to the terminal. What does does that mean?
Let's say you were to hit ALT+J this time. J, in ascii is 74 in ascii. This number represented in binary is: 01101010. Having alt held down sets the left-most (most significant) bit high, becoming 11101010.

Many terminals allow you to set which method of input it uses. XTerm, for instance, allows you through the menu accessed by control+leftclicking on the terminal (anywhere). Select "Meta sends Escape" and alt-key combinations will send escape+character. If "Meta sends Escape" is unchecked, then your terminal will set-high the most significant bit before sending the character.

Here's a perl oneliner that'll let you see what's going on. The output for mat is: char / ascii / binary where ascii is the hexadecimal value of what you entered, and binary is the binary representation. You'll notice that if "Meta sends Escape" is unchecked, then your terminal will set-high the most significant bit before sending the character.

#!/usr/bin/perl

system("stty raw -echo"); 
while (sysread(STDIN,$foo,1)) {
        printf("%s / %2x - ", ($foo =~ /[[:print:]]/ ? $foo : " "), ord($foo)); 
        print unpack("B32", pack("n", ord($foo))) . "\r\n" 
}

XML? Perhaps...

I've been recently considering moving away from the huge hack that is the code that makes this website go. Currently it is driven by mod_perl and HTML::Mason - hack upon hack, mostly due to how this website was done.
I rewrote this website from php into perl as an excuse to learn HTML::Mason. Now I know mason and as a result the code is poorly written and shoddily glued together. In an effort to move towards newer technologies I think I'm going to try rewriting the site in XML and use XSLT to render. For the HTML it generates I'm leaning towards some sort of css/javascript fanciness to make things pretty and useful.
I also want to trash the script that I use to post things to this site becuase frankly it doesn't handle things the proper way it should, among other excuses to want to rewrite it to be morebettergood and use cooler technologies.
I'll keep posting updates as progress happens and I learn new things.

note takery thing

For the longest time I've found myself jotting notes in files tossed throughout my home directory. Finding these files later has become a chore and half.
Some short while ago, the author of Ion, Tuomo Valkonen, wrote a little program called riot in haskel. I got this to build *once* and only once, so all future attempts to use this failed miserably. At any rate, riot is a useful tool to make notes of things. It stores notes in mbox format. All in all, a useful program, unless you can't get a haskell compiler working for your OS!

I quickly became annoyed, realizing the folly of my ways, having to put little jotted notes in various files that end up everywhere. Annoyed, I created a solution in a project I called 'note.' It's a simple perl Curses::UI interface to let me take notes and post followups in a thread-like manner. The config file needs to go in your home directory as ~/.noterc. The format of the file is:

dbuser = DATABASE_USER
dbpass = DATABASE_PASSWORD
dbname = DATABASE_NAME
dbhost = DATABASE_HOST

You'll need postgres and perl. Perl will need Curses::UI and File::Temp
I also have the sql you'll need to put in tables.sql

Regular expressions and word boundaries

I needed a better way to find word boundaries for Term::Shelly, so I turned to perl's rediculously powerful regular expressions. The previous implementation went something like this:

The following is passed to find_word_bound:
- $self ..... the Term::Shelly instance
- $line ..... the line of text that we're doing a bound search on
- $pos ...... the position in the line that we are in
- $opts ..... A bit string (bitwise-AND'd integers) containing what way to search
- $regex .... An optional argument containing a regular expression on which to match word characters.

sub find_word_bound ($$$;$) {
   my $self = shift;
   my $line = shift;
   my $pos = shift;
   my $opts = shift || 0;
   my $regex = "[A-Za-z0-9]";
   my $bword;

   $regex = shift if ($opts & WORD_REGEX);

   # Mod? This is either -1 or +1 depending on if we're looking behind or
   # if we're looking ahead.
   my $mod = -1;
   $mod = 1 if ($opts & WORD_END);

   # What are we doing?
   # If we're in a word, go to the beginning of the word
   # If we're on a space, go to end of previous word.
   # If we're on a nonspace/nonword, go to beginning of nonword chars

   $bword = $pos;

   # Back up if we're past the end of the string (cursor at end of line)
   $bword-- if ($pos == $self->{"input_position"});

   # Ignore trailing whitespace.
   #$bword += $mod while (substr($line,$bword,1) =~ m/^\s/);

   # If we're not on an ALPHANUM, then we want to reverse the match.
   # that is, if we are:
   # "testing here hello .......there"
   #                           ^-- here
   # Then we want to delete (match) all the periods (nonalphanums)
   substr($regex, 1, 0) = "^" if (substr($line,$bword,1) !~ m/$regex/ &&
                                  $opts & WORD_ONLY);

   # Keep going while there's whitespace...
   $bword += $mod while (substr($line,$bword,1) =~ m/\s/ && $bword >= 0);

   # Back up until we hit the bound of our "word"
   $bword += $mod while (substr($line,$bword,1) =~ m/$regex/ && $bword >= 0);

   # Whoops, one too far...
   $bword -= $mod;

   return $bword;
}

Frankly, this is a terrible implementation of word boundary location, but it worked and that's all I was realling looking for at the time.

Enter perl regular expressions. The simple fact was that I was trying to do all kinds of bound searches all throughout that function, and it got messy. There are only three kinds of bounds I actually cared about. These are word beginning, word end, and beginning of next word. I knew I could craft three regex's to match each of those cases, but I more or less shrugged the idea off because I forgot about backreferences.

The new version of find_word_bound is *much* shorter:

sub find_word_bound ($$$$;$) {
   my ($self, $line, $pos, $opts, $regex) = @_;
   $regex = '\w' if (!($opts & WORD_REGEX));
   my $notregex = qr/[^$regex]/;
   my $regex = qr/[$regex]/;

   # Mod? This is either -1 or +1 depending on if we're looking behind or
   # if we're looking ahead.
   my $mod = ($opts & WORD_BEGINNING) ? -1 : 1;

   if ($opts & WORD_NEXT) {
      $regex = qr/^.{$pos}(.+?)(?<!$regex)$regex/;
   } elsif ($opts & WORD_BEGINNING) {
      $regex = qr/($regex+$notregex*)(?<=^.{$pos})/;
   } elsif ($opts & WORD_END) {
      $regex = qr/^.{$pos}(.+?)$regex(?:$notregex|$)/;
   }

   if ($line =~ $regex) {
      $pos += length($1) * $mod;
   } else {
      $pos = ($mod == 1 ? length($line) : 0);
   }

   return $pos;
}

You'll notice this version has a greatly reduced number of lines of code and works much better (the previous implementation was buggy, not that I'll imply that regular expressions fix bugs). A breakdown of the regex's perhap?

- Find the beginning of the next word

$regex = qr/^.{$pos}(.+?)(?<!$regex)$regex/;

This one matches the beginning of the next word after the cursor. It makes use of negative look-behinds. First it makes sure that we are $pos characters into the string. Then it makes a non-greedy grab for any characters past that. Then looks backwards until it finds the first (nearest the end of the last match) non $regex match. Once it finds that, it then looks for the first $regex match after that, and we've found our next word.
A visual example may help explain:

# The cursor is at position 12, so $pos = 12
# The line is "The quick brown fox jumps over the lazy dog."
# For this example, let's have $regex = qr/\w/;

The quick brown fox jumps over the lazy dog
           ^                         <-- the cursor position
---------->^                         <-- .{$pos}
            ---->^                   <-- (.+?) (nongreedy match)
               ^<-                   <-- (?<!$regex)
               -^                    <-- $regex

$1, grouped by the (.+?) here is 'own ', thus length($1) + $pos = (4 + 12) =16
We now position ourselves 16 characters in, landing us on the 'f' of
'fox'

- Find the beginning of the current word

$regex = qr/($regex+[^$regex]*)(?<=^.{$pos})/;

This one finds the beginning of words. It looks for one or more $regex matches followed by any number of nonmatches to $regex. It then uses a positive look-behind to ensure that there are atleast $pos characters before our current position in the match.
Again, a visual example (It is difficult to briefly show exactly what happens when you do this particular kind of match):

# The cursor is at position 14, so $pos = 14
# The line is "The quick brown fox jumps over the lazy dog."
# For this example, let's have $regex = qr/\w/;

The quick brown fox jumps over the lazy dog
              ^                      <-- the cursor position
----------......                     <-- ($regex+[^$regex]*), the '..... '
                                         indicates the $1 group
---------------                      <-- positive look-behind succeeds
                                         due to current match position
                                         being greater than $pos
                                         characters

- Find the end of the current word
$regex = qr/^.{$pos}(.+?)$regex(?:$notregex|$)/;
Let us go forth and search for the end of a word. This one is pretty straight forward and doesn't need much explanation. It looks for any characters past the cursor up to the first $regex match followed by the end of the line or [^$regex]. Perhaps yet another visual example?

# The cursor is at position 9, so $pos = 9
# The line is "The quick brown fox jumps over the lazy dog."
# For this example, let's have $regex = qr/\w/;

Cheese burgers are most tastey
        ^                         <-- cursor position
         ---|                     <-- (.+?) nongreedy match
             |                    <-- one $regex plz
              |                   <-- $notregex or EOL

There you have it. This article ended up being WAAAY longer than I intended, so I apologize :)