Search this site

[prev]  Page 4 of 4





New event recording database prototype

I finally managed to find time today to work on my events database project. In the processes of doing so, I found a few bugs in grok that needed to get fixed. Some of my regular expressions were being a bit greedy, so certain pattern expansion was breaking.

To summarize the event recording system, it is a webserver listening for event publish requests. It accepts the "when" "where" and "what" of an event, and stores it in a database.

To have my logs pushed to the database, I'll leverage the awesome power of Grok. Before I do that, I gathered all of the auth.log files and archives and compiled them into their respective files.

The grok.conf for this particular maneuver:

exec "cat ./logs/nightfall.auth.log ./logs/sparks.auth.log ./logs/whitefox.auth.log" {
   type "all syslog" {
      match_syslog = 1;
      reaction = 'fetch -qo - "http://localhost:8080/?when=%SYSLOGDATE|parsedate%&where=%HOST%/%PROG|urlescape|shdq%&what=%DATA:GLOB|urlescape|shdq%"';
This is farily simple. I added a new standard filter, 'urlescape' to grok becuase I needed it. it will url escape a data piece. Hurray!

Run grok, and it sends event notifications to the webserver for every syslog-matching line. Using FreeBSD's command-line web client, fetch.

sqlite> select count(*) from events;
Now, let's look for something meaningful. I want to know what happened on all sshd services between 1am and 3am this morning (Today, May 3rd):
nightfall(~/projects/eventdb) % date -j 05030100 +%s
nightfall(~/projects/eventdb) % date -j 05030400 +%s
Now I know the Unix epoch times for May 3rd at 1am and 4am.
sqlite> select count(*) from events where time >= 1146632400 
   ...> and time <= 1146643200 and location like "%/sshd" 
   ...> and data like "Invalid user%";
This query is instant. Much faster than doing 'grep -c' on N log files across M machines. I don't care how good your grep-fu is, you aren't going to be faster.This speed feature is only the beginning. Think broader terms. Nearly instantly zoom to any point in time to view "events" on a system or set of systems. Filter out particular events by keyword or pattern. Look for the last time a service was restarted. I could go on, but you probably get the idea. It's grep, but faster, and with more features.

As far as the protocol and implementation goes, I'm not sure how well this web-based concept is going to prevail. At this point, I am not interested in protocol or database efficiency. The prototype implementation is good enough. From what I've read about Splunk in the past months in the form of advertisements and such, it seems I already have the main feature Splunk has: searching logs easily. Perhaps I should incorporate and sell my own, better-than-Splunk, product? ;)

Bear in mind that I have no idea what Splunk actually does beyond what I've gleaned from advertisements for the product. I'm sure it's atleast somewhat useful, or no one would invest.

Certainly, a pipelined HTTP client could perform this much faster than doing 10000 individual http requests. A step further would be having the web server accept any number of events per page request. The big test is going to see how well HTTP scales, but that can be played with later.

At this point, we have come fairly close to the general idea of this project: Allowing you to zoom to particular locations in time and view system events.

The server code for doing this was very easy. I chose Python and started playing with CherryPy (a webserver framework). I had a working event reciever server in about 30 minutes. 29 minutes of that time was spent writing a threadsafe database class to front for pysqlite. The CherryPy bits only amount to about 10 lines of code, out of 90ish.

The code to do the server can be found here: /scripts/

Grok and other plans

As a sysadmin, I get the privilege of sifting through piles of logs to find anomalies. Logs are great. However, I don't have time to sift through every log looking for data worth reading. I'd much prefer having only the data I want to see displayed to me. Most of the time log audits are an all-or-nothing activity - either you look at all of the data, or you look at none of the data. Looking at all the data takes more time than it should, and ignoring data can be hazardous (especially when tracking problems down).

Some time ago, I began a very long process of taking the massive quantities of data and having a machine process them for me. Spend a bit of time up-front to determine what data is definitely meaningful and let the computer handle the rest. The computer needs to process the raw data and display the data to me in a meaningful and readable format. Such formats include trend graphs, log summaries, and anomaly detection. Trend graphs are simple to do, assuming you have numeric data. Log summaries are easy if you know how and what you want to summarize. Anomalies are easy to detect if you know what you're looking for, or declare "anything unknown is badwrong... or badong."

Grok is the first step in having raw data turned into something easily readable. The next step is writing some magic software piece that lets me store arbitrary data (log entries, counters, key->value pairs, etc), possibly by date. This way, you can take grok's parsing ability and turn it into stored content. Now that you'll have super megatastic parsed log data, you'll want to turn it into something more human-meaningful - graphs or summaries. That's the 3rd piece.

So anyway, I got bored and started playing with Visio (generously provided by RIT's CS dept, ofcourse), and I came up with a little diagram of what I want grok and it's sister tools to do. The yellowish items are things I'll be writing. The rest aren't really software so much as stuff that happens. Here's a pretty diagram complete with a useful description of a "brick thing" -

In summary. Data (logs, etc) are extremely noisy. Use grok and other tools to turn raw data into useful data for you. This will keep you reading your logs aswell as keeping you sane. See how happy the sysadmins are? The smiley faces indicate happiness... I promise.

logwatcher renamed to grok. Grok 1.0 release.

I've been working on a fury of new features and fixes for the now-named Grok project. I'm almost done adding new features and I figured it was time to make a release. Head over to the Grok page and play with it if you want.

It is definately production-ready, as far as I can tell. If you use logwatch, swatch, logsentry, or any other log auditing software, then give grok a shot. I guarantee it is better. If it isn't, tell me what you like about the others and I'll add those features.

Logwatcher + lwrecord

It's time logwatcher takes a step towards log summarization in addition to it's already (hopefully) good match-and-react system. This step is with a script I'm working on that will be distributed with logwatcher called lwrecord. lwrecord will let you record arbitrary data in a tree-like structure for later retrieval.

My logwatch.conf is as follows:

file "/var/log/auth.log" {
   type "bad username" {
      match = "pam for invalid user %USERNAME% from %IP%";
      match = "Invalid user %USERNAME% from %IP%";
      reaction = "./lwrecord 'ssh-illegal-user/%IP%' '%USERNAME%'";
lwrecord splits the first argument (the key) by the '/' token and ends up storing the username attempted into:
$db->{"ssh-illegal-user"}->{"%IP%"} = "%USERNAME%";
This gets stored onto disk in to a database file.

After several ssh attempts, I can view the database and see that we are happily recording information:

nightfall(~/projects/logwatch) !130! % ./lwfetch
$VAR1 = {
          'ssh-illegal-user' => {
                                  '' => [
                                                       'time' => 1140510435,
                                                       'data' => 'foo'
                                                       'time' => 1140510439,
                                                       'data' => 'testing'
                                                       'time' => 1140510444,
                                                       'data' => 'happytaps'
                                                       'time' => 1140510446,
                                                       'data' => 'happytaps'
                                                       'time' => 1140510447,
                                                       'data' => 'happytaps'
With lwrecord, we have the ability to store any arbitrary kind of data (hopefully) from simple data like who's ssh'ing in to entire log entries. Soon, I'll write up lwsummarize which will let you summarize database entries over a given time period for human parsing.

One step closer to simple log summarization. With lwrecord (and eventually lwsummarize) you will be able to easily show log summaries such as "Show me all of the IPs who attempted ssh logins with invalid users N times in the last day" - more on this as I work on it.

logwatcher update

Logwatcher got a few significant updates today.

You can now give subnames to patterns in the form of %PATTERN:NAME%. It'll match against the named PATTERN but be referrable as %PATTERN:NAME% instead of just %PATTERN%. This is extremely useful if you have two instances of the same pattern in one match. Perl exec works again too.

I added debug levels too so it won't output anything less you want it to. Specify debug with -d, repeat for higher debug levels. You can also specify a nonstandard config file with -f path/to/file.

If you use logwatcher, let me know. I'm always up for adding new features.

Project page: projects/logwatch

logwatcher in action

Kenya, one of my machines, sends me security reports daily (Thanks FreeBSD!). login failures:
Sep 15 11:15:24 kenya sshd[32882]: Failed password for illegal user a from port 2946 ssh2
Sep 15 11:58:55 kenya sshd[32986]: Failed password for illegal user root from port 40961 ssh2
Sep 15 21:59:03 kenya sshd[34537]: Failed password for illegal user test from port 3614 ssh2
Notice how there was only one root-user attempt and only 2 illegal-user attempts? My logwatcher is configured to instantly block any root login attempts aswell as anyone who tries to login with an invalid user more than once. Keeps the brute-force attempts out of my logs.

Doing this is certainly not a catch-all solution by any means, but definately it keeps my security logs clear of idiots.

Anyone who gets blocked by being naughty on ssh goes into the whores table in pf. That table has been growing steadily for a few weeks now...

kenya(~) [1000] % sudo pfctl -t whores -T show | wc -l
Logwatcher most definately isn't just for security, but the only thing I use it for is to watch auth.log for brute-force bot activity. I'm hoping to have some spare time to spend on developing more neat features into logwatcher as time progresses. Right now, though, it's pretty slick. If you want more information about logwatcher, feel free to visit the logwatcher project page or find me online (aim or email) and bug me with questions or feature requests.

More updates to logwatch

Logwatch is gradually getting sexier as I throw more ideas and time into it. This recent batch of updates added, primarily, the following:

Users can define their own patterns in logwatch.conf:

patterns {
	clientandip = "\[client %IP%\]";
	hostname = "(\[wd-\]+.)*(\[w-\])";
Defines your own patterns in addition to the default ones. These new patterns are used the same way that default patterns are, as %clientandip% and %hostname%. A sample use of this would be:
file "/var/log/auth.log" {
	type "failed hostname lookup" {
		match = "reverse mapping checking getaddrinfo for %hostname% failed";
		threshold = 2;
		interval = 60;
		reaction = "echo 'Someone (%hostname%) is trying to ssh with broken dns... maybe they are being naughty?'";

Patterns can be pretty much any regular expression perl will support. Also, as you can see with the clientandip pattern, you can embed existing patterns in new patterns. This includes your own patterns. Recursion is limited to 10 levels by default, this may change in the future.

log-watching expert system

I got bored and wrote an expert system for doing log analysis and reaction. It's original intention was to watch auth.log for brute-force login attempts and block them on the firewall. It has turned into a far more flexible system for doing generic log-based matching and reaction. Reactions are based on a threshold of hits over time. The 'reaction' section of the config file specifies what command is run (this could be a simple shell script you call, for example).

There are a few features I'll probably be adding soon such as multiple threshold/reactions per match type, but that's somewhat down the road for when I have more boredom to throw at the project. I also want to allow users to add their own meta globs (like %USERNAME%) into the config file so the program is even more flexible.

Currently it runs on my mirror server and blocks excess (brute force) ssh attempts, seems to be going good. The development process of this took me into learning a very slick perl module called Parse::RecDescent which parses documents based on a given grammar. I used this for the config file, it was pleasantly easy to use. Check out logwatch, download it

It requires the following perl modules:

  • File::Tail
  • Regexp::Common
  • Parse::RecDescent