Search this site


Metadata

Articles

Projects

Presentations

Grok beta 20081228 available

The new C version of grok is ready for beta testing.

The requirements are listed in the INSTALL file. There are piles of differences between the new C version and the old perl version, including a different config file syntax to let you more easily batch common input sets through the same set of matches. I'll publish a complete feature list when I get around to it, which isn't right now.

The tarball comes with a sample grok.conf that shows you a a few different things you can do with the new version.

To run it, once you've built it, you must have a 'grok.conf' in the same directory from which you are running the 'grok' binary.

Please send any questions you have to [email protected]

Download: grok-beta-20081228.tar.gz

Grok (pcre grok) nested predicates

I've spent the past few days refactoring and redesigning some of grok (the C version). Some of the methodology was using lazy test-driven design (writing tests in parallel, rather than before), which seemed to help me get the code working quicker.

We can now nest predicates, so you could ask to match an ip or host which has a word in it that matches 'google'. This example is a little silly, but it does show nested expressions.

% echo "www.cnn.com something google.com" \
  | ./main '%{IPORHOST=~/%{WORD=~,google,}/}' IPORHOST
google.com
I switched away from using tsearch(3) and over to using in-memory bdb; I've been happy ever since. Predicates can now live in an external library in preparation for allowing you to write predicates in a scripting language like Python or Ruby.

I'm using CUnit plus a few script hacks to do the testing. It's working pretty well. I have a few hacks (check svn for these), but the results look like this:

% make test
  Test: grok_capture.test.c:test_grok_capture_encode_and_decode ... passed
  Test: grok_capture.test.c:test_grok_capture_encode_and_decode_large ... passed
  Test: grok_capture.test.c:test_grok_capture_get_db ... passed
  Test: grok_capture.test.c:test_grok_capture_get_by_id ... passed
  Test: grok_capture.test.c:test_grok_capture_get_by_name ... passed
  Test: grok_pattern.test.c:test_grok_pattern_add_and_find_work ... passed
...

Revision 2000

I spent some time putting love into cgrok (uses libpcre) tonight.
  • Logging facility to help in debugging. Lets you choose what features you want logging (instead of lame warn/info/number log levels)
  • Added string and number comparison predicates
  • Wrote a few more tests which uncovered some bugs
I also broke 2000 revisions in subversion. Yay.
Sending        test/Makefile
Transmitting file data .
Committed revision 2001.

Python C++ Grok bindings

I've gotten quite a bit further tonight on making c++grok's functionality available in python.

Mostly tonight's efforts have been spent learning the python C api and learning how to add new objects and methods. I'm planning to have this ready for BarCampRochester3 in two weeks.

So far I can make new GrokRegex objects and call set_regex() and search() on them. Next time I'll be implementing GrokMatch objects (like in the C++ version) and a few other small things. Fun fun :)