Search this site


Metadata

Articles

Projects

Presentations

Grok (pcre grok) nested predicates

I've spent the past few days refactoring and redesigning some of grok (the C version). Some of the methodology was using lazy test-driven design (writing tests in parallel, rather than before), which seemed to help me get the code working quicker.

We can now nest predicates, so you could ask to match an ip or host which has a word in it that matches 'google'. This example is a little silly, but it does show nested expressions.

% echo "www.cnn.com something google.com" \
  | ./main '%{IPORHOST=~/%{WORD=~,google,}/}' IPORHOST
google.com
I switched away from using tsearch(3) and over to using in-memory bdb; I've been happy ever since. Predicates can now live in an external library in preparation for allowing you to write predicates in a scripting language like Python or Ruby.

I'm using CUnit plus a few script hacks to do the testing. It's working pretty well. I have a few hacks (check svn for these), but the results look like this:

% make test
  Test: grok_capture.test.c:test_grok_capture_encode_and_decode ... passed
  Test: grok_capture.test.c:test_grok_capture_encode_and_decode_large ... passed
  Test: grok_capture.test.c:test_grok_capture_get_db ... passed
  Test: grok_capture.test.c:test_grok_capture_get_by_id ... passed
  Test: grok_capture.test.c:test_grok_capture_get_by_name ... passed
  Test: grok_pattern.test.c:test_grok_pattern_add_and_find_work ... passed
...