Grok (pcre grok) nested predicates
Posted Thu, 09 Oct 2008
I've spent the past few days refactoring and redesigning some of grok (the C
version). Some of the methodology was using lazy test-driven design (writing
tests in parallel, rather than before), which seemed to help me get the code
working quicker.
We can now nest predicates, so you could ask to match an ip or host which has a word in it that matches 'google'. This example is a little silly, but it does show nested expressions.
% echo "www.cnn.com something google.com" \
| ./main '%{IPORHOST=~/%{WORD=~,google,}/}' IPORHOST
google.com
I switched away from using tsearch(3) and over to using in-memory bdb; I've
been happy ever since. Predicates can now live in an external library in
preparation for allowing you to write predicates in a scripting language like
Python or Ruby.
I'm using CUnit plus a few script hacks to do the testing. It's working pretty well. I have a few hacks (check svn for these), but the results look like this:
% make test Test: grok_capture.test.c:test_grok_capture_encode_and_decode ... passed Test: grok_capture.test.c:test_grok_capture_encode_and_decode_large ... passed Test: grok_capture.test.c:test_grok_capture_get_db ... passed Test: grok_capture.test.c:test_grok_capture_get_by_id ... passed Test: grok_capture.test.c:test_grok_capture_get_by_name ... passed Test: grok_pattern.test.c:test_grok_pattern_add_and_find_work ... passed ...