photo
Jordan Sissel
geek

Tue, 15 Jan 2008

Grok predicates - Perl vs C++

I just finished implementing predicates in c++grok (tentative name) and wanted to compare the performance against perl grok.

An input of 50000 lines of apache logfile amounting to 9.7megs of data.

I initially attempted this using the regex predicate %IP~/^129% but I realized that perl grok compiles the predicate regex every time it is executed, and wasn't a fair test. So I switched to %IP>=129% instead, which converts the match to an integer first (so 129.21.60.9 turns into 129, for example), which seems like more equal ground based on the implementations in both perl and C++.

# C++ Grok
% /usr/bin/time ./test_patterns "%IP>=129%" < /tmp/access.50klines > /dev/null
2.56user 0.14system 0:02.92elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+408minor)pagefaults 0swaps

# Perl Grok
% /usr/bin/time perl grok -m "%IP>=129/%" -r "%IP%" < /tmp/access.50klines > /dev/null
8.87user 1.24system 0:25.94elapsed 39%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+17721minor)pagefaults 0swaps
What still remains consistent is the trend that the more complexity I add in C++ equates to a greater margin of speed from the perl version.
  • Using strict %FOO% patterns with no predicates, the C++ version is 6 to 7 times faster than the perl equivalent in grok.
  • Using predicates shows the C++ version running 10 times faster.
I still need to write test cases for the C++ version in addition to porting the pattern discovery portion from perl.

Exciting :)

Comments: 0 (view comments)
Tags: , , , ,
Permalink: /geekery/grok-predicates-perl-vs-cplusplus
posted at: 02:31

Search this site

Navigation

Metadata

Home About Resume My Code (SVN)

Articles

ARP Security Dynamic DNS with DHCP OpenLDAP+Kerberos+SASL PPP over SSH SSH Security: /bin/false Week of Unix Tools Work Efficiency

Projects

fex firefox tabsearch firefox urledit grok keynav liboverride newpsm (FreeBSD) nis2ldap pam_captcha poor man's backup Solaris audio utility xboxproxy xdotool xmlpresenter xpathtool misc scripts

Presentations

Yahoo! Hack Day '08 Yahoo! Hack Day '06 Unix Essentials Vi/Vim Essentials SSH Tunneling (Video)

Tag Cloud

Calendar

< January 2008 >
SuMoTuWeThFrSa
   1 2 3 4 5
6 7 8 9101112
13141516171819
20212223242526
2728293031  

Friends

BarCamp Kent Brewster Tantek Çelik John Resig Wesley Shields Tyler Shields

Technorati