photo
Jordan Sissel
geek

Fri, 28 Mar 2008

C++Grok bindings working in Python

% python example.py "%SYSLOGDATE%" < /var/log/messages | head -1
{'MONTH': 'Mar', '=LINE': 'Mar 23 06:47:03 snack syslogd 1.4.1#21ubuntu3: restart.', '=MATCH': 'Mar 23 06:47:03', 'TIME': '06:47:03', 'SYSLOGDATE': 'Mar 23 06:47:03', 'MONTHDAY': '23'}
That's right. I can now use C++Grok from python.

After I saw it work, I immediately ran a time check against the perl version:

% seq 20000 > /tmp/x
% time python example.py "%NUMBER>5000%" < /tmp/x > /tmp/x.python
0.59s user 0.00s system 99% cpu 0.595 total
% time perl grok -m "%NUMBER>5000%" -r "%NUMBER%" < /tmp/x  > /tmp/x.perl
4.86s user 0.94s system 18% cpu 31.647 total
The same basic operation is 50x faster in python with c++grok bindings than the pure perl version. Excellent. Sample python code:
g = pygrok.GrokRegex()
g.add_patterns( <dictionary of patterns> )
g.set_regex("%NUMBER>5000%")
match = g.search("hello there 123 456 7890 pants")
if match:
  print match["NUMBER"]
# prints '7890'
I knew I wasn't doing reference counting properly, so to test that I ran the python code against an input set of 1000000 lines and watched the memory usage, which clearly showed leaking. I quickly read up on ref counting in Python and what functions return new or borrowed references. A few keystrokes later my memory leaks were gone. After that I put python in the test suite and am read to push a new version of c++grok.

Download: cgrok-20080327.tar.gz

Python Build instructions:

% cd pygrok
% python setup.py install

# make sure it's working properly
% python -c 'import pygrok'
There is an example and some docs in the pygrok directory.

Let me know what you think :)

Comments: 0 (view comments)
Tags: , , , , ,
Permalink: /geekery/python-cppgrok-bindings
posted at: 01:31

Mon, 24 Mar 2008

Python C++ Grok bindings

I've gotten quite a bit further tonight on making c++grok's functionality available in python.

Mostly tonight's efforts have been spent learning the python C api and learning how to add new objects and methods. I'm planning to have this ready for BarCampRochester3 in two weeks.

So far I can make new GrokRegex objects and call set_regex() and search() on them. Next time I'll be implementing GrokMatch objects (like in the C++ version) and a few other small things. Fun fun :)

Comments: 0 (view comments)
Tags: , , , ,
Permalink: /geekery/python-cgrok-bindings-2
posted at: 13:36

Sat, 09 Feb 2008

Looks scary, actually simple: grammar parsing.

I hit a mental roadblock a few days ago; I was afraid to write a grammar parser for c++grok that supported the same basic format as the perl grok config format.

Perl grok's config grammar was super easy to write thanks to Parse::RecDescent. In the C++ version, I wanted similar ease. However, the tools I had available didn't appear to be expressive enough to support what I wanted. The config object in C++ was going to be a class, so you were free to have multiple config objects, and which meant I couldn't have any global variables. Both Boost Xpressive and Boost Spirit support grammar parsing almost trivially, but they require awkward wrapping and basically make it very hard to use when you want to update values in a class instance instead of some global variables.

Eventually, I gave up and wrote my own recursive descent bits using Xpressive to do the pattern matching and some trivial in-object state management to keep track of what was going on. It was really simple, despite my fears.

I'm not really sure what made me afraid of doing it, but the fear was totally unfounded.

Comments: 0 (view comments)
Tags: , ,
Permalink: /geekery/parsing-config-files
posted at: 17:08

C++ Grok has working filters and exec sections now.

I finished implementing exec and filters:
exec "tail -1 /var/log/auth.log" {
  type "syslog" {
    match = ".*";
    reaction = "echo %=MATCH|shellescape%";
  };
};
I've made a point of having perl-grok's config format work, because I think it was a reasonable format (you're free to disagree!). At any rate, filters are now working, and the result of the above code is:
Reaction: echo Feb  8 23:25:01 snack CRON\[21596\]: pam_unix\(cron:session\): session closed for user root
Checking for input: tail -1 /var/log/auth.log(0x74b100)
Reading from: tail -1 /var/log/auth.log

Feb 8 23:25:01 snack CRON[21596]: pam_unix(cron:session): session closed for user root

Comments: 0 (view comments)
Tags: ,
Permalink: /geekery/c-grok-filters-working
posted at: 02:28

Mon, 14 Jan 2008

Vim function to make g++ errors readable.

If you've ever used templates in C++, you've probably gone blind trying to read the compiler errors.
grokmatch.hpp:7: error: 'typedef class std::map<std::basic_string<char,
std::char_traits<char>, std::allocator<char> >, std::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >, std::allocator<std::pair<const std::basic_string<char,
std::char_traits<char>, std::allocator<char> >, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > > > >
GrokMatch<boost::xpressive::basic_regex<__gnu_cxx::__normal_iterator<const
char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >
> >::match_map_type' is private
I'm supposed to read all that crap? Especially since 99% of the data isn't useful in most cases. The following vim script sanitizes this output:
function! GPPErrorFilter()
  silent! %s/->/ARROW/g
  while search("<", "wc")
    let l:line = getline(".")
    let l:col = col(".")
    let l:char = l:line[l:col - 1]
    if l:char == "<"
      normal d%
    else
      break
    endif
  endwhile
  silent! %s/ARROW/->/g
  silent %!awk '/: In/ { print "---------------"; print }; \!/: In/ {print }'
endfunction
If I dump the output of make to a file (including stderr), and run the function while in vim, using ':call GPPErrorFilter()', the output turns into this:
g++ -g -I/usr/local/include -c -o main.o main.cpp
---------------
grokmatch.hpp: In function 'int main(int, char**)':
grokmatch.hpp:7: error: 'typedef class std::map GrokMatch::match_map_type' is private
main.cpp:43: error: within this context
make: *** [main.o] Error 1
So much better... Now i know I'm clearly trying to access a private typedef. Sanity++

Comments: 1 (view comments)
Tags: , , , ,
Permalink: /geekery/vim-function-to-make-errors-readable
posted at: 00:39

Wed, 11 Oct 2006

keynav being ported to windows.

I'm in the process of porting keynav to Windows. I've never programmed in Visual Studio before, but I think it's going quite well considering I've never coded for this platform.

The current total lines of code is 277. I expect it to be about this number once I'm finished.

I'm writing it using Visual C++ Express, a free version of Visual Studio. Free (after free registration). From Microsoft. Very cool :)

So far I have screen splitting working correctly. My clip code is kinda borked. After I fix that, it should be completely trivial to add mouse movement calls. Since Windows doesn't typically use sloppy focus, I think I'll add extra code to figure out what window the mouse is over and give that window focus.

Comments: 1 (view comments)
Tags: , , , ,
Permalink: /productivity/keynav-windows-port
posted at: 04:37

Search this site

Navigation

Metadata

Home About Resume My Code

Articles

ARP Security Dynamic DNS with DHCP OpenLDAP+Kerberos+SASL PPP over SSH SSH Security: /bin/false Week of Unix Tools Work Efficiency

Projects

fex firefox tabsearch firefox urledit grok keynav liboverride newpsm (FreeBSD) nis2ldap pam_captcha poor man's backup Solaris audio utility xboxproxy xdotool xmlpresenter xpathtool misc scripts

Presentations

Yahoo! Hack Day '06 Unix Essentials Vi/Vim Essentials

Tag Cloud

Calendar

< March 2008 >
SuMoTuWeThFrSa
       1
2 3 4 5 6 7 8
9101112131415
16171819202122
23242526272829
3031     

Friends

BarCamp Kent Brewster Tantek Çelik John Resig Wesley Shields Tyler Shields

Technorati