photo
Jordan Sissel
geek

Mon, 31 Dec 2007

Goodbye, 2007!

To make this year's review cooler than last year, I wrote a python script to generate a tag cloud and fed it only the list of tags mentioned in posts I've made this year.

This year was pretty sweet.

Basic life summary: Still loving it at Google. Got a house. Getting married soon.

This year started off me using EC2 for a side project. Along with EC2, I had to think about scaling mysql and tomcat . This same side project made me rant about mysql's query cache.

I also spent many hours putting crazy features into grok. Unsatisfied with the original predicate implementation, I came up with this hack to run arbitrary pattern-matching code within a regular expression to affect the outcome of the match, and then implemented it in grok. A few months after that, I checked off another todo item by implementing pattern discovery.

I also started working on monitoring. I mentioned this idea in some detail last year, and had a really crappy prototype. This year, I experimented with Berkeley DB and Python to get simple key-value pair storage. All of the work so far is still very primitive, but I did have a working prototype

I've put thousands of miles of travel in this year: Shmoocon (Washington, DC), MashupCamp Dublin (Dublin, Ireland), Defcon 15 (Las Vegas), Barcamp Block (Palo Alto, CA), SuperHappyDevHouse 18 (Hillsborough, CA).

As expected, a few projects stayed on the backburner. One of these is my FreeBSD work redoing the mouse driver system. Given that I've had commit access to FreeBSD for a year now and haven't done much with it, I'm hoping I can spend more time working on that project; as it is my favorite platform. The code has been ready to commit for a long time, and I just haven't gotten around to it :\

New projects: fex, firefox-tabsearch, firefox-urledit, liboverride, xdotool. and xpathtool,

Some of my favorite hacks this past year included pulling album covers from amazon, muting music when your screen is locked, fast log splitting, a mini-freebsd script, and shell shortcuts

With that I bid farewell to 2007, and continue to eagerly look forward to the future. The only plans I have set this year are helping again run Hack or Halo at Shmoocon in addition to putting serious time into FreeBSD.

Comments: 0 (view comments)

Permalink: /geekery/year-in-review-2007
posted at: 20:44

ssh honeypot auditing

I've only gotten a few hits on my honey pot, and none of the bots seem to be doing much. I think it might be because the shell I have setup doesn't behave correctly. Here's the new one:
#!/bin/bash
d="$(date "+%Y%m%d-%H%M%S")"
logfile="/var/log/traps/$d"
env > $logfile
echo "Args: $*" >> $logfile
export SHELL=/bin/bash
script -c "$SHELL $*" -q -a $logfile
This will log the env vars in addition to the arguments passed to the shell. Thus far, I've see 2 patterns of environment variables.

This new version supports arguments, so that things like 'ssh user@host somecommand' works. The next step is probably to have a setuid program chown the logfile to root shortly after script(1) starts, so that you can't remove your own log. I'll only bother with that if it's necessary.

In addition to the shell change, I started looking into the audit facility in Linux. I want to log all command execution, in case my script(1) idea fails. To do this, I added these rules with auditctl:

auditctl -a exit,always -F uid=60000 -S open
auditctl -a exit,always -F uid=60000 -S execve
auditctl -a exit,always -F uid=60000 -S vfork
auditctl -a exit,always -F uid=60000 -S fork
auditctl -a exit,always -F uid=60000 -S clone
I'm not entirely sure if this will specifically catch the execs I'm looking for, but it does seem to work:
% ausearch -sc execve | grep EXECVE
type=EXECVE msg=audit(1199138086.041:3293): a0="/bin/bash" a1="-c" a2="uptime"-
type=EXECVE msg=audit(1199138086.056:3300): a0="uptime"-

Comments: 1 (view comments)
Tags: , , ,
Permalink: /geekery/honeypot-auditing
posted at: 16:59

Vim indentation

More than a year ago, I expressed some frustration about cindent in vim. My main complaints about it were that it made bad decisions about indentation on some languages that were not strictly C-syntax (perl, python, javascript).

Tonight I decided that I wanted to automate indenting to the closest '(' as in:

if (foo() and bar()
    and baz):
    ^ Want to indent to here, somehow, on command.
The 'cindent' feature of vim lets you configure this to happen automatically, but in some cases it won't indent properly: ie; a comment with a ( at the end of the line, for example, will screw it up.

I got tired of dealing with it, so I went back to autoindent, and I've been happier ever after. Fooling around tonight, I started working on a vim function to basically do exactly what I needed. An hour later, it was done. In the process, I wanted to confirm the default actions of ctrl+f in insert mode, which lead me to the cinkeys docs, which clued me that 'cindent' only autoindents on certain occaisions.

All of my time was wasted, it seems, after I figured out setting this option:

set cinkeys=!^F
Now cindent only activates when I hit ctrl+f. If I have both autoindent and cindent enabled, with this cinkeys setting, the default indentation behavior is exactly autoindent, and I can invoke cindent at will.

The following is now set in my .vimrc:

set autoindent
set cindent                     " Use c-style indentation
set cinkeys=!^F                 " Only indent when requested
set cinoptions=(0t0c1           " :help cinoptions-values

If you're interested in the vim script I wrote, which I no longer need, you can download it here: paren_indent.vim

Comments: 0 (view comments)
Tags: ,
Permalink: /geekery/vim-indentation-revisited
posted at: 05:50

Thu, 27 Dec 2007

ssh honeypot.

Using slight variations on the techniques mentioned in my previous post, I've got a vmware instance running Fedora 8 that permits any and all logins. These login sessions are logged with script(1).

Fedora 8 comes with selinux enabled by default. This means sshd was being denied permission to execute my special logging shell. The logs in /var/log/audit/ explained why, and audit2allow even tried to help make a new policy entry for me. However, I couldn't figure out (read: be bothered to search for more than 10 minutes) how to install this new policy. In searching, I found out about chcon(1). A simple command fixed my problems:

chcon --reference=/bin/sh /bin/sugarshell
The symptoms prior to this fix were that I could authenticate, but upon login I would get a '/bin/sugarshell: Permission Denied' that wasn't logged by sshd.

There are plenty of honeypot software tools out there, but I really wasn't in the mood for reading piles of probably-out-of-date documentation about how to use them. This hack (getpwnam + pam_permit + logging shell) took only a few minutes.

As a bonus, I found a feature in Fedora's yum tool that I like about freebsd's packaging system: It's trivial to ask "Where did this file come from?" Doing so made me finally look into how to do it in Ubuntu.

FreeBSD: pkg_info -W /usr/local/bin/ssh
/usr/local/bin/ssh was installed by package openssh-portable-4.7.p1,1
Fedora: yum whatprovides /usr/bin/ssh
openssh-server.x86_64 : The OpenSSH server daemon
Ubuntu: dpkg -S /usr/bin/ssh
openssh-client: /usr/bin/ssh

Let's see what I catch.

Comments: 0 (view comments)
Tags: , , , , , ,
Permalink: /geekery/ssh-honeypot-is-alive
posted at: 03:43

Sat, 22 Dec 2007

liboverride project page is up.

I finally got around to putting up a project page for liboverride.

Location: /projects/liboverride

Comments: 0 (view comments)
Tags: , ,
Permalink: /geekery/liboverride-project-page
posted at: 21:13

Tracking and Analyzing SSH Bots.

I've posted previously about what can be done about ssh bots. In this same context, I've just finished working on a new idea: Tracking the username/passwords used by the bots.

To track the login attempts, I wrote a new pam module: pam_logfailure. The goal of pam_logfailure is to log the passwords used by bots attempting to bruteforce logins. However, when I installed the module, I found that it wasn't working properly:

Dec 20 12:24:50 kenya2 pam_logfailure: host:125.243.206.194 user:john pass:^H ^M^?INCORRECT
I saw line after line of these, and couldn't figure out why the bots were using this as a password. Turns out they aren't. This password is what OpenSSH forces upon pam for users that do not exist. This is apparently by design:
auth-pam.c: static char badpw[] = "\b\n\r\177INCORRECT";
If you are an invalid user, or are trying to login as root while root login is disabled, the password you sent is replaced with 'badpw' above. This makes it kind of hard to track what passwords bots are using...

Thankfully, I was already one step ahead of myself when I wrote a function injection tool back in September (liboverride). So, all I had to do was inject my own 'getpwnam' function to spoof data when a user did not exist to trick OpenSSH into passing the password through.

After injecting my own getpwnam(), pam_logfailure started working just fine:

Dec 22 11:17:47 kenya2 pam_logfailure: host:218.1.65.233 user:admin pass:admins
So where will I go next with these ssh-bot games?
  • Reverse-hack. I picked 3 random ssh bot hosts from my logs, and all of them run sshd. It would be pretty trivial to take the password attempts used against my machine and try them on the host the bot is coming from. Seems likely that turning the bot's actions on itself will grant me access to the infected machine.
  • Redirect to a honeypot. We could detect when a bot is trying to login, and add a firewall rule that would put future ssh attempts from these hosts into a honeypot which accepts all logins to see what happens.
  • Fingerprint ssh bots by behavior.

The usage of getpwnam.over is like any other liboverride code. 'make getpwnam.so' and then use "LD_PRELOAD=/path/to/getpwnam.so ". In this case, I added this line to /usr/local/etc/rc.d/openssh (my sshd start script):

export LD_PRELOAD=/path/to/getpwnam.so

Here is the code:

Comments: 1 (view comments)
Tags: , , , , ,
Permalink: /geekery/tracking-ssh-bots
posted at: 16:37

Thu, 20 Dec 2007

VMware Server 2.0 Beta

I upgraded my vmware machine from vmware 1.3 to vmware 2.0 beta. The install process was great by comparison to the last two releases. This install was much nicer than the previous one for simple reasons that I didn't have to hack the perl script to not misbehave, and I didn't have to mess around compiling or finding my own vmware kernel modules. Everything Just Worked during the install.

On the downside, vmware-server-console is deprecated. Vmware Server 2.0 uses Vmware Infrastructure, which appears to be tomcat+xmlrpc and other things. The New Order seems to be that you manage your vms with the webbrowser, which isn't a bad idea. However, we must remember that Good Ideas do not always translate into Good Implementations.

The web interface looks fancy, but the code looks like it's from 1998. The login window consists of layers and layers of nested tables and a pile of javascript all in the name of getting the login window centered in the browser. You can see the page align itself upon rendering even on my 2gHz workstation with Firefox. Horrible.

Once you log in, you're presented with a visually-useful-but-still-runs-like-shit interface. The interface itself appears useful and nice, but again fails to respond quickly presumably due to the piles of poorly written javascript involved.

Since VMware thought this was a fresh install, it didn't know about any of my old virtual machines. Adding them using the web interface causes vmware to crash. Oops. So, I found a vmware infrastructure client executable randomly in the package; "find ./ -name '*.exe'" will find it for you. Copied this to my windows box and installed it. I used this tool to re-add my old vmware machines.

Unfortunately, "raw disks" are disabled in this free version of vmware server. I'm not sure why. My Solaris VM uses raw disks for its zfs pool, so this was a problem. Luckily, this is purely a gui limitation and not a vmware limitation. To repair my Solaris VM, I created a new virtual machine with the same features and told it where it's first disk lived (the first disk was a normal file-backed vmware disk image). After that, I looked at the old vm's .vmx file and copied in the lines detailing the raw drives to the new .vmx file:

scsi0:1.present = "true"
scsi0:1.filename = "zfs-sdb.vmdk"
scsi0:1.deviceType = "rawDisk"
scsi0:2.present = "true"
scsi0:2.filename = "zfs-sdc.vmdk"
scsi0:2.deviceType = "rawDisk"

Everything's backup and running sanely now in vmware. Hurray :)

Comments: 2 (view comments)
Tags: , , ,
Permalink: /geekery/vmware-server-2.0-beta-1
posted at: 13:44

Fri, 07 Dec 2007

I gave in and got an iPhone.

I picked up an iphone tonight at the apple store in the mall. Yay new toy. The setup process was pretty simple. In fact, I feel comfortable stating that this cellphone purchase was absolutely the most pleasant cellular experience I've had.

  1. Go to Apple store.
  2. Say "I want to buy an iPhone"
  3. Someone hands you a box with an iPhone in it. You pay.
  4. Go home, plug iPhone into PC.
  5. Run iTunes. Follow the trivially simple activation steps(*)
  6. Rejoice now that your iPhone is activated without ever having to deal with morons at the AT&T retailers
(*) My desire to keep my current Cingular AT&T account was satisfied. I was presented with the option of transfering my current service to the new phone. It even set me up with a data plan. Pretty hassle-free.

Much love to Apple for making me not have to talk to anyone at the AT&T retail stores.

  • Uploading photos to flickr is the same as my previous phone: emailing photos to flickr
  • meebo.com (web2.0fancy instant messenger gateway) happily works on the iphone.
  • google reader works too
Comes standard with a google maps and youtube apps.

Fancy. The first two project ideas that came to mind are a both remote-control tools. One for a universal remote (basically smash buttons on a webpage tells an IR emitter to do things) and one to remotely control a PC (mouse, keyboard, etc). I'm guessing there will be a VNC client for the iphone out very shortly after the Apple SDK is released, so I'll think about the infrared remote control project.

Comments: 1 (view comments)
Tags:
Permalink: /geekery/iphone-for-me
posted at: 03:27

Thu, 06 Dec 2007

C vs Python with Berkeley DB

I've got a stable, threaded version of this fancydb tool I've been working on. However, the performance of insertions is less than optimal.

Then again, how much should insert performance matter on a monitoring tool? For data that comes into it gradually, speed doesn't matter much. For bulk inserts, speed matters if you want to get your work done quickly. I haven't decided if bulk insertions are necessary use case for this tool. Despite that, I'm still interested in what the limits are.

I have experimented with many different implementations of parallelism, buffering, caching, etc in the name of making insertion to a fancydb with 10 rules fast. The fastest I've gotten it was 10000/sec, but that was on an implementation that wasn't threadsafe (and used threads).

My most-recent implementation (which should be threadsafe) can do reads and writes at 30000/sec. With evaluation rules the write rate drops to about 10000/sec.

The next task was to figure out what I was doing wrong. For comparison, I wrote two vanilla bdb accessing programs. One in C and one in Python. The output of these two follows:

# The args for each program is: insertions page_size cache_size
% sh runtest.sh
Running: ./test 2000000 8192 10485760
  => 2000000 inserts + 1 fullread: 209205.020921/sec
Running: ./py-bsddb.py 2000000 8192 10485760
  => 2000000 inserts + 1 fullread: 123304.562269/sec
As expected, C clearly outperforms Python here, but the margin is pretty small (C is 69% faster for this test). Given the 120000/sec rate from Python, the poor input rate of my tool seems to be blamed on me. Is my additional code here really the reason that I can only write at 30000 per second? I may need to revisit how I'm implementing things in python. I'm not clear right now where I'm losing so much throughput.

So I use hotshot (python standard profiler) and I find that most of the time is spent in my iterator method. This method is a generator method which uses yield and loops over a cursor.

It's important to note that my python bdb 'speed test' above did not use generators, it used a plain while loop over the cursor. So, I wrote another test that uses generators. First, let's try just inserts, no reading of data:

Running: ./test 1000000 8192 10485760
  => 1000000 inserts: 261096.605744/sec
Running: ./py-bsddb.py 1000000 8192 10485760
  => 1000000 inserts: 166389.351082/sec
Now let's try with 3 different python reading methods: while loop across a cursor, generator function (using yield), and an iterator class (implementing __iter__):
Running: ./py-bsddb.py 4000000 8192 10485760
  => 1 fullread of 4000000 entries: 8.660000
Running: ./py-bsddb_generator.py 4000000 8192 10485760
  => 1 fullread of 4000000 entries: 9.124000
Running: ./py-bsddb_iterable_class.py 4000000 8192 10485760
  => 1 fullread of 4000000 entries: 13.130000
I'm not sure why implementing an iterator is so much slower (in general) than a yield-generator is. Seems strange, perhaps my testing code is busted. Either way, I'm not really closer to finding the slowness.

get this code here

Comments: 0 (view comments)
Tags: , , ,
Permalink: /geekery/c-vs-python-bdb
posted at: 03:59

Sat, 01 Dec 2007

Matplotlib makes me hate.

Let me caveat this rant with the fact that I've only been playing with matplotlib for approximately a week.

All the demos made matplotlib (a python module) look like a great tool that I should want to use to graph things, then I started trying to actually write code and it all went downhill.

Almost all of the functions operate on some mystical global scope, meaning they are by design not threadsafe. Probably not a big deal, I guess, but it certainly feels like an alien world especially given all the object oriented code in use in python.

If this culture shock wasn't bad enough, it went ahead and decided to use inches and ratios as the standard units of measure. You make a figure of a set width and height (in inches) and you can put stuff in that figure given ratio offsets. An offset of '.5' would put your left-bound in the middle. Weird and unexpected. Perhaps not bad. Still, I'm used to pixels, not inches.

Some of the arguments are just looney:

  fig.add_subplot(111)
The docs say this "subplot(211) # 2 rows, 1 column, first (upper) plot". Base10 flag system? What. the. F. I'm at a loss as to why this was ever a good idea. Let's make it hard to add plots? Looks like you can use 'subplot(rows, cols, plotnum)' which is the sensible solution, but all the demos use the integer syntax, and it makes me sad.

You can't easily put the legend outside the graph.

Setting the default font size means you have to set at least 6 things. Make sure you note the excessive use of different tokens for the same freaking setting: labelsize, titlesize, size, and fontsize.

rc("axes", labelsize=10, titlesize=10)
rc("xtick", labelsize=10)
rc("ytick", labelsize=10)
rc("font", size=10)
rc("legend", fontsize=10)

I have code that looks like this:

  fig = figure()
  p = subplot(111)
  line = p.plot_date(dates, values)
  line[0].set_label("foo")
  legend()
  fig.savefig('foo.png', format='png')
Notice my entertaining leaps between OOP and WTF. Other cute nuances are that the docs/examples are littered with:
  ax = subplot(111)
You might think that the name 'ax' means 'axis' and that subplot returns an axis. No. You might ask python with type() and it would say '<type 'instance>'. Helpful. If you just print ax you'll see it is matplotlib.axes.Subplot. I'm trying hard to not get hung up on semantics, but 'axis' to me is very different from a plot. Plot seems like a visual representation, and an axis is a single dimension of a graph (aka a plot).

After several days of playing with this tool, I am frustrated and disheartened. It has such powerful features like tick rules: You can trivially specify "Put one major tick every 3rd week". However, the api is half OO half globally-scoped-procedural. Maybe this is my fault. The docs constantly mix 'matplotlib' and 'pylab' methods. Perhaps you can use just the matplotlib functions by themselves and you don't need pylab? Pylab, by the way, is what provides these awkward global functions and in theory only exists as a pure wrapper on top of matplotlib.

Comments: 3 (view comments)
Tags: ,
Permalink: /geekery/matplotlib-induces-self-loathing
posted at: 05:39

Search this site

Navigation

Metadata

Home About Resume My Code

Articles

ARP Security Dynamic DNS with DHCP OpenLDAP+Kerberos+SASL PPP over SSH SSH Security: /bin/false Week of Unix Tools Work Efficiency

Projects

fex firefox tabsearch firefox urledit grok keynav liboverride newpsm (FreeBSD) nis2ldap pam_captcha poor man's backup Solaris audio utility xboxproxy xdotool xmlpresenter xpathtool misc scripts

Presentations

Yahoo! Hack Day '06 Unix Essentials Vi/Vim Essentials

Tag Cloud

Calendar

< December 2007 >
SuMoTuWeThFrSa
       1
2 3 4 5 6 7 8
9101112131415
16171819202122
23242526272829
3031     

Friends

BarCamp Kent Brewster Tantek Çelik John Resig Wesley Shields Tyler Shields

Technorati