To make this year's review cooler than last year, I wrote a python script to generate a tag cloud and fed it
only the list of tags mentioned in posts I've made this year.
This year was pretty sweet.
Basic life summary: Still loving it at Google. Got a house. Getting married soon.
This year started off me using EC2 for a side project. Along with EC2, I had to
think about scaling mysql and tomcat
. This same side project made me rant about mysql's query cache.
I also spent many hours putting crazy features into grok. Unsatisfied with the original predicate
implementation, I came up with this hack to run arbitrary
pattern-matching code within a regular expression to affect the outcome of the
match, and then implemented it in grok.
A few months after that, I checked off another todo item by implementing pattern discovery.
I also started working on monitoring. I mentioned this idea in some detail last year, and had a really
crappy prototype. This year, I experimented with
Berkeley DB and Python to get simple key-value pair storage. All of the work so
far is still very primitive, but I did have a working prototype
I've put thousands of miles of travel in this year:
Shmoocon (Washington, DC),
MashupCamp Dublin (Dublin, Ireland),
Defcon 15 (Las Vegas),
Barcamp Block (Palo Alto, CA),
SuperHappyDevHouse 18 (Hillsborough, CA).
As expected, a few projects stayed on the backburner. One of these is my
FreeBSD work redoing the mouse driver system. Given that I've had commit access
to FreeBSD for a year now and haven't done much with it, I'm hoping I can spend
more time working on that project; as it is my favorite platform. The code has
been ready to commit for a long time, and I just haven't gotten around to it :\
New projects:
fex,
firefox-tabsearch,
firefox-urledit,
liboverride,
xdotool.
and
xpathtool,
Some of my favorite hacks this past year included pulling album covers
from amazon, muting music
when your screen is locked, fast log splitting, a mini-freebsd script, and shell shortcuts
With that I bid farewell to 2007, and continue to eagerly look forward to the
future. The only plans I have set this year are helping again run Hack or Halo
at Shmoocon in addition to putting serious time into FreeBSD.
Comments: 0 (view comments)
Permalink: /geekery/year-in-review-2007
posted at: 20:44
I've only gotten a few hits on my honey pot, and none of the bots seem to be
doing much. I think it might be because the shell I have setup doesn't behave
correctly. Here's the new one:
#!/bin/bash
d="$(date "+%Y%m%d-%H%M%S")"
logfile="/var/log/traps/$d"
env > $logfile
echo "Args: $*" >> $logfile
export SHELL=/bin/bash
script -c "$SHELL $*" -q -a $logfile
This will log the env vars in addition to the arguments passed to the shell.
Thus far, I've see 2 patterns of environment variables.
This new version supports arguments, so that things like 'ssh user@host
somecommand' works. The next step is probably to have a setuid program chown
the logfile to root shortly after script(1) starts, so that you can't remove
your own log. I'll only bother with that if it's necessary.
In addition to the shell change, I started looking into the audit facility in
Linux. I want to log all command execution, in case my script(1) idea fails. To do this, I added these rules with auditctl:
auditctl -a exit,always -F uid=60000 -S open
auditctl -a exit,always -F uid=60000 -S execve
auditctl -a exit,always -F uid=60000 -S vfork
auditctl -a exit,always -F uid=60000 -S fork
auditctl -a exit,always -F uid=60000 -S clone
I'm not entirely sure if this will specifically catch the execs I'm looking
for, but it does seem to work:
% ausearch -sc execve | grep EXECVE
type=EXECVE msg=audit(1199138086.041:3293): a0="/bin/bash" a1="-c" a2="uptime"-
type=EXECVE msg=audit(1199138086.056:3300): a0="uptime"-
Comments: 1 (view comments)
Tags: ssh, honeypot, fedora, selinux
Permalink: /geekery/honeypot-auditing
posted at: 16:59
More than a year ago, I expressed
some frustration about cindent in vim. My main complaints about it were
that it made bad decisions about indentation on some languages that were not
strictly C-syntax (perl, python, javascript).
Tonight I decided that I wanted to automate indenting to the closest '(' as in:
if (foo() and bar()
and baz):
^ Want to indent to here, somehow, on command.
The 'cindent' feature of vim lets you configure this to happen automatically,
but in some cases it won't indent properly: ie; a comment with a ( at the end
of the line, for example, will screw it up.
I got tired of dealing with it, so I went back to autoindent, and I've been
happier ever after. Fooling around tonight, I started working on a vim function
to basically do exactly what I needed. An hour later, it was done. In the
process, I wanted to confirm the default actions of ctrl+f in insert mode,
which lead me to the cinkeys docs, which clued me that 'cindent' only
autoindents on certain occaisions.
All of my time was wasted, it seems, after I figured out setting this option:
set cinkeys=!^F
Now cindent only activates when I hit ctrl+f. If I have both autoindent and
cindent enabled, with this cinkeys setting, the default indentation behavior is
exactly autoindent, and I can invoke cindent at will.
The following is now set in my .vimrc:
set autoindent
set cindent " Use c-style indentation
set cinkeys=!^F " Only indent when requested
set cinoptions=(0t0c1 " :help cinoptions-values
If you're interested in the vim script I wrote, which I no longer need, you can
download it here:
paren_indent.vim
Comments: 0 (view comments)
Tags: vim, indentation
Permalink: /geekery/vim-indentation-revisited
posted at: 05:50
Using slight variations on the techniques mentioned in my
previous post, I've got a vmware instance running Fedora 8 that permits any
and all logins. These login sessions are logged with script(1).
Fedora 8 comes with selinux enabled by default. This means sshd was being
denied permission to execute my special logging shell. The logs in /var/log/audit/ explained why, and audit2allow even tried to help make a new policy entry for me. However, I couldn't figure out (read: be bothered to search for more than 10 minutes) how to install this new policy. In searching, I found out about chcon(1). A simple command fixed my problems:
chcon --reference=/bin/sh /bin/sugarshell
The symptoms prior to this fix were that I could authenticate, but upon login I
would get a '/bin/sugarshell: Permission Denied' that wasn't logged by sshd.
There are plenty of honeypot software tools out there, but I really wasn't in the mood for reading piles of probably-out-of-date documentation about how to use them. This hack (getpwnam + pam_permit + logging shell) took only a few minutes.
As a bonus, I found a feature in Fedora's yum tool that I like about freebsd's packaging system: It's trivial to ask "Where did this file come from?" Doing so made me finally look into how to do it in Ubuntu.
- FreeBSD: pkg_info -W /usr/local/bin/ssh
- /usr/local/bin/ssh was installed by package openssh-portable-4.7.p1,1
- Fedora: yum whatprovides /usr/bin/ssh
- openssh-server.x86_64 : The OpenSSH server daemon
- Ubuntu: dpkg -S /usr/bin/ssh
- openssh-client: /usr/bin/ssh
Let's see what I catch.
Comments: 0 (view comments)
Tags: ssh, honeypot, vmware, liboverride, fedora, ubuntu, freebsd
Permalink: /geekery/ssh-honeypot-is-alive
posted at: 03:43
I've posted previously
about what can be done about ssh bots. In this same context, I've just finished
working on a new idea: Tracking the username/passwords used by the bots.
To track the login attempts, I wrote a new pam module: pam_logfailure. The goal
of pam_logfailure is to log the passwords used by bots attempting to bruteforce
logins. However, when I installed the module, I found that it wasn't working properly:
Dec 20 12:24:50 kenya2 pam_logfailure: host:125.243.206.194 user:john pass:^H ^M^?INCORRECT
I saw line after line of these, and couldn't figure out why the bots were using
this as a password. Turns out they aren't. This password is what OpenSSH forces
upon pam for users that do not exist. This is apparently by design:
auth-pam.c: static char badpw[] = "\b\n\r\177INCORRECT";
If you are an invalid user, or are trying to login as root while root login is
disabled, the password you sent is replaced with 'badpw' above. This makes it
kind of hard to track what passwords bots are using...
Thankfully, I was already one step ahead of myself when I wrote a function
injection tool back in September (liboverride).
So, all I had to do was inject my own 'getpwnam' function to spoof data when a
user did not exist to trick OpenSSH into passing the password through.
After injecting my own getpwnam(), pam_logfailure started working just fine:
Dec 22 11:17:47 kenya2 pam_logfailure: host:218.1.65.233 user:admin pass:admins
So where will I go next with these ssh-bot games?
- Reverse-hack. I picked 3 random ssh bot hosts from my logs, and all of
them run sshd. It would be pretty trivial to take the password attempts used
against my machine and try them on the host the bot is coming from. Seems
likely that turning the bot's actions on itself will grant me access to the
infected machine.
- Redirect to a honeypot. We could detect when a bot is trying to login,
and add a firewall rule that would put future ssh attempts from these hosts
into a honeypot which accepts all logins to see what happens.
- Fingerprint ssh bots by behavior.
The usage of getpwnam.over is like any other liboverride code. 'make
getpwnam.so' and then use "LD_PRELOAD=/path/to/getpwnam.so ". In this case, I added this line to /usr/local/etc/rc.d/openssh (my sshd start script):
export LD_PRELOAD=/path/to/getpwnam.so
Here is the code:
Comments: 1 (view comments)
Tags: ssh, security, tracking, hacks, liboverride, pam_logfailure
Permalink: /geekery/tracking-ssh-bots
posted at: 16:37
I upgraded my vmware machine from vmware 1.3 to vmware 2.0 beta. The install
process was great by comparison to the last two releases. This install was much
nicer than the previous one for simple reasons that I didn't have to hack the
perl script to not misbehave, and I didn't have to mess around compiling or
finding my own vmware kernel modules. Everything Just Worked during the
install.
On the downside, vmware-server-console is deprecated. Vmware Server 2.0 uses
Vmware Infrastructure, which appears to be tomcat+xmlrpc and other things. The
New Order seems to be that you manage your vms with the webbrowser, which isn't
a bad idea. However, we must remember that Good Ideas do not always translate
into Good Implementations.
The web interface looks fancy, but the code looks like it's from 1998. The
login window consists of layers and layers of nested tables and a pile of
javascript all in the name of getting the login window centered in the browser.
You can see the page align itself upon rendering even on my 2gHz workstation
with Firefox. Horrible.
Once you log in, you're presented with a
visually-useful-but-still-runs-like-shit interface. The interface itself
appears useful and nice, but again fails to respond quickly presumably due to
the piles of poorly written javascript involved.
Since VMware thought this was a fresh install, it didn't know about any of my
old virtual machines. Adding them using the web interface causes vmware to
crash. Oops. So, I found a vmware infrastructure client executable randomly in
the package; "find ./ -name '*.exe'" will find it for you. Copied this to my
windows box and installed it. I used this tool to re-add my old vmware machines.
Unfortunately, "raw disks" are disabled in this free version of vmware server.
I'm not sure why. My Solaris VM uses raw disks for its zfs pool, so this was a
problem. Luckily, this is purely a gui limitation and not a vmware limitation.
To repair my Solaris VM, I created a new virtual machine with the same features
and told it where it's first disk lived (the first disk was a normal
file-backed vmware disk image). After that, I looked at the old vm's .vmx file
and copied in the lines detailing the raw drives to the new .vmx file:
scsi0:1.present = "true"
scsi0:1.filename = "zfs-sdb.vmdk"
scsi0:1.deviceType = "rawDisk"
scsi0:2.present = "true"
scsi0:2.filename = "zfs-sdc.vmdk"
scsi0:2.deviceType = "rawDisk"
Everything's backup and running sanely now in vmware. Hurray :)
Comments: 2 (view comments)
Tags: vmware, linux, virtualization, upgrades
Permalink: /geekery/vmware-server-2.0-beta-1
posted at: 13:44
I picked up an iphone tonight at the apple store in the mall. Yay new toy. The
setup process was pretty simple. In fact, I feel comfortable stating that this
cellphone purchase was absolutely the most pleasant cellular experience I've had.
- Go to Apple store.
- Say "I want to buy an iPhone"
- Someone hands you a box with an iPhone in it. You pay.
- Go home, plug iPhone into PC.
- Run iTunes. Follow the trivially simple activation steps(*)
- Rejoice now that your iPhone is activated without ever having to deal with morons at the AT&T retailers
(*) My desire to keep my current Cingular AT&T account was
satisfied. I was presented with the option of transfering my current service to
the new phone. It even set me up with a data plan. Pretty hassle-free.
Much love to Apple for making me not have to talk to anyone at the AT&T retail stores.
- Uploading photos to flickr is the same as my previous phone: emailing photos to flickr
- meebo.com (web2.0fancy instant messenger gateway) happily works on the
iphone.
- google reader works too
Comes standard with a google maps and youtube apps.
Fancy. The first two project ideas that came to mind are a both remote-control
tools. One for a universal remote (basically smash buttons on a webpage tells
an IR emitter to do things) and one to remotely control a PC (mouse, keyboard,
etc). I'm guessing there will be a VNC client for the iphone out very shortly
after the Apple SDK is released, so I'll think about the infrared remote control project.
Comments: 1 (view comments)
Tags: iphone
Permalink: /geekery/iphone-for-me
posted at: 03:27
I've got a stable, threaded version of this fancydb tool I've been working on.
However, the performance of insertions is less than optimal.
Then again, how much should insert performance matter on a monitoring tool? For
data that comes into it gradually, speed doesn't matter much. For bulk inserts,
speed matters if you want to get your work done quickly. I haven't decided if
bulk insertions are necessary use case for this tool. Despite that, I'm still
interested in what the limits are.
I have experimented with many different implementations of parallelism,
buffering, caching, etc in the name of making insertion to a fancydb with 10
rules fast. The fastest I've gotten it was 10000/sec, but that was on an
implementation that wasn't threadsafe (and used threads).
My most-recent implementation (which should be threadsafe) can do reads and
writes at 30000/sec. With evaluation rules the write rate drops to about
10000/sec.
The next task was to figure out what I was doing wrong. For comparison, I wrote
two vanilla bdb accessing programs. One in C and one in Python. The output of
these two follows:
# The args for each program is: insertions page_size cache_size
% sh runtest.sh
Running: ./test 2000000 8192 10485760
=> 2000000 inserts + 1 fullread: 209205.020921/sec
Running: ./py-bsddb.py 2000000 8192 10485760
=> 2000000 inserts + 1 fullread: 123304.562269/sec
As expected, C clearly outperforms Python here, but the margin is pretty small
(C is 69% faster for this test). Given the 120000/sec rate from Python, the
poor input rate of my tool seems to be blamed on me. Is my additional code here
really the reason that I can only write at 30000 per second? I may need to
revisit how I'm implementing things in python. I'm not clear right now where
I'm losing so much throughput.
So I use hotshot (python standard profiler) and I find that most of the time is
spent in my iterator method. This method is a generator method which uses yield
and loops over a cursor.
It's important to note that my python bdb 'speed test' above did not use
generators, it used a plain while loop over the cursor. So, I wrote another
test that uses generators. First, let's try just inserts, no reading of data:
Running: ./test 1000000 8192 10485760
=> 1000000 inserts: 261096.605744/sec
Running: ./py-bsddb.py 1000000 8192 10485760
=> 1000000 inserts: 166389.351082/sec
Now let's try with 3 different python reading methods: while loop across a cursor, generator function (using yield), and an iterator class (implementing __iter__):
Running: ./py-bsddb.py 4000000 8192 10485760
=> 1 fullread of 4000000 entries: 8.660000
Running: ./py-bsddb_generator.py 4000000 8192 10485760
=> 1 fullread of 4000000 entries: 9.124000
Running: ./py-bsddb_iterable_class.py 4000000 8192 10485760
=> 1 fullread of 4000000 entries: 13.130000
I'm not sure why implementing an iterator is so much slower (in general) than a
yield-generator is. Seems strange, perhaps my testing code is busted. Either
way, I'm not really closer to finding the slowness.
get this code here
Comments: 0 (view comments)
Tags: c, python, bdb, performance
Permalink: /geekery/c-vs-python-bdb
posted at: 03:59
Let me caveat this rant with the fact that I've only been playing with
matplotlib for approximately a week.
All the demos made matplotlib (a python module) look like a great tool that I
should want to use to graph things, then I started trying to actually write
code and it all went downhill.
Almost all of the functions operate on some mystical global scope, meaning they
are by design not threadsafe. Probably not a big deal, I guess, but it
certainly feels like an alien world especially given all the object oriented
code in use in python.
If this culture shock wasn't bad enough, it went ahead and decided to use
inches and ratios as the standard units of measure. You make a figure of a set
width and height (in inches) and you can put stuff in that figure given ratio
offsets. An offset of '.5' would put your left-bound in the middle. Weird and
unexpected. Perhaps not bad. Still, I'm used to pixels, not inches.
Some of the arguments are just looney:
fig.add_subplot(111)
The docs say this "subplot(211) # 2 rows, 1 column, first (upper) plot".
Base10 flag system? What. the. F. I'm at a loss as to why this was ever a good
idea. Let's make it hard to add plots? Looks like you can use 'subplot(rows,
cols, plotnum)' which is the sensible solution, but all the demos use the
integer syntax, and it makes me sad.
You can't easily put the legend outside the graph.
Setting the default font size means you have to set at least 6 things. Make
sure you note the excessive use of different tokens for the same freaking
setting: labelsize, titlesize, size, and fontsize.
rc("axes", labelsize=10, titlesize=10)
rc("xtick", labelsize=10)
rc("ytick", labelsize=10)
rc("font", size=10)
rc("legend", fontsize=10)
I have code that looks like this:
fig = figure()
p = subplot(111)
line = p.plot_date(dates, values)
line[0].set_label("foo")
legend()
fig.savefig('foo.png', format='png')
Notice my entertaining leaps between OOP and WTF. Other cute nuances are that
the docs/examples are littered with:
ax = subplot(111)
You might think that the name 'ax' means 'axis' and that subplot returns an
axis. No. You might ask python with type() and it would say '<type
'instance>'. Helpful. If you just print ax you'll see it is
matplotlib.axes.Subplot. I'm trying hard to not get hung up on semantics, but
'axis' to me is very different from a plot. Plot seems like a visual
representation, and an axis is a single dimension of a graph (aka a plot).
After several days of playing with this tool, I am frustrated and disheartened.
It has such powerful features like tick rules: You can trivially specify "Put
one major tick every 3rd week". However, the api is half OO half
globally-scoped-procedural. Maybe this is my fault. The docs constantly mix
'matplotlib' and 'pylab' methods. Perhaps you can use just the matplotlib
functions by themselves and you don't need pylab? Pylab, by the way, is what
provides these awkward global functions and in theory only exists as a pure
wrapper on top of matplotlib.
Comments: 3 (view comments)
Tags: rants, matplotlib
Permalink: /geekery/matplotlib-induces-self-loathing
posted at: 05:39
|
Search this site
Navigation
Metadata
Home
About
Resume
My Code
ARP Security
Dynamic DNS with DHCP
OpenLDAP+Kerberos+SASL
PPP over SSH
SSH Security: /bin/false
Week of Unix Tools
Work Efficiency
fex
firefox tabsearch
firefox urledit
grok
keynav
liboverride
newpsm (FreeBSD)
nis2ldap
pam_captcha
poor man's backup
Solaris audio utility
xboxproxy
xdotool
xmlpresenter
xpathtool
misc scripts
Presentations
Yahoo! Hack Day '06
Unix Essentials
Vi/Vim Essentials
Tag Cloud
Calendar
| < |
December 2007 |
> |
| | | | | | | 1 |
| 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| 9 | 10 | 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 | 21 | 22 |
| 23 | 24 | 25 | 26 | 27 | 28 | 29 |
| 30 | 31 | | | | | |
Friends
BarCamp
Kent Brewster
Tantek Çelik
John Resig
Wesley Shields
Tyler Shields
Technorati
|