Search this site


Metadata

Articles

Projects

Presentations

GNOME keyring daemon vs ssh-agent

A fellow sysadmin friend of mine has, from time to time, bitched about performance problems with gnome's ssh agent called "keyring." I don't use gnome, so I don't have his problems.

Yesterday, I wrote some ruby code that'll let you use your own ssh keys (and agent) to sign and verify arbitrary strings. See jordansissel/ruby-sshkeyauth on github.

Putting the above module to work, I can easily compare signing speeds of agents under various load conditions. The main complaint was that keyring falls apart under parallel signing demands - a very common situation for any sysadmin that sshes to more than one system at a time say, with a script, or capistrano, or another similar tool.

Under single-client signing requests (one at a time), gnome-keyring on ubuntu 10.04 signed 1000 "Hello world" strings in 12.76 seconds; ssh-agent cut that time by more than half achieving the same result in 5.05 seconds.

Under reasonable parallel load, ssh-agent's lead grew even further. On 4 cores, 5 'signing' processes, and 1000 signing requests each, the results are gnome-keyring signing all 1000 requests on each of 5 processes in parallel in 113.23 seconds, while ssh-agent signed the same in 30.61 seconds.

If you want to repeat my tests, you can use the 'samples/speed.rb' script from the above ruby-sshkeyauth project on github. Single-thread tests were done using ruby speed.rb "Hello world" while the 5-process test was done using seq 5 | xargs -P5 -n1 ruby speed.rb "Hello world".

So, if you're a regular user of ssh and ssh agents, you may want to stick with ssh-agent rather than gnome-keyring-daemon given the huge performance lead ssh-agent has.

Hack for quickly trimming invalid ssh keys

If you reimage a machine or change dns, you may get any of these messages when sshing in:
Offending key for IP in /home/jsissel/.ssh/known_hosts:239
Matching host key in /home/jsissel/.ssh/known_hosts:252
Offending key in /home/jsissel/.ssh/known_hosts:237
Seem familiar? Here's a very quick way to trim those.
#!/bin/sh

eval "value=\$$#"

if [ "$#" -lt 1 ] ; then
  echo "Invalid arguments."
  exit 1
fi

if ! echo "$value" | egrep -q '.*:[0-9]+$' ; then
  echo "Invalid file:lineno format: $value"
  exit 1
fi

echo "$value" | awk -F: '{print "sed -i -e "$2"d",$1}' | sh -x
  • Put this in ~/bin/clearssh.sh
  • chmod 755 ~/bin/clearssh.sh
  • ln -s ~/bin/clearssh.sh ~/bin/Matching
  • ln -s ~/bin/clearssh.sh ~/bin/Offending
Now the next time you see these messages and want to clear the offending key, just paste the log message, as a command, into your terminal:
jls(~) % Offending key for IP in /home/jsissel/.ssh/known_hosts:239
+ sed -i -e 239d /home/jsissel/.ssh/known_hosts
Makes for a quick fix if you hit these messages in your normal day.

I prefer this to using 'ssh-keygen -R' as the error message has exactly the information you need to clear the bad key.

rsync through an ssh server

I have a host (let's call it filer) that is only accessible through a bastion server. Here's how to rsync directly do that host:
rsync -e 'ssh bastion ssh' somefile.tar.gz filer:
I haven't posted much recently due to my wedding coming soon and being extremely busy with work. I've been learning a great deal of new tools recently, so expect some brain dumps about some of the following:
  • lvm
  • qemu
  • windows hackery

ssh honeypot auditing

I've only gotten a few hits on my honey pot, and none of the bots seem to be doing much. I think it might be because the shell I have setup doesn't behave correctly. Here's the new one:
#!/bin/bash
d="$(date "+%Y%m%d-%H%M%S")"
logfile="/var/log/traps/$d"
env > $logfile
echo "Args: $*" >> $logfile
export SHELL=/bin/bash
script -c "$SHELL $*" -q -a $logfile
This will log the env vars in addition to the arguments passed to the shell. Thus far, I've see 2 patterns of environment variables.

This new version supports arguments, so that things like 'ssh [email protected] somecommand' works. The next step is probably to have a setuid program chown the logfile to root shortly after script(1) starts, so that you can't remove your own log. I'll only bother with that if it's necessary.

In addition to the shell change, I started looking into the audit facility in Linux. I want to log all command execution, in case my script(1) idea fails. To do this, I added these rules with auditctl:

auditctl -a exit,always -F uid=60000 -S open
auditctl -a exit,always -F uid=60000 -S execve
auditctl -a exit,always -F uid=60000 -S vfork
auditctl -a exit,always -F uid=60000 -S fork
auditctl -a exit,always -F uid=60000 -S clone
I'm not entirely sure if this will specifically catch the execs I'm looking for, but it does seem to work:
% ausearch -sc execve | grep EXECVE
type=EXECVE msg=audit(1199138086.041:3293): a0="/bin/bash" a1="-c" a2="uptime"-
type=EXECVE msg=audit(1199138086.056:3300): a0="uptime"-

ssh honeypot.

Using slight variations on the techniques mentioned in my previous post, I've got a vmware instance running Fedora 8 that permits any and all logins. These login sessions are logged with script(1).

Fedora 8 comes with selinux enabled by default. This means sshd was being denied permission to execute my special logging shell. The logs in /var/log/audit/ explained why, and audit2allow even tried to help make a new policy entry for me. However, I couldn't figure out (read: be bothered to search for more than 10 minutes) how to install this new policy. In searching, I found out about chcon(1). A simple command fixed my problems:

chcon --reference=/bin/sh /bin/sugarshell
The symptoms prior to this fix were that I could authenticate, but upon login I would get a '/bin/sugarshell: Permission Denied' that wasn't logged by sshd.

There are plenty of honeypot software tools out there, but I really wasn't in the mood for reading piles of probably-out-of-date documentation about how to use them. This hack (getpwnam + pam_permit + logging shell) took only a few minutes.

As a bonus, I found a feature in Fedora's yum tool that I like about freebsd's packaging system: It's trivial to ask "Where did this file come from?" Doing so made me finally look into how to do it in Ubuntu.

FreeBSD: pkg_info -W /usr/local/bin/ssh
/usr/local/bin/ssh was installed by package openssh-portable-4.7.p1,1
Fedora: yum whatprovides /usr/bin/ssh
openssh-server.x86_64 : The OpenSSH server daemon
Ubuntu: dpkg -S /usr/bin/ssh
openssh-client: /usr/bin/ssh

Let's see what I catch.

Tracking and Analyzing SSH Bots.

I've posted previously about what can be done about ssh bots. In this same context, I've just finished working on a new idea: Tracking the username/passwords used by the bots.

To track the login attempts, I wrote a new pam module: pam_logfailure. The goal of pam_logfailure is to log the passwords used by bots attempting to bruteforce logins. However, when I installed the module, I found that it wasn't working properly:

Dec 20 12:24:50 kenya2 pam_logfailure: host:125.243.206.194 user:john pass:^H ^M^?INCORRECT
I saw line after line of these, and couldn't figure out why the bots were using this as a password. Turns out they aren't. This password is what OpenSSH forces upon pam for users that do not exist. This is apparently by design:
auth-pam.c: static char badpw[] = "\b\n\r\177INCORRECT";
If you are an invalid user, or are trying to login as root while root login is disabled, the password you sent is replaced with 'badpw' above. This makes it kind of hard to track what passwords bots are using...

Thankfully, I was already one step ahead of myself when I wrote a function injection tool back in September (liboverride). So, all I had to do was inject my own 'getpwnam' function to spoof data when a user did not exist to trick OpenSSH into passing the password through.

After injecting my own getpwnam(), pam_logfailure started working just fine:

Dec 22 11:17:47 kenya2 pam_logfailure: host:218.1.65.233 user:admin pass:admins
So where will I go next with these ssh-bot games?
  • Reverse-hack. I picked 3 random ssh bot hosts from my logs, and all of them run sshd. It would be pretty trivial to take the password attempts used against my machine and try them on the host the bot is coming from. Seems likely that turning the bot's actions on itself will grant me access to the infected machine.
  • Redirect to a honeypot. We could detect when a bot is trying to login, and add a firewall rule that would put future ssh attempts from these hosts into a honeypot which accepts all logins to see what happens.
  • Fingerprint ssh bots by behavior.

The usage of getpwnam.over is like any other liboverride code. 'make getpwnam.so' and then use "LD_PRELOAD=/path/to/getpwnam.so ". In this case, I added this line to /usr/local/etc/rc.d/openssh (my sshd start script):

export LD_PRELOAD=/path/to/getpwnam.so

Here is the code:

Distributed xargs

I like xargs. However, xargs becomes less useful when you want to run in parallel many cpu-intensive tasks with more parallelism than you have cpus cores local to your machine.

Enter dxargs. For now, dxargs is a simple python script that will distribute tasks in a similar way to xargs but will distribute them to remote hosts over ssh. Basically, it's a threadpool of ssh sessions. An idle worker will ask for something to do, letting you get the maximum throughput possible; meaning your faster servers will be given more tasks to execute than slower ones simply because they complete them sooner.

As an example, let's run 'hostname' in parallel across a few machines for 100 total calls.

% seq 100 | ./dxargs.py -P0 -n1 --hosts "snack scorn" hostname | sort | uniq -c
    14 scorn.csh.rit.edu
    86 snack.home

# Now use per-input-set output collating:
% seq 100 | ./dxargs.py -P0 -n1 --hosts "snack scorn" --output_dir=/tmp/t 'uname -a'
% ls /tmp/t | tail -5
535.95.0.snack.1191918835
535.96.0.snack.1191918835
535.97.0.snack.1191918835
535.98.0.snack.1191918835
535.99.0.snack.1191918835
% cat /tmp/t/535.99.0.snack.1191918835
Linux snack.home 2.6.20-15-generic #2 SMP Sun Apr 15 06:17:24 UTC 2007 x86_64 GNU/Linux
Design requirements:
  • Argument input must work the same way as xargs (-n<num>, etc) and come from stdin
  • Don't violate POLA where unnecessary - same flags as xargs.
Basically, I want dxargs to be a drop in replacement for xargs with respect to compatibility. I may intentionally break compatibility later where it makes sense, however. Also, don't consider this first version POLA-compliant.

Neat features so far:

  • Uses OpenSSH Protocol 2's "Control" sockets (-M and -S flags) to keep the session handshaking down to once per host.
  • Each worker competes for work with the goal of having zero idle workers.
  • Collatable output to a specified directory by input set, pid, number, host, and time
  • '0' (aka -P0) for parallelism means parallelize to the same size as the host list
  • Ability to specify multiplicity by machine with notation like 'snack*4' to indicate snack can run 4 tasks in parallel
  • 'stdout' writing is wrapped with a mutex, so tasks can't interfere with output midline (I see this often with xargs)
Desired features (not yet implemented):
  • Retrying of input sets when workers malfunction
  • Good handling of ssh problems (worker connect timeouts, etc
  • More xargs and xapply behaviors

Download dxargs.py

Video of my SSH Tunneling talk at Barcamp Stanford

I got A.D.D. tonight while working on Pimp and decided to Google myself and see what came up. I do this once very few months. Normally the results aren't terribly interesting. However, since I'd recently attended a few barcamps, I noticed that some folks had written about some of the talks I'd given.

I found a video of my 'firewall bypass/ssh tunnel/encrypt-your-traffic' session I held at BarCamp Stanford. Thanks to ValleyGeek for posting this. I'm assuming the author recorded it.

Towards the end of the talk I start rambling about nat traversal. Hopefully I got the point across that nat<->nat traversal isn't easy, and that tunneling is a way to do it.

http://video.google.com/videoplay?docid=-3694684117834367516

Along with that video, I found my urban-dictionary entry for 'Fo Shizzle Friday'. I can't remember why I created that.

Getting public-key auth working in Solaris 10

Once upon a time, there was a Solaris 10 box where I wasn't able to use ssh keys to login.

Thankfully, that time has now passed. The problem was because PAM was denying access with public keys.

Running sshd in debug mode (-ddd) I would see this:

Found matching DSA key: 80:aa:32:03:ef:51:9c:7b:0f:1d:ac:37:17:d5:fd:2b
debug1: restore_uid: 0/0
debug1: ssh_dss_verify: signature correct
debug2: Starting PAM service sshd-pubkey for method publickey
debug3: Trying to reverse map address 69.181.132.53.
debug2: userauth_pubkey: authenticated 0 pkalg ssh-dss
Failed publickey for psionic from 69.181.132.53 port 55957 ssh2
Clearly indicated here, is the fact that it accepted my ssh-dss key, but I failed for some other reason. Listed here, is: Starting PAM service sshd-pubkey for method publickey. Solaris 10's manpage for sshd shows that it uses different PAM service names for each type of authentication.

The solution involved adding a simple service entry in /etc/pam.conf:

sshd-pubkey    auth required           pam_unix_cred.so.1
It works now. This takes effect immediately as the pam config is invoked any time sshd uses pam, so you don't have to restart sshd.

New ssh vpn article, soon.

I keep an eye on my apache access logs to see what kind of traffic my site gets and why it gets here. It seems that a non-trivial number of searches are for 'vpn over ssh' and similar variants. These land at the ppp over ssh article.

New versions of openssh have built-in support for tunneling, and do not require ppp at all. Seeing as how I've never really used this new feature, and there's a nontrivial number of searches ending up on the ppp-over-ssh article, I think it's time to write a little article on how to use the new openssh built-in tunneling.

Stay tuned...