Search this site





Puppet Trick - Exported Resource Expiration

I've finally taken the plunge with puppet's exported resources.

"Exported resources" is a feature of puppet that allows your nodes to export resources to other nodes. You can read more about this feature on puppet's exported resources documentation. Covering how to setup exported resources or storeconfigs is out of scope, but if you need help read the docs and come to #puppet on freenode IRC.

Exported resources are pretty cool, but they lack one important feature - expiration/purging. The storeconfigs database has no idea about nodes that you have decommissioned or repurposed, so it's very possible to leave exported resources orphaned in your database.

I worked around this by making my resources such that I can expire them. This is done by making a custom define that has a 'timestamp' field that defaults to now, when registering each time. If a node has not checked in (and updated its resources) recently, I will consider that resource expired and will purge it.

I made a demo of this and put the code on github: jordansissel/puppet-examples/exported-expiration. More details (and example output of multiple runs with expiration) are available in the README.

The demo is runnable by itself (standalone, no puppet master), so you can test it without needing to mess with your own puppet installations.

xdotool 2.20100623.2949 released

The latest release of xdotool includes a few bug fixes, minor improvements, and one major feature addition. All of the tests pass in the test suite (156 tests, 1892 assertions, 126 seconds).

The major feature is the new command chaining. That is, you can use multiple commands from the same xdotool invocation. Any window querying (search, getactivewindow, getwindowfocus) will save the result for use with future commands on the same invocation. For a simple example, we can resize the current window by doing this:

% xdotool getactivewindow windowsize 600 400
You can also activate the first Firefox window found:
% xdotool search --class firefox windowactivate
Or move all gimp windows to a specific desktop and then activate (switch to) gimp:
% xdotool search --class gimp set_desktop_for_window %@ 2 windowactivate %@

# the "%@" means all windows found in the search. The default when you do not
# specify a window is always "%1" aka the first window found by the search.
This idea for this feature came primarily from detailed suggestions by Henning Bekel. Huge thanks for dumping ideas my way :)

In addition, libxdo is now documented with Doxygen and is available here. Remember, if you have a feature request, it can't hurt to ask (or send patches!). If I have time/energy to work on it, I'll do what I can.

Download: xdotool-2.20100623.2949.tar.gz

As usual, if you find problems or have feature requests, please file bugs or send an email to the list. Changelist from previously-announced release:

  - Added 'window stack' and 'command chaining' support. Basically lets you
    include multiple commands on a single xdotool invocation and saves the last
    window result from search, getactivewindow, and getwindowfocus. For example,
    the default window selector is "%1" meaning the first window. 'xdotool
    search --class xterm windowsize 500 500' will resize the first xterm found.
    All commands that take window arguments (as flags or otherwise) now default to
    "%1" if unspecified. See xdotool(1) manpage sections 'COMMAND CHAINING'
    and 'WINDOW STACK' for more details. This feature was suggested (with great
    detail) by Henning Bekel.
  - To simplify command chaining, all options now terminate at the first
    non-option argument. See getopt(3) near 'POSIXLY_CORRECT' for details.
  - Add --sync support to windowsize.
  - Update docs and output to indicate that 'search --title' is deprecated (but
    still functions). Use --name instead.
  - Fix mousemove --screen problem due to bug in XTEST. Reported by
    Philipp Specht,
  - Fix segfault when invoking xdotool with an invalid command.
    Reported by: Bruce Jerrick, Sven Lankes, Daniel Kahn Gillmor, and Thomas
  - Fix bug --clearmodifiers bug caused by an uninitialized value being
    interpreted as the 'modmask' causing us to try to clear invalid modifiers.
    Reported by Hong-Leong Ong on the mailing list.
  - Lots of internal refactoring and documentation improvements.
  - Testing additions for several commands in addition to command chaining.
  - Documented libxdo through xdo.h. Docs can be generated by 'make docs'
    from the xdotool release.
  - libxdo: xdo_window_translate_with_sizehint
  - libxdo: xdo_window_wait_for_size

Headless wrapper for ephemeral X servers

For various projects I'm doing right now, I need an easy way to automatically run code in an X server that may not necessarily be the active display. This code may even run on servers in production that don't have video cards or monitors attached.

For some history on this, check out this post on xvfb and firefox. You can solve the problem in that post by simply launching firefox with the tool below, and your X server will exit when firefox exits.

To solve my need for automated headless Xserver shenanigans, I wrote a script that will wrap execution with a temporary X server of your choosing. For xdotool tests I generally use Xephyr and Xvfb.

What's so special about this script? It picks an unused display number reliably and runs your command with that set as DISPLAY. This is a major win because you don't have to pre-allocate X servers (headless or otherwise) before you start your programs. will pick the first unused display, just as binding a socket to port 0 will pick an unused port. Finally, when your process exits, the xserver and windowmanagers are killed. Example usage:

% ./ firefox
Trying :0

Fatal server error:
Server is already active for display 0
        If this server is no longer running, remove /tmp/.X0-lock
        and start again.

Trying :1
Xvfb looks healthy. Moving on.
Using display: :1
Running: firefox
It does health checks to make sure the X server is actually up and running before continuing.

You can also specify a window manager to run:

% ./ -w openbox-session firefox
Trying :1
Xvfb looks healthy. Moving on.
Using display: :1
Starting window manager: openbox-session
Waiting for window manager 'openbox-session' to be healthy.
... startup messages from openbox ...
openbox-session looks healthy. Moving on.
Running: firefox
The default X server is Xvfb. You can use Xvnc, Xephyr, or any X server. Here we run Xephyr with some options:
% sh -x "Xephyr -ac -screen 1280x720x24" -w openbox-session firefox
Xephyr looks healthy. Moving on.
Using display: :1
Starting window manager: openbox-session
Waiting for window manager 'openbox-session' to be healthy.
openbox-session looks healthy. Moving on.
Running: firefox
Screenshot of what this looks like is here: xephyr-ephemeral-x-example.png

I use this script to run my xdotool tests. Additionally, parallelizing test execution can often lead to faster tests. Wrapping each test run with ensure that each test suite runs in a clean environment, untainted by any previous test, allowing me to run all test suites in parallel. Prior to writing this script, I would run each test suite in serial on the same X server instance.

I use a similar technique at work to start ephemeral X servers for running WebDriver tests in hadoop. Each mapper starts its own X server, safely, and kills it when it is completed. This is implemented in java, instead of shell, since the mappers all launch from java and mapreduce launch differently than a standard command would in your shell.

Code lives here:
xdotool tests: xdotool/t

Puppet to manage $HOME dotfiles

On SysAdvent Day 11 I covered the importance of having your own personal environment carry with you. This includes your rc files like .vimrc, etc.

If you manage users with puppet, here's how to manage it with puppet while falling back to a default skeleton (similar to /etc/skel) directory for users who don't have individual ones:

class user::people {
  # ...

  define localuser($uid, ...) {
    user {
        uid => $uid,

    file {
        ensure => directory,                                                      
        owner => $name,                                                           
        group => $gid,                                                            
        source => [ "puppet:///user/home/$name/",                                 
                    "puppet:///user/skel/" ];                                     

  localuser {
    "jls": uid => 10000;
Puppet will first search for 'user/home/$name' on it's file server, and sync it if it exists. If it does not exist, it falls back to 'user/skel'. The benefit here is that whenever you update the default 'skel' directory, all users who depend on it will be automatically updated by puppet.

Update: I've since stopped using this method to sync homedirectories. Puppet (at time of writing, 0.25.1) is not smart about how it scans for files to update when recursion is on - puppet walks all files in each homedir, even if they aren't part of the file list to be pushed by puppet. I use the script mentioned in sysadvent 2008 day 11 with a cron job. This cron job is deployed with puppet.

Find that lost screen session: Episode 3.

Previous posts about screen have shown a few new tools for searching your list of open screen sessions.o

Today, I finally sat down and worked on the next installment: Being able to query any screen window and the window list. The difference between the previous script is that we can now grep screen windows other than the 0th one. Additionally, we can now grep the screen window list (which, by the way, has some excellent information).

To that end, I present now two scripts:

You need both for this to work optimally, but they exist separately because the functionality is somewhat distinct.

The 'hardcopy' script takes a single argument, a screen session. It will hardcopy all windows in that screen session including the window list. If you specify OUTDIR in your environment, the screen hardcopies will be put in that directory; otherwise, the output directory is printed to stdout for consumption by another script.

The 'search' script runs the hardcopy script on all active screen sessions (in parallel, yay xargs). Once it has all of the copies, it will grep through the output for your query string (regular expression). It supports 3 flags:

  • -t - only search 'window titles' (ie; only window list output)
  • -w - only search window contents (ie; exclude window list output)
  • -l - only search the 'location' field of the window list
Now, with a single command, I can find out where that ssh session to 'foo' disappeared to. Here's an example screen window list capture (accessed with Ctrl+A " (doublequote))
Num Name                                                              Location Flags

  0 zsh                                                                        syn $
  1 zsh                                                                      scorn $
Now, I want to find all sessions open to 'scorn':
% -t 'scorn'
sty 18210.pts-8.snack window 1
sty 18556.pts-0.snack window 0
It found 2 sessions. I can attach to the first one with:
screen -x 18210.pts-8.snack -p 1
          ^ screen session     ^ window
caveat: I've been hacking on things all night, so the code may or may not be very readable. Apologies if you go blind while trying to read it ;)

Find that lost screen session, episode 2.

Like I said, I run screen in all of my xterms...

xterm sets an environment variable in child processes: WINDOWID. This is the X window id of the xterm window. Using this, we can extend upon my last post and come up with a much neater solution. Knowing what screen session you want to bring forward (assuming it's running in an xterm), we can run a command inside that session that grabs the $WINDOWID variable in the shell and uses xdotool to activate the window.

session=$(sh "#freebsdhelp")
STY=$session screen -X screen sh -c 'xdotool windowactivate $WINDOWID'
Running this causes my IRC window to be activated by xdotool, which means it is now active, focused, and on top.

This isn't entirely optimal, because it assumes the xterm attached to that screen session is the xterm that launched it. If you run 'xterm -e screen -RR' and close the xterm (don't log out of the shell), then rerun 'xterm -e screen -RR' it will attach to that previous screen session, but the WINDOWID will understandably not be correct any longer.

So what do we do? Using the screen session given, we create a new window in that session and set the title of that window to a generated string. We then use xdotool to search for that string and activate the window. Once we get there, we can kill that new screen window we created and we are left with the terminal holding our screen session sitting in front of us.

I wrote a script to do just that tonight: Example usage: 24072.pts-25.snack

This has a great benefit of supporting every terminal program that understands how to set the terminal window title when screen changes it's title. I have tested my .screenrc in Eterm, Konsole, gnome-terminal, and xterm - all know when screen changes it's title if you put this in your .screenrc:

hardstatus string "[%n] %h - %t"
termcapinfo xterm 'hs:ts=\E]2;:fs=\007:ds=\E]2;screen (not title yet)\007'

# Might need this:
termcapinfo  * '' 'hs:ts=\E_:fs=\E\\:ds=\E_\E\\'

Find that lost screen session

Scenario: I run lots of xterms. Each xterm runs a single screen session(*). At any given time, I can only see some of the xterm windows (the others are hidden).

(*) All my xterms run with: 'xterm -e screen -RR'. This causes them to attach to the first-found detached screen, and if none exist creates a new screen session. See for my pleasant, random-colored xterm script.

Problem: I forget where I put things. I can't find that terminal where I'm editing foo.c!

Possible Solutions:

  1. Bad: Kill the vim session that's editing the file, and rerun vim somewhere else.
  2. Good: Use xdotool to search window titles for 'foo.c'
  3. Great: Find the screen STY variable for the process 'vim foo.c'
  4. Great: Ask each open screen session about what it is on screen
Today, we'll cover the two 'great' solutions. I wrote both of these a while ago, but I totally forgot to post about them. Here you go :)

Find a screen by it's child processes

This tool takes a regexp pattern as the only argument and will output a list of screen sessions having child process commands that match that pattern. This is useful for finding what screen is running 'vim foo.c'

% ./ 'vim foo.c'
Find a screen by what is being displayed

This tool takes a regexp pattern as the only argument. It uses screen's hardcopy command to save the on-screen buffer and then applies the regexp given to the buffer. If it matches, the screen session is output. There is special behavior if only one screen session is found: If the screen session is currently attached, it will flash that screen session giving you a visual clue about where it is; if it is not attached, it will attach to it.

% ./ "keynav"
In case you still aren't clear, the two tools help you find your lost screen sessions. Maybe they aren't lost, but certainly it's easier to search for them by text than by eyeballs if you know what's in them.

A short summary: will search for commands running in a screen session and will search for literal text displayed in a screen session. Both are super useful.

Note: Currently, can only capture the contents of the 0th screen window (screen sessions can have multiple windows). I worked for a while on solving this, but for whatever reason I couldn't get it working properly.

Merging multiple svn repositories

Over the past several years, I've used mainly CVS. I tried switching to subversion, which has been slow-going. To speed that process, I merged all of my repositories together into one svn repo. I also used to convert everything in cvs to svn, which put everything into /trunk/ in the repository - not what I wanted. A simple script fixes that:
svn ls $repo/trunk | xargs -I@ -n1 svn mv $repo/trunk/@ $repo/@

I used svn poorly at first - one repository per project. To fix that, I needed to dump all of them (with svnadmin) and load them into a central repository:

# svnadmin dump all of my svn repositories
for i in $repodir/SVN/*; do 
  echo $i;
  svnadmin dump $i > $(basename $i).dump
# load all of my dumpped repositories into the new one
svnadmin create $repo
for i in *.dump; do 
  proj="$(echo $i | cut -d. -f1)";
  svn mkdir -m "mkdir $proj for import" file://$repo/$proj
  svnadmin load --parent-dir $proj $repo < $i

Splitting large logs

I've started using grok more. I don't have any regular log processing to do, so any usage is random and not related to previous runs. As such, I'd would really love it if grok could process faster. Rather than concentrating on a potentially-dead-end of making grok faster (complex regex patterns are cpu intensive), I think it'll be easier to make it parallelizable. The first step towards parallelization of grok is being able to split logs into small pieces, quickly.

Both FreeBSD's and Linux's (GNU, really) split(1) tool is fairly slow when you want to split by line count. Splitting by byte count is (should be) faster, but that will break logs because the byte boundary probably won't ever align on a log boundary..

I'd prefer both: the speed of byte size splitting and the sanity of line based splitting. To that end, I wrote a small tool to do just that. Yes, there's probably a tool already available that does exactly what I want, but this was only 100 lines of C, so it was quick to write. GNU split's --line-bytes option is mostly what I want, but it's still crap slow.

download fastsplit

Here's a comparison between my tool and gnu's split, run on the fastest workstation I have access to. My tool runs 4 times faster than gnu split for this task.

# Source file is 382 megs, which is tiny.
% du -hs access
382M    access

# (Fast) Split into approximately 20meg chunks while preserving lines.
% time ./fastsplit -b 20971520 -p ./ access 

real    0m1.260s
user    0m0.018s
sys     0m1.242s

# (GNU) Split into 87000-line chunks, no guarantee on size.
% time split -l 87000 access split.normal..

real    0m4.943s
user    0m0.395s
sys     0m2.440s

# (GNU) Split into 20mb (max) chunks, preserve lines.
% time split --line-bytes 20971520 access split.normal_bytes.

real    0m4.391s
user    0m0.001s
sys     0m1.779s
You can see that the actual 'system' time is somewhat close (mine wins by 0.4s), but 'real' time is much longer for Linux's split(1).. My solution is really good if you want quickly split logs for parallel processing and you don't really care how many lines there are so much as you get near N-sized chunks.

What's the output look like?

fast split gnu split -l gnu split --line-bytes
% wc -l*
  1654604 total
% wc -l split.normal.*
 87000 split.normal.aa
 87000 split.normal.ab
  1654604 total
% wc -l split.normal_bytes.*
 85973 split.normal_bytes.aa
 80791 split.normal_bytes.ab
  1654604 total
% du -hs*
% du -hs split.normal.*
21M     split.normal.aa
22M     split.normal.ab
% du -hs split.normal_bytes.*
21M     split.normal_bytes.aa
21M     split.normal_bytes.ab

Mini-FreeBSD script

I wrote a script a while ago to build a very tiny freebsd world. It's extremely fast and only builds a freebsd image in approximately 10 megs of space. It lets you quickly create new jail enviroments or system images for small embedded platforms.

If you look at the script itself, you'll get an idea of what it installs. I used a variant of this script to build the system I run on my Soekris net4501 which runs FreeBSD and is under 20 megs.

There are lots of "make a small freebsd system" scripts, but most of the ones I've found rely heavily on 'buildworld' and what not. This takes a live system and copies the binaries you need, then uses ldd(1) to track down required libraries.


Example usage:

kenya(~/t) % rm -rf ./soekris/
kenya(~/t) % time sudo ./
sudo ./   0.16s user 0.65s system 61% cpu 1.326 total
kenya(~/t) % sudo chroot ./soekris /bin/sh
# pwd
# exit
Simple jail config (rc.conf):
Put something simple in this jail's rc.conf (/home/jls/t/soekris/etc/rc.conf):
Let's test the jail now:
kenya(~/t) % sudo /etc/rc.d/jail start
Configuring jails:.
Starting jails: 
At this point, it's probably hung (assuming you enabled sshd). If you hit CTRL+T you'll see what command has the foreground and what it's doing.* This is because it's prompting you (output is directed to JAILROOT/var/log/console.log) for entropy for the ssh-keygen. Smash a few keys then hit enter. It'll finish eventually.
kenya(~/t) % sockstat -4 | grep 
root     sshd       2258  3  tcp4           *:*
Our sshd is running happily inside that jail we made. This whole process took about 5 minutes.

* FreeBSD's CTRL+T terminal handler feature has to be the best thing ever invented. I wish Linux had something like this. Here's what hitting CTRL+T when running cat looks like:

kenya(~) % cat
load: 0.45  cmd: cat 2324 [ttyin] 0.00u 0.00s 0% 600k
load: 0.42  cmd: cat 2324 [ttyin] 0.00u 0.00s 0% 600k
It clearly shows you the command name, the pid, and the syscall-type-thing it's doing. Clearly cat is waiting for input from the tty. <3 FreeBSD.