photo
Jordan Sissel
geek

Sat, 17 Mar 2007

Less bullshit, more graph.

I've been working recently on dynamic, simple graphing. Systems like Cacti provide useful interfaces, but getting data into it is a pain in the ass.

You have 1500 machines you want in cacti. How do you do it?

My take is that you shouldn't ever need to preregister data types or data sources. Have a system that you simply throw data at, and it stores it so you can get a graph of it later. All I need to do, to graph new data, is simply write a script that produces that data and sends it to the collector.

The collector is a python cgi script that frontends to rrdtool. It takes all cgi paramters and stores the values with a few exceptions:

  • machine=XX - Spoof machine to store data for. If not given, defaults to REMOTE_ADDR. Useful if you need to proxy data through another machine, or are reporting data about another machine you are probing.
  • timestamp=XX - Override default timestamp ("now").
Everything else gets stored like this: /dataroot/<machine>/<variable>.rrd

Example:

kenya(/mnt/rrds/129.21.60.26) % ls
C_bytes_per_page.rrd                            C_pages_inactive.rrd
C_cpu_context_switches.rrd                      C_rfork_calls.rrd
... etc ...
All of those rrds are created by simply throwing data at the python cgi script. The source of the data is a script that runs 'vmstat -s' and turns it into key-value pairs.

Why are the files prefixed with "C_" ? The data I am feeding in comes from counters, and therefore should be stored as counter datatypes in rrdtool. The 'C_' prefix is a hint that if the variable needs an rrd created for it, that the DS type should be COUNTER. The default without this prefix is GAUGE.

Sample update http request:
http://somehost/updater.py?C_fork_calls=32522875&C_system_calls.rrd=235293874987

Feel free to view the vmstat -s poll script to get a better idea of what this does. I also have another script that will do some scraping on 'netstat -s' in freebsd (probably works in linux too).

vmstat -s looks like this:

456846233 cpu context switches
3220655757 device interrupts
 17964606 software interrupts
  ... etc ...
It's trivial to turn this into key-value pairs. If this were Cacti (or similar system) I would have to go through every line of vmstat -s and create a new data type/source/thing for each one, then create one per host. Screw that. Keep in mind my experience with Cacti is pretty small - I saw I had to register data sources and graphs and such manually and left it alone after that.

Anyway, back at the problem. Now how do I graph it? The interface isn't the best, but we use a cgi script again:

Show me all the machines with 'C_system_calls' graphed over the past 15 minutes:
graph.py?machines=129.21.60.1,<...>,129.21.60.26&keys=C_system_calls&start=-15min

This kind of system has the feature that you never need to explicitly define data input variables or data input sources - All you need is to hack together a script that can pump out key-value pairs. No documentation to read. No time consumed registering 500 new servers in your graph system.

Comments: 3 (view comments)
Tags: , ,
Permalink: /geekery/no-bullshit-graphs
posted at: 04:42

Merging multiple svn repositories

Over the past several years, I've used mainly CVS. I tried switching to subversion, which has been slow-going. To speed that process, I merged all of my repositories together into one svn repo. I also used cvs2svn.py to convert everything in cvs to svn, which put everything into /trunk/ in the repository - not what I wanted. A simple script fixes that:
repo=file:///path/to/repo
svn ls $repo/trunk | xargs -I@ -n1 svn mv $repo/trunk/@ $repo/@

I used svn poorly at first - one repository per project. To fix that, I needed to dump all of them (with svnadmin) and load them into a central repository:

# svnadmin dump all of my svn repositories
repodir="/home/foo
for i in $repodir/SVN/*; do 
  echo $i;
  svnadmin dump $i > $(basename $i).dump
done
# load all of my dumpped repositories into the new one
repo="/home/foo/NEWSVN"
svnadmin create $repo
for i in *.dump; do 
  proj="$(echo $i | cut -d. -f1)";
  svn mkdir -m "mkdir $proj for import" file://$repo/$proj
  svnadmin load --parent-dir $proj $repo < $i
done

Comments: 3 (view comments)
Tags: , ,
Permalink: /geekery/merging-multiple-svn-repos
posted at: 02:36

Search this site

Navigation

Metadata

Home About Resume My Code (SVN)

Articles

ARP Security Dynamic DNS with DHCP OpenLDAP+Kerberos+SASL PPP over SSH SSH Security: /bin/false Week of Unix Tools Work Efficiency

Projects

fex firefox tabsearch firefox urledit grok keynav liboverride newpsm (FreeBSD) nis2ldap pam_captcha poor man's backup Solaris audio utility xboxproxy xdotool xmlpresenter xpathtool misc scripts

Presentations

Yahoo! Hack Day '06 Unix Essentials Vi/Vim Essentials

Tag Cloud

Calendar

< March 2007 >
SuMoTuWeThFrSa
     1 2 3
4 5 6 7 8 910
11121314151617
18192021222324
25262728293031

Friends

BarCamp Kent Brewster Tantek Çelik John Resig Wesley Shields Tyler Shields

Technorati