Search this site


Metadata

Articles

Projects

Presentations

Getting your python as rpms

I was working on a new python 2.6 rpm to push at work and started wondering about how to get python eggs to become rpms. Ruby has a gem package called gem2rpm that aids in generating rpms from ruby gems, but there's not really an egg2rpm project.

We're in luck, though. Python's setuptools supports generating rpms by default, it seems. Those 'python setup.py' invocations you may be accustomed to can trivially generate rpms.

The secret sauce is the 'bdist_rpm' command given to setup.py:

% wget -q http://boto.googlecode.com/files/boto-1.8d.tar.gz
% tar -zxf boto-1.8d.tar.gz
% cd cd boto-1.8d
% python setup.py bdist_rpm
% find ./ -name '*.rpm'
./dist/boto-1.8d-1.noarch.rpm
./dist/boto-1.8d-1.src.rpm
Piece of cake. I've tried this on a handful of python packages (boto, simplejson, etc), and they all seem to produce happy rpms.

However, if you have multiple versions of python available, you'll want to explicitly hardcode the path to python:

% python setup.py bdist_rpm --python /usr/bin/python2.6
% rpm2cpio dist/boto-1.8d-1.noarch.rpm | cpio -it | grep lib | head -3
2745 blocks
./usr/lib/python2.6/site-packages/boto-1.8d-py2.6.egg-info
./usr/lib/python2.6/site-packages/boto/__init__.py
./usr/lib/python2.6/site-packages/boto/__init__.pyc
The default python on this system is python 2.4. Doing the above forces a build against python2.6 - excellent, but maybe we're not quite there yet. What if you need this package for both python 2.4 and 2.6? For this, you'll need separate package names. However, the bdist_rpm command doesn't have a way of setting the rpm package name. One way is to hack setup.py with the new name:
% grep name setup.py
setup(name = "boto",
% sed -re 's/name *= *"([^"]+)"/name = "python24-\1"/'  setup.py > setup24.py
% grep name setup24.py
setup(name = "python24-boto",

# Now build the new rpm with the new package name, python24-boto
% python setup24.py bdist_rpm --python /usr/bin/python2.4
For our boto package, this creates an rpm with a new name: python24-boto. This method is good (hack the setup.py script) because the command to build the rpm stays basically the same. The alternative would be to use 'python setup.py bdist_rpm --spec-only' and edit the spec file, then craft whatever rpmbuild command was necessary. The method above is less effort and trivially automatable with no knowledge of rpmbuild or specfiles. :)

Repeat this process for python26, and now we have two boto rpms for both pythons.

% rpm -Uvh python2?-boto-*noarch.rpm
Preparing...                ########################################### [100%]
   1:python26-boto          ########################################### [ 50%]
   2:python24-boto          ########################################### [100%]

% python2.4 -c 'import boto; print True'
True
% python2.6 -c 'import boto; print True'
True
Excellent.

jps output not correct

A nagios alert checking for some java processes started firing because it couldn't find those processes. This check used 'jps' to look for those processes.
% sudo /usr/java/jdk1.6.0_04/bin/jps
15071 Jps
% ps -u root | grep -c java
15
I espected lots of output from jps, but there was only the jps process itself. Confusing. What does jps use to track java processes?

Your old strace (truss, whatever) friend will help you here:

# Always use 'strace -f' on java processes as they spawn new processes/threads
% sudo strace -f /usr/java/jdk1.6.0_04/bin/jps |& grep -F '("/' \
  | fex '"2/{1:2}' | sort | uniq -c | sort -n | tail -5
      5 proc/self 
      5 proc/stat 
     12 usr 
     17 tmp/hsperfdata_root 
    283 usr/java 
It referenced /tmp/hsperfdata_root multiple times. Weird, checking it out:
% ls /tmp/hsperfdata_root | wc -l
0
This directory is empty. Looking further around the strace and confirming by looking at the classes jps invokes (sun.jvmstat.perfdata.monitor.protocol.local.MonitoredHostProvider) shows that /tmp/hsperfdata_<user> is used by each jvm instance. It stores a file named by processes' pid.

So the question is, why is this directory empty?

Of the hosts I know run java, it only seems like long-running instances of java are disappearing from jps, making me think we have a cron job removing files from /tmp. I found this while looking through cron jobs:

% cat /etc/cron.daily/tmpwatch 
/usr/sbin/tmpwatch -x /tmp/.X11-unix -x /tmp/.XIM-unix -x /tmp/.font-unix \
        -x /tmp/.ICE-unix -x /tmp/.Test-unix 240 /tmp
/usr/sbin/tmpwatch 720 /var/tmp
for d in /var/{cache/man,catman}/{cat?,X11R6/cat?,local/cat?}; do
    if [ -d "$d" ]; then
        /usr/sbin/tmpwatch -f 720 "$d"
    fi
done
This file comes from the tmpwatch rpm, which appears to come base installed on CentOS. This means that for every file in /tmp (except those specified by '-x dir') are being deleted if they are older than 240 hours (10 days). As an FYI, the default time value inspected is the file's atime, so if you mount noatime, the accesstime is not reliable.

Ultimately, we need to add a new set of flags to the cronjob that excludes /tmp/hsperfdata_*. This should keep me from being paged when a java process lives for more than 10 days ;)

Additionally, it makes me think that the people who use CentOS don't use Java or don't monitor their java processes with jps.