Search this site


[prev]  Page 2 of 3  [next]

Metadata

Articles

Projects

Presentations

Got Firefox with too many tabs open?

My work environment has changed drastically as I become more dependent on Firefox to feed me information from the web. As such, I often have more and more tabs open as the day progresses. Closing a tab is annoying, becuase in 15 minutes I'll probably need that page again. Since I don't close tabs often, I end up with a huge tab list when the only thing I can see in the tab itself is the favicon.

Obviously, searching for a tab is extremely difficult when all you have for identification is horizontal location and a tiny icon. I had previously resorted to a Monte Carlo-style method of finding tabs, which is just a fancy term for "keep clicking random tabs until you find the right one." Again, this has the obvious side effect of wasting time and annoying me.

I set out yesterday to write a Firefox extension to solve this issue. I've got a prototype working pretty well, but a friend recently pointed me at Reveal. Reveal is a Firefox extension that lets me search titles and urls of open tabs for a page I'd like to view. It can search the page text, but I imagine that it is quite cpu intensive, so I don't bother. I installed it, and while I don't care for the thumbnails, the searching is fast and useful. However, I have doubts of it's speed when running on slower machines (this particular desktop is 3.4gHz).

With about 15-20 tabs open, it seems to perform pretty well. I am uncertain as to how well it scales. Being able to turn thumbnailing off entirely might be an option worth trying. For now, Reveal solves the problem well enough.

I have lots of ideas about how to search pages quickly without having to search the entire text content - like only looking at "important" tags such as the H1-H6 series, etc. I haven't needed page text searching yet, though, so I'll wait on spending time on that particular problem.

Parallelization with /bin/sh

I have 89 log files. The average file size is 100ish megs. I want to parse all of the logs into something else useful. Processing 9.1 gigs of logs is not my idea of a good time, nor is it a good application for a single CPU to handle. Let's parallelize it.

I abuse /bin/sh's ability to background processes and wait for children to finish. I have a script that can take a pool of available computers and send tasks to it. These tasks are just "process this apache log" - but the speed increase of parallelization over single process is incredible and very simple to do in the shell.

The script to perform this parallization is here: parallelize.sh

I define a list of hosts to use in the script and pass a list of logs to process on the command line. The host list is multiplied until it is longer than the number of logs. I then pick a log and send it off to a server to process using ssh, which calls a script that outputs to stdout. Output is captured to a file delimited by the hostname and the pid.

I didn't run it single-process in full to compare running times, however, parallel execution gets *much* farther in 10 minutes than single proc does. Sweet :)

Some of the log files are *enormous* - taking up 1 gig alone. I'm experimenting with split(1) to split these files into 100,000 lines each. The problem becomes that all of the tasks are done except for the 4 processes handling the 1 gig log files (there are 4 of them). Splitting will make the individual jobs smaller, allowing us to process them faster becuase we have a more even work load across proceses.

So, a simple application benefiting from parallelization is solved by using simple, standard tools. Sexy.

Music yields productivity?

I almost always listen to music. I've mentiond this before, but I was reminded tonight of how music seems to make projects happen quicker; especially school projects. Tonight I was working on my Algorithms class project (spanning tree stuff) and found that I was working better than normal. This is due to my finally having found some music to put in my laptop and listen to.

Rochester radio is the absolute definition of excessive repetition. Of the 11 stations I have programmed into my receiver, none of them were playing a refreshing list of songs I hadn't heard in a while. So I put in a Nightwish cd and listened to that instead. Productivity increased.

Assuming the following theories:
Music = Productivity
Xterms = Productivity

If you increase xterms and the quality of music, how much can we increase productivity by? ;)

Music may work well for me because I tend to get distracted or bored easily when doing required tasks such as school work. If there's music playing I like, then I'll end up distracting myself momentarily with the music instead of other more time-consuming tasks such as reading news/blogs. Either way, music works.

Makes me wonder how many other people do the same thing?

statistic deltas using awk

Short shell script I call 'delta' - It is useful for groking 'vmstat -s' output (and possibly other commands) to view time-based deltas on each counter.
#!/bin/sh
while :; do
   $* | sed -e 's/^ *//';
   sleep 1;
done | awk '
{
   line = substr($0, length($1)+1);

   if (foo[line]) {
      printf("%10d %s\n", $1 - foo[line], line);
   }
   foo[line] = $1;
   fflush();
}'
Example usage:
delta vmstat -s | grep -E 'system calls|fork'

       792  system calls
         3   fork() calls
         0  vfork() calls
       120  pages affected by  fork()
         0  pages affected by vfork()
       680  system calls
         3   fork() calls
         0  vfork() calls
       120  pages affected by  fork()
         0  pages affected by vfork()
      1150  system calls
         3   fork() calls
         0  vfork() calls
       120  pages affected by  fork()
         0  pages affected by vfork()

Watching yourself work

I got bored and wanted to see how many lines of code I had otherwise altered since winter break started.

The actual numbers are:

  • Added: 1617
  • Deleted: 1964
I've deleted a lot more than I have added. This is reasonable considering I've been doing quite a bit of code refactoring and I always end up with smaller code that does more.

I'm sure the method I used wasn't the cleanest or best method to go about finding these numbers, but I wasn't looking to spend more than 5 minutes figuring it out. Here's how I looked those numbers up:

% p4 diff -ds `find ./ | sed -e 's,$,@2005/12/20,'` > changes
% awk 'BEGIN { add = 0; del = 0; chg = 0 }; 
/^add/ { add += $4 }; 
/^deleted/ { del += $4 }; 
/^changed/ { chg += $4 }; 
END { printf "Add/Del/Chg: %d/%d/%d", add, del, chg}' changes

Add/Del/Chg: 326/10/139

% svn diff -r {2005-12-20}:head > diffs
% awk 'BEGIN {add = 0; del = 0}
/^-[^-]/ { add += 1};
/^+[^+]/ { del += 1 };
END { print add" "del}' diffs

1152 1815

migrating from nis to ldap, round 1

We at CSH need to move from nis and the many other user information datastores we use to using LDAP instead. To that effort, I have started working on merging our data informations. The first step is importing NIS (passwd/group) information into ldap.

I wrote a script, passwd2ldif, to use NIS passwd information and put it in ldap.

ypcat passwd | ./passwd2ldif > cshusers.ldif
ldapadd -D "cn=happyrootuserthinghere,dc=csh,dc=rit,dc=edu" -f cshusers.ldif
Wait a while, and all users from NIS show up in ldap. I have my laptop looking at ldap for user informatin using nss_ldap:
nightfall(~) [690] % finger -m psionic
Login: psionic                          Name: Jordan Sissel
Directory: /u9/psionic                  Shell: /usr/bin/tcsh
Never logged in.
No Mail.
No Plan.
Pretty simple stuff, so far. Next step is going to involve creating a new schema to support all of the information we currently store in "member profiles." Member profiles is a huge mess of a single mysql table with lots of columns such as "rit_phone," "csh_year," "aol_im," and others. All of that can go to ldap. I'll post more on this later when I figure out what kind of schema we want.

Poor man's todo list.

I've often had a yearning for any kind of a todo list that meets the following requirements:
  • Simple to use
  • Easy to maintain
  • Quick to start using
  • High Mobility
  • Require low effort
I have tried many kinds of "todo" lists. The first one is the ungeeky kind, Ye Olde Paper. Paper is great, unfortunately it doesn'treplicate easily and is easily lost when made portable. Post-It notes fall under this category - easily lost, not mobile without high loss.

Next, I tried online "todo" lists such as tadalist.com. These such organizational tools are great and meet all but one of my requirements: requiring low effort. I am a creature of habit, and learning new habits is difficult. This "new habit" would be my continual visitation of the online todo list. What actually happens is I update the todo list once, then promptly forget about it. That means I need some sort of periodic reminder.

There exist many kinds of virtual "postit" programs. One such program is called xpostitPlus. It wastes valuable screen realestate and is ugly. Furthermore, it is not mobile unless I use X forwarding or replicate the notes database. GNOME has a similar program called 'stickynotes-applet' or something similar. I don't use GNOME, and I imagine that the stickynotes applet suffers from the same problems as xpostitPlus.

So I got to thinking about how to best solve this problem. I immediately thought about writing my own python-gtk app that would let me do it, but I realized quickly that it would fall under the same problems as the virtual postit programs. Furthermore, that's overengineering a solution to a simple problem that can have a simple solution (pen and paper, remember). I remembered that zsh has a 'periodic' feature that you can schedule jobs to occur every N seconds. "Every N seconds" isn't quite true, and this becomes beneficial. It actually schedules execution for N seconds after the last run but doesn't actually execute until you reach a prompt. My solution is very simple, portable, and easy for me to use.

In my .zshrc:

# Periodic Reminder!
PERIOD=3600                       # Every hour, call periodic()
function periodic() {
        [ -f ~/.plan ] || return

        echo
        echo "= Todo List"
        sed -e 's/^/   /' ~/.plan
        echo "= End"
}
Periodic is scheduled as soon as the shell starts, so I see my '.plan' file as soon as I open an xterm or otherwise login. every hour I get a reminder. I may change this to once a day or something, but for the most part my solution is complete and meets my requirements.

My '~/.plan' file:

* Register for classes
* Pimp
* newpsm/newmoused 
* rum
This solution is stupid simple and is effective:
  • Simple to use: integrated into my shell
  • Easy to maintain: with vi
  • Quick to start using: 10 lines of shell and vi... done
  • High Mobile: via ssh, local .plan replica, etc
  • Require low effort: reminders are automatic
I may improve this later using xsltproc(1) so I can set priority levels and other things, but for now this will definatley suffice.

Poor Man's Backup: rsync + management

I got bored and made some useful adaptations on a backup script I wrote for class. It turned into a simple backup/recovery script that supports multiple host backups and very easy recovery.

Read more about the project on the project page: projects/pmbackup

There are a few caveats of the way I currently do it. The first, being, that file ownership is not preserved. This is only an option if rsync is being run as root on the backup server while doing backups, or as root on the client when doing recovery. I'm going to setup a "backup jail" on my machine with only rsync and sshd in it so I can have an ssh key let me login to that jail as root so backups can preserve file ownerships.
There may be a better way to do this, but jailing seems the simplest and most secure.

programming sanity

A project I'm working on at work contains piles of some of the worst-written code I've seen in a long time. Top that with three tons of commented-out source code and you've got me angry.

I use vim (NOT gvim). I don't use syntax highlighting. The bright colors available in the terminal are too contrasting to be useful for syntax highlighting, so I have it off at all times. However, with this new project, there are so many lines of code that are just commented out it becomes extremely difficult to read said code.

Enter vim's syntax highlighting. I hate coloring of most things, but having comments a particular color is acceptable. I would prefer these comments be dark-but-visible-on-a-black-background shade of grey, so I try this in vim:

hi Comments ctermfg=grey
This makes comments appear grey. However, they grey is not quite dark enough, so I'll need to fix that. The ANSI color number for grey is 7. So in my ~/.Xresources file, I need to redefine what this color really looks like, specifically making it darker. I used xcolors(1) to show me a list of colors with names that X would recognize, and I saw that 'grey' was a decent shade of, you guessed it, grey. So I put this in my .Xresources file:
XTerm*color7: grey
Now run xrdb -merge ~/.Xresources and new xterms will use this color. Openning up vim and it looks just fine.

One special mention here, is the difference between my CRT monitor and the LCD on my laptop. The default grey color (7) in X on my laptop is perfectly visible. However, on my CRT I need to darken this color to make it visible to me. Your mileage may vary - the CRT I'm currently using is a pretty crappy 19" trinitron with some contrast issues, so using another monitor may clear this up.

log-watching expert system

I got bored and wrote an expert system for doing log analysis and reaction. It's original intention was to watch auth.log for brute-force login attempts and block them on the firewall. It has turned into a far more flexible system for doing generic log-based matching and reaction. Reactions are based on a threshold of hits over time. The 'reaction' section of the config file specifies what command is run (this could be a simple shell script you call, for example).

There are a few features I'll probably be adding soon such as multiple threshold/reactions per match type, but that's somewhat down the road for when I have more boredom to throw at the project. I also want to allow users to add their own meta globs (like %USERNAME%) into the config file so the program is even more flexible.

Currently it runs on my mirror server and blocks excess (brute force) ssh attempts, seems to be going good. The development process of this took me into learning a very slick perl module called Parse::RecDescent which parses documents based on a given grammar. I used this for the config file, it was pleasantly easy to use. Check out logwatch, download it

It requires the following perl modules:

  • File::Tail
  • Regexp::Common
  • Parse::RecDescent