All posts by Huw Lynes

Jobs and Prizes

Apparently we’ve made the shortlist of the British Computer Society 2008 IT Industry Awards. This is in the environmental category for our new compute cluster install.

In slightly less esoteric news work are advertising for two new posts at the Advanced Research Computing Division of Cardiff University. We are looking for:

Both these positions are co-funded by Bull Information Systems and will involve a significant amount of collaboration with them. On a personal note, I’ve found the Bull R&D team in France and the support team in the UK to be an absolute pleasure to work with. I would happily apply for one of these jobs except that: a) I hate people and b) I despise Fortran.

Dealing With Stupid Programs That Think They Need X

The new compute cluster is beginning to feel like a production system. I’m currently run off my feet installing software for the stream of new users. Mostly this is fine, but occasionally I run into software that makes me want to band my head repeatedly on my desk until the pain goes away; or more accurately makes me want to bang the programmer’s head on the desk.

Just today we received a linux port of a code that has been running on the Windows Condor pool for a while now. Everything seemed fine except for it’s stubborn refusal to run if it couldn’t find a windowing system. Bear in mind that it doesn’t actually produce any graphical output it just dies if it can’t connect to X. After a bit of futzing around we discover that the people that normally run this code do something like:

Xvfb :1 -server 1 1024x1024x8 &
export DISPLAY=:1
./stupid_code_that_wants_X

Xvfb is the X virtual framebuffer. It creates a running X client without actually needing any graphics to be running.

Which works just great locally but if you want to launch that as a script in the job scheduling system (we use PBSpro) then you need to be a bit more careful. What happens if two of these jobs try to launch on the same machine? Obviously one of them will fail because display 1 is already allocated. What I really needed was a script that will try to launch Xvfb and increment DISPLAY on failure until it finds a display that is free. For your edification here it is:

get_xvfb_pid () {
	XVFB_PID=`ps -efww | grep -v grep | grep Xvfb |\
       grep $USERNAME | tail -n 1 | awk '{print $2}'`
	}

create_xvfb () {
	USERNAME=`whoami`
	DISPLAYNO=1
	while [ -z $xvfb_success ]
		do
		get_xvfb_pid
		old_XVFB_PID=$XVFB_PID
		XVFB_PID=""
		Xvfb :${DISPLAYNO} -screen 0 1024x1024x8 >& /dev/null &
		sleep 1
		get_xvfb_pid
		if ! [ -z $old_XVFB_PID ]
			then
			if [ -z $XFVB_PID ] && ! [ $XVFB_PID == $old_XVFB_PID ]
				then
				echo "Started XVFB on display $DISPLAYNO process $XVFB_PID"
				xvfb_success=1
			else
				DISPLAYNO=$(($DISPLAYNO + 1))
				XVFB_PID=""
			fi
		else
			if [ -z $XFVB_PID ]
                                then
                                echo "Started XVFB on display $DISPLAYNO process $XVFB_PID"
                                xvfb_success=1
                        else
                                DISPLAYNO=$(($DISPLAYNO + 1))
                                echo "FAIL!" $XVFB_PID
                                XVFB_PID=""
                        fi
		fi
 		done
	export XVFB_PID
	export DISPLAY=:${DISPLAYNO}
	}

kill_xvfb () {
	kill $XVFB_PID
	}

Which you can call from a script like thus:

[arccacluster8]$. ./xvfb_helper
[arccacluster8]$ create_xvfb
Started XVFB on display 1 process 9563
[arccacluster8 ~]$ echo $DISPLAY
:1
[arccacluster8 ~]$ echo $XVFB_PID
9563
[arccacluster8 ~]$ ps -efw | grep Xvfb
username    9563  9498  0 19:31 pts/8    00:00:00 Xvfb :1 -screen 0 1024x1024x8
[arccacluster8 ~]$ kill_xvfb
[arccacluster8 ~]$ ps -efw | grep Xvfb
[arccacluster8 ~]$

I submit that this is a disgraceful hack, but it might come in handy to someone else.

silly shell history meme

Yes I’m a sheep, I admit it.


[huw@w1199 ~]$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}'|sort -rn|head
223 ./condor_accounting.py
101 rm
81 ls
73 ssh
61 ./condor_usage.py
58 python
42 cd
40 pylint
38 sudo
28 ./condor_status_logger.py

My desktop at work. No prizes for guessing what I’ve been working on recently.

Peapod 0.7

A long weekend is always a good time for a new release. So without further ado I give you Peapod 0.7

Notable bug-fixes include improved syncing with ipods and some clean-up of the verbose output so that it makes more sense.

For those of you who don’t know: peapod is a command-line podcast downloader written in python.

Vmware Server Borkage on Fedora 8 2.6.24

Another kernel update so I go into my usual habit of

vmware-config.pl
..rebuild modules etc

when it dies on it’s arse.
/tmp/vmware-config0/vmmon-only/./include/vm_basic_types.h:170: error: conflicting types for ‘uintptr_t’
include/linux/types.h:40: error: previous declaration of ‘uintptr_t’ was here
In file included from /tmp/vmware-config0/vmmon-only/./include/x86.h:23,
from /tmp/vmware-config0/vmmon-only/linux/driver.h:15,
from /tmp/vmware-config0/vmmon-only/linux/driver.c:53:
/tmp/vmware-config0/vmmon-only/./include/x86cpuid.h:381:1: warning: "BIT_MASK" redefined
In file included from include/linux/kernel.h:15,
from /tmp/vmware-config0/vmmon-only/linux/driver.c:15:
include/linux/bitops.h:7:1: warning: this is the location of the previous definition
In file included from /tmp/vmware-config0/vmmon-only/./include/vmci_kernel_defs.h:26,
from /tmp/vmware-config0/vmmon-only/./common/vmciContext.h:19,
from /tmp/vmware-config0/vmmon-only/linux/driver.h:21,
from /tmp/vmware-config0/vmmon-only/linux/driver.c:53:
/tmp/vmware-config0/vmmon-only/./include/compat_wait.h:37:5: warning: "VMW_HAVE_EPOLL" is not defined
/tmp/vmware-config0/vmmon-only/./include/compat_wait.h:43:5: warning: "VMW_HAVE_EPOLL" is not defined

Fortunately someone in the community has already fixed it. All the magic you need is contained in vmware-any-any-116.tar.gz

workshop kit

I’m in the middle of drawing up an equipment list for the new workshop and machine room. So far I’ve come up with the following:

  • Decent workbench
  • Steps (to work on the machines at the tops of racks)
  • Lots of storage bins and shelves
  • technical vacuum cleaner
  • work lights/torches
  • toolkits
  • lifting gear (possibly a scissor lift of some kind?) Are there affordable lifts that will get heavy machines up to the top of a normal rack?
  • a trolley or truck

Any advice on items I’m missing or good suppliers for this stuff in the UK is gratefully received.

Dead Mobile

My aged Nokia 6600 died the day before going on holiday. Generous souls might want to text me their numbers.

Anyone mentioning backups will be met with a rant about broken SyncML implementations.

I wish FIC would hurry up and release their consumer phone.