All posts by Huw Lynes

Small Victories

I’ve moaned about the state of the SGI cluster here at WeSC before. I’m now happy to report that it’s almost back to full fitness.

After playing about with the external CD-ROM a bit more it became apparent that it would only behave itself if you followed a careful procedure of powering down the Origin it was attached to; power-cycling the CD-ROM and then powering the Origin back on.

At this point the internal disk of the dead node wasn’t even showing up on the SCSI bus so it was definitely knackered. Fortunately the UK is home to a fine purveyor of second-hand SGI parts. We ordered a replacement drive (complete with SGI firmware) and it arrived the next day. Ian Mapleson (for it is he that runs the SGI depot) also has written some excellent articles on Irix administration, one of which details an easy way to clone a root disk. With this info I was able to clone one of the other nodes in the cluster. A quick edit of /etc/sys_id so that it won’t wake up thinking it’s the wrong machine and we are ready to go.

The drive goes into the dead machine, we power it up and hey presto! one working Irix box.

I am jubilant until I realise that all the nodes of the cluster share the same CXFS volume and that this node no longer has a valid CXFS license. And of course we have no backups (actually this isn’t quite true it later transpired that there were copies of the license file on another machine but a backup that isn’t documented isn’t a very useful backup).

I put in a support call to SGI (remembering that this machine has no maintenance contract) without much hope. The very next day SGI email me the license file! SGI, you may be at death’s door but you are lovely people.

At this point all that’s left to do is to debug the condor install which doesn’t seem to be working properly/ But that is somebody else’s problem.

Open Science

Bioinformatician Pedro Beltrao has posted what is probably the most thought-provoking article I’ve read all year.

His basic tenet is that the the current mechanisms by which scientific research is produced have become counter-productive. Rather than just whingeing about it he proposes an alternative mechanism.

During my ill-fated career as a PhD student the constant feeling of being in competition with everyone else was something I hated. That was probably partly due to my state-of-mind at the time but I always got the impression that talking to other scientists was a bad thing.

In my current job I deal with researchers in several different fields. Many of them are doing really cool research that I would love to talk about but I don’t.

Pedro’s suggestion may turn out to be unworkable but I applaud the sentiment.

Post LugRadio Live

Right I’ve now had a week to recover from LRL and to internalise some of it.

Good Points:

A plethora of informative and entertaining talks. Too damn many in fact. I missed quite a few I wanted to talk to see due to scheduling conflicts.

My talk went quite well, although being scheduled against Bruno and Simon Phipps was a bit of a bugger. I can’t wait for the video of Bruno’s talk to be released.

Talked to a researcher from Coventry who was interested in the whole concept of Grid data repositories so something useful from a work stand-point may come out of it.

Dinner at some wetherspoons pub with a random bunch of (very nice) geeks a bumped into on saturday night.

The very nice beer called Titanic that they were serving in the afore-mentioned pub.

Bad Points

Completely failing to talk to any of the #lugradio regulars.

It was too damn hot.

Spending the whole of saturday in a complete haze due to talk anxiety.

Onwards..

Talked to a researcher from Coventry who was interested in the whole concept of Grid data repositories so something useful from a work stand-point may come out of it.

Met a chap from Southampton who was familiar with the OMII project and of course completely failed to get his email address. In the unlikely event that this is you, could you send me an email please?

After seeing Christian Schaller’s GStreamer talk I’ve spent all day looking at the python gstreamer bindings and have gotten as far as writing a noddy media player that prints out any metadata it finds in the media. Now I just have to work out how to re-write the metadata and I should be able to completely replace the tagging infrastructure in Peapod with a better version using gstreamer.

Here are the slides for my talk.

LugRadio Live: Pre Game

Having just had a look at the LugRadio Schedule (due to excellent work by popey) two things occur to me.

1) Buno Bord is a being of unimaginable evil. He kicks Andrex puppies for fun. And anyway how edifying is a talk about swearing going to be?

2)Do you really want to listen to an hour of Simon Phipps getting heckled about why Sun haven’t open-sourced Java yet?

Obviously your should come and listen to my talk about Grid Computing. I’m not evil, Java isn’t my fault and you’ll
learn something.

<fx=”tumbleweed rolling past”>

Oh Dear.

I’d better get my stuff packed

How Many Ways Can We Fail Today?

The SGI cluster here at WeSC is beginning to get me down. One of it’s nodes has been down since I started work here and I’ve finally gotten around to looking at it.

Step 1 was to try and get acces to the serial console. After hunting around for a cable I then had to fight with minicom to get it running. A process that would have been significantly easier if the terminals weren’t all runnning at a non-standard baud. Anyway a quick re-boot of the machine showed that it was finding it’s internal disk but failing to find it’s OS. Given the number of times the power has failed recently I wouldn’t be at all surprised if the partition table is corrupted. So it seemed like a re-install was worth trying before getting a replacement disk.

After finding a set of Irix install instructions that I could actually understand I hooked up the ancient external SCSI CD drive and put in the disk that contains the install tools. It took a couple of attempts to convince the drive that it should close but after that it made all the right kinds of whirring noises and I was quietly hopeful.

So boot to the command monitor and:

boot -f cdrom(1,1,7)sash64

And the monitor helpfully responds with a ‘no media found’ message. After trying several other CDs I realized that I couldn’t even ls them never mind run them. The conclusion? Knackered CD drive. Arse.

For my next trick: installing over bootp using an SGI Fuel workstation as a server.

Annoying Gnome-Blog Bug

So after installing Dapper on my new laptop (more on this in a later post) I decided to give gnome-blog a whirl. I like the idea of in-line spell checking and the panel applet looks quite nice too.

Anyway when pointing it at my blog it insists on trying to connect to http://site.domain/wordpress/xmlrpc.php and there seems to be no way in the configuration GUI to persuade it that my blog lives at http://site.domain/xmlrpc.php

Looking around launchpad and bugzilla.gnome brings up the following bugs.

https://launchpad.net/distros/ubuntu/+source/gnome-blog/+bug/44867

http://bugzilla.gnome.org/show_bug.cgi?id=167499

It seems fairly sensible to me to just change xmlrpc_url in the code so that it doesn’t make any assumptions about the location of xmlrpc.php. But it’s probably not possible to do without breaking at least some peoples currently working gnome-blogs.

Anyway for the moment you can always hack .gconf/apps/gnome-blog/%gconf.xml by hand to make it work.

A Productive Week – Cfengine and SSH-agent

Now that lack-of-sleep madness has passed I’ve managed to actually get some work done. In particular I’ve been chipping away at some of the tedious manual labour that comes with administering multiple machines.

To start off with I finally knuckled down to working out how to use ssh-agent. This nice article from SecurityFocus helped me get started. The most difficult bit was getting ssh-agent to run from fluxbox on start up. To fix that I added the following lines into .fluxbox/apps

[startup] {eval `ssh-agent -s`}
[startup] {ssh-add < /dev/null}

which pops up a dialog box for my passphrase on login.

I also started to get down to sorting out configuration management using cfengine. One of things that I've never been able to work out was how to make rules depend on one another. So if you have a rule that adds a line into the iptables config how do you then tell cfengine that iptables needs to be restarted. After hunting around on the web I found an example that does almost exactly what I need. A hacked up a quick example that would sort my root alias and then run the sendmail newaliases command.

editfiles::
{ /etc/aliases
BeginGroupIfNoSuchLine "root:           wescroot@wesc.ac.uk"
DeleteLinesStarting "#root"
Append "root:           wescroot@wesc.ac.uk"
EndGroup
DefineClasses "aliaseschanged"
}

shellcommands:
aliaseschanged::
'/usr/bin/newaliases'
useshell=false

Basically aliaseschanged is only set if the editfiles rule needs to be executed. So newaliases is only run if we actually update the aliases file. I have a more complicated set of rules that does the same thing for iptables. Next week globus4.

Red Hat in Bad Mouthing Fedora Shock

Read this unhappy tale.

If that really was a Red Hat sales person they need to be found a nd sacked. Also it would seem prudent to tone down the wording of this page about Fedora.

So the chap on the Fedora Forums has valiantly managed to get a Fedora-based project off the ground in what sounds like a fairly windows-centric enverionment only to have the wheels come off because one of his customers has read the Red Hat page about Fedora being “impractical for use in commercial environments…” And the n some “Red Hat Sales Rep” gives him a list of the usual canards about open-source software being unsafe because anyone can contribute.

News for Red Hat: This isn’t a win for RHEL it’s a loss for Linux.

And to think I was already in a bad mood. Heres hoping the sales rep turns out not to work for Red Hat.