Navigation

smew problems

Wednesday, February 13th, 2008

So a while ago I set up buildbot for Adium. Briefly, buildbot provides continuous integration (i.e., building the source tree after every checkin) and runs our unit tests automatically. Cool stuff, and hats off to the buildbot team. Things seemed to be running fine for a while, no problems. However, recently we’ve started to get some odd errors on the machine we use for running builds, a Mac Mini named smew1.

Subversion began to fail looking up DNS requests. I could only reproduce the problem when buildbot was running svn. If I logged in, I could run the exact same commands myself. And even more curiously, telling buildbot to run nslookup svn.adiumx.com worked completely fine.

I “solved” this by having the buildbot master (on a Linux machine) doing the lookup and then telling the client to checkout svn://<ip here>. If the IP of the subversion server changes, we just need to do a clean build and it’ll pick up the change. It’s not a great solution, but definitely workable.

This worked either briefly or perhaps not at all, I don’t recall, because our automated tests began failing┬álike so:

/Developer/Tools/RunUnitTests:298: note: Started tests for architectures 'ppc i386'
/Developer/Tools/RunUnitTests:301: note: Running tests for architecture 'ppc'
Wed Feb 13 02:07:40 smew.adiumx.com otest[41048] <Error>: kCGErrorRangeCheck : On-demand launch of the Window Server is allowed for root user only.
Wed Feb 13 02:07:40 smew.adiumx.com otest[41048] <Error>: kCGErrorRangeCheck : Set a breakpoint at CGErrorBreakpoint() to catch errors as they are returned
2008-02-13 02:07 otest[41048] (CarbonCore.framework) FSEventStreamStart: ERROR: FSEvents_connect() => (ipc/send) invalid destination port (268435459)
FAILED TO GET ASN FROM CORESERVICES so aborting.
/Developer/Tools/RunUnitTests: line 301: 41048 Abort trap              arch -arch "${TEST_ARCH}" "${TEST_RIG}" "${TEST_BUNDLE_PATH}"
/Developer/Tools/RunUnitTests:314: error: Test rig '/Developer/Tools/otest' exited abnormally with code 134 (it may have crashed).
** BUILD FAILED **

This is particularly strange because again, I can run these tests manually and get proper results. Same user buildbot is running in (and that user is logged in to the machine and has a window server connection), same checkout, same everything, near as I can tell.

This may or may not be superstition and it is probably just a coincidence, but sometimes these tests do run, and it seems that when I am logged in to the machine via ssh things work OK. But after I log out, things go screwy again. It’s something screwy with that particular machine — I had the buildbot slave running on a Mac mini while I was at Mozilla and it worked just fine.

I’ve run a permissions repair. It fixed some things. Still no dice. Buildbot is using the python installed by Leopard. The machine is fully updated, none of that fixed the problems (not even the Leopard graphics update). This machine is located at a colo somewhere inaccessible (Mars?), so while doing an archive and install would normally be my next step, I don’t have easy access to the machine.

I’ve done everything I can think of that I can do easily. Help me blogosphere, you’re my only hope.


  1. The Smew (Mergellus albellus) is a small duck which is intermediate between the mergansers and the goldeneyes, and has interbred with the Common Goldeneye. It is the only member of the genus Mergellus. (Wikipedia

Comments

  1. rhelmer replied on February 13th, 2008:

    Are you starting Buildbot from the desktop or from the shell? I’ve had the same kind of problem on Mac servers when trying to get Buildbot to e.g. start Firefox, or anything that needs a security context that you get by logging in via the desktop.

  2. Colin replied on February 14th, 2008:

    @rhelmer I started it from ssh, but this most recent time I started it from an actual terminal window on the machine, and then closed it. Appears to be failing still.

    Should I leave the terminal window open? heh.

  3. cheesy replied on February 14th, 2008:

    Either your comments are moderated, or this stupid blog ate my long one.

    Summary:

    Try logging into the desktop as the buildbot user. Lots of good info here:

    http://developer.apple.com/technotes/tn2005/tn2083.html

  4. Ben Hearsum replied on February 14th, 2008:

    Hey Colin,

    You definitely need to start Buildbot from the Desktop and leave the Terminal window open.