Bill Lovett

Problems with NetBSD

Posted on October 17th, 2005

Yesterday I dropped off the Powermac 7200 at an NYC Recycling event at Union Square. I kept the memory, hard drive, and CDROM drive.

I wanted to see whether the memory from the 7200 would work in the 9600, so I shut the machine down and filled the remaining 4 memory slots with the 7200's memory chips. I also attached an external SCSI drive to the 9600, just to see whether it would be recognized.

The 9600 would not initially boot with the new memory. I tracked this down to one of the DIMMs from the 7200-- when I removed it, NetBSD booted fine. It looks as if this DIMM might have been the culprit behind the problems I was having with the 7200. No distribution except Debian seemed to be able to run its installer on the 7200, and even when I could get Debian's installer to run it would sporadically crap out during the course of installing the base packages.

When the external SCSI drive was turned on, NetBSD recognized it as sd0 and the internal drive (the original 1G drive from the 7200) as sd1. This screwed up the boot process, because the internal drive had previously been seen as sd0 and therefore /etc/fstab was now pointing to the wrong device. This problem went away when I modified /etc/fstab (by booting without the external drive, making the edit, then rebooting with it on). However, if I ever try to boot the 9600 without the external drive, I'll find myself with the reverse problem.

I followed the instructions in the NetBSD Guide to get the new disk up and running. It was weird and not at all like how things work in Linux. I had to do this:

# disklabel sd0 > tempfile
# vi tempfile
# disklabel -R -r sd0 tempfile
# newfs /dev/sd0c

From vi, I had to change the fstype for the c: partition to 4.2BSD. The Guide wasn't very clear on this-- the content in the Guide seemed like it was bits and pieces of some larger document that had since been pared down a lot.

Several hours later while trying to perform a Subversion commit I got a message from Tortoise SVN about a "malformed header", then in the course of trying to identify the problem I got a few other strange errors such as the SSH key not being right. Then I lost all connectivity to the machine.

I plugged the monitor in and found that the console had dropped into a debugger. I restarted the machine and was forced to run fsck_ffs manually because I had evidently suffered some disk corruption (I guess snvserve had died; I later noticed there was a core dump from svnserve in root's home directory). In the course of manually running fsck with the -y option I ended up inadvertently deleting some files in my subversion repository (although who knows whether they were "really" even still there by that point. But the machine did eventually come back and everything seemed fine.

This morning it seems the 9600 broke down again, evidently another case of filesystem corruption. During lunch I hooked the monitor up again, found myself in the debugger again, and so I rebooted. This time fsck fixed things up on its own and the machine was fine. When I got back to my desk I could ssh into the machine but just as I was about to copy some files over from the old 1G drive to the external SCSI drive, my connection dropped out and I suspect when I get back in front of the machine I'll find myself in front of the debugger again.

It's probably time to retire the 1G drive, although that means I'll have to do a fresh install of NetBSD. I wonder whether these problems are disk related or somehow caused by the extra memory I transferred over from the 7200?

Back to the index of all blog entries