Date: Fri, 24 Apr 1998 01:05:23 -0500 From: Patrick Hartling <mystify@friley63.res.iastate.edu> To: scsi@FreeBSD.ORG Subject: CAM == CAM Ate my Machine (and severly corrupted file systems too) Message-ID: <199804240605.BAA02352@friley63.res.iastate.edu>
index | next in thread | raw e-mail
The intention of this message is to warn people of the possibilty of serious disk corruption when using CAM + SMP + ccd. Other factors could be involved, but I've never had anything like this happen to me before in 2 years of running FreeBSD. This morning when I got back from class, I discovered that my machine had apparently gotten hungry and had eaten itself. It had been very stable for 10 days running an SMP kernel with the CAM patches (built April 13, 1998), but then this happened. Unfortunately, I don't know what caused this, but it certainly caused me a lot of stress this morning. My current disk configuration is three UW SCSI disks (two Quantum Viking's and one WD Enterprise) with one Viking and the Enterprise on a BusLogic BT-958 and the other Viking on the onboard Adaptec 2940UW. I have a mirrored ccd across the two Viking disks. Besides that, I'd say that everything concerning partitions/slices is fairly typical. (I also have a Jaz disk and a CD-ROM drive plugged into the BusLogic controller.) At any rate, my /var was completely trashed. fsck core dumped on it repeatedly. /usr was pretty well hosed too. Lots of files (mostly shared libraries) were removed by fsck. This was easy to replace since my /usr/src and /usr/obj partitions were fully intact. 'make install' saved the day here--once I got ld.so and libc.so.3.1 restored. However, the real horror story was the complete loss of my home directory. BUT I have /home on the mirrored ccd, and the second partition in the ccd was fully intact by some miracle. :) The first partition was thoroughly trashed. Everything that was in my base directory ended up in lost+found, so I could have gotten it back if I had spent the time to go through each file and directory and rename everything. Once I found that the second partition was fine, I tried to do: dd if=/dev/rda2s1e of=/dev/rda1s1e bs=64k but it kept saying that rda1s1e was a read-only filesystem. I could be wrong, but that seems kind of odd. This was after going through the appropriate steps to split the mirrored ccd up so that I could get to each partition individually. Using the block device worked fine, and I was able to get the whole ccd back in operation. Since getting everything more or less back to normal, I have crashed my machine again today by accidentally doing: disklabel -r sd4c I'm still not fully used to the da stuff, but now that I have discovered mixing it up can be fatal to stability, I'll remember to be more careful. :) So, unless someone can tell me what mistakes I've made to cause all this, I would recommend that people be extra careful with using the current CAM code (even though I'm really impressed with it overall). Personally, I'm feeling pretty edgy now, but I'll keep on using it and be sure to make frequent backups of important data. -Patrick Patrick L. Hartling | Research Assistant, ICEMT mystify@friley63.res.iastate.edu | SE Lab - 1117 Black Engineering http://www.public.iastate.edu/~oz | http://www.icemt.iastate.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the messagehelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199804240605.BAA02352>
