Date: Tue, 30 Nov 1999 11:42:45 -0500 From: "J. Maynard Gelinas" <mgelinas@bbn.com> To: freebsd-questions@freebsd.org Cc: mgelinas@bbn.com Subject: Problems writing a disklabel/filesystems to Chaparral RAID Message-ID: <199911301642.LAA03784@bbn.com>
next in thread | raw e-mail | index | archive | help
Hi folks, I've been having a problem with repeat SCSI timeouts while attempting to write a partition table and filesystems to a Chaparral based RAID array. The hardware specifics are as follows: HOST: PIII/500/256MB RAM with PCI Adaptec 2940-U2W controller, 789x based. boot disk attached to one channel, the RAID attached to the external channel. Nov 30 11:52:19 ocean /kernel: ahc0: <Adaptec 2940 Ultra2 SCSI adapter> rev 0x00 int a irq 10 on pci0.18.0 Nov 30 11:52:19 ocean /kernel: ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs RAID: External SCSI RAID manufactured by SAG using a Chaparral RAID controller. Eight IBM DRHS36D 36GB drives organized into a RAID 5 array. Here's what the kernel detects: Nov 30 11:52:20 ocean /kernel: Waiting 15 seconds for SCSI devices to settle Nov 30 11:52:20 ocean /kernel: pass2 at ahc0 bus 0 target 1 lun 7 Nov 30 11:52:20 ocean /kernel: pass2: <SAG ELEC G5312 G2.2> Fixed Processor SCSI-2 device Nov 30 11:52:20 ocean /kernel: pass2: 80.000MB/s transfers (40.000MHz, offset 127, 16bit), Tagged Queueing Enabled Nov 30 11:52:20 ocean /kernel: pass3 at ahc0 bus 0 target 5 lun 0 Nov 30 11:52:20 ocean /kernel: pass3: <JMR ELEC FORTRA SERIES. 1.00> Fixed Processor SCSI-2 device Nov 30 11:52:20 ocean /kernel: pass3: 3.300MB/s transfers Nov 30 11:52:20 ocean /kernel: da1 at ahc0 bus 0 target 1 lun 0 Nov 30 11:52:20 ocean /kernel: da1: <SAG ELEC G5312 G2.2> Fixed Direct Access SCSI-2 device Nov 30 11:52:20 ocean /kernel: da1: 80.000MB/s transfers (40.000MHz, offset 127, 16bit), Tagged Queueing Enabled Nov 30 11:52:20 ocean /kernel: da1: 209323MB (428693760 512 byte sectors: 255H 63S/T 26684C) Nov 30 11:52:20 ocean /kernel: changing root device to da0s1a Nov 30 11:52:20 ocean /kernel: da0 at ahc0 bus 0 target 0 lun 0 Nov 30 11:52:20 ocean /kernel: da0: <IBM DNES-309170Y SA30> Fixed Direct Access SCSI-3 device Nov 30 11:52:20 ocean /kernel: da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled Nov 30 11:52:20 ocean /kernel: da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C) da0 is the boot disk, da1 is the RAID. And here's what happens when I attempt to read the disklabel: ocean# disklabel -r da1 disklabel: /dev/rda1c: Input/output error ocean# At this point the machine hangs. Though: ocean# fdisk /dev/da1 ******* Working on device /dev/da1 ******* parameters extracted from in-core disklabel are: cylinders=26684 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=26684 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165,(FreeBSD/NetBSD/386BSD) start 63, size 428678397 (209315 Meg), flag 80 (active) beg: cyl 0/ sector 1/ head 1; end: cyl 1023/ sector 63/ head 254 The data for partition 2 is: <UNUSED> The data for partition 3 is: <UNUSED> The data for partition 4 is: <UNUSED> ocean# Works just fine. Also, I was able to use the /stand/sysinstall utility to write a label (though the machine crashed in the attempt). I was then able to build filesystems, which worked, though I don't feel terribly confidant that this will be stable over the long haul. My vendor suggested the crazy idea that we needed to set the RAID to LUN 1 at it's current ID because "UNIX doesn't like RAID's to be at LUN 0." I've never seen any commercial UNIX complain about SCSI devices living at LUN 0 (other than tape devices which may need multiple LUN support)... so I think these guys are just plain wrong. I've tried this with Solaris/x86 and while I can write a disklabel I get many SCSI timeout errors when reading/writing to the array. The vendor is claiming that they tested the unit with NT and that it works just fine (I've already sent it back once), and they suggest that I run NT if I want to continue getting support for the hardware. This just seems crazy... it's a SCSI disk as far the the OS is concerned, it just shouldn't matter what OS I use. The only possible issue I can see is if the disk array has more cyls, heads, sects than is supported by FreeBSD or Solaris/x86, but I can't imagine that FreeBSD would have a problem writing disklabels while Solaris/x86 would have a problem reading/writing to the array and that *this* is caused by a disk array being too large. When I sent the unit back I explicitly asked my vendor to double check all firmware revisions on each disk and the RAID controller, which they claim are set properly. I'm at a loss. Has anyone else seen these kinds of problems with similar hardware? And can you recommend a good RAID vendor who will support UNIX/BSD based solutions instead of telling me to run NT? Thanks, --Maynard To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199911301642.LAA03784>