From owner-freebsd-scsi Sun Aug 3 06:00:57 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id GAA22959 for freebsd-scsi-outgoing; Sun, 3 Aug 1997 06:00:57 -0700 (PDT) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA22953 for ; Sun, 3 Aug 1997 06:00:52 -0700 (PDT) Received: (from bde@localhost) by godzilla.zeta.org.au (8.8.5/8.6.9) id WAA16206; Sun, 3 Aug 1997 22:56:57 +1000 Date: Sun, 3 Aug 1997 22:56:57 +1000 From: Bruce Evans Message-Id: <199708031256.WAA16206@godzilla.zeta.org.au> To: FreeBSD-SCSI@FreeBSD.ORG, Shimon@i-Connect.Net Subject: Re: Phoenix... SCSI Devices Minor Mapping Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >I am dealing with systems with very large databases, some systems will >have up to 8 DPT controllers, 4 busses per controller, 15 SCSI hubs per >bus, 7 devices per hub. This gives us well over 3,000 drives. We need >... >Proposal A: > >Divide the minor number as follows: > > f e d c b a 9 8 7 6 5 4 3 2 1 0 >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ >| c | c | c | c | b | b | b | b | l | l | l | l | t | t | t | t | >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > >Where t = Target ID > l = Logigal Unit number > b = Bus number > c = Controller number >... >Similar to Proposal A but with complete renaming and the following bitmap: > > f e d c b a 9 8 7 6 5 4 3 2 1 0 >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ >| b | b | b | b | t | t | t | t | t | t | t | t | l | l | l | l | >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > > 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ >| - | - | - | - | - | - | - | - | - | - | - | - | c | c | c | c | >+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > >Where t = Target ID > l = Logigal Unit number > b = Bus number > c = Controller number > - = Reserved Minor numbers are 24 bits (discontiguous), and we already have too many not-quite-right numbering schemes. The two uniform && used ones are: Boot-style device number (see ): f e d c b a 9 8 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | p | p | p | p | p | p | p | p | M | M | M | M | M | M | M | M | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | g | g | g | g | a | a | a | a | c | c | c | c | u | u | u | u | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Where M = Major number p = Partition number (should also contain Slice number) u = Unit number (map this to `t'?) c = Controller number a = Adaptor number ("uba, mba, etc") (map this to to `b'?) g = Magic number (4 bits - very wasteful; map this to 'l'?) There are not enough minor bits, since another bit or two is required to select the encoding scheme. Normal disk device number (see ): f e d c b a 9 8 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | M | M | M | M | M | M | M | M | u | u | u | u | u | p | p | p | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | t | t | t | t | t | t | t | u | u | u | u | s | s | s | s | s | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Where p = Partition number u = Unit number M = Major number s = Slice number t = Type (driver-specific, but not suitable for drive/controller select. Used mainly by SCSI driver to select encoding of other bits. Used by my version of the floppy driver for the floppy type). There are not nearly enough unit bits for you. This is easy enough to handle inside the kernel by making the encoding depend on the driver, but not so easy to handle in config(8) or MAKEDEV. Bruce From owner-freebsd-scsi Sun Aug 3 11:11:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id LAA05945 for freebsd-scsi-outgoing; Sun, 3 Aug 1997 11:11:54 -0700 (PDT) Received: from earth.mat.net (root@earth.mat.net [206.246.122.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id LAA05937 for ; Sun, 3 Aug 1997 11:11:52 -0700 (PDT) Received: from Journey2.mat.net (journey2.mat.net [206.246.122.116]) by earth.mat.net (8.6.12/8.6.12) with SMTP id OAA19164 for ; Sun, 3 Aug 1997 14:11:50 -0400 Date: Sun, 3 Aug 1997 14:12:06 -0400 (EDT) From: Chuck Robey X-Sender: chuckr@Journey2.mat.net To: FreeBSD-SCSI@freebsd.org Subject: new kernel Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I have an NCR 825 card, and I was just trying out a new kernel, but it wouldn't complete the boot. It gets to the line: ncr0: waiting for scsi devices to settle and just hangs. I waited 5 minutes, then rebooted. Any idea if I couased this myself? ----------------------------+----------------------------------------------- Chuck Robey | Interests include any kind of voice or data chuckr@eng.umd.edu | communications topic, C programming, and Unix. 213 Lakeside Drive Apt T-1 | Greenbelt, MD 20770 | I run Journey2 and picnic, both FreeBSD (301) 220-2114 | version 3.0 current -- and great FUN! ----------------------------+----------------------------------------------- From owner-freebsd-scsi Sun Aug 3 14:08:45 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id OAA17702 for freebsd-scsi-outgoing; Sun, 3 Aug 1997 14:08:45 -0700 (PDT) Received: from Octopussy.MI.Uni-Koeln.DE (Octopussy.MI.Uni-Koeln.DE [134.95.166.20]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id OAA17692 for ; Sun, 3 Aug 1997 14:08:40 -0700 (PDT) Received: from x14.mi.uni-koeln.de (annexr2-47.slip.Uni-Koeln.DE) by Octopussy.MI.Uni-Koeln.DE with SMTP id AA13867 (5.67b/IDA-1.5 for ); Sun, 3 Aug 1997 23:08:37 +0200 Received: (from se@localhost) by x14.mi.uni-koeln.de (8.8.6/8.6.9) id XAA00522; Sun, 3 Aug 1997 23:08:35 +0200 (CEST) X-Face: " Date: Sun, 3 Aug 1997 23:08:34 +0200 From: Stefan Esser To: Chuck Robey Cc: FreeBSD-SCSI@FreeBSD.ORG Subject: Re: new kernel References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.74 In-Reply-To: ; from Chuck Robey on Sun, Aug 03, 1997 at 02:12:06PM -0400 Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Aug 3, Chuck Robey wrote: > I have an NCR 825 card, and I was just trying out a new kernel, but it > wouldn't complete the boot. It gets to the line: > > ncr0: waiting for scsi devices to settle > > and just hangs. I waited 5 minutes, then rebooted. Any idea if I couased > this myself? I don't think so, but I did not receive any other NCR bug report, recently, and thus there may be something special with your controller or system. Please answer a few questions: Is this with the latest -current kernel ? Did a kernel built from sources more recent than July 28th work ? Assuming you got a local CVS repository, could you please check whether rev. 1.100 or perhaps even 1.102 still worked, or whether you have to go back to 1.99 ? Your problem could still be caused by changes to other parts of the kernel, but since there were a number of commits to the driver over the last few days, it may be a driver bug, which I'm most interested to find and fix. Regards, STefan From owner-freebsd-scsi Sun Aug 3 15:05:07 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA21212 for freebsd-scsi-outgoing; Sun, 3 Aug 1997 15:05:07 -0700 (PDT) Received: from earth.mat.net (root@earth.mat.net [206.246.122.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id PAA21190; Sun, 3 Aug 1997 15:04:53 -0700 (PDT) Received: from Journey2.mat.net (journey2.mat.net [206.246.122.116]) by earth.mat.net (8.6.12/8.6.12) with SMTP id SAA26433; Sun, 3 Aug 1997 18:04:49 -0400 Date: Sun, 3 Aug 1997 18:05:02 -0400 (EDT) From: Chuck Robey X-Sender: chuckr@Journey2.mat.net To: Stefan Esser cc: FreeBSD-SCSI@freebsd.org Subject: Re: new kernel In-Reply-To: <19970803230834.52396@mi.uni-koeln.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Sun, 3 Aug 1997, Stefan Esser wrote: > On Aug 3, Chuck Robey wrote: > > I have an NCR 825 card, and I was just trying out a new kernel, but it > > wouldn't complete the boot. It gets to the line: > > > > ncr0: waiting for scsi devices to settle > > > > and just hangs. I waited 5 minutes, then rebooted. Any idea if I couased > > this myself? > > I don't think so, but I did not receive > any other NCR bug report, recently, and > thus there may be something special with > your controller or system. > > Please answer a few questions: > > Is this with the latest -current kernel ? Yes, from last night's cvs/ctm. Only modification is addition of Amancio's sound code. I don't _think_ that would affect this. > > Did a kernel built from sources more recent > than July 28th work ? My previous kernel was from 7/18. > Assuming you got a local CVS repository, > could you please check whether rev. 1.100 > or perhaps even 1.102 still worked, or > whether you have to go back to 1.99 ? Rev of what file? I'd be happy to check, you're helping me! I could just check out sys as of some date, if you want. > Your problem could still be caused by > changes to other parts of the kernel, but > since there were a number of commits to > the driver over the last few days, it may > be a driver bug, which I'm most interested > to find and fix. If you'll work with me, I'll certainly help all I can. Be a little more precise in what you want me to check out of my local cvs. I just did a make world, but the only lkms I use are dos and cd9660. ----------------------------+----------------------------------------------- Chuck Robey | Interests include any kind of voice or data chuckr@eng.umd.edu | communications topic, C programming, and Unix. 213 Lakeside Drive Apt T-1 | Greenbelt, MD 20770 | I run Journey2 and picnic, both FreeBSD (301) 220-2114 | version 3.0 current -- and great FUN! ----------------------------+----------------------------------------------- From owner-freebsd-scsi Mon Aug 4 15:58:28 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA03720 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 15:58:28 -0700 (PDT) Received: from fly.HiWAAY.net (root@fly.HiWAAY.net [208.147.154.56]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id PAA03714 for ; Mon, 4 Aug 1997 15:58:17 -0700 (PDT) Received: from nexgen.hiwaay.net (tnt2-165.HiWAAY.net [208.147.148.165]) by fly.HiWAAY.net (8.8.6/8.8.6) with ESMTP id RAA09229 for ; Mon, 4 Aug 1997 17:57:43 -0500 (CDT) Received: from nexgen (localhost [127.0.0.1]) by nexgen.hiwaay.net (8.8.6/8.8.4) with ESMTP id RAA12906 for ; Mon, 4 Aug 1997 17:57:41 -0500 (CDT) Message-Id: <199708042257.RAA12906@nexgen.hiwaay.net> X-Mailer: exmh version 2.0zeta 7/24/97 To: scsi@freebsd.org Subject: Maximum Tape Block Size? From: dkelly@HiWAAY.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 04 Aug 1997 17:57:39 -0500 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk A year or so ago I recall Jordan had an SGI DAT tape with 256k blocksize which wouldn't read on FreeBSD. Am wondering how this problem was eventually solved? Checking the archives I found mention of MAXBSIZE which appears to be defined in /usr/src/sys/sys/param.h (grep the whole /usr/src tree and you'll find a couple). The comments in param.h suggest this value can be increased without harm to existing filesystems but suggests any new filesystems created may require this value in the future to function. If I increase MAXBSIZE in param.h will it do what I think; allow reading those darned 256k-blocked tapes? Applications of interest: gtar, pax, dd, tcopy. If I can make this work I can replace a couple of Suns (which can't read the tapes either) with FreeBSD systems. -- David Kelly N4HHE, dkelly@hiwaay.net ===================================================================== The human mind ordinarily operates at only ten percent of its capacity -- the rest is overhead for the operating system. From owner-freebsd-scsi Mon Aug 4 17:05:17 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id RAA07153 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 17:05:17 -0700 (PDT) Received: from iworks.InterWorks.org (deischen@iworks.interworks.org [128.255.18.10]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA07111 for ; Mon, 4 Aug 1997 17:04:58 -0700 (PDT) Received: (from deischen@localhost) by iworks.InterWorks.org (8.7.5/) id TAA02942; Mon, 4 Aug 1997 19:08:43 -0500 (CDT) Message-Id: <199708050008.TAA02942@iworks.InterWorks.org> Date: Mon, 4 Aug 1997 19:08:43 -0500 (CDT) From: "Daniel M. Eischen" To: dkelly@HiWAAY.net, scsi@FreeBSD.ORG Subject: Re: Maximum Tape Block Size? Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > A year or so ago I recall Jordan had an SGI DAT tape with 256k blocksize > which wouldn't read on FreeBSD. Am wondering how this problem was > eventually solved? I recall (was it julian?) suggesting he use dd with some appropriate values for ibs or bs? Dan Eischen deischen@iworks.InterWorks.org From owner-freebsd-scsi Mon Aug 4 18:34:12 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id SAA11238 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 18:34:12 -0700 (PDT) Received: from sendero-ppp.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id SAA11230 for ; Mon, 4 Aug 1997 18:34:01 -0700 (PDT) Received: (qmail 21031 invoked by uid 1000); 5 Aug 1997 01:33:59 -0000 Message-ID: X-Mailer: XFMail 1.2-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199708050016.RAA28126@ns2.yahoo.com> Date: Mon, 04 Aug 1997 18:33:59 -0700 (PDT) Organization: Atlas Telecom From: Simon Shapiro To: filo@yahoo.com, freebsd-scsi@freebsd.org Subject: RE: strange difference between DPT and Adaptec Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [ I hope you do not mind me forwarding this to the FreeBSD SCSI list ] Hi David Filo; On 05-Aug-97 you wrote: > This problem is a bit strange. I've got a disk (fujitsu narrow, > running 2.2 with 1.1.10 DPT code) that seems to work just fine with an > Adaptec 2940. If I connect the drive (just by itself) to the DPT card > it comes up fine - but there is one particular directory that if I try > to "ls" in, the machine panics with "bad dir ino at offfset 0: mangled > entry". I found this because fsck would fail on bus error (on this > paricular directory) under DPT but work fine with the 2940. Fsck should NOT fail with bus error regardless of diet. Its purpose in life is to deal with broken file systems. Right? > I've tried this in two different motherboards both of which have > worked fine otherwise with the DPT card. And in both cases DPT fails, > 2940 works fine. > > I dd'ed the entire drive onto another and that copy does not exhibit > this behavior. You probably did the dd undr the DPT. Right? > So to me it seems like maybe some marginal hardware (possibly the > drive). Although this behavior is 100% consistent which makes me > wonder. We won't be using that drive so it's not a big deal, but > thought you might like to know. If you have any clue what the problem > might be or tests you'd like me to try, let me know. Do not throw the drive away yet. See below: Now that you baught a DPT, let me tell you something :-) - Don't you hate this statement. Happened to me on several cars I baught. Joking aside, the DPT firmware, sometimes reserves one block at the start of the disk for its own purposes, and slips all LBAs by one. The theory is that it does it on ALL RAID devices and on the lowest target on the lowest bus on the first controller. The reality is that it is hard (for me) to tell when and why the firmware does that. When you dd from device to device, you basically skip the problem. Simon BTW, this sector is what allows the DPT firmware to know what RAID arrays are where and how. If you are brave, disconnect a large array, shuffle the drive (after changing target IDs and try to boot. run the dptmgr and look at the confusion. At one time, I had a RAID-1, composed of targets 8,9 on bus 0. I re-jumpered the drives to 0,1 and put them on bus zero, expecting the RAID-1 array to disappear. When booting, the BIOS reported them as b0t0u0, but FreeBSD saw them as b0t8u0. Took me few moments to figure out why ALL the kernels panicked with ``cannot mount root fs''. Moral: When moving disks form HBA to HBA, low-level format them. From owner-freebsd-scsi Mon Aug 4 18:34:43 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id SAA11278 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 18:34:43 -0700 (PDT) Received: from earth.mat.net (root@earth.mat.net [206.246.122.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id SAA11262; Mon, 4 Aug 1997 18:34:35 -0700 (PDT) Received: from Journey2.mat.net (journey2.mat.net [206.246.122.116]) by earth.mat.net (8.6.12/8.6.12) with SMTP id VAA04533; Mon, 4 Aug 1997 21:34:27 -0400 Date: Mon, 4 Aug 1997 21:34:43 -0400 (EDT) From: Chuck Robey X-Sender: chuckr@Journey2.mat.net To: Stefan Esser cc: FreeBSD-SCSI@freebsd.org Subject: Re: new kernel In-Reply-To: <19970803230834.52396@mi.uni-koeln.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Sun, 3 Aug 1997, Stefan Esser wrote: > On Aug 3, Chuck Robey wrote: > > I have an NCR 825 card, and I was just trying out a new kernel, but it > > wouldn't complete the boot. It gets to the line: > > > > ncr0: waiting for scsi devices to settle > > > > and just hangs. I waited 5 minutes, then rebooted. Any idea if I couased > > this myself? > > I don't think so, but I did not receive > any other NCR bug report, recently, and > thus there may be something special with > your controller or system. > > Please answer a few questions: > > Is this with the latest -current kernel ? > > Did a kernel built from sources more recent > than July 28th work ? > > Assuming you got a local CVS repository, > could you please check whether rev. 1.100 > or perhaps even 1.102 still worked, or > whether you have to go back to 1.99 ? Thanks, Stefan, I checked ncr.c by going backwards from my present version 1.103, one step at a time. 1.101 failed, I am presently using the kernel with 1.100, so the change is from 1.100->1.101. If you could make use of it, I rebooted verbosely, and I would send you that if you wanted, or make any other test you'd suggest. ----------------------------+----------------------------------------------- Chuck Robey | Interests include any kind of voice or data chuckr@eng.umd.edu | communications topic, C programming, and Unix. 213 Lakeside Drive Apt T-1 | Greenbelt, MD 20770 | I run Journey2 and picnic, both FreeBSD (301) 220-2114 | version 3.0 current -- and great FUN! ----------------------------+----------------------------------------------- From owner-freebsd-scsi Mon Aug 4 19:39:25 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA13679 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 19:39:25 -0700 (PDT) Received: from ns2.yahoo.com (ns2.yahoo.com [205.216.162.20]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA13674 for ; Mon, 4 Aug 1997 19:39:24 -0700 (PDT) Received: (from filo@localhost) by ns2.yahoo.com (8.8.5/8.6.12) id TAA28428; Mon, 4 Aug 1997 19:38:22 -0700 (PDT) Date: Mon, 4 Aug 1997 19:38:22 -0700 (PDT) Message-Id: <199708050238.TAA28428@ns2.yahoo.com> From: David Filo To: Shimon@i-Connect.Net CC: freebsd-scsi@freebsd.org In-reply-to: (message from Simon Shapiro on Mon, 04 Aug 1997 18:33:59 -0700 (PDT)) Subject: Re: strange difference between DPT and Adaptec Reply-To: filo@yahoo.com Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > This problem is a bit strange. I've got a disk (fujitsu narrow, > > running 2.2 with 1.1.10 DPT code) that seems to work just fine with an > > Adaptec 2940. If I connect the drive (just by itself) to the DPT card > > it comes up fine - but there is one particular directory that if I try > > to "ls" in, the machine panics with "bad dir ino at offfset 0: mangled > > entry". I found this because fsck would fail on bus error (on this > > paricular directory) under DPT but work fine with the 2940. > > > > > I dd'ed the entire drive onto another and that copy does not exhibit > > this behavior. > > You probably did the dd undr the DPT. Right? > Actually the dd was using the Adaptec, which I guess was kind of stupid. I did the same using the DPT and the dd'ed copy had the same problem - mangled directory causing a panic. And the new copy also fails with the 2940, which makes sense. So on the original disk there is some data that the 2940 can read but the DPT cannot. > > Joking aside, the DPT firmware, sometimes reserves one block at the > start of the disk for its own purposes, and slips all LBAs by one. > > The theory is that it does it on ALL RAID devices and on the lowest > target on the lowest bus on the first controller. The drive in question was not in a raid array, but was target 0 on bus 0 (only one controller is used). BTW, the problem persists if original drive is moved to target 1 bus 0, with a boot disk at target 0. > > BTW, this sector is what allows the DPT firmware to know what RAID arrays > are where and how. If you are brave, disconnect a large array, shuffle > the drive (after changing target IDs and try to boot. run the dptmgr > and look at the confusion. > I've noticed this when trying to build arrays from scratch using drives that used to be in arrays. I learned that you need to delete the arrays with dptmgr before rearranging things. > > Moral: When moving disks form HBA to HBA, low-level format them. > Okay, but i still don't understand what's going on. When we buy drives we don't low-level format them. Just plop them in and run fdisk, disklabel, and newfs. No problems to date. Will low-level format reserve space in the DPT case? Does extra space need to be reserved with fdisk for the DPT? Does this mean you can't clone a machine by simply dd'ing the disk if the destination adapter (say DPT) was not used to build the original disk (say 2940)? From owner-freebsd-scsi Mon Aug 4 20:03:24 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id UAA14568 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 20:03:24 -0700 (PDT) Received: from silvia.HIP.Berkeley.EDU (ala-ca8-36.ix.netcom.com [207.93.141.164]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id UAA14549 for ; Mon, 4 Aug 1997 20:03:19 -0700 (PDT) Received: (from asami@localhost) by silvia.HIP.Berkeley.EDU (8.8.6/8.6.9) id UAA21425; Mon, 4 Aug 1997 20:02:56 -0700 (PDT) Date: Mon, 4 Aug 1997 20:02:56 -0700 (PDT) Message-Id: <199708050302.UAA21425@silvia.HIP.Berkeley.EDU> To: dufault@hda.com CC: scsi@freebsd.org In-reply-to: <199708011056.GAA06673@hda.hda.com> (message from Peter Dufault on Fri, 1 Aug 1997 06:56:52 -0400 (EDT)) Subject: Re: NOT READY From: asami@cs.berkeley.edu (Satoshi Asami) Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk * Something like that - I think you want to return SCSIRET_DO_RETRY and * not SCSIRET_CONTINUE, and you may need to delay in there somehow because * it is going to take a while to spin up. Ok, here's another one. === Index: sd.c =================================================================== RCS file: /usr/cvs/src/sys/scsi/sd.c,v retrieving revision 1.95.2.2 diff -u -r1.95.2.2 sd.c --- sd.c 1997/02/05 19:02:22 1.95.2.2 +++ sd.c 1997/08/04 22:26:13 @@ -855,7 +855,17 @@ /* Retry all disk errors. */ + scsi_sense_print(xs); + /* Try to restart if drive says not ready. + */ + if ((sense->error_code & SSD_ERRCODE_VALID) == 0x2) { + scsi_start_unit(xs->sc_link, SCSI_ERR_OK | SCSI_SILENT); + DELAY(5000000); + printf(", sent start unit command\n"); + return SCSIRET_DO_RETRY; + } + if (xs->retries) printf(", retries:%d\n", xs->retries); else === I've been running the machines with the above patch since this afternoon. I'll let you know if I see my message in the log. Satoshi From owner-freebsd-scsi Mon Aug 4 22:20:25 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id WAA20748 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 22:20:25 -0700 (PDT) Received: from implode.root.com (implode.root.com [198.145.90.17]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id WAA20740 for ; Mon, 4 Aug 1997 22:20:21 -0700 (PDT) Received: from implode.root.com (localhost [127.0.0.1]) by implode.root.com (8.8.5/8.8.5) with ESMTP id WAA20205; Mon, 4 Aug 1997 22:21:04 -0700 (PDT) Message-Id: <199708050521.WAA20205@implode.root.com> To: asami@cs.berkeley.edu (Satoshi Asami) cc: dufault@hda.com, scsi@FreeBSD.ORG Subject: Re: NOT READY In-reply-to: Your message of "Mon, 04 Aug 1997 20:02:56 PDT." <199708050302.UAA21425@silvia.HIP.Berkeley.EDU> From: David Greenman Reply-To: dg@root.com Date: Mon, 04 Aug 1997 22:21:04 -0700 Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >+ if ((sense->error_code & SSD_ERRCODE_VALID) == 0x2) { >+ scsi_start_unit(xs->sc_link, SCSI_ERR_OK | SCSI_SILENT); >+ DELAY(5000000); >+ printf(", sent start unit command\n"); >+ return SCSIRET_DO_RETRY; >+ } Uh, delaying for 5 seconds is not a good idea - the system will be totally dead to the world during the entire time. The right way to do this would be to schedule a 5 second timeout after sending the start-unit. -DG David Greenman Core-team/Principal Architect, The FreeBSD Project From owner-freebsd-scsi Mon Aug 4 23:07:32 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id XAA22227 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 23:07:32 -0700 (PDT) Received: from pluto.plutotech.com (root@mail.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id XAA22222 for ; Mon, 4 Aug 1997 23:07:30 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.5) with ESMTP id AAA27673; Tue, 5 Aug 1997 00:07:23 -0600 (MDT) Message-Id: <199708050607.AAA27673@pluto.plutotech.com> To: asami@cs.berkeley.edu (Satoshi Asami) cc: scsi@FreeBSD.ORG Subject: Re: NOT READY In-reply-to: Your message of "Fri, 01 Aug 1997 04:06:17 PDT." <199708011106.EAA27735@silvia.HIP.Berkeley.EDU> Date: Tue, 05 Aug 1997 00:07:23 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >Something like this? Unfortunately, I don't think it will work unless you poll for completion since the error will be returned in an interrupt context. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Mon Aug 4 23:25:04 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id XAA22887 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 23:25:04 -0700 (PDT) Received: from silvia.HIP.Berkeley.EDU (ala-ca8-36.ix.netcom.com [207.93.141.164]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id XAA22878 for ; Mon, 4 Aug 1997 23:25:00 -0700 (PDT) Received: (from asami@localhost) by silvia.HIP.Berkeley.EDU (8.8.6/8.6.9) id XAA15599; Mon, 4 Aug 1997 23:24:54 -0700 (PDT) Date: Mon, 4 Aug 1997 23:24:54 -0700 (PDT) Message-Id: <199708050624.XAA15599@silvia.HIP.Berkeley.EDU> To: gibbs@plutotech.com CC: scsi@FreeBSD.ORG In-reply-to: <199708050607.AAA27673@pluto.plutotech.com> (gibbs@plutotech.com) Subject: Re: NOT READY From: asami@cs.berkeley.edu (Satoshi Asami) Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk * Unfortunately, I don't think it will work unless you poll * for completion since the error will be returned in an * interrupt context. Oh. I thought I did what you said, but I have no clue as to what to do. I just searched for "sense_handler" and "start_unit" in sd.c and tried to paste them together. Can someone tell me some more about how to go about this? And as David pointed out, it's going to lock up the system for 5 seconds (which is probably not bad for our case, though...). Satoshi From owner-freebsd-scsi Mon Aug 4 23:55:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id XAA24880 for freebsd-scsi-outgoing; Mon, 4 Aug 1997 23:55:54 -0700 (PDT) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id XAA24875 for ; Mon, 4 Aug 1997 23:55:50 -0700 (PDT) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id IAA27169; Tue, 5 Aug 1997 08:55:49 +0200 Received: (from j@localhost) by uriah.heep.sax.de (8.8.5/8.8.5) id IAA15949; Tue, 5 Aug 1997 08:35:04 +0200 (MET DST) Message-ID: <19970805083504.NT06658@uriah.heep.sax.de> Date: Tue, 5 Aug 1997 08:35:04 +0200 From: j@uriah.heep.sax.de (J Wunsch) To: scsi@FreeBSD.ORG Cc: dkelly@hiwaay.net Subject: Re: Maximum Tape Block Size? References: <199708042257.RAA12906@nexgen.hiwaay.net> X-Mailer: Mutt 0.60_p2-3,5,8-9 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: <199708042257.RAA12906@nexgen.hiwaay.net>; from dkelly@hiwaay.net on Aug 4, 1997 17:57:39 -0500 Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk As dkelly@hiwaay.net wrote: > A year or so ago I recall Jordan had an SGI DAT tape with 256k blocksize > which wouldn't read on FreeBSD. Am wondering how this problem was > eventually solved? No, it hasn't. The problem is the way requests are being `sliced' in physio(9). Right now, physio() uses a fixed 64 KB slicing, which is the smallest common denominator for the SCSI controller with the smallest scatter-gather list (AHA1540 with 16 segments per 4 KB in worst case, one page each). At least, that's how i understood it. It should be made more flexible so that controllers with larger scatter-gather lists could be handled efficiently. Those with a small list could perhaps optionally use a large physically contiguous region as a bounce buffer, but that's quite wasteful. Feel free to implement a better scheme. ;-) -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-scsi Tue Aug 5 01:14:55 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id BAA28730 for freebsd-scsi-outgoing; Tue, 5 Aug 1997 01:14:55 -0700 (PDT) Received: from pluto.plutotech.com (root@mail.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id BAA28719 for ; Tue, 5 Aug 1997 01:14:52 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.5) with ESMTP id CAA00234; Tue, 5 Aug 1997 02:14:49 -0600 (MDT) Message-Id: <199708050814.CAA00234@pluto.plutotech.com> To: asami@cs.berkeley.edu (Satoshi Asami) cc: gibbs@plutotech.com, scsi@FreeBSD.ORG Subject: Re: NOT READY In-reply-to: Your message of "Mon, 04 Aug 1997 23:24:54 PDT." <199708050624.XAA15599@silvia.HIP.Berkeley.EDU> Date: Tue, 05 Aug 1997 02:14:49 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > * Unfortunately, I don't think it will work unless you poll > * for completion since the error will be returned in an > * interrupt context. > >Oh. I thought I did what you said, but I have no clue as to what to >do. I just searched for "sense_handler" and "start_unit" in sd.c and >tried to paste them together. Can someone tell me some more about >how to go about this? Well, it isn't really obvious that what you did won't work. 8-) The problem is that making it work in the current architecture is quite tough. I think that the best approach to try first, before attempting something fancier, is to perform a start unit with the immediate bit set (you'll have to modify the sd_start routine and the SCSI spec for which bit to set - bit 0 in the second byte I think), then simply poll for completion (should be short since you'll be telling it to return as soon as it starts the operation instead of holding off until the spin completes). Then you can resubmit the original transaction which will probably wind up in a loop hitting the "unit in process of becoming ready" code until the spin up completes. Any way you slice it though, you will end up "hanging" the machine until the spin up completes. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Tue Aug 5 05:35:32 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id FAA10408 for freebsd-scsi-outgoing; Tue, 5 Aug 1997 05:35:32 -0700 (PDT) Received: from hda.hda.com (hda-bicnet.bicnet.net [208.220.66.37]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id FAA10397 for ; Tue, 5 Aug 1997 05:35:28 -0700 (PDT) Received: (from dufault@localhost) by hda.hda.com (8.8.5/8.8.5) id HAA18841; Tue, 5 Aug 1997 07:49:17 -0400 (EDT) From: Peter Dufault Message-Id: <199708051149.HAA18841@hda.hda.com> Subject: Re: NOT READY In-Reply-To: <199708050302.UAA21425@silvia.HIP.Berkeley.EDU> from Satoshi Asami at "Aug 4, 97 08:02:56 pm" To: asami@cs.berkeley.edu (Satoshi Asami) Date: Tue, 5 Aug 1997 07:49:16 -0400 (EDT) Cc: scsi@freebsd.org X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > + scsi_start_unit(xs->sc_link, SCSI_ERR_OK | SCSI_SILENT); > + DELAY(5000000); This is a five second busy wait and will hang the system. You're in a done interrupt at this point so you can't sleep. I think a "SCSIRET_TRANSACTION_CONTINUES" up in scsi_base could be added so that this interrupt thread would go away without signalling any completion and then finish up this transaction up when the spliced in command completed. It would take some work to do properly and test. Try returning without the DELAY and hope the code that retries forever when something "is becoming ready" kicks in. It will still poll but at least it will stop as soon as it is ready. -- Peter Dufault (dufault@hda.com) Realtime development, Machine control, HD Associates, Inc. Safety critical systems, Agency approval From owner-freebsd-scsi Thu Aug 7 10:01:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id KAA17302 for freebsd-scsi-outgoing; Thu, 7 Aug 1997 10:01:14 -0700 (PDT) Received: from gatekeeper.barcode.co.il (gatekeeper.barcode.co.il [192.116.93.17]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA17296 for ; Thu, 7 Aug 1997 10:01:09 -0700 (PDT) Received: (from nadav@localhost) by gatekeeper.barcode.co.il (8.8.5/8.6.12) id UAA26573; Thu, 7 Aug 1997 20:01:28 +0300 (IDT) Date: Thu, 7 Aug 1997 20:01:28 +0300 (IDT) From: Nadav Eiron To: scsi@freebsd.org Subject: NCR 810 fatal errors during install. (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I've tried this on -questions and didn't get any replies, so I'm trying out scsi (though I knwo it's not really the right place). I've also tried to leave just one disk on the SCSI chain, and I get the same results. Any help will be greatly appreciated. ---------- Forwarded message ---------- Date: Wed, 6 Aug 1997 19:08:10 +0300 (IDT) From: Nadav Eiron To: questions@freebsd.org Subject: NCR 810 fatal errors during install. Hi people, We have here a DECpc XL 590 (it used to be a 466, but got upgraded recently). This machine has a Nepture chipset, built-in NCR 53C810, 32MB RAM and a DE435 ethernet card. On the SCSI bus I have: ID 0 - Quantum LPS340S (340MB disk). ID 1 - HP C3323-300 (1GB disk). ID 5 - Toshiba XM-4101TA (2x CD). All are recognized correctly during boot (both disks work at 10MB/sec). Both 2.2.2-RELEASE and 2.2-090801-RELENG installs give the following shortly after starting writing to the disks (either during newfs-ing them or while copying): ncr0:0: ERROR (20:0) (8-28-0) (8/13) @ (e18:18000140). script cmd = 88030000 reg: da 10 80 13 47 08 00 1f 01 08 80 28 00 00 08 00. ncr0: have to clear fifos. ncr0: restart (fatal error). sd0(ncr0:0:0): COMMAND FAILED (9 ff) @f0663c00. ncr0: timeout ccb=f0663c00 (skip) and then it just hangs. This machine used to work fine (without the HP disk though) when it had a 486 CPU under Win95 and NT, BTW. Can anyone make sense of that? TIA Nadav From owner-freebsd-scsi Fri Aug 8 18:51:23 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id SAA00260 for freebsd-scsi-outgoing; Fri, 8 Aug 1997 18:51:23 -0700 (PDT) Received: from mail.ican.net ([204.92.55.5]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id SAA00251 for ; Fri, 8 Aug 1997 18:51:17 -0700 (PDT) Received: from oddjob.ican.net (oddjob.ican.net [204.92.55.7]) by mail.ican.net (8.8.6/8.8.6) with ESMTP id VAA00369; Fri, 8 Aug 1997 21:50:36 -0400 (EDT) Received: (from josh@localhost) by oddjob.ican.net (8.8.6/8.8.6) id VAA19917; Fri, 8 Aug 1997 21:51:05 -0400 (EDT) Message-ID: <19970808215105.63789@ican.net> Date: Fri, 8 Aug 1997 21:51:05 -0400 From: Josh Tiefenbach To: Simon Shapiro Cc: scsi@freebsd.org Subject: `problem' with dpt driver v1.2.0 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.74 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Shimon, We've been noticing some disturbing behavior with both v1.1.10 and v1.2.0 of the dpt driver. In a nutshell, we cant have unattended reboots. I think the best way of describing the problem, is with dmesg outputs. [shutdown box] syncing disks... 2 done dpt0: Shutting down (mode 0) HBA. Please wait...dpt0: Controller was warned of shutdown and is now disabled Rebooting... [snip] DPT: RAID Manager driver, Version 1.0.0 Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 DPT: PCI SCSI HBA Driver, version 1.2.0 dpt0 rev 2 int a irq 11 on pci0:13 dpt0: DPT type 3, model PM3334UW firmware 07L0, Protocol 0 on port 1090 with 458753MB Write-Back cache dpt0: Enabled Options: Use Software Interrupts Precisely Track State Transitions Collect Metrics dpt0 waiting for scsi devices to settle dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t0u0 but controller is shutdown. Aborting... dpt0 ERROR: Command "Inquiry [7.5]" recieved for b0t0u0 but controller is shutdown. Aborting... dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t1u0 but controller is shutdown. Aborting... [repeated *many* times, followed by eventually] panic: cannot mount root This is, of course, a problem :) The only way I've been able to ameliorate the situation is to boot the machine with an older version of the dpt driver (from floppy no less), wait for it to come up. and reboot with the new driver. Any suggestions? josh -- Josh Tiefenbach - Systems Engineer - ACC TelEnterprises - josh@ican.net From owner-freebsd-scsi Fri Aug 8 21:28:04 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id VAA06066 for freebsd-scsi-outgoing; Fri, 8 Aug 1997 21:28:04 -0700 (PDT) Received: from sendero-ppp.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id VAA06005 for ; Fri, 8 Aug 1997 21:27:42 -0700 (PDT) Received: (qmail 478 invoked by uid 1000); 9 Aug 1997 04:21:17 -0000 Message-ID: X-Mailer: XFMail 1.2-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <19970808215105.63789@ican.net> Date: Fri, 08 Aug 1997 21:21:17 -0700 (PDT) Organization: Atlas Telecom From: Simon Shapiro To: Josh Tiefenbach Subject: RE: `problem' with dpt driver v1.2.0 Cc: scsi@freebsd.org Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi Josh Tiefenbach; On 09-Aug-97 you wrote: > Shimon, > > We've been noticing some disturbing behavior with both v1.1.10 and v1.2.0 > of > the dpt driver. In a nutshell, we cant have unattended reboots. I think > the > best way of describing the problem, is with dmesg outputs. > > [shutdown box] > > syncing disks... 2 done > dpt0: Shutting down (mode 0) HBA. Please wait...dpt0: Controller was > warned of shutdown and is now disabled > Rebooting... > > [snip] > > DPT: RAID Manager driver, Version 1.0.0 > Probing for devices on PCI bus 0: > chip0 rev 2 on pci0:0 > DPT: PCI SCSI HBA Driver, version 1.2.0 > dpt0 rev 2 int a irq 11 on pci0:13 > dpt0: DPT type 3, model PM3334UW firmware 07L0, Protocol 0 > on port 1090 with 458753MB Write-Back cache > dpt0: Enabled Options: > Use Software Interrupts > Precisely Track State Transitions > Collect Metrics > dpt0 waiting for scsi devices to settle > dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t0u0 > but controller is shutdown. Aborting... > dpt0 ERROR: Command "Inquiry [7.5]" recieved for b0t0u0 > but controller is shutdown. Aborting... > dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t1u0 > but controller is shutdown. Aborting... > > [repeated *many* times, followed by eventually] > > panic: cannot mount root > > > This is, of course, a problem :) The only way I've been able to > ameliorate the > situation is to boot the machine with an older version of the dpt driver > (from > floppy no less), wait for it to come up. and reboot with the new driver. > > Any suggestions? > > josh > > -- > Josh Tiefenbach - Systems Engineer - ACC TelEnterprises - josh@ican.net From owner-freebsd-scsi Sat Aug 9 02:43:48 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id CAA16531 for freebsd-scsi-outgoing; Sat, 9 Aug 1997 02:43:48 -0700 (PDT) Received: from sendero-ppp.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id CAA16512 for ; Sat, 9 Aug 1997 02:43:40 -0700 (PDT) Received: (qmail 17350 invoked by uid 1000); 9 Aug 1997 09:43:52 -0000 Message-ID: X-Mailer: XFMail 1.2-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <19970808215105.63789@ican.net> Date: Sat, 09 Aug 1997 02:43:51 -0700 (PDT) Organization: Atlas Telecom From: Simon Shapiro To: Josh Tiefenbach Subject: RE: `problem' with dpt driver v1.2.0 Cc: scsi@freebsd.org Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi Josh Tiefenbach; On 09-Aug-97 you wrote: > Shimon, Simon will do just as well :-) > We've been noticing some disturbing behavior with both v1.1.10 and v1.2.0 > of > the dpt driver. In a nutshell, we cant have unattended reboots. I think > the > best way of describing the problem, is with dmesg outputs. Version 1.2 is not published. See the copyright message. Actually, version 1.2 does not exist yet. > [shutdown box] > > syncing disks... 2 done > dpt0: Shutting down (mode 0) HBA. Please wait...dpt0: Controller was > warned of shutdown and is now disabled > Rebooting... > > [snip] > > DPT: RAID Manager driver, Version 1.0.0 > Probing for devices on PCI bus 0: > chip0 rev 2 on pci0:0 > DPT: PCI SCSI HBA Driver, version 1.2.0 > dpt0 rev 2 int a irq 11 on pci0:13 > dpt0: DPT type 3, model PM3334UW firmware 07L0, Protocol 0 > on port 1090 with 458753MB Write-Back cache > dpt0: Enabled Options: > Use Software Interrupts > Precisely Track State Transitions > Collect Metrics > dpt0 waiting for scsi devices to settle > dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t0u0 > but controller is shutdown. Aborting... > dpt0 ERROR: Command "Inquiry [7.5]" recieved for b0t0u0 > but controller is shutdown. Aborting... > dpt0 ERROR: Command "Test Unit Ready [7.24]" recieved for b0t1u0 > but controller is shutdown. Aborting... > > [repeated *many* times, followed by eventually] > > panic: cannot mount root > > > This is, of course, a problem :) The only way I've been able to > ameliorate the > situation is to boot the machine with an older version of the dpt driver > (from > floppy no less), wait for it to come up. and reboot with the new driver. > > Any suggestions? Yes. Tell Simon to: a) take a Programming in C introductory class and to stop working until 0300 four times a week :-) A bit more to the point; When we shutdown the controller, we set a bit in a state variable in a structure which controls the DPT behaviour (dpt_softc_t *dpt....). Amazingly enough, on your system (we never saw this problem here on any machine, honest), when it reboots on the same kernel configuration, your system's DRAM survives the reset, malloc allocates the same virtual address, the vm assigns the same physical address, etc. and you see the bit still set. Please go to sys/pci/dpt_pci.c and in the function dpt_pci_attach, after the long comment, BEFORE the four TAILQ_INIT calls, add: bzero(dpt, sizeof(dpt_softc_t); If that does not fix the problem, let me know. When I release 1.2, the problem will be fixed. Simon From owner-freebsd-scsi Sat Aug 9 02:43:50 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id CAA16540 for freebsd-scsi-outgoing; Sat, 9 Aug 1997 02:43:50 -0700 (PDT) Received: from sendero-ppp.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id CAA16514 for ; Sat, 9 Aug 1997 02:43:40 -0700 (PDT) Received: (qmail 17372 invoked by uid 1000); 9 Aug 1997 09:43:52 -0000 Message-ID: X-Mailer: XFMail 1.2-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199708050238.TAA28428@ns2.yahoo.com> Date: Sat, 09 Aug 1997 02:43:52 -0700 (PDT) Organization: Atlas Telecom From: Simon Shapiro To: filo@yahoo.com Subject: Re: strange difference between DPT and Adaptec Cc: freebsd-scsi@freebsd.org Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Sorry for the delay. Was very, very busy. I respond to mail on demand but sometimes forget to read the list. Sorry. Hi David Filo; On 05-Aug-97 you wrote: ... > The drive in question was not in a raid array, but was target 0 on bus > 0 (only one controller is used). BTW, the problem persists if > original drive is moved to target 1 bus 0, with a boot disk at target > 0. I really do not remember the details, but b0t0 is definitely a special case. If the DPT firmware thinks your drive is a new addition, it ignores all these special meanings. If not, it will confuse things. Report it as a bug to DPT support. ... > I've noticed this when trying to build arrays from scratch using > drives that used to be in arrays. I learned that you need to delete > the arrays with dptmgr before rearranging things. Yup. Only sometimes this is not an option. I always keep an Adaptec on hand. For one thing it knows to lowlevel format the drives without booting an O/S. I may be able to release a DPT firmware that does that. I need to certify it and get permission to post it. ... > Okay, but i still don't understand what's going on. When we buy > drives we don't low-level format them. Just plop them in and run > fdisk, disklabel, and newfs. No problems to date. I have never seen a problem with new drives either. Just taking a drive that has filesystems, partitions, etc. seems to be a good way to ruin a weekend. > Will low-level format reserve space in the DPT case? Does extra space > need to be reserved with fdisk for the DPT? I do not know. No. > Does this mean you can't clone a machine by simply dd'ing the disk if > the destination adapter (say DPT) was not used to build the original > disk (say 2940)? Again, i hesitate to answer this one. I clone systems with DPT for DPT, Adaptec for Adaptec. Well, I do not clone systems, Manufacturing does... I can ask. Simon