From owner-freebsd-scsi Sun Apr 20 08:41:02 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA27063 for freebsd-scsi-outgoing; Sun, 20 Apr 1997 08:41:02 -0700 (PDT) Received: from tor-adm1.nbc.netcom.ca (taob@tor-adm1.nbc.netcom.ca [207.181.89.5]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA27043; Sun, 20 Apr 1997 08:40:56 -0700 (PDT) Received: from localhost (taob@localhost) by tor-adm1.nbc.netcom.ca (8.8.5/8.8.5) with SMTP id LAA29484; Sun, 20 Apr 1997 11:40:00 -0400 (EDT) Date: Sun, 20 Apr 1997 11:40:00 -0400 (EDT) From: Brian Tao To: freebsd-scsi@freebsd.org cc: freebsd-current@freebsd.org Subject: Re: "Data overrun" with 3.0-SNAP, 2940UW controllers In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Sun, 20 Apr 1997, Brian Tao wrote: > > Apr 19 23:22:51 nfs /kernel: sd5: data overrun of 484 bytes detected. Forcing a retry. *sigh*... I should have dug more deeply into the mailing list archives before posting... -- Brian Tao (BT300, taob@netcom.ca) "Though this be madness, yet there is method in't" From owner-freebsd-scsi Sun Apr 20 09:21:24 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA28790 for freebsd-scsi-outgoing; Sun, 20 Apr 1997 09:21:24 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA28768; Sun, 20 Apr 1997 09:21:18 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id KAA24724; Sun, 20 Apr 1997 10:21:08 -0600 (MDT) Message-Id: <199704201621.KAA24724@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Brian Tao cc: freebsd-scsi@freebsd.org, freebsd-current@freebsd.org Subject: Re: "Data overrun" with 3.0-SNAP, 2940UW controllers In-reply-to: Your message of "Sun, 20 Apr 1997 02:04:50 EDT." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 20 Apr 1997 10:19:43 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > I'm stress testing a new NFS server with 2.0-970209-SNAP to see >how it deals with having a couple of Adaptec 2940UW controllers. >About half an hour into the tests, the machine appears to have crashed >(no response to pings), and I don't have physical access to the >machine right now. :( Old, old, old, old bug. You need to be running a newer verion of the aic7xxx driver. Try the latest 2.2SNAP availible from admin1.calweb.com. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Sun Apr 20 11:28:14 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA04619 for freebsd-scsi-outgoing; Sun, 20 Apr 1997 11:28:14 -0700 (PDT) Received: from tor-adm1.nbc.netcom.ca (taob@tor-adm1.nbc.netcom.ca [207.181.89.5]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA04599; Sun, 20 Apr 1997 11:28:08 -0700 (PDT) Received: from localhost (taob@localhost) by tor-adm1.nbc.netcom.ca (8.8.5/8.8.5) with SMTP id OAA27592; Sun, 20 Apr 1997 14:24:07 -0400 (EDT) Date: Sun, 20 Apr 1997 14:24:07 -0400 (EDT) From: Brian Tao To: "Justin T. Gibbs" cc: freebsd-scsi@freebsd.org, freebsd-current@freebsd.org Subject: Re: "Data overrun" with 3.0-SNAP, 2940UW controllers In-Reply-To: <199704201621.KAA24724@pluto.plutotech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Sun, 20 Apr 1997, Justin T. Gibbs wrote: > > Old, old, old, old bug. You need to be running a newer verion of > the aic7xxx driver. Try the latest 2.2SNAP availible from > admin1.calweb.com. Cool, thanks. -- Brian Tao (BT300, taob@netcom.ca) "Though this be madness, yet there is method in't" From owner-freebsd-scsi Sun Apr 20 17:34:01 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA00221 for freebsd-scsi-outgoing; Sun, 20 Apr 1997 17:34:01 -0700 (PDT) Received: from mexico.brainstorm.eu.org (root@mexico.brainstorm.fr [193.56.58.253]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA00192 for ; Sun, 20 Apr 1997 17:33:55 -0700 (PDT) Received: from brasil.brainstorm.eu.org (brasil.brainstorm.fr [193.56.58.33]) by mexico.brainstorm.eu.org (8.8.4/8.8.4) with ESMTP id CAA11759 for ; Mon, 21 Apr 1997 02:33:46 +0200 Received: (from uucp@localhost) by brasil.brainstorm.eu.org (8.8.4/8.6.12) with UUCP id CAA07800 for freebsd-scsi@FreeBSD.ORG; Mon, 21 Apr 1997 02:33:54 +0200 Received: (from roberto@localhost) by keltia.freenix.fr (8.8.5/keltia-uucp-2.9) id BAA00820; Mon, 21 Apr 1997 01:30:49 +0200 (CEST) Message-ID: <19970421013048.25762@keltia.freenix.fr> Date: Mon, 21 Apr 1997 01:30:48 +0200 From: Ollivier Robert To: "FreeBSD SCSI Users' list" Subject: NCR-810a & old Micropolis MP1624 problems Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 0.67 X-Operating-System: FreeBSD 3.0-CURRENT ctm#3195 Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, A friend and I upgraded a few days ago our 486 to brand new P/I P55T2P4 MB + P133 (overclocked at 2x 83 MHz, works great !) and ASUS SC-200 (the model which supports Ultra). We both have old Micropolis MP1624 drives (not really SCSI2 but CCS). Mine is working perfectly whereas his make his NCR pukes with an error message. Once the offending disk is turned off, the system boots w/o problems. I don't have the _exact_ message because the machine is too fast (and I don't have access to it) but it is something like this ncr0: phase change (6-7) 8@000959c resid=6 aborting jobs error 90 and cycling with the same message a few times till it decides to refuse the drive. The main problem is that all targets after this one won't be recognized and the system won't boot. BIG difference: he's using 2.1.6 (I run CURRENT). Are there signifiant différences between the 2.1.* driver and the 2.2/3.0 one ? He can boot the 2.2.1 boot floppy w/o problem. Is 2.2.1 the only answer to his problem ? My messages : Copyright (c) 1992-1997 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-CURRENT #0: Sat Apr 19 02:47:00 CEST 1997 roberto@keltia.freenix.fr:/src/src/sys/compile/NKELTIA CPU: Pentium (167.05-MHz 586-class CPU) Origin = "GenuineIntel" Id = 0x52c Stepping=12 Features=0x1bf real memory = 67108864 (65536K bytes) avail memory = 63684608 (62192K bytes) bdevsw_add_generic: adding D_DISK flag for device 15 Probing for devices on PCI bus 0: chip0 rev 3 on pci0:0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vga0 rev 1 int a irq 9 on pci0:10:0 ncr0 rev 18 int a irq 12 on pci0:11:0 scbus0 at ncr0 bus 0 sd0 at scbus0 target 0 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access sd0: 10.0 MB/s (100 ns, offset 8) 2063MB (4226725 512 byte sectors) scbus0 target 2 lun 0: phase change 2-3 10@00086bd8 resid=4. sd2 at scbus0 target 2 lun 0 sd2: type 0 fixed SCSI 2 sd2: Direct-Access sd2: 10.0 MB/s (100 ns, offset 8) 1030MB (2110812 512 byte sectors) st1 at scbus0 target 4 lun 0 st1: type 1 removable SCSI 2 st1: Sequential-Access density code 0x0, drive empty st0 at scbus0 target 5 lun 0 st0: type 1 removable SCSI 2 st0: Sequential-Access st0: 5.0 MB/s (200 ns, offset 8) density code 0x13, drive empty ncr1 rev 18 int a irq 11 on pci0:12:0 scbus1 at ncr1 bus 0 sd11 at scbus1 target 1 lun 0 sd11: type 0 fixed SCSI 2 sd11: Direct-Access sd11: 10.0 MB/s (100 ns, offset 8) 642MB (1316751 512 byte sectors) sd12 at scbus1 target 2 lun 0 sd12: type 0 fixed SCSI 2 sd12: Direct-Access sd12: 10.0 MB/s (100 ns, offset 8) 1006MB (2061108 512 byte sectors) cd1 at scbus1 target 6 lun 0 cd1: type 5 removable SCSI 2 cd1: CD-ROM cd1: asynchronous. cd1: M_REJECT sent for 1-3-1-76-8. can't get the size -- Ollivier ROBERT -=- FreeBSD: There are no limits -=- roberto@keltia.freenix.fr FreeBSD keltia.freenix.fr 3.0-CURRENT #41: Sun Mar 23 23:01:22 CET 1997 From owner-freebsd-scsi Mon Apr 21 03:52:30 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id DAA05465 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 03:52:30 -0700 (PDT) Received: from ipro.de ([195.88.13.146]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id DAA05423; Mon, 21 Apr 1997 03:52:21 -0700 (PDT) Received: from jochenkn.intern.ipro.de (jochenkn.intern.ipro.de [172.16.1.2]) by ipro.de (8.8.5/8.8.5) with SMTP id MAA05502; Mon, 21 Apr 1997 12:52:13 +0200 (MET DST) Message-Id: <199704211052.MAA05502@ipro.de> Comments: Authenticated sender is From: "Jochen Knuth" Organization: IPRO GmbH, Leonberg To: scsi@freebsd.org Date: Mon, 21 Apr 1997 12:52:13 +0200 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Problem with ahc solved: switched to ncr Reply-to: J.Knuth@ipro.de CC: stable@freebsd.org Priority: normal X-mailer: Pegasus Mail for Win32 (v2.53DE/R1) Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, after upgrading to 2.2-Stable i got several errors with the ahc driver (mostly when dumping a large filesystems to hp dat). This errors sometimes caused a reboot, sometimes freezes the machine. They are the same as some of you have also reported (time outs etc). But last thursday something strange happened: after some days with successful backups i noticerd, that the machine is rebooting while doing a dump. But it cant boot because the _disk geometry_ has changed to 2millions/1/1. After some problems with the restore procedure i switched to a ncr scsi controller. No problems anymore. Just FYI, Jochen ---------------------------------------------------------- Jochen Knuth WebMaster http://www.ipro.de IPRO GmbH Fon ++49-7152-93330 Steinbeisstr. 6 Fax ++49-7152-933340 71229 Leonberg EMail: J.Knuth@ipro.de From owner-freebsd-scsi Mon Apr 21 05:17:58 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id FAA09750 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 05:17:58 -0700 (PDT) Received: from indigo.ie (aoife.indigo.ie [194.125.133.9]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id FAA09742 for ; Mon, 21 Apr 1997 05:17:55 -0700 (PDT) Received: from indigo.ie (localhost [127.0.0.1]) by indigo.ie (8.8.5/8.8.5/INDIGO-HUB) with ESMTP id NAA20868 for ; Mon, 21 Apr 1997 13:17:52 +0100 (BST) Message-Id: <199704211217.NAA20868@indigo.ie> To: freebsd-scsi@freebsd.org Subject: Vendor specific ASCQ SCSI errors in 2.2-STABLE From: Alan Judge Date: Mon, 21 Apr 1997 13:17:52 +0100 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [I'm not on the freebsd-scsi list, so please reply directly if you can shed any light.] I'm running 2.2-STABLE on a PPro machine with an Adaptec 3940UW and a bunch of Quantum Atlas II disks. Tagging and SCB paging enabled. I sometimes get errors of the form: sd0(ahc0:0:0): ABORTED COMMAND asc:41,86 Vendor Specific ASCQ, retries:4 only under high load and on multiple different disks (as far as I can reproduce the problem). This sometimes results in errors like: vnode_pager_putpages: I/O error 5 vnode_pager_putpages: residual I/O 4096 at 5816 and various sorts of crashes. Can anyone tell me how to decode these? I've got some docs from Quantum and they indicate that this error is some sort of data path error, whatever that is. I have a bunch of identical disks in a 2.2-GAMMA machine and have never seen a similar error. Any ideas? From owner-freebsd-scsi Mon Apr 21 08:20:50 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA21879 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 08:20:50 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA21868 for ; Mon, 21 Apr 1997 08:20:46 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id JAA09950; Mon, 21 Apr 1997 09:20:41 -0600 (MDT) Message-Id: <199704211520.JAA09950@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Alan Judge cc: freebsd-scsi@freebsd.org Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Your message of "Mon, 21 Apr 1997 13:17:52 BST." <199704211217.NAA20868@indigo.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 21 Apr 1997 09:19:08 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I sometimes get errors of the form: > sd0(ahc0:0:0): ABORTED COMMAND asc:41,86 Vendor Specific ASCQ, retries >:4 >only under high load and on multiple different disks (as far as I can reproduc >e >the problem). >From a quick look at the Atlas II tech ref I have here, this is: DDMA overrun or REQ/ACK overrun/underrun error. BUT, the manual says that it will return HARDWARE ERROR not ABORTED COMMAND in this case. I'll have to go look at how we print out the information to make sure this is correct. If this is indeed the case, I would suspect a cabling or termination problem that rears it's ugly head only under heavy load. >I have a bunch of identical disks in a 2.2-GAMMA machine and have never seen >a similar error. The 2.2-GAMMA driver can't generate the same kinds of load as the current driver. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Mon Apr 21 08:40:48 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA23914 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 08:40:48 -0700 (PDT) Received: from indigo.ie (aoife.indigo.ie [194.125.133.9]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA23901 for ; Mon, 21 Apr 1997 08:40:43 -0700 (PDT) Received: from indigo.ie (localhost [127.0.0.1]) by indigo.ie (8.8.5/8.8.5/INDIGO-HUB) with ESMTP id QAA03447; Mon, 21 Apr 1997 16:40:20 +0100 (BST) Message-Id: <199704211540.QAA03447@indigo.ie> To: "Justin T. Gibbs" Cc: freebsd-scsi@freebsd.org, Alan.Judge@indigo.ie Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Message from "Justin T. Gibbs" dated today at 09:19. From: Alan Judge Date: Mon, 21 Apr 1997 16:40:20 +0100 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Justin> DDMA overrun or REQ/ACK overrun/underrun error. Justin> BUT, the manual says that it will return HARDWARE ERROR not Justin> ABORTED COMMAND in this case. I'll have to go look at how we Justin> print out the information to make sure this is correct. OK, thanks. I was just looking at the 41,00 entry under ABORTED COMMAND as the nearest match. Maybe things aren't getting printed correctly. If not, maybe I should try to get an explanation of the code from Quantum. Justin> If this is indeed the case, I would suspect a cabling or Justin> termination problem that rears it's ugly head only under heavy Justin> load. The setup is fairly simple. Eight disks, all internal. Four each on channels of a 3940UW. A single ribbon cable connects each set, and termination is enabled on the last drive in each chain. The only unusual feature is that all the disks are in hot-swap cannisters, so there's some internal cabling there. At the moment, I only get the error on sd0 and even there it only happens once a day or so. Any ideas on where to start narrowing down the problem. >> I have a bunch of identical disks in a 2.2-GAMMA machine and have never seen >> a similar error. Justin> The 2.2-GAMMA driver can't generate the same kinds of load as Justin> the current driver. OK. I was wondering whether the problem I'm having might be related to the other ahc problems. -- Alan From owner-freebsd-scsi Mon Apr 21 08:48:45 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA24667 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 08:48:45 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA24657 for ; Mon, 21 Apr 1997 08:48:42 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id JAA10492; Mon, 21 Apr 1997 09:48:39 -0600 (MDT) Message-Id: <199704211548.JAA10492@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Alan Judge cc: "Justin T. Gibbs" , freebsd-scsi@freebsd.org Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Your message of "Mon, 21 Apr 1997 16:40:20 BST." <199704211540.QAA03447@indigo.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 21 Apr 1997 09:47:05 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >If not, maybe I should try to get an explanation of the code from >Quantum. It may be hard to get. We just received the latest tech ref from them, so you'll have to get above one of their tech support drones in order to get a real response. >Justin> If this is indeed the case, I would suspect a cabling or >Justin> termination problem that rears it's ugly head only under heavy >Justin> load. > >The setup is fairly simple. Eight disks, all internal. Four each on >channels of a 3940UW. A single ribbon cable connects each set, and >termination is enabled on the last drive in each chain. The only >unusual feature is that all the disks are in hot-swap cannisters, so >there's some internal cabling there. Your setup appears to be sound. Using a single ribbon cable for all of the ultra disks is one of the best ways, according to an Adaptec study, to ensure correct Ultra SCSI operation. Hmm. >At the moment, I only get the error on sd0 and even there it only >happens once a day or so. Any ideas on where to start narrowing down >the problem. My first hunch would be a bogus connector somewhere in the system which is putting a large capacitive load on the bus. I would try changing out a canister at a time with a known good one. If you can afford to steal a canister from the other bus, the one that seems to work, that should allow you to be "scientific" about it. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Mon Apr 21 09:00:26 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA26036 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 09:00:26 -0700 (PDT) Received: from indigo.ie (aoife.indigo.ie [194.125.133.9]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA26021 for ; Mon, 21 Apr 1997 09:00:21 -0700 (PDT) Received: from indigo.ie (localhost [127.0.0.1]) by indigo.ie (8.8.5/8.8.5/INDIGO-HUB) with ESMTP id RAA12615; Mon, 21 Apr 1997 17:00:10 +0100 (BST) Message-Id: <199704211600.RAA12615@indigo.ie> To: "Justin T. Gibbs" Cc: freebsd-scsi@freebsd.org Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Message from "Justin T. Gibbs" dated today at 09:47. From: Alan Judge Date: Mon, 21 Apr 1997 17:00:09 +0100 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [Thanks for your help.] Justin> Your setup appears to be sound. Using a single ribbon cable Justin> for all of the ultra disks is one of the best ways, according Justin> to an Adaptec study, to ensure correct Ultra SCSI operation. Justin> Hmm. I just had a close look at the system and noticed something I hadn't seen before. One of the ribbon cables (the one on the bus with problems) doesn't end at the fourth connector. Instead the cable goes on an extra 3-4cm and is taped back (not even electrical tape, I think). (This was probably done by our local supplier, who have a tendency to cut corners. I guess they shortened a longer cable.) Could this be the problem? I'm sure the cable stub is not doing the electrical behaviour of the bus any good at all. I'm tempted to remove the cable and trim it to exact length. >> At the moment, I only get the error on sd0 and even there it only >> happens once a day or so. Any ideas on where to start narrowing down >> the problem. Justin> My first hunch would be a bogus connector somewhere in the Justin> system which is putting a large capacitive load on the bus. I Justin> would try changing out a canister at a time with a known good Justin> one. If you can afford to steal a canister from the other Justin> bus, the one that seems to work, that should allow you to be Justin> "scientific" about it. I have a spare cannister, so I can do some tests. Of course, this is a production news server, so shutting down regularly won't make me popular. Given the MTTF though, each test will probably have to take days. Of course, the bogus cable could be the problem. I'll check all the connectors and sockets carefully as well. -- Alan From owner-freebsd-scsi Mon Apr 21 09:11:57 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA27325 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 09:11:57 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA27315 for ; Mon, 21 Apr 1997 09:11:54 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id KAA10985; Mon, 21 Apr 1997 10:11:51 -0600 (MDT) Message-Id: <199704211611.KAA10985@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Alan Judge cc: "Justin T. Gibbs" , freebsd-scsi@freebsd.org Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Your message of "Mon, 21 Apr 1997 17:00:09 BST." <199704211600.RAA12615@indigo.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 21 Apr 1997 10:10:17 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I just had a close look at the system and noticed something I hadn't >seen before. One of the ribbon cables (the one on the bus with >problems) doesn't end at the fourth connector. Instead the cable goes >on an extra 3-4cm and is taped back (not even electrical tape, I >think). (This was probably done by our local supplier, who have a >tendency to cut corners. I guess they shortened a longer cable.) > >Could this be the problem? I'm sure the cable stub is not doing the >electrical behaviour of the bus any good at all. I'm tempted to >remove the cable and trim it to exact length. Definitely trim that cable. Just be sure to clean the wires after you make the cut as most are braided and tend to fray/short the other lines after you cut them. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Mon Apr 21 13:00:18 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA10414 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 13:00:18 -0700 (PDT) Received: from rs3.rrz.Uni-Koeln.DE (519@rs3.rrz.Uni-Koeln.DE [134.95.100.214]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA10408 for ; Mon, 21 Apr 1997 13:00:14 -0700 (PDT) Received: (from afr04@localhost) by rs3.rrz.Uni-Koeln.DE (8.8.5/8.8.4) id WAA122659; Mon, 21 Apr 1997 22:00:05 +0200 Date: Mon, 21 Apr 1997 22:00:05 +0200 (MST) From: Ralf Luettgen To: freebsd-scsi@freebsd.org Subject: RAID-5 Controller for FreeBSD Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi folks, which RAID-5 controller can I use with FreeBSD 2.2.1? Thanks for help Ralf From owner-freebsd-scsi Mon Apr 21 14:12:05 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA14438 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 14:12:05 -0700 (PDT) Received: from iafnl.es.iaf.nl (uucp@iafnl.es.iaf.nl [195.108.17.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id OAA14381 for ; Mon, 21 Apr 1997 14:11:59 -0700 (PDT) Received: by iafnl.es.iaf.nl with UUCP id AA09017 (5.67b/IDA-1.5 for freebsd-scsi@freebsd.org); Mon, 21 Apr 1997 23:12:02 +0200 Received: (from wilko@localhost) by yedi.iaf.nl (8.7.5/8.6.12) id UAA00722; Mon, 21 Apr 1997 20:36:35 +0200 (MET DST) From: Wilko Bulte Message-Id: <199704211836.UAA00722@yedi.iaf.nl> Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE To: gibbs@plutotech.com (Justin T. Gibbs) Date: Mon, 21 Apr 1997 20:36:35 +0200 (MET DST) Cc: Alan.Judge@indigo.ie, gibbs@plutotech.com, freebsd-scsi@freebsd.org In-Reply-To: <199704211548.JAA10492@pluto.plutotech.com> from "Justin T. Gibbs" at Apr 21, 97 09:47:05 am X-Mailer: ELM [version 2.4 PL24 ME8a] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Justin T. Gibbs wrote... > >At the moment, I only get the error on sd0 and even there it only > >happens once a day or so. Any ideas on where to start narrowing down > >the problem. > > My first hunch would be a bogus connector somewhere in the system which > is putting a large capacitive load on the bus. I would try changing out > a canister at a time with a known good one. If you can afford to steal > a canister from the other bus, the one that seems to work, that should > allow you to be "scientific" about it. How much stublength does each canister introduce? Are these canisters designed with Ultra SCSI speeds in mind? Wilko _ ____________________________________________________________________ | / o / / _ Bulte email: wilko@yedi.iaf.nl - Arnhem, The Netherlands |/|/ / / /( (_) Do, or do not. There is no 'try' - Yoda -------------------------------------------------------------------------- From owner-freebsd-scsi Mon Apr 21 14:12:20 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA14465 for freebsd-scsi-outgoing; Mon, 21 Apr 1997 14:12:20 -0700 (PDT) Received: from iafnl.es.iaf.nl (uucp@iafnl.es.iaf.nl [195.108.17.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id OAA14454 for ; Mon, 21 Apr 1997 14:12:14 -0700 (PDT) Received: by iafnl.es.iaf.nl with UUCP id AA09029 (5.67b/IDA-1.5 for freebsd-scsi@freebsd.org); Mon, 21 Apr 1997 23:12:26 +0200 Received: (from wilko@localhost) by yedi.iaf.nl (8.7.5/8.6.12) id UAA00759; Mon, 21 Apr 1997 20:40:54 +0200 (MET DST) From: Wilko Bulte Message-Id: <199704211840.UAA00759@yedi.iaf.nl> Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE To: gibbs@plutotech.com (Justin T. Gibbs) Date: Mon, 21 Apr 1997 20:40:54 +0200 (MET DST) Cc: Alan.Judge@indigo.ie, freebsd-scsi@freebsd.org In-Reply-To: <199704211520.JAA09950@pluto.plutotech.com> from "Justin T. Gibbs" at Apr 21, 97 09:19:08 am X-Mailer: ELM [version 2.4 PL24 ME8a] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Justin T. Gibbs wrote... > >I sometimes get errors of the form: > > sd0(ahc0:0:0): ABORTED COMMAND asc:41,86 Vendor Specific ASCQ, retries > >:4 > >only under high load and on multiple different disks (as far as I can reproduc > >e > >the problem). > > >From a quick look at the Atlas II tech ref I have here, this is: > > DDMA overrun or REQ/ACK overrun/underrun error. This could well be a glitch on the REQ/ACK lines. Having bus stubs (inside the canisters) makes the drives vulnerable to this problem. Any chance you can try things without the canisters? (I know, a horrible lot of work). An active terminator on the end of the bus instead of a drive-internal terminator in a canister is also worthwile. Wilko _ ____________________________________________________________________ | / o / / _ Bulte email: wilko@yedi.iaf.nl - Arnhem, The Netherlands |/|/ / / /( (_) Do, or do not. There is no 'try' - Yoda -------------------------------------------------------------------------- From owner-freebsd-scsi Tue Apr 22 06:14:16 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id GAA14136 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 06:14:16 -0700 (PDT) Received: from weenix.guru.org (kmitch@phantasma.bevc.blacksburg.va.us [198.82.200.65]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA14129 for ; Tue, 22 Apr 1997 06:14:13 -0700 (PDT) Received: (from kmitch@localhost) by weenix.guru.org (8.8.5/8.8.5) id JAA22747 for scsi@freebsd.org; Tue, 22 Apr 1997 09:14:07 -0400 (EDT) From: Keith Mitchell NIS Message-Id: <199704221314.JAA22747@weenix.guru.org> Subject: Freezes/Reboots with -current ahc driver To: scsi@freebsd.org Date: Tue, 22 Apr 1997 09:14:06 -0400 (EDT) X-Mailer: ELM [version 2.4ME+ PL30 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current (as of April 21) to gain use of the SMP stuff. Now, I am experiancing system freezes and reboots during the nightly backup. No error messages ever appear from what I can tell. The little SCSI light doesn;t even stay solidly lit when the system freezes. I didn't have any problems with the 2.2-STABLE system. I am only using SCB paging. Tagged queuing has never worked with my Micropolis 4221W. I just get all kinds of timeouts with tht option on. Has anyone else seen anything similar?? -- Keith Mitchell Head Administrator: acm.vt.edu Email: kmitch@weenix.guru.org PGP key available upon request http://weenix.guru.org/~kmitch Address and URL (c) 1997 Keith Mitchell - All Rights Reserved Unauthorized use or duplication prohibited From owner-freebsd-scsi Tue Apr 22 08:20:29 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id IAA19828 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 08:20:29 -0700 (PDT) Received: from news1.gtn.com (news1.gtn.com [194.77.0.15]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA19822 for ; Tue, 22 Apr 1997 08:20:22 -0700 (PDT) Received: (from uucp@localhost) by news1.gtn.com (8.7.2/8.7.2) with UUCP id RAA02960 for scsi@FreeBSD.ORG; Tue, 22 Apr 1997 17:00:47 +0200 (MET DST) Received: (from andreas@localhost) by klemm.gtn.com (8.8.5/8.8.2) id PAA08699; Tue, 22 Apr 1997 15:53:15 +0200 (CEST) Message-ID: <19970422155315.41408@klemm.gtn.com> Date: Tue, 22 Apr 1997 15:53:15 +0200 From: Andreas Klemm To: scsi@FreeBSD.ORG Subject: bonnie results with latest ahc driver relatively low on writing... Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.69 X-Disclaimer: A free society is one where it is safe to be unpopular X-Operating-System: FreeBSD 2.2-STABLE Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi ! Machine: PPro 200 MHz. OS = FreeBSD 2.2-STABLE as of yesterday. I just made some bonnie benchmarks on a fresh 2 GB filesystem. The harddisk is a IBM DORS the controller a AHA 2940. ahc0: target 0 Tagged Queuing Device (ahc0:0:0): "IBM DORS-32160 WA6A" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 2063MB (4226725 512 byte sectors) sd0(ahc0:0:0): with 6703 cyls, 5 heads, and an average 126 sectors/track -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char-- ---Block--- --Seeks--- K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 25 5305 45.6 4179 10.2 5722 19.2 10723 100.0 71963 100.0 794.9 10.3 50 4173 35.5 4256 10.6 2760 9.9 10724 99.8 71274 100.0 899.1 11.9 100 4148 35.4 4207 10.6 2370 7.7 6825 64.3 5482 8.2 146.8 2.5 ^^^^ ^^^^ 150 4140 35.5 4203 10.7 2201 7.1 6274 59.1 5436 7.9 100.0 1.8 300 4002 34.0 4036 10.1 2023 6.2 5725 53.9 5356 7.7 74.4 1.4 600 3919 33.4 3937 10.0 1805 5.7 5314 50.1 5148 7.5 62.1 1.2 1200 3700 31.5 3707 9.3 1708 5.4 4919 46.1 4847 7.2 53.5 1.1 If I remember right, then I had about 5400 K/sec write performance with former versions of the ahc driver. Now only the read performance has the old speed of about 5.5 MB/sec. Somebody else with similar results/experiences ? Here my kernel config file just for reference: machine "i386" cpu "I686_CPU" ident BISDN maxusers 64 # Debugging #options DDB #options KTRACE #kernel tracing # Networking options INET #InterNETworking options IPFIREWALL #firewall options IPFIREWALL_VERBOSE #print information about dropped packets options "IPFIREWALL_VERBOSE_LIMIT=100" #limit verbosity # filesystems options FFS #Berkeley Fast Filesystem options PROCFS #Process filesystem options MFS #Memory File System options NSWAPDEV=3 #Allow this many swap-devices. # misc options options "COMPAT_43" #Compatible with BSD 4.3 [KEEP THIS!] options UCONSOLE #Allow users to grab the console options SYSVSHM,SYSVSEM,SYSVMSG #shared memory (X11) options "MD5" options COMPAT_LINUX # Linux Binary compatibility config kernel root on sd1 # ISA and PCI BUS support controller isa0 controller pci0 # Floppy Disk Controller controller fdc0 at isa? port "IO_FD1" bio irq 6 drq 2 vector fdintr disk fd0 at fdc0 drive 0 # AHA 2940 PCI Controller controller ahc0 # SCSI Devices controller scbus0 device sd0 # Harddisk 0 - DOS/FreeBSD SMP device sd1 # Harddisk 1 - FreeBSD Boot device sd2 # Harddisk 2 - FreeBSD local device st0 # TDC 4222 device cd0 # TOSHIBA XM-5701TA 3136 options AHC_TAGENABLE # tagged command queueing options AHC_ALLOW_MEMIO options AHC_SCBPAGING_ENABLE options SCSI_REPORT_GEOMETRY # SCO compatible system console device sc0 at isa? port "IO_KBD" tty irq 1 vector scintr options MAXCONS=4 # number of virtual consoles # floating point unit device npx0 at isa? port "IO_NPX" flags 0x1 irq 13 vector npxintr # serial devices on mainboard device sio0 at isa? port "IO_COM1" tty irq 4 vector siointr device sio1 at isa? port "IO_COM2" tty irq 3 vector siointr # parallel device on mainboard device lpt0 at isa? port? tty irq 7 vector lptintr # PS/2 mouse on mainboard device psm0 at isa? port "IO_KBD" conflicts tty irq 12 vector psmintr options "PSM_ACCEL=1" # PS/2 mouse acceleration # Joystick device joy0 at isa? port "IO_GAME" # Network 3COM PCI device vx0 # Soundblaster 16 # SoundBlaster DSP driver - for SB, SB Pro, SB16, PAS(emulating SB) # SoundBlaster 16 DSP driver - for SB16 - requires sb0 device # SoundBlaster 16 MIDI - for SB16 - requires sb0 device # Yamaha OPL-2/OPL-3 FM - for SB, SB Pro, SB16, PAS controller snd0 device sb0 at isa? port 0x220 irq 5 conflicts drq 1 vector sbintr device sbxvi0 at isa? drq 5 device sbmidi0 at isa? port 0x330 device opl0 at isa? port 0x388 # Pseudo devices pseudo-device loop pseudo-device ether pseudo-device log #Kernel syslog interface (/dev/klog) pseudo-device vn 1 #Vnode driver (turns a file into a dev.) pseudo-device tun 1 #user mode ppp pseudo-device bpfilter 2 #Berkeley packet filter pseudo-device pty 16 pseudo-device gzip # Exec gzipped a.out's # BISDN options IPI_VJ # Van Jacobsen header compression support #options "IPI_DIPA=3" # send ip accounting packets every 3 seconds options TELES_HAS_MEMCPYB # bisdn 0.97 # Teles S0/16.3 ###################################################### IRQ 9 ## controller tel0 at isa? port 0xd80 net irq 9 vector telintr pseudo-device disdn pseudo-device isdn pseudo-device ipi 1 pseudo-device ispy 1 #pseudo-device itel 1 -- andreas@klemm.gtn.com /\/\___ Wiechers & Partner Datentechnik GmbH Andreas Klemm ___/\/\/ Support Unix -- andreas.klemm@wup.de pgp p-key http://www-swiss.ai.mit.edu/~bal/pks-toplev.html >>> powered by <<< ftp://sunsite.unc.edu/pub/Linux/system/Printing/aps-491.tgz >>> FreeBSD <<< From owner-freebsd-scsi Tue Apr 22 09:45:45 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA23908 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 09:45:45 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA23901 for ; Tue, 22 Apr 1997 09:45:42 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id KAA05060; Tue, 22 Apr 1997 10:45:38 -0600 (MDT) Message-Id: <199704221645.KAA05060@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Keith Mitchell NIS cc: scsi@FreeBSD.ORG Subject: Re: Freezes/Reboots with -current ahc driver In-reply-to: Your message of "Tue, 22 Apr 1997 09:14:06 EDT." <199704221314.JAA22747@weenix.guru.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Apr 1997 10:44:06 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current >(as of April 21) to gain use of the SMP stuff. > >Now, I am experiancing system freezes and reboots during the >nightly backup. No error messages ever appear from what I can tell. >The little SCSI light doesn;t even stay solidly lit when the system >freezes. I didn't have any problems with the 2.2-STABLE system. The drivers are identical in the two branches, so it may not be a driver problem at all. Hmmm. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Tue Apr 22 10:28:12 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id KAA26789 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 10:28:12 -0700 (PDT) Received: from Ilsa.StevesCafe.com (Ilsa.StevesCafe.com [205.168.119.129]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA26593; Tue, 22 Apr 1997 10:25:22 -0700 (PDT) Received: from Ilsa.StevesCafe.com (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.8.5/8.8.5) with ESMTP id LAA16301; Tue, 22 Apr 1997 11:25:04 -0600 (MDT) Message-Id: <199704221725.LAA16301@Ilsa.StevesCafe.com> X-Mailer: exmh version 2.0gamma 1/27/96 From: Steve Passe To: Keith Mitchell NIS cc: scsi@freebsd.org, smp@freebsd.org Subject: Re: Freezes/Reboots with -current ahc driver In-reply-to: Your message of "Tue, 22 Apr 1997 09:14:06 EDT." <199704221314.JAA22747@weenix.guru.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Apr 1997 11:25:03 -0600 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current > (as of April 21) to gain use of the SMP stuff. > > Now, I am experiancing system freezes and reboots during the > nightly backup. No error messages ever appear from what I can tell. > The little SCSI light doesn;t even stay solidly lit when the system > freezes. I didn't have any problems with the 2.2-STABLE system. > > I am only using SCB paging. Tagged queuing has never worked with my > Micropolis 4221W. I just get all kinds of timeouts with tht option > on. > > Has anyone else seen anything similar?? details please. what backup tool are you using when it crashes? are any NFS mounted partitions involved? are you using APIC_IO? is ddb installed in your kernel, if not please add. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-scsi Tue Apr 22 12:07:00 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id MAA03831 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 12:07:00 -0700 (PDT) Received: from weenix.guru.org (kmitch@phantasma.bevc.blacksburg.va.us [198.82.200.65]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA03795; Tue, 22 Apr 1997 12:06:03 -0700 (PDT) Received: (from kmitch@localhost) by weenix.guru.org (8.8.5/8.8.5) id PAA02398; Tue, 22 Apr 1997 15:05:51 -0400 (EDT) From: Keith Mitchell Message-Id: <199704221905.PAA02398@weenix.guru.org> Subject: Re: Freezes/Reboots with -current ahc driver In-Reply-To: <199704221725.LAA16301@Ilsa.StevesCafe.com> from Steve Passe at "Apr 22, 97 11:25:03 am" To: smp@csn.net (Steve Passe) Date: Tue, 22 Apr 1997 15:05:51 -0400 (EDT) Cc: gibbs@freebsd.org, smp@freebsd.org, scsi@freebsd.org X-Mailer: ELM [version 2.4ME+ PL30 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > what backup tool are you using when it crashes? I am using amanda 2.3.0.4 as the backup program which in turn uses dump. None of the drives are mounted via NFS. I do have APIC_IO turned on. After I turned on the comconsole and ddb stuff I found out that it was panicing (it looks like in the SMP stuff). The controller card is a 3940UW. I did have toi patch the mp_machdep.c file for the PCI-PCI bridge. The panic message is: apicIPI is stuck panic(cpu#0): boot() called on cpu#0 What should I look for in DDB?? -- Keith Mitchell Head Administrator: acm.vt.edu Email: kmitch@weenix.guru.org PGP key available upon request http://weenix.guru.org/~kmitch Address and URL (c) 1997 Keith Mitchell - All Rights Reserved Unauthorized use or duplication prohibited From owner-freebsd-scsi Tue Apr 22 12:37:57 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id MAA11737 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 12:37:57 -0700 (PDT) Received: from Ilsa.StevesCafe.com (Ilsa.StevesCafe.com [205.168.119.129]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA11694; Tue, 22 Apr 1997 12:37:49 -0700 (PDT) Received: from Ilsa.StevesCafe.com (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.8.5/8.8.5) with ESMTP id NAA17103; Tue, 22 Apr 1997 13:36:25 -0600 (MDT) Message-Id: <199704221936.NAA17103@Ilsa.StevesCafe.com> X-Mailer: exmh version 2.0gamma 1/27/96 From: Steve Passe To: Keith Mitchell cc: gibbs@freebsd.org, smp@freebsd.org, scsi@freebsd.org Subject: Re: Freezes/Reboots with -current ahc driver In-reply-to: Your message of "Tue, 22 Apr 1997 15:05:51 EDT." <199704221905.PAA02398@weenix.guru.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Apr 1997 13:36:25 -0600 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > what backup tool are you using when it crashes? > > I am using amanda 2.3.0.4 as the backup program which in turn uses > dump. None of the drives are mounted via NFS. > > I do have APIC_IO turned on. > > After I turned on the comconsole and ddb stuff I found out that it was > panicing (it looks like in the SMP stuff). The controller card is a > 3940UW. I did have toi patch the mp_machdep.c file for the PCI-PCI > bridge. > > The panic message is: > > apicIPI is stuck > panic(cpu#0): > > boot() called on cpu#0 > > What should I look for in DDB?? you are hitting the known "APIC_IO/heavy IO deadlock" problem. I'll think more about a solution. In the meantime you could try a non_APIC_IO kernel, that should solve the problem for now... -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-scsi Tue Apr 22 13:15:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id NAA29970 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 13:15:54 -0700 (PDT) Received: from sendero.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA29963 for ; Tue, 22 Apr 1997 13:15:51 -0700 (PDT) Received: (from shimon@localhost) by sendero.i-connect.net (8.8.5/8.8.5) id NAA06442 for freebsd-scsi@freebsd.org; Tue, 22 Apr 1997 13:15:48 -0700 (PDT) Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Date: Tue, 22 Apr 1997 13:01:59 -0700 (PDT) Organization: iConnect Corp. From: Simon Shapiro To: freebsd-scsi@freebsd.org Subject: Ahc problems, yet again... Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I start annoying and boring myself too... :-) An Iomega Jaz (I knwo, i know :-) hooked up to an AHa-2940UW. The drive spins down after some minutes. Several hours later we are running dump on all filesystms. We get to the Jaz and: sd0 Not ready Logical unit not ready ILLEGAL REQUEST Could not mode sense. Using ficticious geometry All this is pretty normal. What immediately follows is NOT: Fatal trap 9: General protection violation while in kernel mode IP: 0xf01de2ff SP: 0xefbfff64 FP: 0xefbfff84 at _generic_bzero + 0x0f repe stol %es(%edi) Tracing provided: _end _end _end _Xfastintr7 at _Xfastintr7 + 0x17 All this beauty with kernel as of 21-Apr-97 (RELENG_2_2). With the BETA_A kernel, I got pretty much the same ahc complaints, but it paniced with: ahc0:a:0: Target did not send an IDENTIFY message. SAVE_TCL=0 If you make sure the Jaz is spinning while booting, the 2.2-BETA_A is fine. It still complains about the ILLEGAL REQUEST, but does not panic. Is this a good one or what? :-) Simon wo, i know :-) hooked up to an AHa-2940UW. The drive spins down after some minutes. Several hours later we are running dump on all filesystms. We get to the Jaz and: sd0 Not ready sd0 Logical unit not ready sd0 ILLEGAL REQUEST From owner-freebsd-scsi Tue Apr 22 14:37:10 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id OAA07580 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 14:37:10 -0700 (PDT) Received: from agora.rdrop.com (root@agora.rdrop.com [199.2.210.241]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id OAA07565 for ; Tue, 22 Apr 1997 14:37:05 -0700 (PDT) Received: from mexico.brainstorm.eu.org by agora.rdrop.com with smtp (Smail3.1.29.1 #17) id m0wJnFD-0009AnC; Tue, 22 Apr 97 14:36 PDT Received: from brasil.brainstorm.eu.org (brasil.brainstorm.fr [193.56.58.33]) by mexico.brainstorm.eu.org (8.8.4/8.8.4) with ESMTP id XAA21242 for ; Tue, 22 Apr 1997 23:32:14 +0200 Received: (from uucp@localhost) by brasil.brainstorm.eu.org (8.8.4/8.6.12) with UUCP id XAA12281 for freebsd-scsi@FreeBSD.org; Tue, 22 Apr 1997 23:31:52 +0200 Received: (from roberto@localhost) by keltia.freenix.fr (8.8.5/keltia-uucp-2.9) id TAA02675; Tue, 22 Apr 1997 19:09:37 +0200 (CEST) Message-ID: <19970422190933.49028@keltia.freenix.fr> Date: Tue, 22 Apr 1997 19:09:33 +0200 From: Ollivier Robert To: "FreeBSD SCSI Users' list" Subject: Re: NCR-810a & old Micropolis MP1624 problems References: <19970421013048.25762@keltia.freenix.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 0.67 In-Reply-To: <19970421013048.25762@keltia.freenix.fr>; from Ollivier Robert on Mon, Apr 21, 1997 at 01:30:48AM +0200 X-Operating-System: FreeBSD 3.0-CURRENT ctm#3233 Sender: owner-freebsd-scsi@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk According to Ollivier Robert: > BIG difference: he's using 2.1.6 (I run CURRENT). Are there signifiant > différences between the 2.1.* driver and the 2.2/3.0 one ? He can boot the > 2.2.1 boot floppy w/o problem. Is 2.2.1 the only answer to his problem ? We have found that adding FAILSAFE as a kernel option does fix the problem (which exists with 2.2.1 too). What I have trouble to understand is that I don't need it at all for the same drive... -- Ollivier ROBERT -=- FreeBSD: There are no limits -=- roberto@keltia.freenix.fr FreeBSD keltia.freenix.fr 3.0-CURRENT #1: Mon Apr 21 01:37:27 CEST 1997 From owner-freebsd-scsi Tue Apr 22 14:52:34 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id OAA08628 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 14:52:34 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id OAA08621 for ; Tue, 22 Apr 1997 14:52:32 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id PAA16847; Tue, 22 Apr 1997 15:52:29 -0600 (MDT) Message-Id: <199704222152.PAA16847@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Simon Shapiro cc: freebsd-scsi@freebsd.org Subject: Re: Ahc problems, yet again... In-reply-to: Your message of "Tue, 22 Apr 1997 13:01:59 PDT." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Apr 1997 15:50:56 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I start annoying and boring myself too... :-) > >An Iomega Jaz (I knwo, i know :-) hooked up to an AHa-2940UW. >The drive spins down after some minutes. >Several hours later we are running dump on all filesystms. >We get to the Jaz and: > >sd0 Not ready > Logical unit not ready > ILLEGAL REQUEST > Could not mode sense. Using ficticious geometry > >All this is pretty normal. What immediately follows is NOT: > >Fatal trap 9: General protection violation while in kernel mode > >IP: 0xf01de2ff >SP: 0xefbfff64 >FP: 0xefbfff84 > >at _generic_bzero + 0x0f repe stol %es(%edi) This doesn't sound like an AHC problem, but a bug somewhere in the sd driver. Hard to say since the stack trace is so bogus. I'll look into this when I get some time. Why is it that all of you posts have garbage at the end of them, usually some amount of repeat from the original message??? >wo, i know :-) hooked up to an AHa-2940UW. >The drive spins down after some minutes. >Several hours later we are running dump on all filesystms. >We get to the Jaz and: > >sd0 Not ready >sd0 Logical unit not ready >sd0 ILLEGAL REQUEST > -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Tue Apr 22 15:23:41 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA11144 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 15:23:41 -0700 (PDT) Received: from sag.space.lockheed.com (sag.space.lockheed.com [192.68.162.134]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id PAA11134 for ; Tue, 22 Apr 1997 15:23:38 -0700 (PDT) Received: from localhost by sag.space.lockheed.com; (5.65v3.2/1.1.8.2/21Nov95-0423PM) id AA11327; Tue, 22 Apr 1997 15:23:40 -0700 Date: Tue, 22 Apr 1997 15:23:40 -0700 (PDT) From: "Brian N. Handy" To: freebsd-scsi@freebsd.org Subject: AHA-1542 woes Message-Id: X-Files: The truth is out there Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, I'm having problems with a SCSI tape drive on a 486. First off: FreeBSD flare.physics.montana.edu 2.2-STABLE FreeBSD 2.2-STABLE #0: Tue Apr 22 13:16:15 MDT 1997 handy@flare.physics.montana.edu:/usr/src/sys/compile/FLARE i386 ...It's a newly CVSupped machine, and I did a "make world" last night and recompiled the kernel this morning. What this machine is largely used for is to pull data off Exabyte tapes and then archive it onto CDrom. Prior to a few days ago, the machine ran some odd 2.2-SNAP from middle of last year or so. The relevant probe messages: [...] aha0 at 0x330-0x333 irq 11 drq 5 on isa aha0 waiting for scsi devices to settle (aha0:0:0): "Quantum XP34300 L912" type 0 fixed SCSI 2 sd0(aha0:0:0): Direct-Access 4101MB (8399520 512 byte sectors) (aha0:2:0): "PHILIPS CDD2600 1.07" type 5 removable SCSI 2 worm0(aha0:2:0): Write-Once (aha0:3:0): "EXABYTE EXB-85058SQANXR1 07J0" type 1 removable SCSI 2 st0(aha0:3:0): Sequential-Access density code 0x0, drive empty scd0 not found at 0x230 [...] The CD-writer works great, and the disk drive works great. This machine used to be an IDE machine, I went to SCSI so I could use the tape drive and CD writer. What happens is while I'm reading tar stuff from the tape drive, the SCSI bus seems to freeeze up. The relevant messages: Apr 22 15:02:55 flare /kernel: sd0(aha0:0:0): timed out Apr 22 15:02:55 flare /kernel: adapter not taking commands.. frozen?! Apr 22 15:02:55 flare /kernel: Apr 22 15:02:59 flare /kernel: sd0(aha0:0:0): timed out Apr 22 15:02:59 flare /kernel: adapter not taking commands.. frozen?! Apr 22 15:02:59 flare /kernel: AGAIN Apr 22 15:02:59 flare /kernel: aha0: MBO 02 and not 00 (free) Apr 22 15:03:09 flare /kernel: sd0(aha0:0:0): timed out Apr 22 15:03:09 flare /kernel: adapter not taking commands.. frozen?! [....] This repeats over and over for a while, and the machine is effectively locked up now. Later, when I try to do, say, a "du" on the SCSI disk I continue getting this sort of stuff -- which I think means the SCSI adapter is hosed here on out. I have to do a hard reboot to get the system back. Nothing else seems to jam the system up solidly like this. The CD writer and disk drive work fine all the time. Any suggestions for what I can try to fix this? Thanks, Brian From owner-freebsd-scsi Tue Apr 22 15:52:20 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA13102 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 15:52:20 -0700 (PDT) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id PAA13091 for ; Tue, 22 Apr 1997 15:52:14 -0700 (PDT) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id AAA19289; Wed, 23 Apr 1997 00:52:12 +0200 Received: (from j@localhost) by uriah.heep.sax.de (8.8.5/8.8.5) id AAA29573; Wed, 23 Apr 1997 00:39:52 +0200 (MET DST) Message-ID: <19970423003951.ZU05379@uriah.heep.sax.de> Date: Wed, 23 Apr 1997 00:39:51 +0200 From: j@uriah.heep.sax.de (J Wunsch) To: freebsd-scsi@freeBSD.org (FreeBSD SCSI Users' list) Cc: roberto@keltia.freenix.fr (Ollivier Robert) Subject: Re: NCR-810a & old Micropolis MP1624 problems References: <19970421013048.25762@keltia.freenix.fr> <19970422190933.49028@keltia.freenix.fr> X-Mailer: Mutt 0.60_p2-3,5,8-9 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: <19970422190933.49028@keltia.freenix.fr>; from Ollivier Robert on Apr 22, 1997 19:09:33 +0200 Sender: owner-freebsd-scsi@freeBSD.org X-Loop: FreeBSD.org Precedence: bulk As Ollivier Robert wrote: > We have found that adding FAILSAFE as a kernel option does fix the problem > (which exists with 2.2.1 too). This would mean the drive claims to grok tagged command queuing, but actually doesn't. -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-scsi Tue Apr 22 19:01:43 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA25063 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 19:01:43 -0700 (PDT) Received: from mail.calweb.com (mail.calweb.com [208.131.56.11]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA25052; Tue, 22 Apr 1997 19:01:28 -0700 (PDT) Received: by mail.calweb.com (8.8.5/8.8.5) with ESMTP id TAA05074; Tue, 22 Apr 1997 19:01:13 -0700 (PDT) X-SMTP: hello web1.calweb.com from cslye@calweb.com server cslye@web1.calweb.com ip 208.131.56.51 Received: (from cslye@localhost) by web1.calweb.com (8.8.5/8.8.5) id TAA10660; Tue, 22 Apr 1997 19:01:19 -0700 (PDT) Message-Id: <199704230201.TAA10660@web1.calweb.com> Subject: Kernel panic's To: freebsd-scsi@freebsd.org, freebsd-hackers@freebsd.org Date: Tue, 22 Apr 1997 19:01:19 -0700 (PDT) From: "Cameron Slye" X-Mailer: ELM [version 2.4 PL25 ME8b] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Working on a machine here that I have problems with for over a month and a half now. Short rundown of machine, p133 on a ASUS p55t2p4 m/b with 128mb, 2 2940UW's and a SMC dual port Digital 21040 card. One 2940 has a Quantum XP31070W (root drive) and the other 2940 has a 9gig IBM oem. I am using the RELENG_2_2 from 04/21/97 for the panic's I am quoting today. This box is a news feeder system running innd1.5.1 (with mmap at the moment, about to recompile without mmap) Anyways, any ideas would be great. I have the dump files, and can get you any other info you need. Anyways the ddb trace etc.. The first one I dont have trace info from. dev = 0x400, block = 4505, fs = / panic: freeing free block debugger("panic") stopped at _Debugger+0x25: movb $0_in_Debugger,110 --- dev = 0x2040c, block = 19976, fs = /news/spool panic: blkfree freeing free block debugger: panic trace: _Debugger(f0117b98) at _Debugger+0x35 _panic(f019edd1,f019edae,2040c,4e08,f41f58d4) at _panic+0x5a _fffs_blkfree(f4930e00,4e08,2000,359da0,0) at ffs_blkfree+0x19b _ffs_indirtrunc(f4930e00,fffffff4,359da0,ffffffff,0,efbffd24) at _ffs_indirtrunc+0x222 _ffs_truncate(efbffdfc,f42b06a0,f4c08b80,efbffe70,0) at _ffs_truncate+0x83c _ufs_inactive(efbffe28,f42b0680,f01e34f4,f42b0680,efbffe48) at _ufs_inactive+0xb1 _vrele(f42b0680,0,f4c08b80,efbffe70,efbffe54) at _vrele+0xe7 _vnode_pager_dealloc(f4c08b80,efbffe78,f01b24d0,f4c08b80,f4c08b80) at _vnode_pager_dealloc+0x95 _vm_pager_deallocate(f4c08b80) at _vm_pager_deallocate+0x16 _vm_object_terminate(f4c08b80,f42b0680,0,f43cd400,efbffea8) at _vm_object_terminate+0x154 _vm_object_deallocate(f4c08b80,f42b0680,f43cd400,1ef0,efbffec0) at _vm_object_deallocate+0x19f _vrele(f42b0680,f4930e00,f01e3578,f42b0680,efbffedc) at _vrele+0x30 _vput(f42b0680) at _vput+0x2f _ufs_remove(efbffef4,f01e4500,f41f4a00,0,f01e3354) at _ufs_remove+0x70 _unlink(f41f4a00,efbfff94,efbfff84,55a8,efbfddf8) at _unlink+0xb1 _syscall(27,27,efbfde0e,efbfddf8,efbfdce8) at _syscall+0x183 _Xsyscall() at _Xsyscall+0x35 --- syscall 0xa,eip = 0x8073ff1, esp = 0xefbfdc78, ebp = 0xefbfdce8 From owner-freebsd-scsi Tue Apr 22 19:29:23 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA26799 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 19:29:23 -0700 (PDT) Received: from main.gbdata.com (USR2-1.detnet.com [207.113.12.44]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA26042; Tue, 22 Apr 1997 19:17:24 -0700 (PDT) Received: (from gclarkii@localhost) by main.gbdata.com (8.8.5/8.8.5) id VAA02617; Tue, 22 Apr 1997 21:17:10 -0500 (CDT) From: Gary Clark II Message-Id: <199704230217.VAA02617@main.gbdata.com> Subject: Re: Freezes/Reboots with -current ahc driver To: smp@csn.net (Steve Passe) Date: Tue, 22 Apr 1997 21:17:10 -0500 (CDT) Cc: kmitch@weenix.guru.org, scsi@freebsd.org, smp@freebsd.org In-Reply-To: <199704221725.LAA16301@Ilsa.StevesCafe.com> from Steve Passe at "Apr 22, 97 11:25:03 am" X-Mailer: ELM [version 2.4ME+ PL22 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Steve Passe wrote: > Hi, > > > I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current > > (as of April 21) to gain use of the SMP stuff. > > > > Now, I am experiancing system freezes and reboots during the > > nightly backup. No error messages ever appear from what I can tell. > > The little SCSI light doesn;t even stay solidly lit when the system > > freezes. I didn't have any problems with the 2.2-STABLE system. > > > > I am only using SCB paging. Tagged queuing has never worked with my > > Micropolis 4221W. I just get all kinds of timeouts with tht option > > on. > > > > Has anyone else seen anything similar?? > I don't belive that this is a SMP problem. I'm running the same current here and am seeing random reboots (not panics) 1-4 times a day. I'm running on a 486DX4-100, 24MB and NO SMP. > details please. > > what backup tool are you using when it crashes? > > are any NFS mounted partitions involved? > > are you using APIC_IO? > > is ddb installed in your kernel, if not please add. > > > -- > Steve Passe | powered by > smp@csn.net | Symmetric MultiProcessor FreeBSD > Gary -- Gary Clark II (N5VMF) | I speak only for myself and "maybe" my company gclarkii@GBData.COM | Member of the FreeBSD Doc Team Providing Internet and ISP startups - http://WWW.GBData.com for information FreeBSD FAQ at ftp://ftp.FreeBSD.ORG/pub/FreeBSD/docs/FAQ.latin1 From owner-freebsd-scsi Tue Apr 22 19:44:17 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA27886 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 19:44:17 -0700 (PDT) Received: from Ilsa.StevesCafe.com (Ilsa.StevesCafe.com [205.168.119.129]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA27079; Tue, 22 Apr 1997 19:32:36 -0700 (PDT) Received: from Ilsa.StevesCafe.com (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.8.5/8.8.5) with ESMTP id UAA20177; Tue, 22 Apr 1997 20:32:24 -0600 (MDT) Message-Id: <199704230232.UAA20177@Ilsa.StevesCafe.com> X-Mailer: exmh version 2.0gamma 1/27/96 From: Steve Passe To: Gary Clark II cc: kmitch@weenix.guru.org, scsi@freebsd.org, smp@freebsd.org Subject: Re: Freezes/Reboots with -current ahc driver In-reply-to: Your message of "Tue, 22 Apr 1997 21:17:10 CDT." <199704230217.VAA02617@main.gbdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Apr 1997 20:32:24 -0600 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > > I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current > > > (as of April 21) to gain use of the SMP stuff. > > > > > > Now, I am experiancing system freezes and reboots during the > > > nightly backup. No error messages ever appear from what I can tell. > > > ... > > I don't belive that this is a SMP problem. I'm running the same current > here and am seeing random reboots (not panics) 1-4 times a day. > I'm running on a 486DX4-100, 24MB and NO SMP. For the record, after DDB was added to the SMP kernel we demonstrated that the problem in this case is SMP, specifically the apicIPI() related deadlock. So this proves that my cheap bandaid doesn't solve the problem... I've got my thinking cap on, but for now I've suggested the user switch to a NON APIC_IO kernel. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-scsi Tue Apr 22 19:59:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA28852 for freebsd-scsi-outgoing; Tue, 22 Apr 1997 19:59:14 -0700 (PDT) Received: from main.gbdata.com (USR2-1.detnet.com [207.113.12.44]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA28041; Tue, 22 Apr 1997 19:48:00 -0700 (PDT) Received: (from gclarkii@localhost) by main.gbdata.com (8.8.5/8.8.5) id VAA02748; Tue, 22 Apr 1997 21:47:54 -0500 (CDT) From: Gary Clark II Message-Id: <199704230247.VAA02748@main.gbdata.com> Subject: Re: Freezes/Reboots with -current ahc driver To: smp@csn.net (Steve Passe) Date: Tue, 22 Apr 1997 21:47:54 -0500 (CDT) Cc: kmitch@weenix.guru.org, scsi@freebsd.org, smp@freebsd.org In-Reply-To: <199704230232.UAA20177@Ilsa.StevesCafe.com> from Steve Passe at "Apr 22, 97 08:32:24 pm" X-Mailer: ELM [version 2.4ME+ PL22 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Steve Passe wrote: > Hi, > > > > > I just upgraded my 2.2-STABLE system (from around April 1) to 3.0-current > > > > (as of April 21) to gain use of the SMP stuff. > > > > > > > > Now, I am experiancing system freezes and reboots during the > > > > nightly backup. No error messages ever appear from what I can tell. > > > > ... > > > > I don't belive that this is a SMP problem. I'm running the same current > > here and am seeing random reboots (not panics) 1-4 times a day. > > I'm running on a 486DX4-100, 24MB and NO SMP. > Ohhhh..... Then there are other problems in the latest current. I may have to downgrade to what I was running then (pre-lite snap). I can't run DDB here due to the fact that I have a SUN 19" monitor and no way to see regular VGA output...:( > For the record, after DDB was added to the SMP kernel we demonstrated that the > problem in this case is SMP, specifically the apicIPI() related deadlock. > So this proves that my cheap bandaid doesn't solve the problem... > I've got my thinking cap on, but for now I've suggested the user switch to > a NON APIC_IO kernel. > > -- > Steve Passe | powered by > smp@csn.net | Symmetric MultiProcessor FreeBSD > Gary -- Gary Clark II (N5VMF) | I speak only for myself and "maybe" my company gclarkii@GBData.COM | Member of the FreeBSD Doc Team Providing Internet and ISP startups - http://WWW.GBData.com for information FreeBSD FAQ at ftp://ftp.FreeBSD.ORG/pub/FreeBSD/docs/FAQ.latin1 From owner-freebsd-scsi Wed Apr 23 01:23:15 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id BAA15893 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 01:23:15 -0700 (PDT) Received: from hcshh.hcs.de (hcshh.hcs.de [194.49.17.1]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id BAA15888 for ; Wed, 23 Apr 1997 01:23:10 -0700 (PDT) Received: from hcswork.hcs.de(really [192.76.124.5]) by hcshh.hcs.de via sendmail with smtp id for ; Wed, 23 Apr 1997 10:22:57 +0200 (METDST) (Smail-3.2.0.91 1997-Jan-14 #3 built 1997-Apr-8) Received: by hcswork.hcs.de (Smail3.1.29.0 #12) id m0wJxKF-00000UC; Wed, 23 Apr 97 10:22 METDST Message-Id: From: hm@hcs.de (Hellmuth Michaelis) Subject: Re: NCR-810a & old Micropolis MP1624 problems To: freebsd-scsi@freebsd.org Date: Wed, 23 Apr 1997 10:22:51 +0200 (METDST) In-Reply-To: <19970423003951.ZU05379@uriah.heep.sax.de> from "J Wunsch" at "Apr 23, 97 00:39:51 am" Reply-To: hm@hcs.de Organization: HCS Hanseatischer Computerservice GmbH X-Mailer: ELM [version 2.4ME+ PL15 (25)] Content-Type: text Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >From the keyboard of J Wunsch: > > We have found that adding FAILSAFE as a kernel option does fix the problem > > (which exists with 2.2.1 too). > > This would mean the drive claims to grok tagged command queuing, but > actually doesn't. This seems to be a common "feature" of some drives (or a bug in the driver) since i made the very same experience under 2.1.7 and a recent HP disk. hellmuth -- Hellmuth Michaelis Tel +49 40 559747-70 HCS Hanseatischer Computerservice GmbH Fax +49 40 559740-77 Oldesloer Strasse 97-99 Mail hm@hcs.de 22457 Hamburg WWW http://www.hcs.de From owner-freebsd-scsi Wed Apr 23 09:08:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA06968 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 09:08:14 -0700 (PDT) Received: from tacitus.globecomm.net (tacitus.globecomm.net [207.51.48.7]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA06963; Wed, 23 Apr 1997 09:08:10 -0700 (PDT) Received: from w3.starnets.ro ([193.226.124.34]) by tacitus.globecomm.net (8.8.5/8.8.0) with SMTP id MAA25122; Wed, 23 Apr 1997 12:07:57 -0400 (EDT) Message-ID: <335E32E0.2D4@earthling.net> Date: Wed, 23 Apr 1997 19:03:44 +0300 From: Penisoara Adrian Reply-To: ady@earthling.net X-Mailer: Mozilla 3.01Gold (Win95; I) MIME-Version: 1.0 To: freebsd-current@FreeBSD.org, freebsd-scsi@FreeBSD.org Subject: AHA 2940AU (aic7xxx) safe options References: <335CFA80.70F1@earthling.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk Hi I need to know which of the AHC_* options are safe in the kernel config file; I tried one kernel and I got big problems (wrong netstat output, core dumps and finally a nice cold reboot). I'm not sure that the system doesn't support AHC options but rather my partitions are ruined in consistency matter speaking. I have an Tyan Tomcat III with 2 Pentiums, 64Mb RAM @60ns, AHA2940AU PCI with an Quantum VP32170. Any hints are appreciated. The SCSI system looks like this from the view of a normal kernel: ---------------------- FreeBSD 3.0-SMP #0: Tue Apr 15 23:04:42 EEST 1997 root@warp2.starnets.ro:/usr/src/sys-MP/compile/ADYSMP FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00030010 cpu1 (AP): apic id: 1, version: 0x00030010 io0 (APIC): apic id: 2, version: 0x00170011 Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193086 Hz CPU: Pentium (586-class CPU) Origin = "GenuineIntel" Id = 0x526 Stepping=6 Features=0x3bf real memory = 67108864 (65536K bytes) avail memory = 62566400 (61100K bytes) Probing for devices on PCI bus 0: chip0 rev 3 on pci0:0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vga0 rev 0 int a irq 17 on pci0:19:0 Freeing (NOT implimented) irq 10 for ISA cards. ahc0 rev 1 int a irq 16 on pci0:20:0 Freeing (NOT implimented) irq 11 for ISA cards. ahc0: aic7860 Single Channel, SCSI Id=7, 3 SCBs ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 sd0 at scbus0 target 6 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access 2069MB (4238640 512 byte sectors) [.........] ------------------------------- and the bogus kernel (with all three options actived): ------------------------------- FreeBSD 3.0-SMP #0: Tue Apr 22 20:24:56 EEST 1997 root@warp2.starnets.ro:/usr/src/sys-MP/compile/ADYSMP FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00030010 cpu1 (AP): apic id: 1, version: 0x00030010 io0 (APIC): apic id: 2, version: 0x00170011 CPU: Pentium (586-class CPU) Origin = "GenuineIntel" Id = 0x526 Stepping=6 Features=0x3bf real memory = 67108864 (65536K bytes) avail memory = 63188992 (61708K bytes) Probing for devices on PCI bus 0: chip0 rev 3 on pci0:0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vga0 rev 0 int a irq 17 on pci0:19:0 Freeing (NOT implimented) irq 10 for ISA cards. ahc0 rev 1 int a irq 16 on pci0:20:0 Freeing (NOT implimented) irq 11 for ISA cards. ahc0: aic7860 Single Channel, SCSI Id=7, 3/8 SCBs ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 ahc0: target 6 Tagged Queuing Device sd0 at scbus0 target 6 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access 2069MB (4238640 512 byte sectors) -------------------------- Please not that Steve Passe already assured me that there should be nothing related to SMP code that might cause problems. Thanks. Ady (@earthling.net) From owner-freebsd-scsi Wed Apr 23 09:50:50 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA09436 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 09:50:50 -0700 (PDT) Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id JAA09401 for ; Wed, 23 Apr 1997 09:50:40 -0700 (PDT) Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02) id AA26920; Wed, 23 Apr 1997 12:50:07 -0400 Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Wed, 23 Apr 1997 12:50 EDT Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id HAA00385; Wed, 23 Apr 1997 07:39:51 -0400 (EDT) Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id HAA06260; Wed, 23 Apr 1997 07:10:36 -0400 (EDT) Date: Wed, 23 Apr 1997 07:10:36 -0400 (EDT) From: Thomas David Rivers Message-Id: <199704231110.HAA06260@lakes.water.net> To: ponds!calweb.com!cslye, ponds!freebsd.org!freebsd-hackers, ponds!freebsd.org!freebsd-scsi Subject: Re: Kernel panic's Content-Type: text Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > Working on a machine here that I have problems with for over a month and a > half now. Short rundown of machine, p133 on a ASUS p55t2p4 m/b with 128mb, > 2 2940UW's and a SMC dual port Digital 21040 card. One 2940 has a Quantum > XP31070W (root drive) and the other 2940 has a 9gig IBM oem. I am using the > RELENG_2_2 from 04/21/97 for the panic's I am quoting today. This box is a > news feeder system running innd1.5.1 (with mmap at the moment, about to > recompile without mmap) Anyways, any ideas would be great. I have the dump > files, and can get you any other info you need. Anyways the ddb trace etc.. This looks very much like my "dup alloc" and "freeing free inode" panics I have been working on for many months (years?)... You can find more information in the mail archives - look for "daily panics" and "dup alloc" panics. - Dave Rivers - > > The first one I dont have trace info from. > > dev = 0x400, block = 4505, fs = / > panic: freeing free block > debugger("panic") > stopped at _Debugger+0x25: movb $0_in_Debugger,110 > > --- > > dev = 0x2040c, block = 19976, fs = /news/spool > panic: blkfree freeing free block > debugger: panic > > trace: > > _Debugger(f0117b98) at _Debugger+0x35 > _panic(f019edd1,f019edae,2040c,4e08,f41f58d4) at _panic+0x5a > _fffs_blkfree(f4930e00,4e08,2000,359da0,0) at ffs_blkfree+0x19b > _ffs_indirtrunc(f4930e00,fffffff4,359da0,ffffffff,0,efbffd24) at > _ffs_indirtrunc+0x222 > _ffs_truncate(efbffdfc,f42b06a0,f4c08b80,efbffe70,0) at > _ffs_truncate+0x83c > _ufs_inactive(efbffe28,f42b0680,f01e34f4,f42b0680,efbffe48) at > _ufs_inactive+0xb1 > _vrele(f42b0680,0,f4c08b80,efbffe70,efbffe54) at _vrele+0xe7 > _vnode_pager_dealloc(f4c08b80,efbffe78,f01b24d0,f4c08b80,f4c08b80) at > _vnode_pager_dealloc+0x95 > _vm_pager_deallocate(f4c08b80) at _vm_pager_deallocate+0x16 > _vm_object_terminate(f4c08b80,f42b0680,0,f43cd400,efbffea8) at > _vm_object_terminate+0x154 > _vm_object_deallocate(f4c08b80,f42b0680,f43cd400,1ef0,efbffec0) at > _vm_object_deallocate+0x19f > _vrele(f42b0680,f4930e00,f01e3578,f42b0680,efbffedc) at _vrele+0x30 > _vput(f42b0680) at _vput+0x2f > _ufs_remove(efbffef4,f01e4500,f41f4a00,0,f01e3354) at _ufs_remove+0x70 > _unlink(f41f4a00,efbfff94,efbfff84,55a8,efbfddf8) at _unlink+0xb1 > _syscall(27,27,efbfde0e,efbfddf8,efbfdce8) at _syscall+0x183 > _Xsyscall() at _Xsyscall+0x35 > --- syscall 0xa,eip = 0x8073ff1, esp = 0xefbfdc78, ebp = 0xefbfdce8 > From owner-freebsd-scsi Wed Apr 23 10:12:39 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id KAA10739 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 10:12:39 -0700 (PDT) Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id KAA10718 for ; Wed, 23 Apr 1997 10:12:29 -0700 (PDT) Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02) id AA26884; Wed, 23 Apr 1997 12:50:02 -0400 Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Wed, 23 Apr 1997 12:50 EDT Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id HAA00315; Wed, 23 Apr 1997 07:28:38 -0400 (EDT) Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id HAA06374; Wed, 23 Apr 1997 07:35:08 -0400 (EDT) Date: Wed, 23 Apr 1997 07:35:08 -0400 (EDT) From: Thomas David Rivers Message-Id: <199704231135.HAA06374@lakes.water.net> To: ponds!calweb.com!cslye, ponds!freebsd.org!freebsd-hackers, ponds!freebsd.org!freebsd-scsi Subject: Re: Kernel panic's Content-Type: text Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > Working on a machine here that I have problems with for over a month and a > half now. Short rundown of machine, p133 on a ASUS p55t2p4 m/b with 128mb, > 2 2940UW's and a SMC dual port Digital 21040 card. One 2940 has a Quantum > XP31070W (root drive) and the other 2940 has a 9gig IBM oem. I am using the > RELENG_2_2 from 04/21/97 for the panic's I am quoting today. This box is a > news feeder system running innd1.5.1 (with mmap at the moment, about to > recompile without mmap) Anyways, any ideas would be great. I have the dump > files, and can get you any other info you need. Anyways the ddb trace etc.. > > The first one I dont have trace info from. > > dev = 0x400, block = 4505, fs = / > panic: freeing free block > debugger("panic") > stopped at _Debugger+0x25: movb $0_in_Debugger,110 > > --- > > dev = 0x2040c, block = 19976, fs = /news/spool > panic: blkfree freeing free block > debugger: panic > Just to show how similar this is to the panics I get on my (2.1.7.1) news server: ponds# gdb -k kernel.29 vmcore.29 GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc...(no debugging symbols found)... IdlePTD 21c000 current pcb at 20ded8 panic: ifree: freeing free inode #0 0xf01a26ff in boot () (kgdb) where #0 0xf01a26ff in boot () #1 0xf0114413 in panic () #2 0xf0183327 in ffs_vfree () #3 0xf01885b2 in ufs_inactive () #4 0xf012a589 in vrele () #5 0xf012a4eb in vput () #6 0xf018bae4 in ufs_remove () #7 0xf012c58e in unlink () #8 0xf01aaa76 in syscall () #9 0xf019febb in Xsyscall () #10 0x2d9a in ?? () #11 0x2b2a in ?? () #12 0x2507 in ?? () #13 0x19b9 in ?? () #14 0x10d3 in ?? () (kgdb) By the way, this particular machine is an IDE machine - so I'm not sure SCSI has too much to do with the problem. I have a reproduction of what is potentially the problem on a small machine; I've been trying to nail this for a _long_ time :-) - Dave Rivers - From owner-freebsd-scsi Wed Apr 23 11:21:36 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id LAA14704 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 11:21:36 -0700 (PDT) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id LAA14683 for ; Wed, 23 Apr 1997 11:21:29 -0700 (PDT) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id UAA00860 for freebsd-scsi@freebsd.org; Wed, 23 Apr 1997 20:21:27 +0200 Received: (from j@localhost) by uriah.heep.sax.de (8.8.5/8.8.5) id UAA03499; Wed, 23 Apr 1997 20:18:17 +0200 (MET DST) Message-ID: <19970423201817.QT17503@uriah.heep.sax.de> Date: Wed, 23 Apr 1997 20:18:17 +0200 From: j@uriah.heep.sax.de (J Wunsch) To: freebsd-scsi@freebsd.org Subject: Re: NCR-810a & old Micropolis MP1624 problems References: <19970423003951.ZU05379@uriah.heep.sax.de> X-Mailer: Mutt 0.60_p2-3,5,8-9 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: ; from Hellmuth Michaelis on Apr 23, 1997 10:22:51 +0200 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Hellmuth Michaelis wrote: > This seems to be a common "feature" of some drives (or a bug in the driver) > since i made the very same experience under 2.1.7 and a recent HP disk. One of the HP drives (C2524?) is a known rogue already. HP claims that they would work with tagged commands, but Stefan Esser once reported to me that they also fail on other, non-FreeBSD systems. It's probably time to implement the SD_Q_NO_TAGS quirk. It's already documented in sd(9). :) -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-scsi Wed Apr 23 19:13:56 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id TAA09106 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 19:13:56 -0700 (PDT) Received: from dilbert.iagnet.net (root@dilbert.iagnet.net [207.206.8.155]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA09101; Wed, 23 Apr 1997 19:13:54 -0700 (PDT) Received: (from jamie@localhost) by dilbert.iagnet.net (8.8.5/8.8.5) id WAA15041; Wed, 23 Apr 1997 22:13:53 -0400 (EDT) Message-Id: <199704240213.WAA15041@dilbert.iagnet.net> Subject: Repeated Crashes, news server - SCSI Probs? To: freebsd-questions@freebsd.org, freebsd-isp@freebsd.org, freebsd-scsi@freebsd.org Date: Wed, 23 Apr 1997 22:13:53 -0400 (EDT) RFC_Violation: You saw it here first! From: jamie@dilbert.iagnet.net (Jamie Rishaw) Reply-To: jamie@dilbert.iagnet.net Organization: Internet Access Group X-No-Archive: yes X-Face: >:-p X-Mailer: ELM [version 2.4ME+ PL22 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, I've been having a problem with one of the FreeBSD machines on our network. It's the news server.. Sorry for the crosspost, btw. Up until today it was a Cyrix 6x86 with 128M RAM, yada yada.. today's config is: - P-Pro 200, TYAN mb, 192Mb RAM - FreeBSD 2.1.7-RELEASE - Adaptec AHA-2940 Ultra/Ultra W BIOS v1.21 - 2x9GB ST410800W (News spool) - Generic 500Mb, Generic 600Mb quantum HD (IDE) (System) - 2x3COM 3C590 Etherlink III PCI rev 0 int a It seems that the server doesn't like to stay online for more than a day or so before rebooting. This has been the symptom since the day I rescued the poor server from a life of servitude and slavery running an evil Micros*** product. No worry, I formatted the disks thrice. :-) I'm looking at the messages file (I have them locally and across the network) and this is the last thing I see before a reboot: Apr 21 17:11:00 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 Apr 21 17:11:00 iagnews /kernel: SEQADDR == 0x8 (reboot) Apr 23 13:21:07 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 Apr 23 13:21:07 iagnews /kernel: SEQADDR == 0xd (reboot) Apr 23 17:12:13 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 Apr 23 17:12:13 iagnews /kernel: SEQADDR == 0xc (reboot) Apr 23 19:32:56 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 Apr 23 19:32:56 iagnews /kernel: SEQADDR == 0xc (reboot) Here's a dmesg: -- snip -- FreeBSD 2.1.7-RELEASE #14: Tue Apr 22 21:12:19 EDT 1997 jamie@iagnews.iagnet.net:/usr/src/sys/compile/GAV CPU: 199-MHz unknown (Pentium-class CPU) Origin = "GenuineIntel" Id = 0x619 Stepping=9 Features=0xf9ff,MTRR,PGE,MCA,CMOV> real memory = 201326592 (196608K bytes) Physical memory hole(s): avail memory = 192729088 (188212K bytes) Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vx0 <3COM 3C590 Etherlink III PCI> rev 0 int a irq 9 on pci0:11 utp[*utp*] address 00:a0:24:de:6e:23 vx1 <3COM 3C590 Etherlink III PCI> rev 0 int a irq 10 on pci0:12 utp[*utp*]: disable 'auto select' with DOS util! address 00:a0:24:de:92:78 ahc0 rev 0 int a irq 11 on pci0:13 ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs (ahc0:0:0): "SEAGATE ST410800W 0006" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 8669MB (17755614 512 byte sectors) (ahc0:2:0): "SEAGATE ST410800W 0003" type 0 fixed SCSI 2 sd1(ahc0:2:0): Direct-Access 8669MB (17755614 512 byte sectors) Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A wdc0 at 0x1f0-0x1f7 irq 14 on isa wdc0: unit 0 (wd0): wd0: 601MB (1232784 sectors), 1223 cyls, 16 heads, 63 S/T, 512 B/S wdc0: unit 1 (wd1): wd1: 516MB (1057280 sectors), 1120 cyls, 16 heads, 59 S/T, 512 B/S bt0 not found at 0x330 npx0 on motherboard npx0: INT 16 interface WARNING: / was not properly dismounted. -- snip -- bt0's there because I was going to replace the adaptec with a buslogic, but then realized that the cabling was wrong.. :/ Anyhow.. I've seen a lot of talk on this, but no real answers.. Are adaptec 29xx's Satan-Spawn? TIA all, -jamie -- jamie g.k. rishaw Internet Access Group Chance favors the prepared mind. __ [http://www.iagnet.net] DID:216.902.5455 FAX:216.623.3566 \/ 800:800.637.4IAGx5455 From owner-freebsd-scsi Wed Apr 23 21:42:38 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id VAA15132 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 21:42:38 -0700 (PDT) Received: from milehigh.denver.net (jdc@milehigh.denver.net [204.144.180.2]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id VAA15093; Wed, 23 Apr 1997 21:42:30 -0700 (PDT) Received: (from jdc@localhost) by milehigh.denver.net (8.6.12/8.6.12) id WAA05508; Wed, 23 Apr 1997 22:45:05 -0600 Date: Wed, 23 Apr 1997 22:45:05 -0600 (MDT) From: John-David Childs To: Jamie Rishaw cc: freebsd-questions@freebsd.org, freebsd-isp@freebsd.org, freebsd-scsi@freebsd.org Subject: Re: Repeated Crashes, news server - SCSI Probs? In-Reply-To: <199704240213.WAA15041@dilbert.iagnet.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I had the same problem with the Adaptec 2940 UW Controller. Jordan and others suggested upgrading to 2.2-REL-ENGINEERING (however that's actually spelled ;). It does seem to have helped SIGNIFICANTLY! I ended up booting from a 2.2-RELEASE-ENG floppy and starting all over, but it was worth it to get rid of the random reboots. -- On Wed, 23 Apr 1997, Jamie Rishaw wrote: > Hi, > > I've been having a problem with one of the FreeBSD machines on our > network. It's the news server.. > > Sorry for the crosspost, btw. > > Up until today it was a Cyrix 6x86 with 128M RAM, yada yada.. today's config > is: > > - P-Pro 200, TYAN mb, 192Mb RAM > - FreeBSD 2.1.7-RELEASE > - Adaptec AHA-2940 Ultra/Ultra W BIOS v1.21 > - 2x9GB ST410800W (News spool) > - Generic 500Mb, Generic 600Mb quantum HD (IDE) (System) > - 2x3COM 3C590 Etherlink III PCI rev 0 int a > > It seems that the server doesn't like to stay online for more than a day > or so before rebooting. This has been the symptom since the day I rescued the > poor server from a life of servitude and slavery running an evil Micros*** > product. No worry, I formatted the disks thrice. :-) > > I'm looking at the messages file (I have them locally and across the network) > and this is the last thing I see before a reboot: > > Apr 21 17:11:00 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Apr 21 17:11:00 iagnews /kernel: SEQADDR == 0x8 > (reboot) > Apr 23 13:21:07 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Apr 23 13:21:07 iagnews /kernel: SEQADDR == 0xd > (reboot) > Apr 23 17:12:13 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Apr 23 17:12:13 iagnews /kernel: SEQADDR == 0xc > (reboot) > Apr 23 19:32:56 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Apr 23 19:32:56 iagnews /kernel: SEQADDR == 0xc > John-David Childs (JC612) http://www.denver.net System Administrator jdc@denver.net & Network Engineer Think, Listen, Look, then ACT! "A verbal contract isn't worth the paper it's written on" - Louis B Mayer From owner-freebsd-scsi Wed Apr 23 22:37:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id WAA17180 for freebsd-scsi-outgoing; Wed, 23 Apr 1997 22:37:14 -0700 (PDT) Received: from sendero.i-connect.net (sendero-ppp.i-Connect.Net [206.190.143.100]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id WAA17147; Wed, 23 Apr 1997 22:36:58 -0700 (PDT) Received: (from shimon@localhost) by sendero.i-connect.net (8.8.5/8.8.5) id WAA03496; Wed, 23 Apr 1997 22:36:12 -0700 (PDT) Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD X-PRIORITY: 2 (High) Priority: urgent Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Date: Wed, 23 Apr 1997 22:25:18 -0700 (PDT) Organization: iConnect Corp. From: Simon Shapiro To: freebsd-scsi@freebsd.org, freebsd-bugs@freebsd.org Subject: Panic in sys/scsi/scsiconf.c - Please Help... Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I think I posted this before, but now I am stuck, and with no answers so far. So here it is again: I am calling scsi_attachdevs() from a device driver and getting PANIC: extend_set: entry 1 already has storage panic: scsi-attachdevs: malloc. Upon close examination one sees the lines: if(scbus == 0 || scbus->sc_link == 0 || extend_set(scbusses, scsibus, scbus) == 0) { panic("scsi_attachdevs: malloc"); ... When one examins the extend_set erorr message one sees quickly that it returns zero (NULL) when it discovers that the storage being extended is already extended. I am a bit confused abouth this as if storage is already allocated, why would extend_set try to extend it before checking? Also, why would it return ZERO if there IS storage. I am a bit confused. BTW, this happens only on the 176th device on the SCSI bus, so it is a bit difficult to see on most systems. I have disabled the return 0 in extend_set for now, but really need someone who understands this code to tell me which is the proper way of handling it. Thanx, Simon us->sc_link == 0 || extend_set(scbusses, scsibus, scbus) == 0) { panic("scsi_attachdevs: malloc"); ... When one examins the extend_set erorr message one sees quickly that it returns zero (NULL) when From owner-freebsd-scsi Thu Apr 24 02:31:00 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id CAA26765 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 02:31:00 -0700 (PDT) Received: from hda.hda.com (hda-bicnet.bicnet.net [207.198.1.121]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id CAA26755 for ; Thu, 24 Apr 1997 02:30:54 -0700 (PDT) Received: (from dufault@localhost) by hda.hda.com (8.8.5/8.8.5) id FAA02943; Thu, 24 Apr 1997 05:17:48 -0400 (EDT) From: Peter Dufault Message-Id: <199704240917.FAA02943@hda.hda.com> Subject: Re: Panic in sys/scsi/scsiconf.c - Please Help... In-Reply-To: from Simon Shapiro at "Apr 23, 97 10:25:18 pm" To: Shimon@i-Connect.Net (Simon Shapiro) Date: Thu, 24 Apr 1997 05:17:48 -0400 (EDT) Cc: scsi@freebsd.org X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > I am calling scsi_attachdevs() from a device driver and getting PANIC: > > extend_set: entry 1 already has storage > panic: scsi-attachdevs: malloc. > > Upon close examination one sees the lines: > > if(scbus == 0 || scbus->sc_link == 0 > || extend_set(scbusses, scsibus, scbus) == 0) { > panic("scsi_attachdevs: malloc"); > ... > (That first "scbus == 0" isn't doing anything since it was just dereferenced.) Extend_set is allocating pointers in chunks, and reallocating and moving them when it runs out of point space. Then the array can be indexed at run time instead of walking a data structure. extend_set is panicing because it requires: 1. That the index (scsibus) not have already been set; 2. That newly allocated storage be zeroed. We're probably violating rule 1. I'm surprised you're on your 176th disk and just now getting to bus 1. Is there a chance you have two devices wired down to bus 1 and nothing complained? Try boot verbose, and next time include your config output. > us->sc_link == 0 > || extend_set(scbusses, scsibus, scbus) == 0) { > panic("scsi_attachdevs: malloc"); And as Justin asked, why do you always get repeated text at the end of your messages? Peter -- Peter Dufault (dufault@hda.com) Realtime Machine Control and Simulation HD Associates, Inc. Voice: 508 433 6936 From owner-freebsd-scsi Thu Apr 24 04:47:08 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id EAA02333 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 04:47:08 -0700 (PDT) Received: from outland.cyberwar.com (root@outland.cyberwar.com [206.88.128.42]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id EAA02328 for ; Thu, 24 Apr 1997 04:47:04 -0700 (PDT) Received: from skippy.grunfelder.com (nj004z-193.cybernex.net [207.198.208.193]) by outland.cyberwar.com (8.8.5/8.8.5) with SMTP id HAA17019 for ; Thu, 24 Apr 1997 07:47:01 -0400 (EDT) Message-Id: <3.0.1.32.19970424074701.006a85e4@pop.cyberwar.com> X-Sender: wjgrun@pop.cyberwar.com X-Mailer: Windows Eudora Pro Version 3.0.1 (32) Date: Thu, 24 Apr 1997 07:47:01 -0400 To: freebsd-scsi@freebsd.org From: Bill Grunfelder Subject: Re: Repeated Crashes, news server - SCSI Probs? In-Reply-To: References: <199704240213.WAA15041@dilbert.iagnet.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I had the same problem with the Adaptec 2940 UW Controller. Jordan and >others suggested upgrading to 2.2-REL-ENGINEERING (however that's >actually spelled ;). It does seem to have helped SIGNIFICANTLY! >I ended up booting from a 2.2-RELEASE-ENG floppy and starting all over, >but it was worth it to get rid of the random reboots. Is there a simpler way for us to update the adaptec driver? If possible, I would rather just update the driver, rather than upgrade/redo the whole machine. Bill ....................................................................... Bill Grunfelder System Administrator wjgrun@cyberwar.com Cyber Warrior, Inc. http://www.cyberwar.com/~wjgrun/ (201) 703-1517 -The above does not necessarily coincide with the views of my employer- From owner-freebsd-scsi Thu Apr 24 08:11:46 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id IAA10742 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 08:11:46 -0700 (PDT) Received: from indigo.ie (aoife.indigo.ie [194.125.133.9]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA10737 for ; Thu, 24 Apr 1997 08:11:42 -0700 (PDT) Received: from indigo.ie (localhost [127.0.0.1]) by indigo.ie (8.8.5/8.8.5/INDIGO-HUB) with ESMTP id QAA05458; Thu, 24 Apr 1997 16:11:25 +0100 (BST) Message-Id: <199704241511.QAA05458@indigo.ie> To: "Justin T. Gibbs" Cc: freebsd-scsi@freebsd.org, judgea@indigo.ie, wilko@yedi.iaf.nl Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE In-reply-to: Message from "Justin T. Gibbs" dated Monday at 10:10. From: Alan Judge Date: Thu, 24 Apr 1997 16:11:25 +0100 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [Replying to a few messages at once.] Justin> Definitely trim that cable. Just be sure to clean the wires Justin> after you make the cut as most are braided and tend to Justin> fray/short the other lines after you cut them. I delayed replying until I saw what shortening the cable did. The system made nearly two days before producing that odd error. I guess I'll have to try canister swapping. >>>>> Wilko Bulte writes: Wilko> Any chance you can try things without the canisters? A lot of work alright, and I'm not sure I want to do that on a production machine. Bear in mind that I've never seen the same problem on the other bus (same config --- 4 drives in canisters). Both busses are pretty busy (two CCD stripes). I guess it might be a broken canister or a dodgy ribbon cable. I'll have to do some swapping, since I can't think of any better way to narrow down which canister. What's the likelihood that the disk producing the error is in the faulty canister? Wilko> An active terminator on the end of the bus instead of a Wilko> drive-internal terminator in a canister is also worthwile. Couldn't easily get ribbon cables with the correct number of connectors for this. In any case, the Quantum Atlas II claims to do active termination (controlled by a single jumper rather than a restistor pack). Wilko> How much stublength does each canister introduce? Quiet a bit. Maybe 20cm round trip. Wilko> Are these canisters designed with Ultra SCSI speeds in mind? The booklet doesn't mention Ultra, so I'd guess no. -- Alan Judge Phone: +353-1-6046901 Indigo Internet Services Fax: +353-1-6046948 From owner-freebsd-scsi Thu Apr 24 08:26:34 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id IAA11479 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 08:26:34 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA11472; Thu, 24 Apr 1997 08:26:32 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id JAA01891; Thu, 24 Apr 1997 09:26:17 -0600 (MDT) Message-Id: <199704241526.JAA01891@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: jamie@dilbert.iagnet.net cc: freebsd-questions@freebsd.org, freebsd-isp@freebsd.org, freebsd-scsi@freebsd.org Subject: Re: Repeated Crashes, news server - SCSI Probs? In-reply-to: Your message of "Wed, 23 Apr 1997 22:13:53 EDT." <199704240213.WAA15041@dilbert.iagnet.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 24 Apr 1997 10:24:44 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I'm looking at the messages file (I have them locally and across the network) >and this is the last thing I see before a reboot: > >Apr 21 17:11:00 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHAS >E == 0x1, SCSISIGI == 0x0 This is an old bug. You should run either 2.1-stable or 2.2-stable on this machine which has all of the latest bug fixes for the adaptec driver. >Anyhow.. I've seen a lot of talk on this, but no real answers.. Are adaptec >29xx's Satan-Spawn? I must have mentioned 10 times that people who see this problem need to get the latest driver, but perhaps you're not on those lists. >TIA all, > >-jamie >-- >jamie g.k. rishaw Internet Access Group >Chance favors the prepared mind. __ [http://www.iagnet.net] >DID:216.902.5455 FAX:216.623.3566 \/ 800:800.637.4IAGx5455 -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Thu Apr 24 08:42:47 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id IAA12347 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 08:42:47 -0700 (PDT) Received: from indigo.ie (aoife.indigo.ie [194.125.133.9]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA12342 for ; Thu, 24 Apr 1997 08:42:44 -0700 (PDT) Received: from indigo.ie (localhost [127.0.0.1]) by indigo.ie (8.8.5/8.8.5/INDIGO-HUB) with ESMTP id QAA11526 for ; Thu, 24 Apr 1997 16:42:39 +0100 (BST) Message-Id: <199704241542.QAA11526@indigo.ie> To: freebsd-scsi@freebsd.org Subject: Target busy and other errors From: Alan Judge Date: Thu, 24 Apr 1997 16:42:39 +0100 Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [Fun day. This is a different machine.] I upgraded another machine here to 2.2-STABLE this morning. It was running 2.2-GAMMA and was giving the odd SCSI error. The machine only stayed up a few hours before hanging completely. Symptoms were a burst of "sd0(ahc0:0:0): Target Busy" errors then a burst of other stuff about SCBs and so on, before a panic on freeing a free inode and a hang (with more SCB stuff) after the syncing disk message. (Sorry I don't have more info, but the machine was important and was rebooted before I got a chance to write things down.) In the past under 2.2-GAMMA, I've seen other errors, like overlapped command errors (but usually after a target busy), and timeouts of the sort that 2.2-STABLE is supposed to fix. The errors are always on sd0, so I'm wondering if it's specific to the disk or controller. sd0 is an internal narrow Barracuda on its own on a motherboard controller in a HP Vectra XU. External terminator. FreeBSD identifies: ahc0 rev 0 int a irq 9 on pci0:2 ahc0: Using left over BIOS settings aic7880 Single Channel, SCSI Id=7, 16/255 SCBs There are also two 2940UWs and a bunch of external Atlas II-UWs that give no problems that I've seen. But sd0 probably bursts busier. Backups appear to be particular causes of lockups. I've moved back to 2.2-GAMMA for the moment, as it's never locked up in the same way, though the errors sometimes cause paging errors and process crashes. -- Alan Judge Phone: +353-1-6046901 Indigo Internet Services Fax: +353-1-6046948 From owner-freebsd-scsi Thu Apr 24 09:10:26 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA13719 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 09:10:26 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA13714 for ; Thu, 24 Apr 1997 09:10:24 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id KAA02656; Thu, 24 Apr 1997 10:10:05 -0600 (MDT) Message-Id: <199704241610.KAA02656@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Bill Grunfelder cc: freebsd-scsi@freebsd.org Subject: Re: Repeated Crashes, news server - SCSI Probs? In-reply-to: Your message of "Thu, 24 Apr 1997 07:47:01 EDT." <3.0.1.32.19970424074701.006a85e4@pop.cyberwar.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 24 Apr 1997 11:08:32 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >>I had the same problem with the Adaptec 2940 UW Controller. Jordan and >>others suggested upgrading to 2.2-REL-ENGINEERING (however that's >>actually spelled ;). It does seem to have helped SIGNIFICANTLY! >>I ended up booting from a 2.2-RELEASE-ENG floppy and starting all over, >>but it was worth it to get rid of the random reboots. > >Is there a simpler way for us to update the adaptec driver? If possible, >I would rather just update the driver, rather than upgrade/redo the whole >machine. Use CVSup or CTM to update your kernel sources, rebuild the kernel, and reboot. >Bill >....................................................................... >Bill Grunfelder System Administrator >wjgrun@cyberwar.com Cyber Warrior, Inc. >http://www.cyberwar.com/~wjgrun/ (201) 703-1517 >-The above does not necessarily coincide with the views of my employer- > -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Thu Apr 24 09:57:21 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA16821 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 09:57:21 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA16815 for ; Thu, 24 Apr 1997 09:57:18 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id KAA03503; Thu, 24 Apr 1997 10:57:15 -0600 (MDT) Message-Id: <199704241657.KAA03503@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Alan Judge cc: freebsd-scsi@FreeBSD.ORG Subject: Re: Target busy and other errors In-reply-to: Your message of "Thu, 24 Apr 1997 16:42:39 BST." <199704241542.QAA11526@indigo.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 24 Apr 1997 11:55:48 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >[Fun day. This is a different machine.] > >I upgraded another machine here to 2.2-STABLE this morning. It was >running 2.2-GAMMA and was giving the odd SCSI error. The machine only >stayed up a few hours before hanging completely. Symptoms were a >burst of "sd0(ahc0:0:0): Target Busy" errors then a burst of other >stuff about SCBs and so on, before a panic on freeing a free inode and >a hang (with more SCB stuff) after the syncing disk message. (Sorry I >don't have more info, but the machine was important and was rebooted >before I got a chance to write things down.) I just committed one bug fix that might affect you since you are using SCB paging. I'm also working with JDP on what I believe is a very similar problem to what you're seeing with the target busy errors. I have a hunch on it, and once I have a solution, you'll see the commit. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Thu Apr 24 10:07:34 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id KAA17679 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 10:07:34 -0700 (PDT) Received: from mail.id.net (mail.id.net [199.125.1.6]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA17635; Thu, 24 Apr 1997 10:07:26 -0700 (PDT) Received: from server.id.net (server.id.net [199.125.2.20]) by mail.id.net (8.8.5/8.8.5) with ESMTP id NAA00590; Thu, 24 Apr 1997 13:07:23 -0400 (EDT) From: Robert Shady Received: (from rls@localhost) by server.id.net (8.8.5/8.7.3) id NAA00300; Thu, 24 Apr 1997 13:07:22 -0400 (EDT) Message-Id: <199704241707.NAA00300@server.id.net> Subject: Re: Repeated Crashes, news server - SCSI Probs? In-Reply-To: from John-David Childs at "Apr 23, 97 10:45:05 pm" To: jdc@denver.net (John-David Childs) Date: Thu, 24 Apr 1997 13:07:22 -0400 (EDT) Cc: jamie@dilbert.iagnet.net, freebsd-questions@freebsd.org, freebsd-isp@freebsd.org, freebsd-scsi@freebsd.org X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > > Apr 21 17:11:00 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > > Apr 21 17:11:00 iagnews /kernel: SEQADDR == 0x8 > > (reboot) > > Apr 23 13:21:07 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > > Apr 23 13:21:07 iagnews /kernel: SEQADDR == 0xd > > (reboot) > > Apr 23 17:12:13 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > > Apr 23 17:12:13 iagnews /kernel: SEQADDR == 0xc > > (reboot) > > Apr 23 19:32:56 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > > Apr 23 19:32:56 iagnews /kernel: SEQADDR == 0xc Hmmm.. I've had the same problem when I run my tape backup drive on my 2940UW.. I'll have to check into that. It's a new tape drive, so I was just figuring it was probably me.. :) -- Rob === _/_/_/_/_/ _/_/_/_/ _/_/ _/ _/_/_/_/_/ _/_/_/_/_/ _/ _/ _/ _/_/_/ _/ _/ _/ _/_/_/_/ _/ _/_/_/_/_/ _/_/_/_/ _/ _/ _/_/_/_/_/ _/ Innovative Data Services Serving South-Eastern Michigan Internet Service Provider / Hardware Sales / Consulting Services Voice: (810)855-0404 / Fax: (810)855-3268 / Web: http://www.id.net From owner-freebsd-scsi Thu Apr 24 10:43:17 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id KAA20145 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 10:43:17 -0700 (PDT) Received: from mail.id.net (mail.id.net [199.125.1.6]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA20104; Thu, 24 Apr 1997 10:43:04 -0700 (PDT) Received: from server.id.net (server.id.net [199.125.2.20]) by mail.id.net (8.8.5/8.8.5) with ESMTP id NAA01267; Thu, 24 Apr 1997 13:43:03 -0400 (EDT) From: Robert Shady Received: (from rls@localhost) by server.id.net (8.8.5/8.7.3) id NAA00664; Thu, 24 Apr 1997 13:43:03 -0400 (EDT) Message-Id: <199704241743.NAA00664@server.id.net> Subject: Re: Repeated Crashes, news server - SCSI Probs? In-Reply-To: <199704241526.JAA01891@pluto.plutotech.com> from "Justin T. Gibbs" at "Apr 24, 97 10:24:44 am" To: gibbs@plutotech.com (Justin T. Gibbs) Date: Thu, 24 Apr 1997 13:43:03 -0400 (EDT) Cc: jamie@dilbert.iagnet.net, freebsd-questions@FreeBSD.ORG, freebsd-isp@FreeBSD.ORG, freebsd-scsi@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > >I'm looking at the messages file (I have them locally and across the network) > >and this is the last thing I see before a reboot: > > > >Apr 21 17:11:00 iagnews /kernel: sd1(ahc0:2:0): timed out while idle, LASTPHAS > >E == 0x1, SCSISIGI == 0x0 > > This is an old bug. You should run either 2.1-stable or 2.2-stable on this > machine which has all of the latest bug fixes for the adaptec driver. > > >Anyhow.. I've seen a lot of talk on this, but no real answers.. Are adaptec > >29xx's Satan-Spawn? > > I must have mentioned 10 times that people who see this problem need to get > the latest driver, but perhaps you're not on those lists. I saw your message, but I'm running 3.0-970209-SNAP, I figured it would be in there... :( -- Rob === _/_/_/_/_/ _/_/_/_/ _/_/ _/ _/_/_/_/_/ _/_/_/_/_/ _/ _/ _/ _/_/_/ _/ _/ _/ _/_/_/_/ _/ _/_/_/_/_/ _/_/_/_/ _/ _/ _/_/_/_/_/ _/ Innovative Data Services Serving South-Eastern Michigan Internet Service Provider / Hardware Sales / Consulting Services Voice: (810)855-0404 / Fax: (810)855-3268 / Web: http://www.id.net From owner-freebsd-scsi Thu Apr 24 13:22:01 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id NAA28632 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 13:22:01 -0700 (PDT) Received: from pluto.plutotech.com (root@pluto100.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA28627; Thu, 24 Apr 1997 13:21:57 -0700 (PDT) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id OAA07141; Thu, 24 Apr 1997 14:21:45 -0600 (MDT) Message-Id: <199704242021.OAA07141@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: Robert Shady cc: gibbs@plutotech.com (Justin T. Gibbs), jamie@dilbert.iagnet.net, freebsd-questions@FreeBSD.ORG, freebsd-isp@FreeBSD.ORG, freebsd-scsi@FreeBSD.ORG Subject: Re: Repeated Crashes, news server - SCSI Probs? In-reply-to: Your message of "Thu, 24 Apr 1997 13:43:03 EDT." <199704241743.NAA00664@server.id.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 24 Apr 1997 15:20:18 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >> I must have mentioned 10 times that people who see this problem need to get >> the latest driver, but perhaps you're not on those lists. > >I saw your message, but I'm running 3.0-970209-SNAP, I figured it would be >in there... :( In something more than 2 months old? > -- Rob >=== > _/_/_/_/_/ _/_/_/_/ _/_/ _/ _/_/_/_/_/ _/_/_/_/_/ > _/ _/ _/ _/_/_/ _/ _/ _/ _/_/_/_/ _/ > _/_/_/_/_/ _/_/_/_/ _/ _/ _/_/_/_/_/ _/ > > Innovative Data Services > Serving South-Eastern Michigan > Internet Service Provider / Hardware Sales / Consulting Services > Voice: (810)855-0404 / Fax: (810)855-3268 / Web: http://www.id.net > -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations =========================================== From owner-freebsd-scsi Thu Apr 24 15:39:29 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA06402 for freebsd-scsi-outgoing; Thu, 24 Apr 1997 15:39:29 -0700 (PDT) Received: from iafnl.es.iaf.nl (root@iafnl.es.iaf.nl [195.108.17.20]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id PAA06394 for ; Thu, 24 Apr 1997 15:39:20 -0700 (PDT) Received: by iafnl.es.iaf.nl with UUCP id AA19238 (5.67b/IDA-1.5 for freebsd-scsi@freebsd.org); Thu, 24 Apr 1997 23:48:02 +0200 Received: (from wilko@localhost) by yedi.iaf.nl (8.7.5/8.6.12) id XAA01106; Thu, 24 Apr 1997 23:04:23 +0200 (MET DST) From: Wilko Bulte Message-Id: <199704242104.XAA01106@yedi.iaf.nl> Subject: Re: Vendor specific ASCQ SCSI errors in 2.2-STABLE To: Alan.Judge@indigo.ie (Alan Judge) Date: Thu, 24 Apr 1997 23:04:23 +0200 (MET DST) Cc: gibbs@plutotech.com, freebsd-scsi@freebsd.org, judgea@indigo.ie In-Reply-To: <199704241511.QAA05458@indigo.ie> from "Alan Judge" at Apr 24, 97 04:11:25 pm X-Mailer: ELM [version 2.4 PL24 ME8a] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Alan Judge wrote... > > >>>>> Wilko Bulte writes: > Wilko> Any chance you can try things without the canisters? > > A lot of work alright, and I'm not sure I want to do that on a > production machine. Bear in mind that I've never seen the same > problem on the other bus (same config --- 4 drives in canisters). Same drives ? > I'll have to do some swapping, since I can't think of any better way > to narrow down which canister. What's the likelihood that the > disk producing the error is in the faulty canister? Hmm. I would not know. > Wilko> How much stublength does each canister introduce? > > Quiet a bit. Maybe 20cm round trip. Hmm. Long.. > Wilko> Are these canisters designed with Ultra SCSI speeds in mind? > > The booklet doesn't mention Ultra, so I'd guess no. Again: Hmmm. Ultra *is* sensitive to marginal busses. Wilko _ ____________________________________________________________________ | / o / / _ Bulte email: wilko@yedi.iaf.nl - Arnhem, The Netherlands |/|/ / / /( (_) Do, or do not. There is no 'try' - Yoda --------------------------------------------------------------------------