From owner-freebsd-hackers Tue Apr 1 06:28:36 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id GAA03807 for hackers-outgoing; Tue, 1 Apr 1997 06:28:36 -0800 (PST) Received: from nevis.oss.uswest.net (nevis.oss.uswest.net [204.147.85.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA03800 for ; Tue, 1 Apr 1997 06:28:33 -0800 (PST) Received: (from greg@localhost) by nevis.oss.uswest.net (8.8.2/8.8.2) id IAA08940; Tue, 1 Apr 1997 08:28:28 -0600 (CST) From: "Greg Rowe" Message-Id: <9704010828.ZM8938@nevis.oss.uswest.net> Date: Tue, 1 Apr 1997 08:28:28 -0600 In-Reply-To: Thomas David Rivers "aha2940 problems on 2.1.7.1." (Mar 31, 8:44pm) References: <199704010144.UAA00322@lakes.water.net> X-Mailer: Z-Mail (3.2.1 10oct95) To: Thomas David Rivers Subject: Re: aha2940 problems on 2.1.7.1. Cc: hackers@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk You're not alone with this problem.... Although I'm seeing it when Amanda first goes out and does it's planner stuff with dump(8). The tape drive is located on another system, so it's not drive related. I've also increased the timeout values. My system lockup symptoms are also the same. Justin is working on the problem. Greg On Mar 31, 8:44pm, Thomas David Rivers wrote: > Subject: aha2940 problems on 2.1.7.1. > > Just to let everyone know what I've tried, in an attempt to diagnose > the aha2940 problems in 2.1.7.1. > > I increased (to 20 minutes) all of the timeout parms in the scsi_scsi_cmd() > calls in st.c; thinking that perhaps the write at the end of the time > was timing out too soon; causing my problems. [Recall, my problem is that > I can write to a Wangtek 5150ES (QIC-150) and fill the tape up; which locks > down my 2.1.7.1 system completely - this is a 2.1.6.1 system with a 2.1.7.1 > kernel.] > > However, this didn't help the problem, when the write that fills the tape > up completes I get (on the console): > > sd0(ahc0:0:0): SCB 0x2 - timed out in command phase, SCSISIGI == 0x84 > SEQADDR == 0x42 > st0(ahc0:2:0): abort message in message buffer > > Note that ahc0:0:0 is my primary disk drive; apparently filling the > tape up caused a SCSI bus reset which didn't do much for my primary disk > drive. Also notice that I didn't get any I/O about the abort having > completed... everything except ping'ing the machine has "gone south" > at this point. > > Also, after a press-the-reset-button reboot (not a complete shutdown) > I got as far as starting login and xdm when suddenly, I got: > > > sd0(ahc0:0:0) SCB 0x0 - timed out in message out phasse, SCSISIGI == 0xa4 > SEQADDR == 0x99 > sd0(ahc0:0:0): abort message in message buffer > sd0(ahc0:0:0): SCB 0 - Abort Completed. > panic: Couldn't find next SCB > > A turn-off-the-machine/cold reboot seems to have gotten me working > again... (phew) > > However; this means two things: > > 1) 2.1.7.1's AHA2940 support has a problem. > > 2) My idea about needing longer timeouts in st.c has nothing > to do with the problem. > > > I really need to back stuff up before an upgrade to 2.2.1 - does anyone > have any suggestions? > > > - Dave Rivers - > > p.s. Here's the pertinent dmesg from the last boot; to give everyone > an idea about the devices I have: > > FreeBSD 2.1.7.1-RELEASE #1: Mon Mar 31 19:38:48 EST 1997 > rivers@lakes.water.net:/usr/src/sys-2.1.7.1/compile/LAKES > CPU: 133-MHz Pentium 735\\90 or 815\\100 (Pentium-class CPU) > Origin = "GenuineIntel" Id = 0x52c Stepping=12 > Features=0x1bf > real memory = 33554432 (32768K bytes) > avail memory = 30457856 (29744K bytes) > Probing for devices on PCI bus 0: > chip0 rev 1 on pci0:0 > chip1 rev 1 on pci0:7:0 > chip2 rev 0 on pci0:7:1 > ahc0 rev 0 int a irq 15 on pci0:17 > ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs > ahc0 waiting for scsi devices to settle > (ahc0:0:0): "HP C3323-300 4242" type 0 fixed SCSI 2 > sd0(ahc0:0:0): Direct-Access 1003MB (2056008 512 byte sectors) > (ahc0:1:0): "MICROP 1548-15MZ1077802 HZ2P" type 0 fixed SCSI 1 > sd1(ahc0:1:0): Direct-Access 1635MB (3349512 512 byte sectors) > (ahc0:2:0): "WANGTEK 5150ES SCSI FA23 08" type 1 removable SCSI 1 > st0(ahc0:2:0): Sequential-Access drive offline > (ahc0:3:0): "NEC CD-ROM DRIVE:400 1.0" type 5 removable SCSI 2 > cd0(ahc0:3:0): CD-ROM cd present.[217422 x 2048 byte records] >-- End of excerpt from Thomas David Rivers -- Greg Rowe | U S West - Interact Services | INTERNET greg@uswest.net 111 Washington Ave. South | Fax: (612) 672-8537 Minneapolis, MN USA 55401 | Voice: (612) 672-8535 Never trust an operating system you don't have source for....