Date: Mon, 13 Jul 1998 13:44:54 +0500 (PKT) From: "Saad M. Waraich" <Saad.Waraich@Pakistan.NCR.COM> To: stable@FreeBSD.ORG Cc: leo@talcom.net, se@FreeBSD.ORG Subject: Re: NCR 875 and tagged queing. Broken? Message-ID: <199807130844.NAA11954@isb.ncr.com.pk> In-Reply-To: <19980712103316.07090@mi.uni-koeln.de> from Stefan Esser at "Jul 12, 98 10:33:16 am"
next in thread | previous in thread | raw e-mail | index | archive | help
The problem is a combination of the NCR driver and the Atlas III drive. I have an 875 based card (Tekram 390F) and a 2 gig. Atlas III drive and I've seen this problem a lot. Upgrading the drive's firmware didn't help either. Is it worth it to talk to Quantum about this problem ? They could easily shrug it off saying that it is a problem in the driver. -- Saad Stefan Esser wrote: > On 1998-06-27 21:43 -0400, Leo Papandreou <leo@talcom.net> wrote: > > > > 2.2-STABLE (cvsupped and built June 26) > > > > Twin channel NCR 875 adapter, Quantum Atlas III, FAILSAFE commented > > out in kernerl's configuration file. > > > > cp -RP dir1 dir2 (dir1 and dir2 on different partitions, same drive.) > > produces lots of these messages: > > > > Jun 26 17:42:47 abou /kernel: assertion "cp" failed: file "../../pci/ncr.c", line 6191 > > Jun 26 17:42:48 abou /kernel: sd0(ncr0:6:0): COMMAND FAILED (4 28) @f14a1800. > > This is a result of too many simultanous outstanding commands. > > The drive returns QUEUE_FULL status if it is asked to accept > another (tagged) command, and the upper layer SCSI Code will > initiate several retries of that command. > > > I've seen recent reports of an identical problem. I'm not sure if its > > the hardware; the fact that these other reports are very recent makes > > me suspect the hard drive is not at fault. I wish I had a spare AHA > > around to test this suspicion but I do not. Also, although I realize > > older quantums cannot reliably do tagged queing, this is an 18.2 Gig > > Atlas III bought not 2 days ago. (Please let it not be the hardware.) > > It might be the firmware. Atlas drives have been known to show > that effect for quite some time: They accept a huge number of > tagged commands during normal operation, but suddenly decide to > support only a few (during short intervals of resource exhaustion ?) > > The generic SCSI code in FreeBSD 2.2.x and -current pre-dates use > of tags in drivers, and can't really deal with QUEUE_FULL. > The new CAM code (a new snapshot has been announced by Justin Gibbs > recently) will understand QUEUE_FULL status to mean "throttle down". > It will reduce the number of simultanous commands sent to a drive, > and will try to slowly raise that value again after things seem > normal again. > > > This does not happen if the directories involved are small. This does > > not happen when FAILSAFE is present. The problem certainly has something > > to do with tagged queing as has already been pointed out in a previous > > msg. Without FAILSAFE, SCSI_NCR_DFLT_TAGS defaults to 4 but I've seen > > at least 1 msg on this list where someone had set SCSI_NCR_DFLT_TAGS=8. > > You can use any number of tags between 0 and 16, but in my tests > with several drives I found 8 tags to give best performance and > 4 tags to give nearly identical performance woth less system load. > Justin Gibbs reported throughput improvements with much higher > numbers of tags, but I could not reproduce them, either because I > could not produce the same kind of load, or because the NCR driver > uses linear lists in a few cases, which does not matter if there > are a few entries in the list, but may do, if the list grows to > tens or hundreds of entries. > > > Can anyone confirm or deny that the problem is related to recent (Jun 2?) > > changes in the kernel? > > No, there have been none in that area, sorry. > > > Jun 26 18:01:15 abou /kernel: (ncr0:6:0): "QUANTUM QM318000TD-SW N1B0" type 0 fixed SCSI 2 > > I do not know, whether there is a problem with tags in that firmware > release (N1B0). The problem existed in both the Atlas and Atlas II, > but I do not know much about the Atlas III ... > > There should not be any data loss because of that situation. You may > want to test the next snapshot release of Justin Gibbs CAM code. It > is much better tested with Adaptec cards, but I've been using a CAM > system for several months with my NCR card and an old Quantum Atlas > with no problems. (But the highest load is an occasional "make world" > every one or two weeks :) > > Regards, STefan > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-stable" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199807130844.NAA11954>