From owner-freebsd-questions Tue Apr 8 19:14:39 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id TAA19379 for questions-outgoing; Tue, 8 Apr 1997 19:14:39 -0700 (PDT) Received: from vicor-nb.com (ftp.vicor-nb.com [208.206.78.223]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA19373 for ; Tue, 8 Apr 1997 19:14:36 -0700 (PDT) Received: from dns.vicor-nb.com (madden.vicor-nb.com [208.206.78.38]) by vicor-nb.com (8.7.5/8.7.3) with SMTP id SAA19932; Tue, 8 Apr 1997 18:04:57 GMT Message-ID: <334AFA8A.3010@vicor-nb.com> Date: Tue, 08 Apr 1997 19:10:18 -0700 From: Cayford Burrell Reply-To: cayford@vicor-nb.com Organization: Vicor, Inc X-Mailer: Mozilla 3.01 (Win95; I) MIME-Version: 1.0 To: questions@freebsd.org CC: julian@whistle.com, phk@tfs.com Subject: SCSI problems Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-questions@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi there. We're having trouble with SCSI. Need some help. Console shows "timeout while idle", with a bunch of variables like SCSI_SIGI, LASTPHASE, and so on. Very often, the cpu freezes. Sometimes, the system recovers. I can easily recreate this. I heard from Poul-Henning that 2.2 had problems like this and that they were solved with 2.2.1. Well, I still have problems. 2.2.1 is better, but under load (6 processes reading a series of 200k files), it will crash, after 5-10 minutes, sometimes after 2-3 hours. Specifics: P5/133, Adaptec 2940UW, Seek Raid disk configured as one SCSI target with 39gb, AHC_TAGENABLE amd AHC_MEMIO. NOT AHC_PAGE_SCB (or whatever turns on SCB paging). FreeBSD 2.2.1. This disk can routinely move data at 15MB/sec; at least, it has in the past, with 2.1.5. It seems that either the Raid disk is dropping an SCB, or, the kernel is confused. Reading the scsi driver code, I see that there should only be 4 SCB's for this disk, even though the 2940 supports 16. The disk's serial port shows that frequently, on reads, up to 16 tagged requests are pending. The SCSI analyzer shows that the disk is rejecting requests with the scsi response "queue full", which is supposed to trigger the driver to reduce the number of opennings. The disk supports 16 SCB's. Seems to me that the kernel is sending more SCB's than it should, and is not honoring the limit of 4 SCB's, and worse, when the disk reports back that it can't accept more requests, the cpu just keeps hammering away. What should I do next? -- Cayford Burrell