From owner-freebsd-questions Fri Oct 26 14: 5:32 2001 Delivered-To: freebsd-questions@freebsd.org Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6]) by hub.freebsd.org (Postfix) with SMTP id EB87837B407 for ; Fri, 26 Oct 2001 14:05:26 -0700 (PDT) Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Fri Oct 26 17:05:14 EDT 2001 Received: from aura.research.bell-labs.com (aura.research.bell-labs.com [135.104.46.10]) by scummy.research.bell-labs.com (8.11.4/8.11.4) with ESMTP id f9QL4al21909 for ; Fri, 26 Oct 2001 17:04:36 -0400 (EDT) Received: (from jkf@localhost) by aura.research.bell-labs.com (8.9.1/8.9.1) id RAA00212 for freebsd-questions@FreeBSD.ORG; Fri, 26 Oct 2001 17:04:35 -0400 (EDT) Date: Fri, 26 Oct 2001 17:04:35 -0400 (EDT) From: Jeff Fellin Message-Id: <200110262104.RAA00212@aura.research.bell-labs.com> To: freebsd-questions@FreeBSD.ORG Subject: system hung with runnable processes Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I didn't see anything like this in the archives, so I'm sending this to the questions list and hackers list for assistance. I am running FreeBSD 4.3 on a L440GX+ motherboard with dual PCI buses: 32/33 and 32/66 dual Pentium III @ 700MHz with 256KB L2 cache. The system is running in Uniprocessor mode. Although running the tests on FreeBSD 4.1 has not caused the problem. My problem: I have an application that reads from a SCSI bus, and forwards the SCSI CDB's to another system over TCP. When running a large load the system gets SCSI bus device reset's that the application acknowledges and clears an error bit. After a period of time, in this example about 2.5 hours, the system stops processing any SCSI CDB's. In DDB the ps output show 11 runnable process, p_wchan == 0, and curproc points to one of the processes. However, when checking the run queues via gdb, none of the runnable processes is in a run queue. According to rtqueuebits, queuebits, and idqueuebits, only queue[12] has any runnable processes. Examing the proc structures for the runnable processes, their priority is 6, so they should be in queue[6]. I cannot determine anything obvious in the process scheduling code, but something is happening. I am attaching the system dmesg output from boot to taking the system dump, the ddb output on the serial console, and the output from gdb of the process' stack trace and proc structure. If anyone needs more information just ask and I'll try to get it for you. Does anyone believe upgrading to FreeBSD 4.4 would resolve the problem? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message