From owner-freebsd-questions Fri Nov 2 5:20:26 2001 Delivered-To: freebsd-questions@freebsd.org Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49]) by hub.freebsd.org (Postfix) with SMTP id 6E94837B403 for ; Fri, 2 Nov 2001 05:20:19 -0800 (PST) Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Fri Nov 2 08:14:25 EST 2001 Received: from aura.research.bell-labs.com (aura.research.bell-labs.com [135.104.46.10]) by grubby.research.bell-labs.com (8.11.6/8.11.6) with ESMTP id fA2DIaR14314 for ; Fri, 2 Nov 2001 08:18:37 -0500 (EST) Received: (from jkf@localhost) by aura.research.bell-labs.com (8.9.1/8.9.1) id IAA29475 for freebsd-questions@FreeBSD.ORG; Fri, 2 Nov 2001 08:18:36 -0500 (EST) Date: Fri, 2 Nov 2001 08:18:36 -0500 (EST) From: Jeff Fellin Message-Id: <200111021318.IAA29475@aura.research.bell-labs.com> To: freebsd-questions@FreeBSD.ORG Subject: system hung with runnable processes Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I didn't see anything like this in the archives, so I'm sending this to the questions list and hackers list for assistance. I am running FreeBSD 4.3 on a L440GX+ motherboard with dual PCI buses: 32/33 and 32/66 dual Pentium III @ 700MHz with 256KB L2 cache. The system is running in Uniprocessor mode. Although running the tests on FreeBSD 4.1 has not caused the problem. My problem: I have an application that reads from a SCSI bus, and forwards the SCSI CDB's to another system over TCP. When running a large load the system gets SCSI bus device reset's that the application acknowledges and clears an error bit. After a period of time, in this example about 2.5 hours, the system stops processing any SCSI CDB's. In DDB the ps output show 11 runnable process, p_wchan == 0, and curproc points to one of the processes. However, when checking the run queues via gdb, none of the runnable processes is in a run queue. According to rtqueuebits, queuebits, and idqueuebits, only queue[12] has any runnable processes. Examing the proc structures for the runnable processes, their priority is 6, so they should be in queue[6]. I cannot determine anything obvious in the process scheduling code, but something is happening. I am attaching the system dmesg output from boot to taking the system dump, the ddb output on the serial console, and the output from gdb of the process' stack trace and proc structure. If anyone needs more information just ask and I'll try to get it for you. Does anyone believe upgrading to FreeBSD 4.4 would resolve the problem? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message