From owner-freebsd-current@FreeBSD.ORG Mon Aug 16 14:42:40 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BB3DC16A4CF for ; Mon, 16 Aug 2004 14:42:40 +0000 (GMT) Received: from kendy.up.ac.za (kendy.up.ac.za [137.215.101.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id C5C6D43D4C for ; Mon, 16 Aug 2004 14:42:39 +0000 (GMT) (envelope-from marc@bowtie.nl) Received: from hades.cs.up.ac.za ([137.215.40.17]) by kendy.up.ac.za with esmtp (Exim 4.24) id 1Bwih1-00045R-SH for freebsd-current@freebsd.org; Mon, 16 Aug 2004 16:42:35 +0200 Received: (qmail 26809 invoked from network); 16 Aug 2004 16:42:34 +0200 Received: from b040pc123.up.ac.za (HELO ?137.215.40.123?) (137.215.40.123) by hades.cs.up.ac.za with SMTP; 16 Aug 2004 16:42:34 +0200 Message-ID: <4120C7DA.1000007@bowtie.nl> Date: Mon, 16 Aug 2004 16:42:34 +0200 From: Marc van Kempen User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.2) Gecko/20040810 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Robert Watson References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Scan-Signature: 7dfe0ca649cca9476ed6282eba2d8737 cc: freebsd-current@freebsd.org cc: =?ISO-8859-15?Q?S=F8ren_Schmidt?= Subject: Re: ATA write-dma interrupt was seen but timeout fired LBA=53346288 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Aug 2004 14:42:40 -0000 Robert Watson wrote: >On Mon, 16 Aug 2004, Marc van Kempen wrote: > > > >>>Something is holding on to one of the taskqueues blocking the request >>>processesing. I'm not sure what caused it but backstepping ATA doesn't >>>make things work at least, suggesting to look somewhere else... >>> >>>For me at least network seems to be troublesome, if I exclude as much >>>network related code as possible in the config, I can resume just fine. >>> >>>YMMV as usual... >>> >>> >>I have taken out all the network related options/devices that I could >>live without and it resumes fine now. >> >>Below is my current config file. >> >> > >There are a number of variables we could look at here, so it would be >useful if we could sort through them some by trying a few cases -- this >may take a bit! > >Before we start -- are you running with debug.mpsafenet=1, or any more >experimental or custom network features? I.e., changes in default network >allocation, etc? > >Could you let me know if the interrupt (and ithread) used by the network >device are being shared by any other devices? On notebooks, interrupts >are frequently shared (and also Dell desktop systems). This might point >at a problem with ithread registration, suspension, restart, and related >races. > >Could you try compiling in support for all the network foo, but leaving >the network entirely unused (disable DHCP, don't configure an IP address, >don't raise the interface, etc. Maybe even do this in single-user). This >might help us determine if it's related specifically to network activity >or not -- i.e., an active ithread. > >If this doesn't help, could you try compiling out all the network service >options (inet, NFS, etc), but leaving in the network interfaces? This >will help determine if we may be looking at a bug in a service, or perhaps >a problem with network interrupt registration or a small set of devices. > >If you compile out the network interface, but do configure the services, >does it still hang? If you compile out the network interface, but >configure the services and use them actively over the localhost interface >when you suspend, does it hang then? This would also help point at a >possible problem in the protocol implementation. > >Thanks, > > Hi Robert, I will try to do this tonight, by the way how can I see if an interrupt is being shared? Cheers, Marc. -- http://zuidafrika.vankempen.com