From owner-freebsd-current@FreeBSD.ORG Mon Aug 16 14:29:44 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D9A6116A4CE for ; Mon, 16 Aug 2004 14:29:44 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3EDA043D2F for ; Mon, 16 Aug 2004 14:29:44 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i7GERcmT077766; Mon, 16 Aug 2004 10:27:38 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i7GERcEJ077763; Mon, 16 Aug 2004 10:27:38 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Mon, 16 Aug 2004 10:27:38 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Marc van Kempen In-Reply-To: <41209DDA.3090504@bowtie.nl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org cc: =?ISO-8859-15?Q?S=F8ren_Schmidt?= Subject: Re: ATA write-dma interrupt was seen but timeout fired LBA=53346288 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Aug 2004 14:29:45 -0000 On Mon, 16 Aug 2004, Marc van Kempen wrote: > > Something is holding on to one of the taskqueues blocking the request > > processesing. I'm not sure what caused it but backstepping ATA doesn't > > make things work at least, suggesting to look somewhere else... > > > > For me at least network seems to be troublesome, if I exclude as much > > network related code as possible in the config, I can resume just fine. > > > > YMMV as usual... > I have taken out all the network related options/devices that I could > live without and it resumes fine now. > > Below is my current config file. There are a number of variables we could look at here, so it would be useful if we could sort through them some by trying a few cases -- this may take a bit! Before we start -- are you running with debug.mpsafenet=1, or any more experimental or custom network features? I.e., changes in default network allocation, etc? Could you let me know if the interrupt (and ithread) used by the network device are being shared by any other devices? On notebooks, interrupts are frequently shared (and also Dell desktop systems). This might point at a problem with ithread registration, suspension, restart, and related races. Could you try compiling in support for all the network foo, but leaving the network entirely unused (disable DHCP, don't configure an IP address, don't raise the interface, etc. Maybe even do this in single-user). This might help us determine if it's related specifically to network activity or not -- i.e., an active ithread. If this doesn't help, could you try compiling out all the network service options (inet, NFS, etc), but leaving in the network interfaces? This will help determine if we may be looking at a bug in a service, or perhaps a problem with network interrupt registration or a small set of devices. If you compile out the network interface, but do configure the services, does it still hang? If you compile out the network interface, but configure the services and use them actively over the localhost interface when you suspend, does it hang then? This would also help point at a possible problem in the protocol implementation. Thanks, Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research