From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 11 20:39:43 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 749FA1065673 for ; Thu, 11 Aug 2011 20:39:43 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 2C2D48FC12 for ; Thu, 11 Aug 2011 20:39:42 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAI02RE6DaFvO/2dsb2JhbAA+A4RIpAeBQAEBAQEDAQEBICsgCxsOCgICDRkCKQEJJgYIBwQBHASHUqwokTyBLIIUgXeBEASRAYINkQs X-IronPort-AV: E=Sophos;i="4.67,358,1309752000"; d="scan'208";a="134147857" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 11 Aug 2011 16:10:54 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id C070DB3F0F; Thu, 11 Aug 2011 16:10:54 -0400 (EDT) Date: Thu, 11 Aug 2011 16:10:54 -0400 (EDT) From: Rick Macklem To: Andrew Duane Message-ID: <1748900458.38612.1313093454775.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: FreeBSD Hackers Subject: Re: Dumping core over NFS X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2011 20:39:43 -0000 Andrew Duane wrote: > We have a strange problem in 6.2 that we're wondering if anyone else > has seen. If a process is dumping core to an NFS-mounted directory, > sending SIGINT, SIGTERM, or SIGKILL to that process causes NFS to > wedge. The nfs_asyncio starts complaining that 20 iods are already > processing the mount, but nothing makes any forward progress. > > Sending SIGUSR1, SIGUSR2, or SIGABRT seem to work fine, as does any > signal if the core dump is going to a local filesystem. > > Before I dig into this apparent deadlock, just wondering if it's been > seen before. > The only thing I can tell you is that SIGINT, SIGTERM are signals that are handled differently by mounts with the "intr" option set. For this case, the client tries to make the syscall in progress fail with EINTR when one of these signals is posted. I have no idea what effect this might have on a core dump in progress or if you are using "intr" mounts. There was an issue in FreeBSD8.[01] (for the "intr" case) where the termination signal could get the krpc code in a loop when trying to re-establish a TCP connection because an msleep() would always return EINTR right away without waiting for the connection attempt to complete and then code outside that would just try it again and again and... This bug was fixed for FreeBSD8.2. Obviously it's not the same bug since FreeBSD6 didn't have a krpc subsystem, but you might look for something similar. (ie. a sleep(...PCATCH...) and then a caller that just tries again for it returning EINTR. If you use "intr", you might also try without "intr" and see if that has any effect. Good luck with it, rick > ................................... > > Andrew Duane > Juniper Networks > o +1 978 589 0551 > m +1 603-770-7088 > aduane@juniper.net > > > > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org"