From owner-freebsd-stable@FreeBSD.ORG Mon Sep 15 09:37:22 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D11901065671 for ; Mon, 15 Sep 2008 09:37:22 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from mail-gw0.york.ac.uk (mail-gw0.york.ac.uk [144.32.128.245]) by mx1.freebsd.org (Postfix) with ESMTP id 5FBE48FC1D for ; Mon, 15 Sep 2008 09:37:16 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from mail-gw7.york.ac.uk (mail-gw7.york.ac.uk [144.32.129.30]) by mail-gw0.york.ac.uk (8.13.6/8.13.6) with ESMTP id m8F9bC1o024455; Mon, 15 Sep 2008 10:37:12 +0100 (BST) Received: from buffy-128.york.ac.uk ([144.32.128.160] helo=buffy.york.ac.uk) by mail-gw7.york.ac.uk with esmtps (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from ) id 1KfAW8-00002r-8v; Mon, 15 Sep 2008 10:37:12 +0100 Received: from buffy.york.ac.uk (localhost [127.0.0.1]) by buffy.york.ac.uk (8.14.2/8.14.2) with ESMTP id m8F9bBwN049499; Mon, 15 Sep 2008 10:37:11 +0100 (BST) (envelope-from gavin@FreeBSD.org) Received: (from ga9@localhost) by buffy.york.ac.uk (8.14.2/8.14.2/Submit) id m8F9bB1B049498; Mon, 15 Sep 2008 10:37:11 +0100 (BST) (envelope-from gavin@FreeBSD.org) X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to gavin@FreeBSD.org using -f From: Gavin Atkinson To: Norbert Papke In-Reply-To: <200809141219.24943.fbsd-ml@scrapper.ca> References: <200809141219.24943.fbsd-ml@scrapper.ca> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Mon, 15 Sep 2008 10:37:11 +0100 Message-Id: <1221471431.49328.5.camel@buffy.york.ac.uk> Mime-Version: 1.0 X-Mailer: Evolution 2.22.2 FreeBSD GNOME Team Port X-York-MailScanner: Found to be clean X-York-MailScanner-From: gavin@freebsd.org Cc: freebsd-stable@FreeBSD.org Subject: Re: Possible UDP related deadlock in 7.1-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Sep 2008 09:37:22 -0000 On Sun, 2008-09-14 at 12:19 -0700, Norbert Papke wrote: > Symptoms: > > * I can trigger this lockup reliably by starting ktorrent. After a short > while (one to two minutes), it locks up. Other commands, e.g., netstat, also > lock up. > * The console generates "nfe0: watchdog timeout" error messages. > * The system becomes unusable and must be rebooted. > Attempted Diagnosis: > > If I break into DDB, the 'ps' output shows a number of processes that seem to > be locked related to udp. > > [irq18:dc0] L *udp > ktorrent L *udpinp > hald L *udp > ntpd L *udp > > Unfortunately, I am rapidly getting out of my depth here. I have no idea how > to go about further analyzing this problem and would appreciate help. Can you add: options WITNESS options WITNESS_SKIPSPIN to your kernel, recompile and wait for the problem to happen again? When it does, from the debugger issue "sh alllocks" and make a note of the output? This will probably show that two locks are held, "Giant" and "udp", along with the thread that holds each of them. Take the ID of the thread that holds the "udp" lock, and enter "tr 100150" (where 100150 is the thread ID. This should hopefully provide enough info to figure out what is happening. Thanks, Gavin