From owner-freebsd-stable@FreeBSD.ORG Wed Feb 29 18:32:57 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8FB5106567A; Wed, 29 Feb 2012 18:32:57 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3D8728FC1A; Wed, 29 Feb 2012 18:32:56 +0000 (UTC) Received: by wibhn6 with SMTP id hn6so3272180wib.13 for ; Wed, 29 Feb 2012 10:32:56 -0800 (PST) Received-SPF: pass (google.com: domain of lacombar@gmail.com designates 10.180.14.73 as permitted sender) client-ip=10.180.14.73; Authentication-Results: mr.google.com; spf=pass (google.com: domain of lacombar@gmail.com designates 10.180.14.73 as permitted sender) smtp.mail=lacombar@gmail.com; dkim=pass header.i=lacombar@gmail.com Received: from mr.google.com ([10.180.14.73]) by 10.180.14.73 with SMTP id n9mr3277684wic.16.1330540376296 (num_hops = 1); Wed, 29 Feb 2012 10:32:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=8xjfcVauObQeiwE5k9rt4J0cjSj5IfyZTRp2WXR/KFU=; b=hlk+6/y7JTj5ufAoVqARFH+WPXuXI5AaPCrsTVTXZM9PD7xTeNxk9r2+3ZoN7VtmKr qsMU1nf0M7vX9EMi7IN6FNFMd1222leKNlKxU3Eco4CfTcUM87Srf2o0SBXI+QluVRNa JJX/9YoMoVLfL4zT9d/s454HKdZvs57cA3XxM= MIME-Version: 1.0 Received: by 10.180.14.73 with SMTP id n9mr2629786wic.16.1330540376146; Wed, 29 Feb 2012 10:32:56 -0800 (PST) Received: by 10.216.166.11 with HTTP; Wed, 29 Feb 2012 10:32:56 -0800 (PST) In-Reply-To: References: Date: Wed, 29 Feb 2012 13:32:56 -0500 Message-ID: From: Arnaud Lacombe To: Attilio Rao Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable Subject: Re: Complete hang on 9.0-RELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Feb 2012 18:32:58 -0000 Hi, On Wed, Feb 29, 2012 at 12:59 PM, Arnaud Lacombe wrote= : > Hi, > > On Mon, Feb 27, 2012 at 12:48 PM, Arnaud Lacombe wro= te: >> Hi, >> >> On Mon, Feb 27, 2012 at 10:36 AM, Attilio Rao wrot= e: >>> 2012/2/27, Arnaud Lacombe : >>>> Hi, >>>> >>>> On Tue, Feb 14, 2012 at 11:41 AM, Arnaud Lacombe = wrote: >>>>> Hi folks, >>>>> >>>>> For the records, I was running some tests yesterday on top of a >>>>> 9.0-RELEASE, amd64, kernel when the box hanged. At the time of the >>>>> hang, the box was running a process with about 2800 threads with heav= y >>>>> IPC between 1400 writers and 1400 readers. The box was in single user >>>>> mode (/bin/sh coming from FreeBSD 7.4-STABLE). Here is the beginning >>>>> of the dmesg: >>>>> >>>> This happened a second time, now with FreeBSD 8.2-RELEASE. Complete >>>> machine hang. The machine was running about 4000 threads in a single >>>> process, all the other condition are the same. >>> >>> Arnaud, >>> can you please break in your kernel via KDB, collect the following >>> informations from the DDB prompt: >>> - ps >>> - alltrace >>> - show allpcpu >>> - possibly get a coredump with 'call doadump' >>> >> Will do, but I'll need to rebuild a kernel to include DDB. >> >>> and in the end provide all those along with kernel binary and possibly >>> sources somewhere? >>> >> I'll be testing a bare `release/8.2.0' with the following patch: >> >> diff --git a/sys/amd64/conf/GENERIC b/sys/amd64/conf/GENERIC >> index c3e0095..7bd997f 100644 >> --- a/sys/amd64/conf/GENERIC >> +++ b/sys/amd64/conf/GENERIC >> @@ -79,6 +79,10 @@ options =A0 =A0 =A0INCLUDE_CONFIG_FILE =A0 =A0 # Incl= ude this >> file in kernel >> >> =A0options =A0 =A0 =A0 =A0KDB =A0 =A0 =A0 =A0 =A0 # Kernel debugger rela= ted code >> =A0options =A0 =A0 =A0 =A0KDB_TRACE =A0 =A0 # Print a stack trace for a = panic >> +options =A0 =A0 =A0 =A0DDB >> +options =A0 =A0 =A0 =A0BREAK_TO_DEBUGGER >> +options =A0 =A0 =A0 =A0ALT_BREAK_TO_DEBUGGER >> >> =A0# Make an SMP-capable kernel by default >> =A0options =A0 =A0 =A0 =A0SMP =A0 =A0 =A0 =A0 =A0 # Symmetric MultiProce= ssor Kernel >> > ok, it happened again after 2 days, the process was running about 3200 > threads. I'm trying to break into DDB and let you know, I'm not that > successful for now... > No luck. None of BREAK or ALT_BREAK are responding. I will not touch the system in the next few hours if you want me to test something on it. In the event of 8.2-RELEASE or 9.0-RELEASE are not meant to work reliably on top of a 7.4-RELEASE userland, I will re-setup the test to occurs on a clean 9.0-RELEASE system and re-try. - Arnaud