From owner-freebsd-current@FreeBSD.ORG Mon Dec 10 18:38:29 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9496770C; Mon, 10 Dec 2012 18:38:29 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 168D78FC0C; Mon, 10 Dec 2012 18:38:28 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEACgrxlCDaFvO/2dsb2JhbABFhje4XHOCHgEBAQMBAQEBIAQnIAsbGAICDRkCKQEJJgYIBwQBHASHagYMpSWSP4Eiix0bDYMIgRMDiF+KeYIugRyPLIMRgUgHFx4 X-IronPort-AV: E=Sophos;i="4.84,252,1355115600"; d="scan'208";a="3866404" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 10 Dec 2012 13:38:21 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id D5A82B3F7D; Mon, 10 Dec 2012 13:38:21 -0500 (EST) Date: Mon, 10 Dec 2012 13:38:21 -0500 (EST) From: Rick Macklem To: Adrian Chadd Message-ID: <735026206.1290394.1355164701856.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: r244036 kernel hangs under load. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Linux)/6.0.10_GA_2692) Cc: Tim Kientzle , freebsd-current Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Dec 2012 18:38:29 -0000 Adrian Chadd wrote: > .. what was the previous kernel version? > Hopefully Tim has it narrowed down more, but I don't see the hangs on a Sept. 7 kernel from head and I do see them on a Dec. 3 kernel from head. (Don't know the eact rNNNNNN.) It seems to predate my commit (r244008), which was my first concern. I use old single core i386 hardware and can fairly reliably reproduce it by doing a kernel build and a "svn checkout" concurrently. No NFS activity. These are running on a local disk (UFS/FFS). (The kernel I reproduce it on is built via GENERIC for i386. If you want me to start a "binary search" for which rNNNNNN, I can do that, but it will take a while.:-) I can get out into DDB, but I'll admit I don't know enough about it to know where to look;-) Here's some lines from "db> ps", in case they give someone useful information. (I can leave this box sitting in DB for the rest of to-day, in case someone can suggest what I should look for on it.) Just snippets... Ss pause adjkerntz DL sdflush [sofdepflush] RL [syncer] DL vlruwt [vnlru] DL psleep [bufdaemon] RL [pagezero] DL psleep [vmdaemon] DL psleep [pagedaemon] DL ccb_scan [xpt_thrd] DL waiting_ [sctp_iterator] DL ctl_work [ctl_thrd] DL cooling [acpi_cooling0] DL tzpoll [acpi_thermal] DL (threaded) [usb] ... DL - [yarrow] DL (threaded) [geom] D - [g_down] D - [g_up] D - [g_event] RL (threaded) [intr] I [irq15: ata1] ... Run CPU0 [swi6: Giant taskq] --> does this one indicate the CPU is actually running this? (after a db> cont, wait a while db> ps it is still the same) I [swi4: clock] I [swi1: netisr 0] I [swi3: vm] RL [idle: cpu0] SLs wait [init] DL audit_wo [audit] DLs (threaded) [kernel] D - [deadlkres] ... D sched [swapper] I have no idea if this "ps" output helps, unless it indicates that it is looping on the Giant taskq? As I said, I can leave it in "db" for to-day, if anyone wants me to do anything in the debugger and I can probably reproduce it, if someone wants stuff tried later. rick > > > adrian > > > On 9 December 2012 22:08, Tim Kientzle wrote: > > I haven't found any useful clues yet, but thought I'd ask if anyone > > else > > was seeing hangs in a recent kernel. > > > > I just upgraded to r244036 using a straight GENERIC i386 kernel. > > (Straight buildworld/buildkernel, no local changes, /etc/src.conf > > doesn't > > exist, /etc/make.conf just has PERL_VERSION defined.) > > > > When I try to cross build an ARM world on the resulting system, > > the entire system hangs hard after about 30 minutes: No network, > > no keyboard response, no nothing. > > > > Don't know if it's relevant, but the system is using NFS pretty > > heavily (Parallels VM mounting NFS from Mac OS 10.7 host.) > > > > I'll try to get some more details ... > > > > Tim > > > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to > > "freebsd-current-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@freebsd.org"