From owner-freebsd-stable@FreeBSD.ORG Thu Dec 25 09:29:09 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 222B6C0F; Thu, 25 Dec 2014 09:29:09 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C295328FB; Thu, 25 Dec 2014 09:29:08 +0000 (UTC) Received: from laptop015.home.selasky.org (31.89-11-148.nextgentel.com [89.11.148.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 88AA01FE022; Thu, 25 Dec 2014 10:29:06 +0100 (CET) Message-ID: <549BD90B.2050000@selasky.org> Date: Thu, 25 Dec 2014 10:29:47 +0100 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Shane Ambler , John Baldwin , freebsd-stable@freebsd.org Subject: Re: Help debugging stable/10 References: <5488F58D.7060708@ShaneWare.Biz> <201412161129.57704.jhb@freebsd.org> <549BC924.3050402@ShaneWare.Biz> In-Reply-To: <549BC924.3050402@ShaneWare.Biz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Alexander Motin , mjg@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Dec 2014 09:29:09 -0000 On 12/25/14 09:21, Shane Ambler wrote: > On 17/12/2014 02:59, John Baldwin wrote: >> On Wednesday, December 10, 2014 8:38:21 pm Shane Ambler wrote: >>> Since upgrading to 10.1 (RC2) I have had trouble getting uptimes greater >>> than 1 day. I have little experience debugging the OS so could use some >>> help. >>> > >> It looks like your processes are hanging on a global lock used by >> sysctl(3). >> That would explain hangs in top/ps/procstat as they all use sysctls. For >> example: >> > > Hi again, > > I just had a usb failure while still able to start processes so have a > bit more info. It took approximately 45 minutes before other things > locked up and I was unable to create new processes, I couldn't gather > any more info at that time. > > http://shaneware.biz/freebsddebugdata/procstat-2014-12-25-15-52 > http://shaneware.biz/freebsddebugdata/kgdb.output-2014-12-25-15-52 > > FreeBSD leader.local 10.1-STABLE FreeBSD 10.1-STABLE #1 r276101: Tue > Dec 23 16:19:53 ACDT 2014 root@leader.local:/usr/obj/usr/src/sys/GENERIC > amd64 > > Shortly before the time of the procstat and kgdb output I inserted a > usb memstick and it failed to create the device entries, I removed and > reinserted the stick and the detach event showed in log/messages but > not the re-insertion. I tried inserting two different branded memsticks > in two physical slots and both failed to show insert events. I was > running poudriere at the time but stopped that before logging the data. > > I see 3 entries in procstat that have _sx_xlock_hard - > > 6 100058 g_journal switcher g_journal switch mi_switch+0xe1 > sleepq_wait+0x3a _sx_xlock_hard+0x48a _sx_xlock+0x5d > g_journal_switcher+0x1ca fork_exit+0x9a fork_trampoline+0xe > > 13 100015 geom g_event mi_switch+0xe1 > sleepq_wait+0x3a _sx_xlock_hard+0x48a g_run_events+0x82 fork_exit+0x9a > fork_trampoline+0xe > > 16 100073 pagedaemon - mi_switch+0xe1 > sleepq_wait+0x3a _sx_xlock_hard+0x48a _sx_xlock+0x5d > g_journal_lowmem+0xa3 vm_pageout+0x2bf fork_exit+0x9a fork_trampoline+0xe > > Hi, The cam_sim_free() is stuck, blocking the rest of that controller from enumerating. It might look like a non-USB stack issue. MAV: Do you have some ideas where to start looking, now we have a dump? Any refcounts to check in particular? --HPS > Thread 254 (Thread 100039): > #0 sched_switch (td=0xfffff8000669e000, newtd=, flags=) > at /usr/src/sys/kern/sched_ule.c:1945 > #1 0xffffffff809350b1 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:493 > #2 0xffffffff80972a2a in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:617 > #3 0xffffffff80934ad7 in _sleep (ident=, lock=, > priority=, wmesg=, sbt=, pr=, > flags=) at /usr/src/sys/kern/kern_synch.c:255 > #4 0xffffffff802df838 in cam_sim_free (sim=0xfffff801f5ee8900, free_devq=1) at /usr/src/sys/cam/cam_sim.c:109 > #5 0xffffffff807a5549 in umass_detach (dev=) at /usr/src/sys/dev/usb/storage/umass.c:2139 > #6 0xffffffff8095d042 in device_detach (dev=0xfffff8018b028900) at device_if.h:214 > #7 0xffffffff807b23b1 in usb_detach_device (udev=0xfffff801ea710000, iface_index=, flag=0 '\0') > at /usr/src/sys/dev/usb/usb_device.c:1148 > #8 0xffffffff807b14d6 in usb_unconfigure (udev=0xfffff801ea710000, flag=2 '\002') > at /usr/src/sys/dev/usb/usb_device.c:537 > #9 0xffffffff807b4466 in usb_free_device (udev=0xfffff801ea710000, flag=) > at /usr/src/sys/dev/usb/usb_device.c:2175 > #10 0xffffffff807bda6f in uhub_explore (udev=0xfffff8000d6c4000) at /usr/src/sys/dev/usb/usb_hub.c:647 > #11 0xffffffff807be0d9 in uhub_explore (udev=0xfffff8000d07d000) at /usr/src/sys/dev/usb/usb_hub.c:574 > #12 0xffffffff807a42c0 in usb_bus_explore (pm=) > at /usr/src/sys/dev/usb/controller/usb_controller.c:406 > #13 0xffffffff807c05af in usb_process (arg=0xfffffe0000ac4db0) at /usr/src/sys/dev/usb/usb_process.c:177 > #14 0xffffffff808fc66a in fork_exit (callout=0xffffffff807c0490 , arg=0xfffffe0000ac4db0, > ---Type to continue, or q to quit--- > frame=0xfffffe0212f7aac0) at /usr/src/sys/kern/kern_fork.c:996 > #15 0xffffffff80d10eee in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 > #16 0x0000000000000000 in ?? ()