From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 08:20:33 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B79B41065670; Thu, 7 Jul 2011 08:20:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 4B1798FC1A; Thu, 7 Jul 2011 08:20:32 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p678KROA030405 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Jul 2011 11:20:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p678KRMr041132; Thu, 7 Jul 2011 11:20:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p678KRUj041131; Thu, 7 Jul 2011 11:20:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 7 Jul 2011 11:20:27 +0300 From: Kostik Belousov To: Andriy Gapon Message-ID: <20110707082027.GX48734@deviant.kiev.zoral.com.ua> References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="g0LfaOww9mCf9g0e" Content-Disposition: inline In-Reply-To: <4E15620A.9030608@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FreeBSD-STABLE Mailing List Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 08:20:33 -0000 --g0LfaOww9mCf9g0e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 07, 2011 at 10:36:42AM +0300, Andriy Gapon wrote: > on 07/07/2011 08:55 Mike Tancsa said the following: > > I did a buildworld on this box to bring it up to RELENG_8 for the BIND > > fixes. Unfortunately, the formerly solid box (April 13th kernel) > > panic'd tonight with > >=20 > > Unread portion of the kernel message buffer: > > spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac8a0 (tid 100107) too= long > > panic: spin lock held too long > > cpuid =3D 0 > > Uptime: 13h30m4s > > Physical memory: 2035 MB > >=20 > >=20 > > Its a somewhat busy box taking in mail as well as backups for a few > > servers over nfs. At the time, it would have been getting about 250Mb/s > > inbound on its gigabit interface. Full core.txt file at > >=20 > > http://www.tancsa.com/core-jul8-2011.txt >=20 > I thought that this was supposed to contain output of 'thread apply all b= t' in > kgdb. Anyway, I think that stacktrace for tid 100107 may have some useful > information. >=20 > > #0 doadump () at pcpu.h:231 > > 231 pcpu.h: No such file or directory. > > in pcpu.h > > (kgdb) #0 doadump () at pcpu.h:231 > > #1 0xc06fd6d3 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown= .c:429 > > #2 0xc06fd937 in panic (fmt=3DVariable "fmt" is not available. > > ) at /usr/src/sys/kern/kern_shutdown.c:602 > > #3 0xc06ed95f in _mtx_lock_spin_failed (m=3D0x0) > > at /usr/src/sys/kern/kern_mutex.c:490 > > #4 0xc06ed9e5 in _mtx_lock_spin (m=3D0xc0b1d200, tid=3D3312388992, opt= s=3D0, > > file=3D0x0, line=3D0) at /usr/src/sys/kern/kern_mutex.c:526 > > #5 0xc0720254 in sched_add (td=3D0xc5dac5c0, flags=3D0) > > at /usr/src/sys/kern/sched_ule.c:1119 > > #6 0xc07203f9 in sched_wakeup (td=3D0xc5dac5c0) > > at /usr/src/sys/kern/sched_ule.c:1950 > > #7 0xc07061f8 in setrunnable (td=3D0xc5dac5c0) > > at /usr/src/sys/kern/kern_synch.c:499 > > #8 0xc07362af in sleepq_resume_thread (sq=3D0xca0da300, td=3D0xc5dac5c= 0, > > pri=3DVariable "pri" is not available. > > ) > > at /usr/src/sys/kern/subr_sleepqueue.c:751 > > #9 0xc0736e18 in sleepq_signal (wchan=3D0xc5fafe50, flags=3D1, pri=3D0= , queue=3D0) > > at /usr/src/sys/kern/subr_sleepqueue.c:825 > > #10 0xc06b6764 in cv_signal (cvp=3D0xc5fafe50) > > at /usr/src/sys/kern/kern_condvar.c:422 > > #11 0xc08eaa0d in xprt_assignthread (xprt=3DVariable "xprt" is not avai= lable. > > ) at /usr/src/sys/rpc/svc.c:342 > > #12 0xc08ec502 in xprt_active (xprt=3D0xc95d9600) at > > /usr/src/sys/rpc/svc.c:378 > > #13 0xc08ee051 in svc_vc_soupcall (so=3D0xc6372ce0, arg=3D0xc95d9600, > > waitflag=3D1) > > at /usr/src/sys/rpc/svc_vc.c:747 > > #14 0xc075bbb1 in sowakeup (so=3D0xc6372ce0, sb=3D0xc6372d34) > > at /usr/src/sys/kern/uipc_sockbuf.c:191 > > #15 0xc08447bc in tcp_do_segment (m=3D0xcaa8d200, th=3D0xca6aa824, > > so=3D0xc6372ce0, > > tp=3D0xc63b4d20, drop_hdrlen=3D52, tlen=3D1448, iptos=3D0 '\0', ti_= locked=3D2) > > at /usr/src/sys/netinet/tcp_input.c:1775 > > #16 0xc0847930 in tcp_input (m=3D0xcaa8d200, off0=3D20) > > at /usr/src/sys/netinet/tcp_input.c:1329 > > #17 0xc07ddaf7 in ip_input (m=3D0xcaa8d200) > > at /usr/src/sys/netinet/ip_input.c:787 > > #18 0xc07b8859 in netisr_dispatch_src (proto=3D1, source=3D0, m=3D0xcaa= 8d200) > > at /usr/src/sys/net/netisr.c:859 > > #19 0xc07b8af0 in netisr_dispatch (proto=3D1, m=3D0xcaa8d200) > > at /usr/src/sys/net/netisr.c:946 > > #20 0xc07ae5e1 in ether_demux (ifp=3D0xc56ed800, m=3D0xcaa8d200) > > at /usr/src/sys/net/if_ethersubr.c:894 > > #21 0xc07aeb5f in ether_input (ifp=3D0xc56ed800, m=3D0xcaa8d200) > > at /usr/src/sys/net/if_ethersubr.c:753 > > #22 0xc09977b2 in nfe_int_task (arg=3D0xc56ff000, pending=3D1) > > at /usr/src/sys/dev/nfe/if_nfe.c:2187 > > #23 0xc07387ca in taskqueue_run_locked (queue=3D0xc5702440) > > at /usr/src/sys/kern/subr_taskqueue.c:248 > > #24 0xc073895c in taskqueue_thread_loop (arg=3D0xc56ff130) > > at /usr/src/sys/kern/subr_taskqueue.c:385 > > #25 0xc06d1027 in fork_exit (callout=3D0xc07388a0 , > > arg=3D0xc56ff130, frame=3D0xc538ed28) at /usr/src/sys/kern/kern_for= k.c:861 > > #26 0xc09a5c24 in fork_trampoline () at > > /usr/src/sys/i386/i386/exception.s:275 > > (kgdb) > >=20 BTW, we had a similar panic, "spinlock held too long", the spinlock is the sched lock N, on busy 8-core box recently upgraded to the stable/8. Unfortunately, machine hung dumping core, so the stack trace for the owner thread was not available. I was unable to make any conclusion from the data that was present. If the situation is reproducable, you coulld try to revert r221937. This is pure speculation, though. --g0LfaOww9mCf9g0e Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk4VbEsACgkQC3+MBN1Mb4iU9gCgxnDJw+3nI7TIfBHBKi2QCTev DwIAn2Zpb3dOwCkYNf03tBahoyOVYIfB =jWgx -----END PGP SIGNATURE----- --g0LfaOww9mCf9g0e--