From owner-freebsd-stable@FreeBSD.ORG Tue Jul 16 06:06:02 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E1935EFD for ; Tue, 16 Jul 2013 06:06:01 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 54245FCC for ; Tue, 16 Jul 2013 06:06:01 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r6G65txM095651; Tue, 16 Jul 2013 09:05:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r6G65txM095651 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r6G65tv0095648; Tue, 16 Jul 2013 09:05:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 16 Jul 2013 09:05:55 +0300 From: Konstantin Belousov To: Patrick Lamaiziere Subject: Re: (9.2) panic under disk load (gam_server / knlist_remove_kq) Message-ID: <20130716060555.GF91021@kib.kiev.ua> References: <20130714115953.1afd6e90@davenulle.org> <20130714163353.2367a6a4@davenulle.org> <20130715142647.GA9293@dft-labs.eu> <20130715185009.052d7614@davenulle.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="vcGfevjIs3IRswOr" Content-Disposition: inline In-Reply-To: <20130715185009.052d7614@davenulle.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: Mateusz Guzik , freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jul 2013 06:06:02 -0000 --vcGfevjIs3IRswOr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 15, 2013 at 06:50:09PM +0200, Patrick Lamaiziere wrote: > Le Mon, 15 Jul 2013 16:26:47 +0200, > Mateusz Guzik a ?crit : >=20 > Hello, >=20 > > > > I'm seeing a panic while trying to build a poudriere repository. > > > >=20 > > > > As far I can see it always happens when gam_server is started (ie > > > > xfce is running) and under disk load (poudriere bulk build) : > > > > (That is something new, the box was pretty stable) > > > >=20 > > > > the complete crash dump (core.0.txt) is here: > > > > http://user.lamaiziere.net/patrick/panic_gam_server.txt > > >=20 > > > With WITNESS and ASSERTION on, I see a warning that looks related : > > >=20 > > > Jul 14 16:23:29 roxette kernel: WARNING: destroying knlist w/ > > > knotes on it! > > >=20 > > > and the box panics just after this. > > >=20 > >=20 > > can you switch that printf to a panic and paste backtrace? >=20 > Yes the full core.txt : > http://user.lamaiziere.net/patrick/panic_knlist_wknotes.txt=20 >=20 > panic: WARNING: destroying knlist w/ knotes on it! >=20 > Unread portion of the kernel message buffer: > lock order reversal: > 1st 0xfffffe00b678c098 ufs (ufs) @ /usr/src/sys/modules/nullfs/../../fs/= nullfs/null_vnops.c:620 > 2nd 0xffffffff813ebda0 allproc (allproc) @ /usr/src/sys/kern/kern_descri= p.c:2822 > KDB: stack backtrace: > #0 0xffffffff8094bc26 at kdb_backtrace+0x66 > #1 0xffffffff809603ae at _witness_debugger+0x2e > #2 0xffffffff80961a85 at witness_checkorder+0x865 > #3 0xffffffff8091b1ea at _sx_slock+0x5a > #4 0xffffffff808d30ff at mountcheckdirs+0x3f > #5 0xffffffff809a890f at dounmount+0x2df > #6 0xffffffff809a913e at sys_unmount+0x3ce > #7 0xffffffff80cec429 at amd64_syscall+0x2f9 > #8 0xffffffff80cd6d47 at Xfast_syscall+0xf7 > panic: WARNING: destroying knlist w/ knotes on it! >=20 > cpuid =3D 3 > KDB: stack backtrace: > #0 0xffffffff8094bc26 at kdb_backtrace+0x66 > #1 0xffffffff80912da8 at panic+0x1d8 > #2 0xffffffff808db269 at knlist_destroy+0x39 > #3 0xffffffff809afd7e at destroy_vpollinfo+0x1e > #4 0xffffffff809b13ef at vdropl+0x18f > #5 0xffffffff809b404c at vputx+0xac > #6 0xffffffff8299ce13 at null_reclaim+0x103 > #7 0xffffffff80d912eb at VOP_RECLAIM_APV+0xdb > #8 0xffffffff809b20a2 at vgonel+0x112 > #9 0xffffffff809b4cd9 at vflush+0x2b9 > #10 0xffffffff8299bbb3 at nullfs_unmount+0x43 > #11 0xffffffff809a8982 at dounmount+0x352 > #12 0xffffffff809a913e at sys_unmount+0x3ce > #13 0xffffffff80cec429 at amd64_syscall+0x2f9 > #14 0xffffffff80cd6d47 at Xfast_syscall+0xf7 > Uptime: 4m47s > Dumping 915 out of 3544 MB:..2%..11%..21%..32%..41%..51%..62%..72%..81%..= 91% >=20 > #0 doadump (textdump=3D) at pcpu.h:234 > 234 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump (textdump=3D) at pcpu.h:234 > #1 0xffffffff80913354 in kern_reboot (howto=3D260) > at /usr/src/sys/kern/kern_shutdown.c:449 > #2 0xffffffff80912d79 in panic (fmt=3D0x1
) > at /usr/src/sys/kern/kern_shutdown.c:637 > #3 0xffffffff808db269 in knlist_destroy (knl=3D) > at /usr/src/sys/kern/kern_event.c:1961 > #4 0xffffffff809afd7e in destroy_vpollinfo (vi=3D0xfffffe007ffec690) > at /usr/src/sys/kern/vfs_subr.c:3583 > #5 0xffffffff809b13ef in vdropl (vp=3D0xfffffe00b678c000) > at /usr/src/sys/kern/vfs_subr.c:2530 > #6 0xffffffff809b404c in vputx (vp=3D0xfffffe00b678c000, func=3D2) > at /usr/src/sys/kern/vfs_subr.c:2358 > #7 0xffffffff8299ce13 in ?? () > #8 0xffffffff8299d510 in ?? () > #9 0xfffffe00000002ec in ?? () > #10 0xffffff81090b8750 in ?? () > #11 0x0000000000000246 in ?? () > #12 0xfffffe002af55000 in ?? () > #13 0xffffffff81576950 in w_locklistdata () > #14 0xffffffff81322ce0 in pmc___lock_failed () > #15 0xffffffff8299d8a0 in ?? () > #16 0xffffff81090b87b0 in ?? () > #17 0x0000000000000000 in ?? () > (kgdb)=20 Hm, try this (mostly naive) patch. If kernel does not panic for you anymore, check that gam_server is still operational. If not, I have some other thing to try. diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c index e64f379..e2c2813 100644 --- a/sys/kern/vfs_subr.c +++ b/sys/kern/vfs_subr.c @@ -3455,6 +3455,8 @@ vfs_msync(struct mount *mp, int flags) static void destroy_vpollinfo(struct vpollinfo *vi) { + + knlist_clear(&vi->vpi_selinfo.si_note, 1); seldrain(&vi->vpi_selinfo); knlist_destroy(&vi->vpi_selinfo.si_note); mtx_destroy(&vi->vpi_lock); --vcGfevjIs3IRswOr Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJR5OLCAAoJEJDCuSvBvK1Bl0QP/0EXr40qDouPlMVX/hpAYe6+ xDGpEWxpNJl1wBiNUqz4SjjM+oqRT4KXi+gMh2M7Fk4KwlIgOxdb/YrfUA5IjK5C 9mfm4+xQv3p/wmeiwTrBKRfs6SZkMx+NulUtW/VEYz0xv4rTSXWXWrQ3zmaCdmJG 0fiQbN+MXf8NByhQIbzaUnPOgfglkUFP27k2twzCWZoFXw45tfRG6/Tuy+HRpOoC ysrm3zqlsFln8NBZIIRcgWknW9ak03yjMOFEDClaD2IOMySI1M19/SBtm1rWvxcH pCQB/Rh9Vry/dqFNG/UQT6CdeDcduA1RzGTj/Rm33SZPptc6ltbmS3GhhLaVZWBJ 63kmdya58ltXdOaNhVGRPoqgWUdZtl0AOc3xmrkcM2kcfWNeiv7q2D2xop5kPiWF JD9jPmwC4hBdh86wEQZqb4VBaz/laAOVrsBNz26uD4a0LluByrxQ1Kpt8cTJtant 642xK40PK7CH8YOKFPLkpAaiKAPdfvCGdq+TKLnMTess0Igx0H9AghtqkPMTv55l woxOxHkVo14AY9y2b3373KqGLP9XFVuVxLUdaCtdh5Yp9P3fb+hsfVIDoZEgxnu3 Tm+31XAwMFCIa0yArdm4wGYS3fHArnzgkSt9wVrmlabPrQex3n6NrQa2BDrr9PLy L2vbEcpqtuo/xkCYZHOd =tGvp -----END PGP SIGNATURE----- --vcGfevjIs3IRswOr--