Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Aug 2011 20:40:54 +0200
From:      Tom Vijlbrief <tom.vijlbrief@xs4all.nl>
To:        Sergey Kandaurov <pluknet@gmail.com>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: BETA1 IPv6 crash
Message-ID:  <CAOQrpVdk9n7eXto4W2RWg_8Xqr9qQ6JzYN=g3DArtLP%2B%2Bup=sg@mail.gmail.com>
In-Reply-To: <CAE-mSO%2Bd2JSYhiNG2pRMReHDNYDBDba8h6vX4w7C-kQu3WYrdw@mail.gmail.com>
References:  <CAOQrpVdkqYm22jjgdOiu9f7GrALfrjCuY-VfYBibvF4Lb9m-=Q@mail.gmail.com> <CAE-mSOK3ZD92NG6DqzmZ_pdK0KoUZoN_s9c-iToa1XQ4vaBfFQ@mail.gmail.com> <CAOQrpVf7bATyWmWRF0Cnwk_KWPygbSFXCSP6ZtGtkSL0czRGag@mail.gmail.com> <CAE-mSO%2Bd2JSYhiNG2pRMReHDNYDBDba8h6vX4w7C-kQu3WYrdw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/8/22 Sergey Kandaurov <pluknet@gmail.com>:
> On 8 August 2011 22:06, Tom Vijlbrief <tom.vijlbrief@xs4all.nl> wrote:
>> 2011/8/7 Sergey Kandaurov <pluknet@gmail.com>:
>>> On 7 August 2011 17:11, Tom Vijlbrief <tom.vijlbrief@xs4all.nl> wrote:
>>>> I installed BETA1 in a fresh ubuntu 11.04 KVM virtual machine with the
>>>> new installer.
>>>>
>>>> Major issue I noticed was the missing /home.
>>>>
>>>> It took me quite some time to get IPv6 working in the guest (a Linux
>>>> configuration issue), but now that it works
>>>> BETA1 panics in about 50% of the boot attempts:
>>>>
>>>> testbsd dumped core - see /var/crash/vmcore.0
>>>>
>>>> Sun Aug =A07 08:25:28 CEST 2011
>>>>
>>>> FreeBSD testbsd 9.0-BETA1 FreeBSD 9.0-BETA1 #0: Thu Jul 28 16:34:16
>>>> UTC 2011 =A0 =A0 root@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENE=
RIC
>>>> i386
>>>>
>>>> panic: _mtx_lock_sleep: recursed on non-recursive mutex if_addr_mtx @
>>>> /usr/src/sys/netinet6/mld6.c:1676
>>>>
>>>> GNU gdb 6.1.1 [FreeBSD]
>>>> Copyright 2004 Free Software Foundation, Inc.
>>>> GDB is free software, covered by the GNU General Public License, and y=
ou are
>>>> welcome to change it and/or distribute copies of it under certain cond=
itions.
>>>> Type "show copying" to see the conditions.
>>>> There is absolutely no warranty for GDB. =A0Type "show warranty" for d=
etails.
>>>> This GDB was configured as "i386-marcel-freebsd"...
>>> [..]
>>>> panic: _mtx_lock_sleep: recursed on non-recursive mutex if_addr_mtx @
>>>> /usr/src/sys/netinet6/mld6.c:1676
>>>>
>>>> cpuid =3D 0
>>>> KDB: enter: panic
>>>> Uptime: 28s
>>>> Physical memory: 491 MB
>>>> Dumping 45 MB: 30 14
>>>>
>>>> #0 =A0doadump (textdump=3D1) at pcpu.h:244
>>>> 244 =A0 =A0 pcpu.h: No such file or directory.
>>>> =A0 =A0 =A0 =A0in pcpu.h
>>>> (kgdb) #0 =A0doadump (textdump=3D1) at pcpu.h:244
>>>> #1 =A00xc0a04965 in kern_reboot (howto=3D260)
>>>> =A0 =A0at /usr/src/sys/kern/kern_shutdown.c:430
>>>> #2 =A00xc0a04291 in panic (fmt=3DVariable "fmt" is not available.
>>>> ) at /usr/src/sys/kern/kern_shutdown.c:595
>>>> #3 =A00xc09f4a4a in _mtx_lock_sleep (m=3D0xc35f3a28, tid=3D3278693824,=
 opts=3D0,
>>>> =A0 =A0file=3D0xc0f1ab65 "/usr/src/sys/netinet6/mld6.c", line=3D1676)
>>>> =A0 =A0at /usr/src/sys/kern/kern_mutex.c:341
>>>> #4 =A00xc09f4c67 in _mtx_lock_flags (m=3D0xc35f3a28, opts=3D0,
>>>> =A0 =A0file=3D0xc0f1ab65 "/usr/src/sys/netinet6/mld6.c", line=3D1676)
>>>> =A0 =A0at /usr/src/sys/kern/kern_mutex.c:203
>>>> #5 =A00xc0bbf007 in mld_set_version (mli=3D0xc3589a00, version=3DVaria=
ble
>>>> "version" is not available.
>>>> )
>>>> =A0 =A0at /usr/src/sys/netinet6/mld6.c:1676
>>>> #6 =A00xc0bc0c00 in mld_input (m=3D0xc3951e00, off=3D48, icmp6len=3D24=
)
>>>> =A0 =A0at /usr/src/sys/netinet6/mld6.c:690
>>>> #7 =A00xc0ba5696 in icmp6_input (mp=3D0xc3313a54, offp=3D0xc3313a68, p=
roto=3D58)
>>>> =A0 =A0at /usr/src/sys/netinet6/icmp6.c:654
>>>> #8 =A00xc0bba23a in ip6_input (m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/netinet6/ip6_input.c:964
>>>> #9 =A00xc0ac9b1c in netisr_dispatch_src (proto=3D10, source=3D0, m=3D0=
xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/netisr.c:1013
>>>> #10 0xc0ac9da0 in netisr_dispatch (proto=3D10, m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/netisr.c:1104
>>>> #11 0xc0abecf1 in ether_demux (ifp=3D0xc35f3800, m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/if_ethersubr.c:936
>>>> #12 0xc0abf1b3 in ether_nh_input (m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/if_ethersubr.c:755
>>>> #13 0xc0ac9b1c in netisr_dispatch_src (proto=3D9, source=3D0, m=3D0xc3=
951e00)
>>>> =A0 =A0at /usr/src/sys/net/netisr.c:1013
>>>> #14 0xc0ac9da0 in netisr_dispatch (proto=3D9, m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/netisr.c:1104
>>>> #15 0xc0abe7f5 in ether_input (ifp=3D0xc35f3800, m=3D0xc3951e00)
>>>> =A0 =A0at /usr/src/sys/net/if_ethersubr.c:796
>>>> #16 0xc0672bc9 in lem_handle_rxtx (context=3D0xc3732000, pending=3D1)
>>>> =A0 =A0at /usr/src/sys/dev/e1000/if_lem.c:3554
>>>> #17 0xc0a468ab in taskqueue_run_locked (queue=3D0xc359ca80)
>>>> =A0 =A0at /usr/src/sys/kern/subr_taskqueue.c:306
>>>> #18 0xc0a47307 in taskqueue_thread_loop (arg=3D0xc37365ec)
>>>> =A0 =A0at /usr/src/sys/kern/subr_taskqueue.c:495
>>>> #19 0xc09d7af8 in fork_exit (callout=3D0xc0a472a0 <taskqueue_thread_lo=
op>,
>>>> =A0 =A0arg=3D0xc37365ec, frame=3D0xc3313d28) at /usr/src/sys/kern/kern=
_fork.c:941
>>>> #20 0xc0d1d714 in fork_trampoline () at /usr/src/sys/i386/i386/excepti=
on.s:275
>>>> (kgdb)
>>>>
>>>
>>> This is the same as in PR kern/158426.
>>> Can you try the patch from PR followup and report us whether it helps?
>>> Full link to PR with patch:
>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/158426
>>>
>>
>> I applied the patch and tried about 15 reboots and all went fine....
>>
>
> Hi, Tom.
> A better fix for this problem has been developed since then. Would you
> please try it as well? For doing that, you need to revert a previous
> patch and apply this one.
> Please report if this change also fixes the panic for you, so it =A0has
> better chances to get into 9.0 release.
>
>
> Index: sys/netinet6/mld6.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- sys/netinet6/mld6.c (revision 224471)
> +++ sys/netinet6/mld6.c (working copy)
> @@ -680,7 +680,6 @@ mld_v1_input_query(struct ifnet *ifp, const struct
>
> =A0 =A0 =A0 =A0IN6_MULTI_LOCK();
> =A0 =A0 =A0 =A0MLD_LOCK();
> - =A0 =A0 =A0 IF_ADDR_LOCK(ifp);
>
> =A0 =A0 =A0 =A0/*
> =A0 =A0 =A0 =A0 * Switch to MLDv1 host compatibility mode.
> @@ -693,6 +692,7 @@ mld_v1_input_query(struct ifnet *ifp, const struct
> =A0 =A0 =A0 =A0if (timer =3D=3D 0)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0timer =3D 1;
>
> + =A0 =A0 =A0 IF_ADDR_LOCK(ifp);
> =A0 =A0 =A0 =A0if (is_general_query) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/*
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * For each reporting group joined on this
> @@ -888,7 +888,6 @@ mld_v2_input_query(struct ifnet *ifp, const struct
>
> =A0 =A0 =A0 =A0IN6_MULTI_LOCK();
>
> =A0 =A0 =A0 =A0MLD_LOCK();
> - =A0 =A0 =A0 IF_ADDR_LOCK(ifp);
>
> =A0 =A0 =A0 =A0mli =3D MLD_IFINFO(ifp);
> =A0 =A0 =A0 =A0KASSERT(mli !=3D NULL, ("%s: no mld_ifinfo for ifp %p", __=
func__, ifp));
> @@ -936,14 +935,18 @@ mld_v2_input_query(struct ifnet *ifp, const struct
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * Queries for groups we are not a member =
of on this
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * link are simply ignored.
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 IF_ADDR_LOCK(ifp);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0inm =3D in6m_lookup_locked(ifp, &mld->mld_=
addr);
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (inm =3D=3D NULL)
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (inm =3D=3D NULL) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 IF_ADDR_UNLOCK(ifp);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_locked;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (nsrc > 0) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!ratecheck(&inm->in6m_=
lastgsrtv,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&V_mld_gsrdelay)) =
{
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0CTR1(KTR_M=
LD, "%s: GS query throttled.",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0__=
func__);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 IF_ADDR_UNL=
OCK(ifp);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_l=
ocked;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> @@ -961,10 +964,10 @@ mld_v2_input_query(struct ifnet *ifp, const struct
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* XXX Clear embedded scope ID as userland=
 won't expect it. */
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0in6_clearscope(&mld->mld_addr);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 IF_ADDR_UNLOCK(ifp);
> =A0 =A0 =A0 =A0}
>
> =A0out_locked:
> - =A0 =A0 =A0 IF_ADDR_UNLOCK(ifp);
> =A0 =A0 =A0 =A0MLD_UNLOCK();
> =A0 =A0 =A0 =A0IN6_MULTI_UNLOCK();
>
>
> --
> wbr,
> pluknet
>

Applied your patch and rebooted about 10 times without problems, so
the new patch works ok for me!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOQrpVdk9n7eXto4W2RWg_8Xqr9qQ6JzYN=g3DArtLP%2B%2Bup=sg>