From owner-freebsd-current@FreeBSD.ORG Wed Jan 19 16:20:50 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BF31106566C; Wed, 19 Jan 2011 16:20:50 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 969A28FC2F; Wed, 19 Jan 2011 16:20:49 +0000 (UTC) Received: by qyk8 with SMTP id 8so716764qyk.13 for ; Wed, 19 Jan 2011 08:20:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=LSAAbbJ3GadMhqbQqNtXs5pVMDe0hHA1BvBtKzegRMs=; b=taByjHgPYNprg0sGBOJjPpQTP+LP52AVmgJ6b/dmSeHPi6u2XR2enrQa/bcRdSgk7k fM2T6G8MEkLSSeRuUokJpJpVdT/wmq1/EAPb0DIY4i9jXFyB8ZrZd9rlJLyLP/eQd0Vd oJuhbz9fl1ZbwLo/agaTdr+oFketJR5yzFZuo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=k+4OawmsDZO1ENfJHpQxF6abvPX6aoos7WpMTqW50H7wLxaRGODCky+zHLY/Yu9c1H XQgcNIhH3mtCBtu4R5wI3pyIZngq17Cj6r5y9f8H2BtzhYm8QkUYRBnQnEdHRmmkEgf+ iGzth4C7p1HFd64xqtI/Y+j32aSyF00ED3J/M= MIME-Version: 1.0 Received: by 10.229.246.209 with SMTP id lz17mr749633qcb.271.1295454048726; Wed, 19 Jan 2011 08:20:48 -0800 (PST) Received: by 10.229.102.87 with HTTP; Wed, 19 Jan 2011 08:20:48 -0800 (PST) In-Reply-To: <201101191017.12144.jhb@freebsd.org> References: <201101181404.26768.jhb@freebsd.org> <201101191017.12144.jhb@freebsd.org> Date: Wed, 19 Jan 2011 19:20:48 +0300 Message-ID: From: Sergey Kandaurov To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, Robert Watson Subject: Re: uma_zalloc_arg: zone "256" with non-sleepable exclusive rw ifnet_rw @ /usr/src/sys/net/if.c:414 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jan 2011 16:20:50 -0000 On 19 January 2011 18:17, John Baldwin wrote: > On Wednesday, January 19, 2011 9:51:59 am Sergey Kandaurov wrote: >> On 18 January 2011 22:04, John Baldwin wrote: >> > On Tuesday, January 18, 2011 1:22:24 pm Sergey Kandaurov wrote: >> >> On 18 January 2011 17:54, John Baldwin wrote: >> >> > On Monday, January 17, 2011 12:55:26 pm Sergey Kandaurov wrote: >> >> >> Hi, >> >> >> >> >> >> I see this "malloc with non-sleepable" on current during boot. >> >> >> It's strange that I don't see it if I boot via pxe/nfs. >> >> >> >> >> >> if_alloc() calls ifindex_alloc_locked() under IFNET_WLOCK() which >> >> >> might call if_grow(). >> >> >> Looks like a regression from r196553. >> >> > >> >> > I'm guessing that ifindex_alloc() should drop the lock and retry th= e >> >> > allocation after calling if_grow()? =A0This compiles, but I haven't > booted >> > it >> >> > yet: >> >> >> >> vnet_if_init() calls if_grow() without lock. >> > >> > So it does. :( =A0I've added locking to the sysinit to handle this: >> >> Seems a bit more work there. The vnet_if_init() sysinit cannot use ifnet >> locking since it runs before another sysinit initialized those ifnet loc= ks. >> On the other side it's safe to call if_grow() from vnet_if_init() w/o > locking: >> >> db> bt >> Tracing pid 0 tid 100000 td 0xffffffff80ccffc0 >> _sx_xlock_hard() at _sx_xlock_hard+0xa0 >> _sx_xlock() at _sx_xlock+0xbb >> vnet_if_init() at vnet_if_init+0x4a >> mi_startup() at mi_startup+0x77 >> btext() at btext+0x2c >> >> After said certain locking changes I came to the next problem. >> A modified ifindex_alloc_locked() function doesn't update/increment > V_if_index >> in successful if_grow() case, so it ends up in quick memory exhaustion: >> >> bce0: mem >> 0x92000000-0x93ffffff irq 28 at device 0.0 on pci4 >> panic: kmem_malloc(-2147483648): kmem_map too small: 1110716416 total > allocated >> cpuid =3D 0 >> KDB: enter: panic >> [ thread pid 0 tid 100000 ] >> Stopped at =A0 =A0 =A0kdb_enter+0x3d: movq =A0 =A0$0,0x700140(%rip) >> db> bt >> Tracing pid 0 tid 100000 td 0xffffffff80ccffc0 >> kdb_enter() at kdb_enter+0x3d >> panic() at panic+0x180 >> kmem_malloc() at kmem_malloc+0x25d >> uma_large_malloc() at uma_large_malloc+0x4a >> malloc() at malloc+0x15d >> if_grow() at if_grow+0x98 >> if_alloc() at if_alloc+0xd8 >> bce_attach() at bce_attach+0x18de >> [...] >> > show malloc >> =A0 =A0ifnet =A0 =A0 =A0 =A0 =A0 =A02 =A0 =A0 =A01048578K =A0 =A0 =A0 = =A0 =A0 25 >> >> So I added V_if_index increment into if_grow() itself. >> At least it boots now :) > > Ah, the increment belongs in ifindex_alloc_locked() not if_grow(), but th= at > was a bug certainly. :) =A0The issue is that we should only grow if the n= ew > index would exceed V_if_indexlim, not V_if_index and that should fix it. = =A0For > the locking, I just swapped the order of the SYSINIT's so that the locks = are > initialized before the various data structures. > > Index: if.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- if.c =A0 =A0 =A0 =A0(revision 217544) > +++ if.c =A0 =A0 =A0 =A0(working copy) > @@ -266,6 +266,7 @@ ifindex_alloc_locked(u_short *idxp) > > =A0 =A0 =A0 =A0IFNET_WLOCK_ASSERT(); > > +retry: > =A0 =A0 =A0 =A0/* > =A0 =A0 =A0 =A0 * Try to find an empty slot below V_if_index. =A0If we fa= il, take the > =A0 =A0 =A0 =A0 * next slot. > @@ -278,10 +279,12 @@ ifindex_alloc_locked(u_short *idxp) > =A0 =A0 =A0 =A0/* Catch if_index overflow. */ > =A0 =A0 =A0 =A0if (idx < 1) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return (ENOSPC); > + =A0 =A0 =A0 if (idx >=3D V_if_indexlim) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if_grow(); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto retry; > + =A0 =A0 =A0 } > =A0 =A0 =A0 =A0if (idx > V_if_index) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0V_if_index =3D idx; > - =A0 =A0 =A0 if (V_if_index >=3D V_if_indexlim) > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 if_grow(); > =A0 =A0 =A0 =A0*idxp =3D idx; > =A0 =A0 =A0 =A0return (0); > =A0} > @@ -351,10 +354,12 @@ vnet_if_init(const void *unused __unused) > > =A0 =A0 =A0 =A0TAILQ_INIT(&V_ifnet); > =A0 =A0 =A0 =A0TAILQ_INIT(&V_ifg_head); > + =A0 =A0 =A0 IFNET_WLOCK(); > =A0 =A0 =A0 =A0if_grow(); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0/* create initial table */ > + =A0 =A0 =A0 IFNET_WUNLOCK(); > =A0 =A0 =A0 =A0vnet_if_clone_init(); > =A0} > -VNET_SYSINIT(vnet_if_init, SI_SUB_INIT_IF, SI_ORDER_FIRST, vnet_if_init, > +VNET_SYSINIT(vnet_if_init, SI_SUB_INIT_IF, SI_ORDER_SECOND, vnet_if_init= , > =A0 =A0 NULL); > > =A0/* ARGSUSED*/ > @@ -365,7 +370,7 @@ if_init(void *dummy __unused) > =A0 =A0 =A0 =A0IFNET_LOCK_INIT(); > =A0 =A0 =A0 =A0if_clone_init(); > =A0} > -SYSINIT(interfaces, SI_SUB_INIT_IF, SI_ORDER_SECOND, if_init, NULL); > +SYSINIT(interfaces, SI_SUB_INIT_IF, SI_ORDER_FIRST, if_init, NULL); > > > =A0#ifdef VIMAGE > @@ -385,16 +390,25 @@ VNET_SYSUNINIT(vnet_if_uninit, SI_SUB_INIT_IF, SI_ > =A0static void > =A0if_grow(void) > =A0{ > + =A0 =A0 =A0 int oldlim; > =A0 =A0 =A0 =A0u_int n; > =A0 =A0 =A0 =A0struct ifindex_entry *e; > > - =A0 =A0 =A0 V_if_indexlim <<=3D 1; > - =A0 =A0 =A0 n =3D V_if_indexlim * sizeof(*e); > + =A0 =A0 =A0 IFNET_WLOCK_ASSERT(); > + =A0 =A0 =A0 oldlim =3D V_if_indexlim; > + =A0 =A0 =A0 IFNET_WUNLOCK(); > + =A0 =A0 =A0 n =3D (oldlim << 1) * sizeof(*e); > =A0 =A0 =A0 =A0e =3D malloc(n, M_IFNET, M_WAITOK | M_ZERO); > + =A0 =A0 =A0 IFNET_WLOCK(); > + =A0 =A0 =A0 if (V_if_indexlim !=3D oldlim) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 free(e, M_IFNET); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + =A0 =A0 =A0 } > =A0 =A0 =A0 =A0if (V_ifindex_table !=3D NULL) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0memcpy((caddr_t)e, (caddr_t)V_ifindex_tabl= e, n/2); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0free((caddr_t)V_ifindex_table, M_IFNET); > =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 V_if_indexlim <<=3D 1; > =A0 =A0 =A0 =A0V_ifindex_table =3D e; > =A0} This patch works for me. Thank you! --=20 wbr, pluknet