From owner-freebsd-current@FreeBSD.ORG Wed Dec 10 19:23:49 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE1611065676; Wed, 10 Dec 2008 19:23:49 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 9BF698FC18; Wed, 10 Dec 2008 19:23:49 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTP id 38AC146B53; Wed, 10 Dec 2008 14:23:49 -0500 (EST) Date: Wed, 10 Dec 2008 19:23:49 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Roman Divacky In-Reply-To: <20081210164345.GA32188@freebsd.org> Message-ID: References: <20081210164345.GA32188@freebsd.org> User-Agent: Alpine 1.10 (BSF 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: current@freebsd.org Subject: Re: [PANIC]: rw_lock panic in in_pcballoc() in r185864 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Dec 2008 19:23:49 -0000 On Wed, 10 Dec 2008, Roman Divacky wrote: > #9 0xc06bd28b in calltrap () at ../../../i386/i386/exception.s:165 > #10 0xc0528cc9 in _rw_wlock_hard (rw=0xc45a00a4, tid=3293569600, file=0x0, > line=0) at ../../../kern/kern_rwlock.c:616 > #11 0xc05eae42 in in_pcballoc (so=0xc459e000, pcbinfo=0xc0794b40) > at ../../../netinet/in_pcb.c:238 > #12 0xc060b403 in udp_attach (so=0xc459e000, proto=0, td=0xc44fe240) > at ../../../netinet/udp_usrreq.c:1131 > #13 0xc0586df5 in socreate (dom=2, aso=0xc3e77c6c, type=2, proto=0, > #14 0xc058d974 in socket (td=0xc44fe240, uap=0xc3e77cf8) > ---Type to continue, or q to quit---Dec 10 17:29:23 witten log > in: ROOT LOGIN (root) ON ttyv1 > > at ../../../kern/uipc_syscalls.c:178 > #15 0xc06d8010 in syscall (frame=0xc3e77d38) at ../../../i386/i386/trap.c:1076 > #16 0xc06bd320 in Xint0x80_syscall () at ../../../i386/i386/exception.s:261 > #17 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) p *pcbinfo > $2 = {ipi_listhead = 0xc0794b24, ipi_count = 1, ipi_hashbase = 0xc42fe000, > ipi_hashmask = 127, ipi_porthashbase = 0xc42fce00, ipi_porthashmask = 127, > ipi_lastport = 0, ipi_lastlow = 0, ipi_lasthi = 0, ipi_zone = 0xc1471360, > ipi_gencnt = 0, ipi_lock = {lock_object = {lo_name = 0xc0713b87 "udp", > lo_flags = 69926928, lo_data = 0, lo_witness = 0x0}, > rw_lock = 3293569600}, ipi_pspare = { > (kgdb) p *pcbinfo->ipi_zone > $4 = {uz_name = 0xc0716712 "udpcb", uz_lock = 0xc147ed88, > uz_keg = 0xc147ed80, uz_link = {le_next = 0x0, le_prev = 0xc147eda8}, > uz_full_bucket = {lh_first = 0x0}, uz_free_bucket = {lh_first = 0x0}, > uz_ctor = 0, uz_dtor = 0, uz_init = 0, uz_fini = 0, uz_allocs = 0, > uz_frees = 0, uz_fails = 0, uz_fills = 0, uz_count = 23, uz_cpu = {{ > uc_freebucket = 0x0, uc_allocbucket = 0x0, uc_allocs = 0, > uc_frees = 0}}} > > the code tries to rw_rwlock() the inp->inp_lock, the inp is allocated from > an UMA zone which has no constructor and in the in_pcballoc() the rwlock is > never initialized. I believe that's why it crashes > > can someone confirm/fix that? Hmm. I disagree with the diagnosis, although clearly there's a problem here of some sort since panicking is, well, bad. Each protocol using the inpcb framework provides its own UMA zone and is responsible for initializing the lock on the inpcb when memory is first pulled into the cache. For UDP, that occurs in udp_inpcb_init(), the init function passed into uma_zcreate() in udp_init(). Notice that when in_pcballoc() zeroes the inpcb, it stops at inp_zero_size, which is before inp_lock, and since the inpcb zone for UDP is set to be a no-free zone, there's no INP lock destroy. So, given that this code has worked for quite a long time for many people, this really raises two questions: (1) how reproduceable is this and at what point does it kick in during the boot/runtime, and (2) when did this start happening in terms of updating your source? Robert N M Watson Computer Laboratory University of Cambridge