From owner-freebsd-sparc64@FreeBSD.ORG Tue Oct 18 04:26:58 2011 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 297E11065674 for ; Tue, 18 Oct 2011 04:26:58 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail26.syd.optusnet.com.au (mail26.syd.optusnet.com.au [211.29.133.167]) by mx1.freebsd.org (Postfix) with ESMTP id A6E278FC0A for ; Tue, 18 Oct 2011 04:26:57 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail26.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p9I4QmKm026288 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 18 Oct 2011 15:26:50 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.5/8.14.4) with ESMTP id p9I4QmUF019880; Tue, 18 Oct 2011 15:26:48 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.5/8.14.4/Submit) id p9I4QkL9019879; Tue, 18 Oct 2011 15:26:46 +1100 (EST) (envelope-from peter) Date: Tue, 18 Oct 2011 15:26:46 +1100 From: Peter Jeremy To: Marius Strobl Message-ID: <20111018042646.GA18863@server.vk2pj.dyndns.org> References: <20110816214820.GA35017@server.vk2pj.dyndns.org> <20110817094541.GJ48988@alchemy.franken.de> <20110830152725.GA28552@alchemy.franken.de> <20110831212458.GA25926@server.vk2pj.dyndns.org> <20110902153206.GR40781@alchemy.franken.de> <20111006120411.GA903@alchemy.franken.de> <20111011030529.GA4093@server.vk2pj.dyndns.org> <20111011205543.GA81376@alchemy.franken.de> <20111013035648.GA54190@server.vk2pj.dyndns.org> <20111013184224.GG39118@alchemy.franken.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="mYCpIKhGyMATD0i+" Content-Disposition: inline In-Reply-To: <20111013184224.GG39118@alchemy.franken.de> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-sparc64@freebsd.org Subject: Re: 'make -j16 universe' gives SIReset X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Oct 2011 04:26:58 -0000 --mYCpIKhGyMATD0i+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2011-Oct-13 20:42:25 +0200, Marius Strobl wr= ote: >On Thu, Oct 13, 2011 at 02:56:48PM +1100, Peter Jeremy wrote: >> Unfortunately, I can't get a crashdump because dumpon(8) doesn't like >> my Solaris swap partitions: >> GEOM_PART: Partition 'da0b' not suitable for kernel dumps (wrong type?) >> GEOM_PART: Partition 'da6b' not suitable for kernel dumps (wrong type?) >> No suitable dump device was found. >>=20 >> I did write a patch for that but took it out during some earlier >> testing to get back to stock code. It looks like I didn't PR it >> either so I will do that when I get some time. I've resurrected that patch (and will send-pr it later). >Hrm, this backtrace seems impossible as vmtotal() explicitly locks >the object before calling vm_object_clear_flag(). A crash dump of >this panic really would be interesting. I've reproduced the same panic and got a crashdump (2 hours for the dump and another hour for the savecore): VNASSERT failed panic: mutex vm object not owned at /usr/src/sys/vm/vm_object.c:281 cpuid =3D 7 #10 0x00000000c04ffbf4 in panic (fmt=3D0xc0a906d0 "mutex %s not owned at %s= :%d") at /usr/src/sys/kern/kern_shutdown.c:599 #11 0x00000000c04eb1b8 in _mtx_assert (m=3D0xfffff8b29d750ca8, what=3D0x4, = file=3D0xc0ac6c00 "/usr/src/sys/vm/vm_object.c", line=3D0x119) at /usr/src/= sys/kern/kern_mutex.c:706 #12 0x00000000c07f4b0c in vm_object_clear_flag (object=3D0xfffff8b29d750ca8= , bits=3D0x4) at /usr/src/sys/vm/vm_object.c:281 #13 0x00000000c07f1dac in vmtotal (oidp=3D0xc0ba9be8, arg1=3D0x0, arg2=3D0x= 30, req=3D0xef8a54e0) at /usr/src/sys/vm/vm_meter.c:121 #14 0x00000000c050c13c in sysctl_root (oidp=3DVariable "oidp" is not availa= ble. ) at /usr/src/sys/kern/kern_sysctl.c:1509 #15 0x00000000c050c434 in userland_sysctl (td=3D0x0, name=3D0xef8a5628, nam= elen=3D0x2, old=3D0x0, oldlenp=3DVariable "oldlenp" is not available.) at /= usr/src/sys/kern/kern_sysctl.c:1619 #16 0x00000000c050c858 in sys___sysctl (td=3D0xfffff8a2e3ef48c0, uap=3D0xef= 8a5768) at /usr/src/sys/kern/kern_sysctl.c:1545 #17 0x00000000c086ba00 in syscall (tf=3DVariable "tf" is not available.) at= subr_syscall.c:131 #18 0x00000000c0098e60 in tl0_intr () (kgdb) p *object $1 =3D { mtx =3D { lock_object =3D { lo_name =3D 0xc0a9a308 "vm object",=20 lo_flags =3D 0x1430000,=20 lo_data =3D 0x0,=20 lo_witness =3D 0xfff85180 },=20 mtx_lock =3D 0xfffff8a0112d75e0 },=20 =2E.. } (kgdb) p *object->mtx->lock_object->lo_witness $3 =3D { w_name =3D "standard object", '\0' ,=20 w_index =3D 0xa3,=20 w_class =3D 0xc0b82e88,=20 w_list =3D { stqe_next =3D 0xfff85100 },=20 w_typelist =3D { stqe_next =3D 0xfff85100 },=20 w_hash_next =3D 0x0,=20 w_file =3D 0xc0ac6388 "/usr/src/sys/vm/vm_meter.c",=20 w_line =3D 0x71,=20 w_refcount =3D 0x53718,=20 w_num_ancestors =3D 0xe,=20 w_num_descendants =3D 0xe,=20 w_ddb_level =3D 0x0,=20 w_displayed =3D 0x1,=20 w_reversed =3D 0x0 } (kgdb) p vm_object_list_mtx $4 =3D { lock_object =3D { lo_name =3D 0xc0ac6e30 "vm object_list",=20 lo_flags =3D 0x1030000,=20 lo_data =3D 0x0,=20 lo_witness =3D 0xfff81d80 },=20 mtx_lock =3D 0xfffff8a2e3ef48c2 } (kgdb) p *vm_object_list_mtx.lock_object.lo_witness=20 $6 =3D { w_name =3D "vm object_list", '\0' ,=20 w_index =3D 0x3b,=20 w_class =3D 0xc0b82e88,=20 w_list =3D { stqe_next =3D 0xfff81d00 },=20 w_typelist =3D { stqe_next =3D 0xfff81d00 },=20 w_hash_next =3D 0x0,=20 w_file =3D 0xc0ac6388 "/usr/src/sys/vm/vm_meter.c",=20 w_line =3D 0x6f,=20 w_refcount =3D 0x1,=20 w_num_ancestors =3D 0xf,=20 w_num_descendants =3D 0x0,=20 w_ddb_level =3D 0x0,=20 w_displayed =3D 0x1,=20 w_reversed =3D 0x0 } The witness information looks correct but I notice that vm_object_list_mtx is owned by a different thread to the vm_object that triggers the panic. The panic says it occurred on CPU 7: (kgdb) p cpuid_to_pcpu[7]->pc_curthread $21 =3D (struct thread *) 0xfffff8a2e3ef48c0 which matches the vm_object_list_mtx. My inital thought was a locking glitch but, looking through cpuid_to_pcpu[], the vm_object's lock doesn't match any running thread: (kgdb) p cpuid_to_pcpu[0]->pc_curthread $14 =3D (struct thread *) 0xfffff8a2e3008000 (kgdb) p cpuid_to_pcpu[1]->pc_curthread $15 =3D (struct thread *) 0xfffff8a2aae7c8c0 (kgdb) p cpuid_to_pcpu[2]->pc_curthread $16 =3D (struct thread *) 0xfffff8a0112acd20 (kgdb) p cpuid_to_pcpu[3]->pc_curthread $17 =3D (struct thread *) 0xfffff8a0112ac8c0 (kgdb) p cpuid_to_pcpu[4]->pc_curthread $18 =3D (struct thread *) 0xfffff8a2aae7da40 (kgdb) p cpuid_to_pcpu[5]->pc_curthread $19 =3D (struct thread *) 0xfffff8a2aa2a6460 (kgdb) p cpuid_to_pcpu[6]->pc_curthread $20 =3D (struct thread *) 0xfffff8a2e3148d20 (kgdb) p cpuid_to_pcpu[7]->pc_curthread $21 =3D (struct thread *) 0xfffff8a2e3ef48c0 (kgdb) p cpuid_to_pcpu[8]->pc_curthread $22 =3D (struct thread *) 0xfffff8d32cfa0460 (kgdb) p cpuid_to_pcpu[9]->pc_curthread $23 =3D (struct thread *) 0xfffff8a0112b3a40 (kgdb) p cpuid_to_pcpu[10]->pc_curthread $24 =3D (struct thread *) 0xfffff8a2a8f77180 (kgdb) p cpuid_to_pcpu[11]->pc_curthread $25 =3D (struct thread *) 0xfffff8a2e3ef1a40 (kgdb) p cpuid_to_pcpu[12]->pc_curthread $26 =3D (struct thread *) 0xfffff8a2e319e8c0 (kgdb) p cpuid_to_pcpu[13]->pc_curthread $27 =3D (struct thread *) 0xfffff8a2e3c30d20 (kgdb) p cpuid_to_pcpu[14]->pc_curthread $28 =3D (struct thread *) 0xfffff8a0112b2460 (kgdb) p cpuid_to_pcpu[15]->pc_curthread $29 =3D (struct thread *) 0xfffff8c1f78cb180 Some rummaging around says that the object is locked by yarrow: (kgdb) p ((struct thread *) 0xfffff8a0112d75e0)->td_proc.p_comm $35 =3D "yarrow", '\0' At this stage, I'm not sure where to go next. --=20 Peter Jeremy --mYCpIKhGyMATD0i+ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6dAAYACgkQ/opHv/APuIdILQCgvSvXFoWS5pZovoJT/RANMk8Y 95YAn3WeigJ2bT5zaE/7OYwl8zHPSeZP =SYZg -----END PGP SIGNATURE----- --mYCpIKhGyMATD0i+--