Date: Sat, 26 Nov 2005 18:22:45 -0500 From: Kris Kennaway <kris@obsecurity.org> To: Kris Kennaway <kris@obsecurity.org> Cc: amd64@freeBSD.org Subject: Re: spin lock smp rendezvous held by 0xffffff01250a7980 for > 5 seconds Message-ID: <20051126232244.GA83432@xor.obsecurity.org> In-Reply-To: <20051124232616.GA32023@xor.obsecurity.org> References: <20051124232616.GA32023@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--BOKacYhQ+x31HxR3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 24, 2005 at 06:26:16PM -0500, Kris Kennaway wrote: > I got this on a quad amd64 machine running 6.0-STABLE. At the time it > was running 21 simultaneous tar extractions onto a sync-mounted md. >=20 > panic() at panic+0x1e6 > _mtx_lock_spin() at _mtx_lock_spin+0xad > pmap_invalidate_range() at pmap_invalidate_range+0xb3 > pmap_qremove() at pmap_qremove+0x53 > vfs_vmio_release() at vfs_vmio_release+0x1e0 > getnewbuf() at getnewbuf+0x368 > getblk() at getblk+0x3d9 > ffs_balloc_ufs1() at ffs_balloc_ufs1+0x662 > ffs_write() at ffs_write+0x31b > VOP_WRITE_APV() at VOP_WRITE_APV+0xed > vn_write() at vn_write+0x228 > dofilewrite() at dofilewrite+0x90 > kern_writev() at kern_writev+0x54 > write() at write+0x4b >=20 > Unfortunately I can't dump on this machine (and no debugging is > currently enabled), but I can try to reproduce it. I tried for 24 hours with witness enabled but couldn't reproduce. The same panic happened in the same way when witness was disabled, although the= failure mode was a bit different: Fatal double fault cpuid =3D 3; apic id =3D 03 panic: double fault cpuid =3D 3 KDB: enter: panic [...] mtx_lock_spin() at _mtx_lock_spin+0x6b getit() at getit+0x6f DELAY() at DELAY+0x44 _mtx_lock_spin() at _mtx_lock_spin+0x6b getit() at getit+0x6f DELAY() at DELAY+0x44 _mtx_lock_spin() at _mtx_lock_spin+0x6b getit() at getit+0x6f DELAY() at DELAY+0x44 _mtx_lock_spin() at _mtx_lock_spin+0x6b getit() at getit+0x6f DELAY() at DELAY+0x44 _mtx_lock_spin() at _mtx_lock_spin+0x6b pmap_invalidate_range() at pmap_invalidate_range+0xb3 pmap_qremove() at pmap_qremove+0x53 vfs_vmio_release() at vfs_vmio_release+0x1e0 getnewbuf() at getnewbuf+0x368 getblk() at getblk+0x3d9 ffs_balloc_ufs1() at ffs_balloc_ufs1+0x662 ffs_write() at ffs_write+0x31b VOP_WRITE_APV() at VOP_WRITE_APV+0xed vn_write() at vn_write+0x228 dofilewrite() at dofilewrite+0x90 kern_writev() at kern_writev+0x54 write() at write+0x4b syscall() at syscall+0x404 Xfast_syscall() at Xfast_syscall+0xa8 --- syscall (4, FreeBSD ELF64, write), rip =3D 0x80070ea6c, rsp =3D 0x7ffff= fffe6a8, rbp =3D 0x52ae00 --- i.e. the first _mtx_lock_spin() tried to acquire the ipi lock and spun, which called DELAY and getit, which tried to acquire the clock lock: mtx_lock_spin(&clock_lock); which *also* spun, and called DELAY...and at that point things went to hell and it recursed until it blew out the stack. I guess the next step is to try INVARIANTS alone in case that catches something. Kris --BOKacYhQ+x31HxR3 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFDiO43Wry0BWjoQKURAhsxAJ9KDUyMD0x3Ce/jtB2QDry+kxfyrQCg4inc pO713nUMAEgFuuRg88J+0eI= =cJAh -----END PGP SIGNATURE----- --BOKacYhQ+x31HxR3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051126232244.GA83432>