From owner-freebsd-alpha  Thu Jun  7 14:52:32 2001
Delivered-To: freebsd-alpha@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id 6665C37B401
	for <freebsd-alpha@FreeBSD.org>; Thu,  7 Jun 2001 14:52:25 -0700 (PDT)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.3/8.11.2) with ESMTP id f57Lpf137848;
	Thu, 7 Jun 2001 14:51:41 -0700 (PDT)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010607145147.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <15135.50535.462934.648630@grasshopper.cs.duke.edu>
Date: Thu, 07 Jun 2001 14:51:47 -0700 (PDT)
From: John Baldwin <jhb@FreeBSD.org>
To: Andrew Gallatin <gallatin@cs.duke.edu>
Subject: Re: Possible VM patch..
Cc: freebsd-alpha@FreeBSD.org
Sender: owner-freebsd-alpha@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-alpha.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo?subject=subscribe%20freebsd-alpha>
List-Unsubscribe: <mailto:majordomo?subject=unsubscribe%20freebsd-alpha>
X-Loop: FreeBSD.org


On 07-Jun-01 Andrew Gallatin wrote:
> 
> John Baldwin writes:
>  > 
>  > 
>  > I just got a panic on my UP machine.  This one is a very weird panic that
> is
>  > only triggered when the witness code gets its internal per-process lock
> lists
>  > out of sync.  I wish it was easier to trigger.  It may be a witness bug or
> it
>  > may be some sort of data corruption.  :(  The NULL vm_object panic in
>  > vm_fault1() that I'm getting on the dual Rawhide seems fairly
> reproducible, so
>  > I've added in a bunch of KTR tracepoints to see if I can narrow it down.
> 
> Here's my latest panic:
> 
> login: panic: mutex vm not owned at ../../vm/vm_page.c:1017
> cpuid = 0; panic
> Stopped at      Debugger+0x34:  zapnot  v0,#0xf,a0      <v0=0x0,a0=0x6>
> db> tr
> Debugger() at Debugger+0x34
> panic() at panic+0x178
> _mtx_assert() at _mtx_assert+0x64
> vm_page_free_toq() at vm_page_free_toq+0x38
> vm_page_alloc() at vm_page_alloc+0x270
> vm_fault1() at vm_fault1+0x648
> vm_fault() at vm_fault+0x204
> trap() at trap+0xe20
> XentMM() at XentMM+0x2c
> --- memory management fault (from ipl 0) ---
> --- user mode ---
> 
> [I'm running an updated db_trace.c merged from NetBSD with Ross
> Harvey's all-singing / all dancing stack trace code, but it isn't
> too terribly interesting for this particular case.]
> 
> db> show locks
> exclusive (sleep mutex) vm (0xfffffc0000815430) locked @
> ../../vm/vm_fault.c:301
> exclusive (sleep mutex) Giant (0xfffffc00008160c8) locked @
> ../../vm/vm_fault.c:213
> 
> db> show pcpu
> cpuid     = 0
> ipis      = 0
> next ASN  = 240
> curproc   = 0xfffffe00068a1480: pid 41334 "ld"
> curpcb    = 0x7a96000
> fpcurproc = none
> idleproc  = 0xfffffe0005887600: pid 10 "idle: cpu0"
> spin locks held:
> db> x vm_mtx,10
> vm_mtx: 760c20          fffffc00        730700          fffffc00        30000
> vm_mtx+0x14:    0               806528          fffffc00        7b7910
> vm_mtx+0x24:    fffffc00        4               0               0
                                  ^^^^^^^^^^^^^^^^^
> vm_mtx+0x34:    0               0               0

That is 0x4 == MTX_UNOWNED which means no one owns this mutex.

> I think this is the same "can't happen" thing you were talking about
> earlier.  We've got a panic because vm isn't owned, but wait!  It is!

No, it's not owned, though I don't know why it isn't owned.  :(  I'll look at
the code in a bit.  The weird mtx_owned() problem I had was on an SMP system,
too.

> Drew

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message