From owner-freebsd-hackers  Fri Feb 21 11: 9:25 2003
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0EAD737B401
	for <freebsd-hackers@freebsd.org>; Fri, 21 Feb 2003 11:09:24 -0800 (PST)
Received: from jordan.llnl.gov (jordan.llnl.gov [128.115.36.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2E20143FB1
	for <freebsd-hackers@freebsd.org>; Fri, 21 Feb 2003 11:09:23 -0800 (PST)
	(envelope-from alley1@llnl.gov)
Received: from jordan.llnl.gov (b7572acf6d65dd4f7a300444d5410d79@localhost [127.0.0.1])
	by jordan.llnl.gov (8.12.6/8.12.6) with ESMTP id h1LJ9DWc036667
	for <freebsd-hackers@freebsd.org>; Fri, 21 Feb 2003 11:09:13 -0800 (PST)
Received: (from wea@localhost)
	by jordan.llnl.gov (8.12.6/8.12.6/Submit) id h1LJ9Doc036666
	for freebsd-hackers@freebsd.org; Fri, 21 Feb 2003 11:09:13 -0800 (PST)
Date: Fri, 21 Feb 2003 11:09:13 -0800 (PST)
From: Ed Alley <alley1@llnl.gov>
Message-Id: <200302211909.h1LJ9Doc036666@jordan.llnl.gov>
To: freebsd-hackers@freebsd.org
Subject: Some advice in analyzing vfs crashes
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG


I am looking for advice:

	I am developing a file system (it can exist as a module or compiled in).

	A problem has developed which I do not know how to debug and would like
        some advice.

	The symptoms:
		I can mount my FS read from it, write into and out of it and unmount
	it. However, it is subject to periodic page-faults. These can happen while
	the FS is mounted, or after it has been dismounted (even for a while).
	The page-fault occurs at random and doesn't depend on whether I reference
	the FS or not. When I look at the core dump, I see that it occurs in
	the VFS usually after an open, say ufs_open(), and often in vn_isdisk().
	When I look at the arguments to these calls, I see *vp = 0 or a bad address.

	Clearly my FS has corrupted a vnode, or the buffer cache.

	(Maybe some vnodes are still locked or referenced, or corrupted? or some
	 buffer blocks are corrupted or not released?)

	So I'm thinking to do the following:

		1. Mount my FS, cd into it and do the following:

			cp /etc/passwd .
			rm *
			cd /
			umount /mnt

		2. panic the system to get a core dump.
			This could be done, say, by putting the panic(9) call
			in my vfs unmount() function before the return.

	My questions:

	   Is there a way to analyize a core dump with gdb(1) and detect:

	      	a) corrupted vnodes?
		b) locked/referenced vnodes (that should have been released)?
		c) corrupted buffer cache?
		d) locked buffer blocks (that should have been released)?

	   In other words, I want evaluate the state of the vfs after my
	   FS has done its damage, but before the inevitable panic occurs.

	   Any further hints on what I should look at would be greatly appreciated.

			Thank-you
			Ed Alley
			

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message