From owner-freebsd-hackers@FreeBSD.ORG  Fri Jul 26 20:20:34 2013
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 0D2B6408
 for <hackers@freebsd.org>; Fri, 26 Jul 2013 20:20:34 +0000 (UTC)
 (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id D802E2E7F
 for <hackers@freebsd.org>; Fri, 26 Jul 2013 20:20:33 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id AF2B8B94C;
 Fri, 26 Jul 2013 16:20:32 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: rank1seeker@gmail.com
Subject: Re: UFS related panic (daily <-> find)
Date: Fri, 26 Jul 2013 16:08:11 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; )
References: <20130719.174511.786.3@DOMY-PC>
 <201307261116.58638.jhb@freebsd.org> <20130726.190033.811.1@DOMY-PC>
In-Reply-To: <20130726.190033.811.1@DOMY-PC>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <201307261608.11556.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Fri, 26 Jul 2013 16:20:32 -0400 (EDT)
Cc: hackers@freebsd.org
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jul 2013 20:20:34 -0000

On Friday, July 26, 2013 3:00:33 pm rank1seeker@gmail.com wrote:
> > > > > I had 2 panics: (Both occured at 3 AM, so had to be daily task)
> > > > > 
> > > > > First (Jul  2 03:06:50 2013):
> > > > > --
> > > > > Fatal trap 12: page fault while in kernel mode
> > > > > fault virtual address   = 0x19
> > > > > fault code              = supervisor read, page not present
> > > > > instruction pointer     = 0x20:0xc06caf34
> > > > > stack pointer           = 0x28:0xe76248fc
> > > > > frame pointer           = 0x28:0xe7624930
> > > > > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > > > >                         = DPL 0, pres 1, def32 1, gran 1
> > > > > processor eflags        = interrupt enabled, resume, IOPL = 0
> > > > > current process         = 76562 (find)
> > > > > trap number             = 12
> > > > > panic: page fault
> > > > > Uptime: 23h0m41s
> > > > > Physical memory: 1014 MB
> > > > > Dumping 186 MB: 171 155 139 123 107 91 75 59 43 27 11
> > > > > 
> > > > > #7  0xc06caf34 in cache_lookup_times (dvp=0xc784a990, 
> vpp=0xe7624ae8,
> > > > >     cnp=0xe7624afc, tsp=0x0, ticksp=0x0) at 
> > > > /usr/src/sys/kern/vfs_cache.c:547
> > > > 
> > > > Can you go up to this frame and do 'l'?
> > > > 
> > > > -- 
> > > > John Baldwin
> > > 
> > > 
> > > Sure,
> > > 
> > > ---------
> > > (kgdb) up 7
> > > #7  0xc06caf34 in cache_lookup_times (dvp=0xc784a990, vpp=0xe7624ae8, 
> cnp=0xe7624afc, tsp=0x0, ticksp=0x0) at /usr/src/sys/kern/vfs_cache.c:547
> > > 547                     numchecks++;
> > > ---------
> > > (kgdb) l
> > > 542             }
> > > 543
> > > 544             hash = fnv_32_buf(cnp->cn_nameptr, cnp->cn_namelen, 
> FNV1_32_INIT);
> > > 545             hash = fnv_32_buf(&dvp, sizeof(dvp), hash);
> > > 546             LIST_FOREACH(ncp, (NCHHASH(hash)), nc_hash) {
> > > 547                     numchecks++;
> > > 548                     if (ncp->nc_dvp == dvp && ncp->nc_nlen == 
> cnp->cn_namelen &&
> > > 549                         !bcmp(nc_get_name(ncp), cnp->cn_nameptr, 
> ncp->nc_nlen))
> > > 550                             break;
> > > 551             }
> > > ---------
> > 
> > Hmm, 'p ncp' and 'p *ncp' at that frame perhaps?
> > 
> 
> (kgdb) p ncp
> $1 = (struct namecache *) 0x1
> (kgdb) p *ncp
> Cannot access memory at address 0x1

Interesting.  Maybe look at NCHHASH(hash) (you'll have to expand the macro manually)
and see if the head node is corrupted or walk the list to find the corrupted node.
Given that it is a single bit error, there is a chance this is a RAM problem.  If it
is in the hash table head entry then that would always be at the same physical address
for the same kernel I think.

-- 
John Baldwin