FreeBSD Mail Archives

Date:      Wed, 7 May 2003 13:47:34 -0400
From:      Ali Bahar <alih@internetDog.org>
To:        freebsd-hackers@freebsd.org
Subject:   Re: cache_purge > cache_zap segmentation fault
Message-ID:  <20030507134734.A12455@internetDog.org>
In-Reply-To: <20030504113221.A27756@internetDog.org>; from alih@internetDog.org on Sun, May 04, 2003 at 11:32:21AM -0400
References:  <20030504113221.A27756@internetDog.org>

On Sun, May 04, 2003 at 11:32:21AM -0400, Ali wrote:

> this post may be of interest to people familiar with the filesystem code. 

>   syscall2 > open > vn_open > namei > lookup > ufs_vnoperate > 
>     vfs_cache_lookup > ufs_vnoperate > ufs_lookup > ffs_vget > getnewvnode >
>       cache_purge > cache_zap

The name cache is corrupted. 

Most of the threads involve getnewvnode, so a new file is being
opened. The only thread observed to not include getnewvnode, used
cache_enter. So a new cache entry is being created.

I consider it a corruption because a namecache node has a junk value
for nc_src.le_next . This is then de-referenced as the next namecache
node, thus seg faulting. 

   (gdb) p ncp
   $4 = (struct namecache *) 0xc0d62b40
   (gdb) p *ncp
   $5 = {
     nc_hash = {
       le_next = 0x0, 
       le_prev = 0xc0cd2ae4
     }, 
     nc_src = {
       le_next = 0x117, 
       le_prev = 0xc0002a48
     }, 
     nc_dst = {
       tqe_next = 0x0, 
       tqe_prev = 0xc61f9940
     }, 
     nc_dvp = 0xc61f33c0, 
     nc_vp = 0xc61f98c0, 
     nc_flag = 0 '\0', 
     nc_nlen = 7 '\a', 
     nc_name = 0xc0d62b62 "time.el<FB>\t\b[<FB>\t\bX<FB>\t\bM<FB>\t\bJ<FB>\t\b;<FB>\t\b"
   }

As 'cache_purge > cache_zap' is involved, it may be that namecache
node deletions have left a deleted node dangling.

What I do not know, is whether there is a single system-wide name cache,
or a per-directory cache linked list (LL). Neither the beastie book
(Mckusick et al) or FreeBSD Developers' Handbook seem to cover
this. Knowing the answer, would help me determine what the LLs are
supposed to look like -- thereby help diagnose when the LL begins
to go wrong.

> P.P.S. It's been occuring intermittently, and increasingly,
> recently. (Due to its increased prevalence, I even suspected that the
> frequency of kernel crashes, might have corrupted the filesystem in a
> way ignorable/imperceptible by fsck/me!)

I no longer think so. 
Certainly a 'typical' filesystem corruption would lead to all sorts of
random faults, not the consistent execution threads noted above. This
is closer to a 'bug' than to a 'corruption'. Nonetheless, it may still
be (somehow!) caused by me, rather than being a bug in the generic kernel.

regards,
ali
-- 
             Jesus was an Arab.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030507134734.A12455>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation