Date: Wed, 7 May 2003 13:47:34 -0400 From: Ali Bahar <alih@internetDog.org> To: freebsd-hackers@freebsd.org Subject: Re: cache_purge > cache_zap segmentation fault Message-ID: <20030507134734.A12455@internetDog.org> In-Reply-To: <20030504113221.A27756@internetDog.org>; from alih@internetDog.org on Sun, May 04, 2003 at 11:32:21AM -0400 References: <20030504113221.A27756@internetDog.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, May 04, 2003 at 11:32:21AM -0400, Ali wrote: > this post may be of interest to people familiar with the filesystem code. > syscall2 > open > vn_open > namei > lookup > ufs_vnoperate > > vfs_cache_lookup > ufs_vnoperate > ufs_lookup > ffs_vget > getnewvnode > > cache_purge > cache_zap The name cache is corrupted. Most of the threads involve getnewvnode, so a new file is being opened. The only thread observed to not include getnewvnode, used cache_enter. So a new cache entry is being created. I consider it a corruption because a namecache node has a junk value for nc_src.le_next . This is then de-referenced as the next namecache node, thus seg faulting. (gdb) p ncp $4 = (struct namecache *) 0xc0d62b40 (gdb) p *ncp $5 = { nc_hash = { le_next = 0x0, le_prev = 0xc0cd2ae4 }, nc_src = { le_next = 0x117, le_prev = 0xc0002a48 }, nc_dst = { tqe_next = 0x0, tqe_prev = 0xc61f9940 }, nc_dvp = 0xc61f33c0, nc_vp = 0xc61f98c0, nc_flag = 0 '\0', nc_nlen = 7 '\a', nc_name = 0xc0d62b62 "time.el<FB>\t\b[<FB>\t\bX<FB>\t\bM<FB>\t\bJ<FB>\t\b;<FB>\t\b" } As 'cache_purge > cache_zap' is involved, it may be that namecache node deletions have left a deleted node dangling. What I do not know, is whether there is a single system-wide name cache, or a per-directory cache linked list (LL). Neither the beastie book (Mckusick et al) or FreeBSD Developers' Handbook seem to cover this. Knowing the answer, would help me determine what the LLs are supposed to look like -- thereby help diagnose when the LL begins to go wrong. > P.P.S. It's been occuring intermittently, and increasingly, > recently. (Due to its increased prevalence, I even suspected that the > frequency of kernel crashes, might have corrupted the filesystem in a > way ignorable/imperceptible by fsck/me!) I no longer think so. Certainly a 'typical' filesystem corruption would lead to all sorts of random faults, not the consistent execution threads noted above. This is closer to a 'bug' than to a 'corruption'. Nonetheless, it may still be (somehow!) caused by me, rather than being a bug in the generic kernel. regards, ali -- Jesus was an Arab.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030507134734.A12455>