From owner-freebsd-bugs@FreeBSD.ORG Mon Jun 16 02:10:12 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69B7937B401 for ; Mon, 16 Jun 2003 02:10:12 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id E615643FAF for ; Mon, 16 Jun 2003 02:10:11 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h5G9A9Up086924 for ; Mon, 16 Jun 2003 02:10:09 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h5G9A97u086923; Mon, 16 Jun 2003 02:10:09 -0700 (PDT) Date: Mon, 16 Jun 2003 02:10:09 -0700 (PDT) Message-Id: <200306160910.h5G9A97u086923@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Tom Alsberg Subject: Re: kern/53004: union_lookup returning . (0xbc332e90) not same as startdir (0xc1fa8a40) X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Tom Alsberg List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Jun 2003 09:10:12 -0000 The following reply was made to PR kern/53004; it has been noted by GNATS. From: Tom Alsberg To: freebsd-gnats-submit@FreeBSD.org, scrappy@hub.org Cc: Subject: Re: kern/53004: union_lookup returning . (0xbc332e90) not same as startdir (0xc1fa8a40) Date: Mon, 16 Jun 2003 12:01:47 +0300 I noticed this a few days ago too, and sent a message to the FreeBSD-hackers list. David Schultz asked me to repost this to gnats as a followup to this PR. Following it is, including a simple (and yet "foolproof" as I noticed) way to reproduce it: From: Tom Alsberg To: FreeBSD Hackers List Subject: (bug?) panic in union filesystem - file/. Hi there. I recently stumbled upon a crash in the union filesystem. It seems that when trying to stat "/." where file is a regular (non-directory) file in a union mounted filesystem, the system will panic. I first noticed this as an effect of zsh (Z shell)'s tab completion, which after I checked, tries to lstat "/." if there are no other completions and the file exists, to see if it is a directory with other files in it which it should try to complete (I do not know why they chose to do it this way). It seems like a bug in the union filesystem to me. I can reproduce it on both 4.8-STABLE and 5.1-CURRENT. Simplest way I reproduce it: # Create two directories somewhere: cd /var/tmp mkdir foo mkdir bar # union-mount one on top of the other: mount -t union bar foo # enter the mounted directory, create a regular file there, and read # /.: cd foo touch meow cat meow/. Everywhere I checked, there is a panic at that point: panic: union_lookup returning . (0xc8d83edc) not same as startdir (0xc8cb2e00) Relevant part of a backtrace (with gdb -k on saved core files of a 4.8-CURRENT kernel compiled with debugging): #0 dumpsys () at /r+d/4.8/src/sys/kern/kern_shutdown.c:487 #1 0xc022b067 in boot (howto=256) at /r+d/4.8/src/sys/kern/kern_shutdown.c:316 #2 0xc022b4a5 in panic ( fmt=0xc0420e80 "union_lookup returning . (%p) not same as startdir (%p)") at /r+d/4.8/src/sys/kern/kern_shutdown.c:595 #3 0xc02674b8 in union_lookup (ap=0xc8d83d70) at /r+d/4.8/src/sys/miscfs/union/union_vnops.c:615 #4 0xc02577fd in lookup (ndp=0xc8d83ec8) at vnode_if.h:52 #5 0xc02572f8 in namei (ndp=0xc8d83ec8) at /r+d/4.8/src/sys/kern/vfs_lookup.c:153 #6 0xc025fd43 in vn_open (ndp=0xc8d83ec8, fmode=1, cmode=0) at /r+d/4.8/src/sys/kern/vfs_vnops.c:138 #7 0xc025be78 in open (p=0xc8d74ac0, uap=0xc8d83f80) at /r+d/4.8/src/sys/kern/vfs_syscalls.c:1029 #8 0xc03c5a45 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134564005, tf_esi = -1077939303, tf_ebp = -1077939744, tf_isp = -925351980, tf_ebx = -1077939304, tf_edx = 134578912, tf_ecx = 1, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 134531788, tf_cs = 31, tf_eflags = 663, tf_esp = -1077939788, tf_ss = 47}) at /r+d/4.8/src/sys/i386/i386/trap.c:1175 #9 0xc03b5995 in Xint0x80_syscall () I looked a bit at the code of the union filesystem, and the best I know until now is that it is because of union_allocvp putting NULL in (*ap->a_vpp) in (src/sys/miscfs/union/union_vnops.c, union_lookup(...), about line 543): error = union_allocvp(ap->a_vpp, dvp->v_mount, dvp, upperdvp, cnp, uppervp, lowervp, 1); which later triggers (src/sys/miscfs/union/union_vnops.c, union_lookup(...), about line 573): #ifdef DIAGNOSTIC if (cnp->cn_namelen == 1 && cnp->cn_nameptr[0] == '.' && *ap->a_vpp != dvp) { panic("union_lookup returning . (%p) not same as startdir (%p)", ap->a_vpp, dvp); } #endif But I'm not sure what exactly is wrong in or before union_allocvp, and right now I don't yet understand what's exactly going on in the code there (I'm not exactly sure what the DIAGNOSTIC marked code is doing there - what is it for, and why is this specific case special?, but I see union_lookup would just fail (and not panic) without it, so that's perhaps a workaround)... Can someone with more experience/understanding of the union filesystem take a look at this? Thanks, -- Tom -- Tom Alsberg - hacker (being the best description fitting this space) Web page: http://www.cs.huji.ac.il/~alsbergt/ DISCLAIMER: The above message does not even necessarily represent what my fingers have typed on the keyboard, save anything further. -- Tom Alsberg - hacker (being the best description fitting this space) Web page: http://www.cs.huji.ac.il/~alsbergt/ DISCLAIMER: The above message does not even necessarily represent what my fingers have typed on the keyboard, save anything further.