From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 3 13:25:57 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E156716A419; Thu, 3 Jan 2008 13:25:57 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10]) by mx1.freebsd.org (Postfix) with ESMTP id 9B17C13C45A; Thu, 3 Jan 2008 13:25:57 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by cs1.cs.huji.ac.il with esmtp id 1JAQ55-000MJe-VY; Thu, 03 Jan 2008 15:25:56 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: Eric Anderson In-reply-to: Your message of Thu, 03 Jan 2008 06:46:34 -0600 . Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 03 Jan 2008 15:25:55 +0200 From: Danny Braniss Message-ID: Cc: freebsd-hackers@freebsd.org Subject: Re: nfs v2/v3 and diskless boot problem X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 13:25:58 -0000 > Danny Braniss wrote: > >> Danny Braniss wrote: > >>>> Danny Braniss wrote: > >>>>> there is an undocumented option: > >>>>> boot-nfsroot-options > >>>>> that the diskeless boot can use. I tried > >>>>> boot-nfsroot-options = "nfsv3" > >>>>> since the pxeboot does the initial mount via nfsv2, and this has at least > >>>>> one problem: removing a file from the readonly / will hang the system. > >>>>> > >>>>> so, the remount to v3 works in the case that the root is served by a Freebsd > >>>>> nfs server, but fails if it's NetAPP. The reason is that the v2 filehandle > >>>>> is 32 bytes, and when switching to V3 it becomes 28bytes - sizeof(fhandle_t). > >>>>> This is not liked by the NetApp, which correctly gives error 1001: BADHANDLE > >>>>> :-) > >>>>> > >>>>> While I'm trying to come up with a solution, I am wondering if someone > >>>>> can shed some light: > >>>>> - is sizeof(fhandle_t) == 28 bytes is mystical, or changing it to > >>>>> 32 bytes will start WW3? > >>>> NFSv3 file handles (by spec) can be up to 64bytes. > >>> true, but in freebsd, look at sys/nfs/nfsproto.h > >>> #define NFSX_V2FH 32 > >>> #define NFSX_V3FH (sizeof (fhandle_t)) > >>> #define NFSX_V4FH 128 > >>> > >>> so for v3 it's 28 bytes. (fhandle_t is defined in sys/mount.h) > >>> > >>> > >>>> I'm not 100% sure what is happening, but it sounds like the file handle > >>>> for the mount point or maybe one of the directories is not getting reset > >>>> on remount. > >>>> > >>>> When do you get the BADHANDLE error? Can you capture a > >>>> tshark/wireshark/tcpdump of the remount and error? > >>> I did, and if you look in sys/nfsclient/nfs_vfsops.c, nfs_convert_diskless is responsible > >>> for chopping off the 4 extra bytes. BTW, I tried to change the bcopy count to NFSX_V2FH/32, and > >>> it panics the kernel :-( > >>> > >>> danny > >> > >> oh - looks like this says it all: > >> http://fxr.googlebit.com/source/sys/nfsclient/nfsdiskless.h?v=8-CURRENT#L51 > >> > > that's where the boot-nfsroot-options comes from:-) > > if you notice, the filehandle for v3 is 64 bytes, but > > only 28 are used. > > > > but as I mentioned initially, this ONLY works when the server is FreeBSD, and > > breaks for other servers, ie NetAPP. AND the initial question stands: > > what's in a filehandle, or can it be > 28bytes. > > > Yea, FreeBSD is making the assumption that all NFS servers will use the > same size FH for NFSv3. That is just wrong. > carful, I think this is the case only if fsb is the server, it will 'probably' accept filehandles of other sizes from other servers. > The FH is a server created opaque handle that it can create however it > wishes. Most servers use information like inode, generation, fsid, etc > to create it, but it's something that you can't necessarily decode. > yes, but the FH has information that the server can/must use to figure out which local filesystem it refers to - remember that v2/v3 are stateless. > I've created a patch that might fix this, but I'm still testing and QEMU > (which I use for my testing) keeps making my system either panic or lock > up, so hopefully I should have something for you to try tonight. > > Also - can you tell me the exact 'mount' command you tried to do the > remount/update? > it's only in the diskless boot, where setting boot-nfsroot-options = "nfsv3" in /boot/loader.conf will do the remount. cheers, danny