From owner-freebsd-current@FreeBSD.ORG Fri Aug 19 14:42:27 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1AD51065672 for ; Fri, 19 Aug 2011 14:42:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 955F08FC12 for ; Fri, 19 Aug 2011 14:42:27 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAOVyTk6DaFvO/2dsb2JhbABBhEukOYFAAQEFIwRSGw4KAgINGQJZBhOvR5E6gSyEDIEQBJMTkRE X-IronPort-AV: E=Sophos;i="4.68,251,1312171200"; d="scan'208";a="131599854" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 19 Aug 2011 10:32:47 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id C9584B3F09; Fri, 19 Aug 2011 10:32:47 -0400 (EDT) Date: Fri, 19 Aug 2011 10:32:47 -0400 (EDT) From: Rick Macklem To: Hiroki Sato Message-ID: <1623060518.69434.1313764367817.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20110819.224310.740411147168584392.hrs@allbsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: pjd@FreeBSD.org, current@FreeBSD.org Subject: Re: fsid change of ZFS? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2011 14:42:28 -0000 Hiroki Sato wrote: > Hiroki Sato wrote > in <20110819.002046.908756241495481148.hrs@allbsd.org>: > > hr> Hi, > hr> > hr> I have experienced "Stale NFS file handle" issue when switching > hr> between oldnfs and newnfs on a CURRENT box (NFS server exporting > ZFS > hr> mountpoints). The cause was that fsid was changed in the following > hr> conditions and not in the NFS subsystem itself, but I am wondering > if > hr> these are expected behavior... > hr> > hr> First, I tried the following configurations of NFS and ZFS, and > saw > hr> if fsid of the same mountpoint (a mounted ZFS dataset) changed or > hr> not by using statfs(2): > hr> > hr> compile opts kld module fsid[0:1] kld loaded by > hr> > ---------------------------------------------------------------------------- > hr> NFSSERVER+NFSCLIENT zfs 865798fa:8346ef02 loader > hr> > hr> NFSSERVER+NFSCLIENT zfs 865798fa:8346ef07 kldload(8) > hr> > hr> NFSSERVER+NFSCLIENT+ > hr> NFSD+NFSCL zfs 865798fa:8346ef03 loader > hr> > hr> NFSSERVER+NFSCLIENT+ > hr> NFSD+NFSCL zfs 865798fa:8346ef08 kldload(8) > hr> > hr> NFSSERVER+NFSCLIENT nfsd+nfscl+zfs 865798fa:8346ef08 loader > hr> > ---------------------------------------------------------------------------- > > Ah, I found why this happened: > > /* > * The fsid is 64 bits, composed of an 8-bit fs type, which > * separates our fsid from any other filesystem types, and a > * 56-bit objset unique ID. The objset unique ID is unique to > * all objsets open on this system, provided by unique_create(). > * The 8-bit fs type must be put in the low bits of fsid[1] > * because that's where other Solaris filesystems put it. > */ > fsid_guid = dmu_objset_fsid_guid(zfsvfs->z_os); > ASSERT((fsid_guid & ~((1ULL<<56)-1)) == 0); > vfsp->vfs_fsid.val[0] = fsid_guid; > vfsp->vfs_fsid.val[1] = ((fsid_guid>>32) << 8) | > vfsp->mnt_vfc->vfc_typenum & 0xFF; > > Since the vfc_typenum variable is incremented every time a new vfs is > installed, loading order of modules that call vfs_register() affects > ZFS's fsid. > > Anyway, possibility of fsid change is troublesome especially for an > NFS server with a lot of clients running. Can zeroing or setting a > fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful? > > -- Hiroki Oh, and I think other fs types will suffer the same fate, except that they usually avoid it, because they are compiled into the kernel and the assignment of vfs_typenum happens in the same order-->same value. My (B) suggestion would avoid this for all file system types in the fixed table. rick