Date: Sat, 25 Jun 2011 07:58:23 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-fs@freebsd.org Cc: shadow@gmail.com, Robert Watson <rwatson@freebsd.org>, Garance A Drosehn <gad@freebsd.org> Subject: Re: [rfc] 64-bit inode numbers Message-ID: <201106250758.23935.jhb@freebsd.org> In-Reply-To: <alpine.GSO.1.10.1106242244170.6818@multics.mit.edu> References: <1656190156.1051008.1308953344203.JavaMail.root@erie.cs.uoguelph.ca> <alpine.GSO.1.10.1106242244170.6818@multics.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, June 24, 2011 11:38:35 pm Benjamin Kaduk wrote: > > point. fts(3) and friends will assume that it is a mount point > > crossing when st_dev changes. It will then expect that the funny > > rule that the d_ino in dirent will not be the same as st_ino. > > > > What I do for NFSv4 is sythesize the mnt_stat.f_fsid value and > > return that as st_dev for the mounted volume until I see the fsid > > returned by the server change. Below that point, I return the fsid > > from the server as st_dev so long as it isn't the same as the > > I think I'm confused. You're ... walking a directory heirarchy, and > return a fake st_dev value but hold onto the fsid value from the server, > then when the fsid from the server changes (due to a ... different NFS > mount?), start reporting that new fsid and throw away the fake st_dev > value? Can you point me at the code that is doing this? I think he's saying that VOP_GETATTR() for different vnodes in a single NFSv4 "mount" (as in 'struct mount *') can return different st_dev values to userland where the st_dev value for a given vnode depends on the remote fsid of the file on the NFSv4 server. That is, for NFSv4 it seems that all files on a mount do not use the same value of st_dev (as they would for a local filesystem), but instead only files from the logical volume on the server share an st_dev. That is, st_dev is per-vnode rather than just copied from the mount. This is done by storing va_fsid in the NFS attribute cache for each vnode: int nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, void *stuff, int writeattr, int dontshrink) { ... /* * For NFSv4, if the node's fsid is not equal to the mount point's * fsid, return the low order 32bits of the node's fsid. This * allows getcwd(3) to work. There is a chance that the fsid might * be the same as a local fs, but since this is in an NFS mount * point, I don't think that will cause any problems? */ if (NFSHASNFSV4(nmp) && NFSHASHASSETFSID(nmp) && (nmp->nm_fsid[0] != np->n_vattr.na_filesid[0] || nmp->nm_fsid[1] != np->n_vattr.na_filesid[1])) { /* * va_fsid needs to be set to some value derived from * np->n_vattr.na_filesid that is not equal * vp->v_mount->mnt_stat.f_fsid[0], so that it changes * from the value used for the top level server volume * in the mounted subtree. */ if (vp->v_mount->mnt_stat.f_fsid.val[0] != (uint32_t)np->n_vattr.na_filesid[0]) vap->va_fsid = (uint32_t)np->n_vattr.na_filesid[0]; else vap->va_fsid = (uint32_t)hash32_buf( np->n_vattr.na_filesid, 2 * sizeof(uint64_t), 0); } else vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; ... } Then for VOP_GETATTR() it returns the va_fsid from the attribute cache saved in 'vap' as the vnode's va_fsid which is used to compute st_dev in vn_stat(). I think the effect here is that 'mount' still only shows a single mountpoint for NFSv4, but applications that check for 'st_dev' changing to see if they are crossing a mountpoint (e.g. find -x) will treat the volumes as different mountpoints. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201106250758.23935.jhb>