Date: Mon, 19 Dec 2016 21:59:59 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Colin Percival <cperciva@tarsnap.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ESTALE after cwd deleted by same NFS client Message-ID: <YTXPR01MB0189B6B6869327FD70210018DD910@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <0100015915a5ee96-49de6100-5050-4a0a-a3c9-1bd4215bc6a0-000000@email.amazonses.com> References: <01000158f023675b-41b35a73-4428-4937-853b-62db4fb9b984-000000@email.amazonses.com> <20161212054233.GU8460@kduck.kaduk.org> <01000158f1abc081-d4eddc58-3b4b-41dd-aa1e-0157d2fad812-000000@email.amazonses.com> <20161212163603.GV8460@kduck.kaduk.org> <YQBPR01MB018054EE62DEFDC73784AD9BDD9B0@YQBPR01MB0180.CANPRD01.PROD.OUTLOOK.COM> <01000158fc3da2c5-c13da088-e7b9-4ac0-ac01-ec49a275dd24-000000@email.amazonses.com> <YTXPR01MB0189ACD940B7D399A6855CB8DD9A0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <YTXPR01MB01891CB02D5D7BA46579F1BFDD9D0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <010001590945e9b3-015a4d05-2646-44ba-9db9-415e8b9119dd-000000@email.amazonses.com>, <0100015915a5ee96-49de6100-5050-4a0a-a3c9-1bd4215bc6a0-000000@email.amazonses.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Colin Percival wrote: >On 12/16/16 12:14, Colin Percival wrote: >> making this change in nfs_lookup >>> --- sys/fs/nfsclient/nfs_clvnops.c (revision 310132) >>> +++ sys/fs/nfsclient/nfs_clvnops.c (working copy) >>> @@ -1144,7 +1144,7 @@ >>> *vpp =3D NULLVP; >>> } >>> >>> - if (error !=3D ENOENT) { >>> + if (error !=3D ENOENT && error !=3D ESTALE) { >>> if (NFS_ISV4(dvp)) >>> error =3D nfscl_maperr(td, error, (uid_= t)0, >>> (gid_t)0); >> >> fixes the case I described above (for some definition of "fixes" -- I'm = not >> sure if this is the correct way of handling ESTALE here?) but I'm still = seeing >> ESTALEs from buildworld's cleandir so I think there must be some other p= lace >> where something odd is happening. > >Further information: In addition to the "lookup relative to a directory wh= ich >has been deleted out from underneath us" case which causes ESTALE to land = in >nfs_lookup, the cleandir step of buildworld results in ESTALE being return= ed >by nfsrpc_getattr into nfs_getattr (landing ultimately in getcwd), and EST= ALE >being returned by nfsrpc_accessrpc into nfs34_access_otw (landing ultimate= ly >in stat and lstat). > >In UFS there are checks for effnlink =3D=3D 0 which result in e.g. ufs_loo= kup >returning ENOENT; would it make sense to add NREMOVED to struct nfsnode.n_= flag >and check this in the appropriate nfs_* calls? To be honest, I can't think of a reason why userland would ever want to see= ESTALE? The function you see above "nfscl_maperr()" could easily map all ESTALEs to= ENOENTs? - The question is: "Would returning ENOENT for stat(2) and access(2) actual= ly make the buildworld happy? if Yes - then mapping ESTALE->ENOENT makes sense for most/all VOP_xxx() = calls. else - I don't see any point in returning a different error and there= might be some code out there somewhere that depends on seeing ESTALE (I doub= t it, but???). The real problem here is that the directory has a reference count on it whe= n it is rmdir'd. POSIX file systems keep the data until the reference count goes to= 0, but NFS isn't POSIX. --> The cheat for regular files is "sillyrename". This could be done for di= rectories, but there are multiple comments in the code (not put there by me) tha= t say "no sillyrename for directories". #1 Does this imply something breaks when you do sillyrename for dirs? OR #2 Does it mean no one has bothered to implement it? Since implementing it would have been pretty easy, I have to suspect #1, wh= ich means I would be reluctant to do it, at least by default. --> Maybe I'll send you a patch that does sillyrename for dirs which you ca= n test. If it fixes buildworld, then it could be considered for head as a non= -default option? rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0189B6B6869327FD70210018DD910>