Date: Thu, 8 Mar 2018 22:54:19 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: NAGY Andreas <Andreas.Nagy@frequentis.com>, "'freebsd-stable@freebsd.org'" <freebsd-stable@freebsd.org> Subject: =?iso-8859-1?Q?Re:_NFS_4.1_RECLAIM=5FCOMPLETE_FS=A0failed_error_in_combin?= =?iso-8859-1?Q?ation_with_ESXi_client?= Message-ID: <YQBPR0101MB1042D788A9B3DBF769052244DDDF0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <D890568E1D8DD044AA846C56245166780124AFCFF8@vie196nt> References: <c5c624de-42bb-45cf-8cf0-b25be56e5f58@frequentis.com> <YQBPR0101MB1042DEF0825996764CBCA829DDC40@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFB90E@vie196nt> <YQBPR0101MB1042479407CAA253674BBAEBDDDB0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> <D890568E1D8DD044AA846C56245166780124AFBD21@vie196nt>, <D890568E1D8DD044AA846C56245166780124AFBD91@vie196nt> <YQBPR0101MB104225B6884FEC70A03C61CCDDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFC0E2@vie196nt>, <YQBPR0101MB1042040D2BFB3681E940D271DDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <2feda1e2-16d5-43b5-98eb-dcc71cc67c6f@frequentis.com> <YQBPR0101MB10427C97161C74A5C441D1DCDDD80@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFCABC@vie196nt> <YQBPR0101MB1042B17763E2605A7CE72EF5DDDF0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFCFF8@vie196nt>
next in thread | previous in thread | raw e-mail | index | archive | help
NAGY Andreas wrote: >Thanks you, really great how fast you adapt the source/make patches for th= is. Saw so many >posts were people did not get NFS41 working with ESXi and = FreeBSD and now I have it already >running with your changes. > >I have now compiled the kernel with all 4 patches, and it works now. Ok. Sounds like we are making progress. It also takes someone willing to te= st patches, so thanks for doing so. >Some problems are still left: > >- the "Server returned improper reason for no delegation: 2" warnings are = still in the >vmkernel.log. > 2018-03-08T11:41:20.290Z cpu0:68011 opID=3D488969b0)WARNIN= G: NFS41: >NFS41ValidateDelegation:608: Server returned improper reason for= no delegation: 2 I'll take another look and see if I can guess why it doesn't like "2" as a = reason for not issuing a delegation. (As noted before, I don't think this is serious, but?= ??) >- can't delete a folder with the VMware host client datastore browser: > 2018-03-08T11:34:00.349Z cpu1:67981 opID=3Df5159ce3)WARNIN= G: NFS41: >NFS41FileOpReaddir:4728: Failed to process READDIR result for fh= 0x43046e4cb158: Transient >file system condition, suggest retry [more of these snipped] > 2018-03-08T11:34:00.352Z cpu1:67981 opID=3Df5159ce3)WARNIN= G: UserFile: 2155: >hostd-worker: Directory changing too often to perform r= eaddir operation (11 retries), >returning busy This one is a mystery to me. It seemed to be upset that the directory is ch= anging (I assume either the Change or ModifyTime attributes). However, if entries are= being deleted, the directory is changing and, as far as I know, the Change and Mo= difyTime attributes are supposed to change. I might try posting on nfsv4@ietf.org in case somebody involved with this c= lient reads that list and can explain what this is? >- after a reboot of the FreeBSD machine the ESXi does not restore the NFS = datastore again >with following warning (just disconnecting the links is fi= ne) > 2018-03-08T12:39:44.602Z cpu23:66484)WARNING: NFS41: NFS41= _Bug:2361: BUG - >Invalid BIND_CONN_TO_SESSION error: NFS4ERR_NOTSUPP Hmm. Normally after a server reboot, the clients will try some RPC that sta= rts with a Sequence (the session op) and the server will reply NFS4ERR_BAD_SESSION. This triggers recovery in the client. The BindConnectiontoSession operation is done in an RPC by itself, so there= is no Sequence op to trigger NFS4ERR_BAD_SESSION. Maybe this client expects to see NFS4ERR_BAD_SESSION for the BindConnection= toSession. I'll post a patch that modifies the BindConnectiontoSession to do that. >Actually I have only made some quick benchmarks with ATTO in a Windows VM = which has a >vmdk on the NFS41 datastore which is mounted over two 1GB link= s in different subnets. >Read is nearly the double of just a single connection and write is just a = bit faster. Don't know if >write speed could be improved, actually the shar= e is UFS on a HW raid controller which has >local write speeds about 500MB/= s. Yes, before I posted that I didn't understand why multiple TCP links would = be faster. I didn't notice at the time that you mentioned using different subnets and,= as such, links couldn't be trunked below TCP. In your case trunking above TCP makes = sense. Getting slower write rates than read rates from NFS is normal. Did you try "sysctl vfs.nfsd.async=3D1"? The other thing that might help for UFS is increasing the size of the buffe= r cache. (If this server is mainly an NFS server you could probably make the buffer = cache greater than half of the machine's ram. Note to others, since ZFS doesn't use the buffer cache, the opposite is tr= ue for ZFS.) rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB1042D788A9B3DBF769052244DDDF0>