Date: Wed, 7 Mar 2018 17:32:29 +0000 From: NAGY Andreas <Andreas.Nagy@frequentis.com> To: Rick Macklem <rmacklem@uoguelph.ca>, "'freebsd-stable@freebsd.org'" <freebsd-stable@freebsd.org> Subject: =?iso-8859-1?Q?RE:_NFS_4.1_RECLAIM=5FCOMPLETE_FS=A0failed_error_in_combin?= =?iso-8859-1?Q?ation_with_ESXi_client?= Message-ID: <D890568E1D8DD044AA846C56245166780124AFCABC@vie196nt> In-Reply-To: <YQBPR0101MB10427C97161C74A5C441D1DCDDD80@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> References: <c5c624de-42bb-45cf-8cf0-b25be56e5f58@frequentis.com> <YQBPR0101MB1042DEF0825996764CBCA829DDC40@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFB90E@vie196nt> <YQBPR0101MB1042479407CAA253674BBAEBDDDB0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> <D890568E1D8DD044AA846C56245166780124AFBD21@vie196nt>, <D890568E1D8DD044AA846C56245166780124AFBD91@vie196nt> <YQBPR0101MB104225B6884FEC70A03C61CCDDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFC0E2@vie196nt>, <YQBPR0101MB1042040D2BFB3681E940D271DDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <2feda1e2-16d5-43b5-98eb-dcc71cc67c6f@frequentis.com> <YQBPR0101MB10427C97161C74A5C441D1DCDDD80@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Hi,
attached the trace. If I see it correct it uses FORE_OR_BOTH. (bctsa_dir: CDFC4_FORE_OR_BOTH (0x00000003))
The trace is only with the first patch, have not compiled the wantdeleg patches so far.
I think this is related to the BIND_CONN_TO_SESSION; after a disconnect the ESXi cannot connect to the NFS also with this warning:
2018-03-07T16:55:11.227Z cpu21:66484)WARNING: NFS41: NFS41_Bug:2361: BUG - Invalid BIND_CONN_TO_SESSION error: NFS4ERR_NOTSUPP
Another thing I noticed today is that it is not possible to delete a folder with the ESXi datastorebrowser on the NFS mount. Maybe it is a VMWare bug, but with NFS3 it works.
Here the vmkernel.log with only one connection contains mounting, trying to delete a folder and disconnect:
2018-03-07T16:46:04.543Z cpu12:68008 opID=55bea165)World: 12235: VC opID c55dbe59 maps to vmkernel opID 55bea165
2018-03-07T16:46:04.543Z cpu12:68008 opID=55bea165)NFS41: NFS41_VSIMountSet:423: Mount server: 10.0.0.225, port: 2049, path: /, label: nfsds1, security: 1 user: , options: <none>
2018-03-07T16:46:04.543Z cpu12:68008 opID=55bea165)StorageApdHandler: 977: APD Handle Created with lock[StorageApd-0x43046e4c6d70]
2018-03-07T16:46:04.544Z cpu11:66486)NFS41: NFS41ProcessClusterProbeResult:3873: Reclaiming state, cluster 0x43046e4c7ee0 [7]
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)NFS41: NFS41FSCompleteMount:3791: Lease time: 120
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)NFS41: NFS41FSCompleteMount:3792: Max read xfer size: 0x20000
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)NFS41: NFS41FSCompleteMount:3793: Max write xfer size: 0x20000
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)NFS41: NFS41FSCompleteMount:3794: Max file size: 0x800000000000
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)NFS41: NFS41FSCompleteMount:3795: Max file name: 255
2018-03-07T16:46:04.545Z cpu12:68008 opID=55bea165)WARNING: NFS41: NFS41FSCompleteMount:3800: The max file name size (255) of file system is larger than that of FSS (128)
2018-03-07T16:46:04.546Z cpu12:68008 opID=55bea165)NFS41: NFS41FSAPDNotify:5960: Restored connection to the server 10.0.0.225 mount point nfsds1, mounted as 1a7893c8-eec764a7-0000-000000000000 ("/")
2018-03-07T16:46:04.546Z cpu12:68008 opID=55bea165)NFS41: NFS41_VSIMountSet:435: nfsds1 mounted successfully
2018-03-07T16:47:19.869Z cpu21:67981 opID=e47706ec)World: 12235: VC opID c55dbe91 maps to vmkernel opID e47706ec
2018-03-07T16:47:19.869Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.870Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.870Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.870Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.870Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.871Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.871Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.871Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.871Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.872Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.872Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.872Z cpu21:67981 opID=e47706ec)WARNING: UserFile: 2155: hostd-worker: Directory changing too often to perform readdir operation (11 retries), returning busy
2018-03-07T16:47:19.874Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.875Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.875Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.875Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.875Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.876Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.876Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.876Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.876Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.877Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.877Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.877Z cpu21:67981 opID=e47706ec)WARNING: UserFile: 2155: hostd-worker: Directory changing too often to perform readdir operation (11 retries), returning busy
2018-03-07T16:47:19.879Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.879Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.879Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.880Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.880Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.880Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.880Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.881Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.881Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.881Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.881Z cpu21:67981 opID=e47706ec)WARNING: NFS41: NFS41FileOpReaddir:4728: Failed to process READDIR result for fh 0x43046e4c6398: Transient file system condition, suggest retry
2018-03-07T16:47:19.881Z cpu21:67981 opID=e47706ec)WARNING: UserFile: 2155: hostd-worker: Directory changing too often to perform readdir operation (11 retries), returning busy
2018-03-07T16:49:41.154Z cpu23:85446)WARNING: SunRPC: 3987: fail all pending calls for client 0x4303cbe221f0 IP 10.0.0.225.8.1 (socket half closed)
2018-03-07T16:49:41.155Z cpu11:66486)WARNING: NFS41: NFS41FSAPDNotify:5908: Lost connection to the server 10.0.0.225 mount point nfsds1, mounted as 1a7893c8-eec764a7-0000-000000000000 ("/")
-----Original Message-----
From: Rick Macklem [mailto:rmacklem@uoguelph.ca]
Sent: Mittwoch, 7. Mrz 2018 16:07
To: NAGY Andreas <Andreas.Nagy@frequentis.com>; 'freebsd-stable@freebsd.org' <freebsd-stable@freebsd.org>
Subject: Re: NFS 4.1 RECLAIM_COMPLETE FSfailed error in combination with ESXi client
NAGY Andreas wrote:
>Okay, that was the main reason for using NFS 4.1.
>Is it planned to implement it, or is the focus on pNFS?
I took a quick look and implementing this for some cases will be pretty easy. Binding a FORE channel is implied, so for that case all the server does is reply OK to the BIND_CONN_TO_SESSION.
To know if the ESXi client case is a simple one, I need to see what the BIND_CONN_TO_SESSION arguments look like.
If you can capture packets for when this second connection is done and email it to me as an attachment, I can look at what the BIND_CONN_TO_SESSION args are.
# tcpdump -s 0 -w <file.pcap> host <client-host-for-this-connection> run on the FreeBSD server should get the <file.pcap> I need.
Alternately, if you have wireshark handy, you can just use it to look for the BIND_CONN_TO_SESSION request and see if it specifies (FORE, BACK, FORE_OR_BOTH or BACK_OR_BOTH) in it.
FORE or FORE_OR_BOTH means it is easy to do and I can probably have a patch for testing in a day or two.
rick
[-- Attachment #2 --]
ò ZZ@S J J &U7 PVk[ E <@ @5E
u4~ <