FreeBSD Mail Archives

Date:      Fri, 1 Mar 2024 09:00:20 +0100 (CET)
From:      Ronald Klop <ronald-lists@klop.ws>
To:        Rick Macklem <rick.macklem@gmail.com>
Cc:        Garrett Wollman <wollman@bimajority.org>, stable@freebsd.org, rmacklem@freebsd.org
Subject:   Re: 13-stable NFS server hang
Message-ID:  <1020651467.1592.1709280020993@localhost>
In-Reply-To: <CAM5tNy6v3N-uiULGA0vb_2s0GK1atRh6TYNDGfYMK0PkP46BbQ@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail


[-- Attachment #1 --]
Interesting read. 

 Would it be possible to separate locking for admin actions like a client mounting an fs from traffic flowing for file operations?

Like ongoing file operations could have a read only view/copy of the mount table. Only new operations will have to wait.
But the mount never needs to wait for ongoing operations before locking the structure. 

Just a thought in the morning

Regards,
Ronald.

Van: Rick Macklem <rick.macklem@gmail.com>
Datum: 1 maart 2024 00:31
Aan: Garrett Wollman <wollman@bimajority.org>
CC: stable@freebsd.org, rmacklem@freebsd.org
Onderwerp: Re: 13-stable NFS server hang

> 
> 
> On Wed, Feb 28, 2024 at 4:04PM Rick Macklem  wrote:
> >
> > On Tue, Feb 27, 2024 at 9:30PM Garrett Wollman  wrote:
> > >
> > > Hi, all,
> > >
> > > We've had some complaints of NFS hanging at unpredictable intervals.
> > > Our NFS servers are running a 13-stable from last December, and
> > > tonight I sat in front of the monitor watching `nfsstat -dW`.  I was
> > > able to clearly see that there were periods when NFS activity would
> > > drop *instantly* from 30,000 ops/s to flat zero, which would last
> > > for about 25 seconds before resuming exactly as it was before.
> > >
> > > I wrote a little awk script to watch for this happening and run
> > > `procstat -k` on the nfsd process, and I saw that all but two of the
> > > service threads were idle.  The three nfsd threads that had non-idle
> > > kstacks were:
> > >
> > >   PID    TID COMM                TDNAME              KSTACK
> > >   997 108481 nfsd                nfsd: master        mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrvd_dorpc nfssvc_program svc_run_internal svc_run nfsrvd_nfsd nfssvc_nfsd sys_nfssvc amd64_syscall fast_syscall_common
> > >   997 960918 nfsd                nfsd: service       mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrv_setclient nfsrvd_exchangeid nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline
> > >   997 962232 nfsd                nfsd: service       mi_switch _cv_wait txg_wait_synced_impl txg_wait_synced dmu_offset_next zfs_holey zfs_freebsd_ioctl vn_generic_copy_file_range vop_stdcopy_file_range VOP_COPY_FILE_RANGE vn_copy_file_range nfsrvd_copy_file_range nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline
> > >
> > > I'm suspicious of two things: first, the copy_file_range RPC; second,
> > > the "master" nfsd thread is actually servicing an RPC which requires
> > > obtaining a lock.  The "master" getting stuck while performing client
> > > RPCs is, I believe, the reason NFS service grinds to a halt when a
> > > client tries to write into a near-full filesystem, so this problem
> > > would be more evidence that the dispatching function should not be
> > > mixed with actual operations.  I don't know what the clients are
> > > doing, but is it possible that nfsrvd_copy_file_range is holding a
> > > lock that is needed by one or both of the other two threads?
> > >
> > > Near-term I could change nfsrvd_copy_file_range to just
> > > unconditionally return NFSERR_NOTSUP and force the clients to fall
> > > back, but I figured I would ask if anyone else has seen this.
> > I have attached a little patch that should limit the server's Copy size
> > to vfs.nfsd.maxcopyrange (default of 10Mbytes).
> > Hopefully this makes sure that the Copy does not take too long.
> >
> > You could try this instead of disabling Copy. It would be nice to know if
> > this is suffciient? (If not, I'll probably add a sysctl to disable Copy.)
> I did a quick test without/with this patch,where I copied a 1Gbyte file.
> 
> Without this patch, the Copy RPCs mostly replied in just under 1sec
> (which is what the flag requests), but took over 4sec for one of the Copy
> operations. This implies that one Read/Write of 1Mbyte on the server
> took over 3 seconds.
> I noticed the first Copy did over 600Mbytes, but the rest did about 100Mbytes
> each and it was one of these 100Mbyte Copy operations that took over 4sec.
> 
> With the patch, there were a lot more Copy RPCs (as expected) of 10Mbytes
> each and they took a consistent 0.25-0.3sec to reply. (This is a test of a local
> mount on an old laptop, so nowhere near a server hardware config.)
> 
> So, the patch might be sufficient?
> 
> It would be nice to avoid disabling Copy, since it avoids reading the data
> into the client and then writing it back to the server.
> 
> I will probably commit both patches (10Mbyte clip of Copy size and
> disabling Copy) to main soon, since I cannot say if clipping the size
> of the Copy will always be sufficient.
> 
> Pleas let us know how trying these patches goes, rick
> 
> >
> > rick
> >
> > >
> > > -GAWollman
> > >
> > >
> 
> 
> 
> 
> 
[-- Attachment #2 --]
<html><head></head><body>Interesting read.&nbsp;<div><br></div><div>&nbsp;Would it be possible to separate locking for admin actions like a client mounting an fs from traffic flowing for file operations?</div><div><br></div><div>Like ongoing file operations could have a read only view/copy of the mount table. Only new operations will have to wait.</div><div>But the mount never needs to wait for ongoing operations before locking the structure.&nbsp;</div><div><br></div><div>Just a thought in the morning</div><div><br></div><div>Regards,</div><div>Ronald.</div><div><br><p><small><strong>Van:</strong> Rick Macklem &lt;rick.macklem@gmail.com&gt;<br><strong>Datum:</strong> 1 maart 2024 00:31<br><strong>Aan:</strong> Garrett Wollman &lt;wollman@bimajority.org&gt;<br><strong>CC:</strong> stable@freebsd.org, rmacklem@freebsd.org<br><strong>Onderwerp:</strong> Re: 13-stable NFS server hang<br></small></p><blockquote style="margin-left: 5px; border-left: 3px solid #ccc; margin-right: 0px; padding-left: 5px;"><div class="MessageRFC822Viewer do_not_remove" id="P"><!-- P -->
<!-- processMimeMessage --><div class="TextPlainViewer do_not_remove" id="P.P"><!-- P.P -->On Wed, Feb 28, 2024 at 4:04PM Rick Macklem <rick.macklem@gmail.com> wrote:<br>
&gt;<br>
&gt; On Tue, Feb 27, 2024 at 9:30PM Garrett Wollman <wollman@bimajority.org> wrote:<br>
&gt; &gt;<br>
&gt; &gt; Hi, all,<br>
&gt; &gt;<br>
&gt; &gt; We've had some complaints of NFS hanging at unpredictable intervals.<br>
&gt; &gt; Our NFS servers are running a 13-stable from last December, and<br>
&gt; &gt; tonight I sat in front of the monitor watching `nfsstat -dW`. &nbsp;I was<br>
&gt; &gt; able to clearly see that there were periods when NFS activity would<br>
&gt; &gt; drop *instantly* from 30,000 ops/s to flat zero, which would last<br>
&gt; &gt; for about 25 seconds before resuming exactly as it was before.<br>
&gt; &gt;<br>
&gt; &gt; I wrote a little awk script to watch for this happening and run<br>
&gt; &gt; `procstat -k` on the nfsd process, and I saw that all but two of the<br>
&gt; &gt; service threads were idle. &nbsp;The three nfsd threads that had non-idle<br>
&gt; &gt; kstacks were:<br>
&gt; &gt;<br>
&gt; &gt; &nbsp;&nbsp;PID &nbsp;&nbsp;&nbsp;TID COMM &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TDNAME &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;KSTACK<br>
&gt; &gt; &nbsp;&nbsp;997 108481 nfsd &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nfsd: master &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrvd_dorpc nfssvc_program svc_run_internal svc_run nfsrvd_nfsd nfssvc_nfsd sys_nfssvc amd64_syscall fast_syscall_common<br>
&gt; &gt; &nbsp;&nbsp;997 960918 nfsd &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nfsd: service &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrv_setclient nfsrvd_exchangeid nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline<br>
&gt; &gt; &nbsp;&nbsp;997 962232 nfsd &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nfsd: service &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mi_switch _cv_wait txg_wait_synced_impl txg_wait_synced dmu_offset_next zfs_holey zfs_freebsd_ioctl vn_generic_copy_file_range vop_stdcopy_file_range VOP_COPY_FILE_RANGE vn_copy_file_range nfsrvd_copy_file_range nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline<br>
&gt; &gt;<br>
&gt; &gt; I'm suspicious of two things: first, the copy_file_range RPC; second,<br>
&gt; &gt; the "master" nfsd thread is actually servicing an RPC which requires<br>
&gt; &gt; obtaining a lock. &nbsp;The "master" getting stuck while performing client<br>
&gt; &gt; RPCs is, I believe, the reason NFS service grinds to a halt when a<br>
&gt; &gt; client tries to write into a near-full filesystem, so this problem<br>
&gt; &gt; would be more evidence that the dispatching function should not be<br>
&gt; &gt; mixed with actual operations. &nbsp;I don't know what the clients are<br>
&gt; &gt; doing, but is it possible that nfsrvd_copy_file_range is holding a<br>
&gt; &gt; lock that is needed by one or both of the other two threads?<br>
&gt; &gt;<br>
&gt; &gt; Near-term I could change nfsrvd_copy_file_range to just<br>
&gt; &gt; unconditionally return NFSERR_NOTSUP and force the clients to fall<br>
&gt; &gt; back, but I figured I would ask if anyone else has seen this.<br>
&gt; I have attached a little patch that should limit the server's Copy size<br>
&gt; to vfs.nfsd.maxcopyrange (default of 10Mbytes).<br>
&gt; Hopefully this makes sure that the Copy does not take too long.<br>
&gt;<br>
&gt; You could try this instead of disabling Copy. It would be nice to know if<br>
&gt; this is suffciient? (If not, I'll probably add a sysctl to disable Copy.)<br>
I did a quick test without/with this patch,where I copied a 1Gbyte file.<br>
<br>
Without this patch, the Copy RPCs mostly replied in just under 1sec<br>
(which is what the flag requests), but took over 4sec for one of the Copy<br>
operations. This implies that one Read/Write of 1Mbyte on the server<br>
took over 3 seconds.<br>
I noticed the first Copy did over 600Mbytes, but the rest did about 100Mbytes<br>
each and it was one of these 100Mbyte Copy operations that took over 4sec.<br>
<br>
With the patch, there were a lot more Copy RPCs (as expected) of 10Mbytes<br>
each and they took a consistent 0.25-0.3sec to reply. (This is a test of a local<br>
mount on an old laptop, so nowhere near a server hardware config.)<br>
<br>
So, the patch might be sufficient?<br>
<br>
It would be nice to avoid disabling Copy, since it avoids reading the data<br>
into the client and then writing it back to the server.<br>
<br>
I will probably commit both patches (10Mbyte clip of Copy size and<br>
disabling Copy) to main soon, since I cannot say if clipping the size<br>
of the Copy will always be sufficient.<br>
<br>
Pleas let us know how trying these patches goes, rick<br>
<br>
&gt;<br>
&gt; rick<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt; -GAWollman<br>
&gt; &gt;<br>
&gt; &gt;<br>
<br>
</wollman@bimajority.org></rick.macklem@gmail.com></div><!-- TextPlainViewer -->
<hr>
</div><!-- MessageRFC822Viewer -->
</blockquote><br><br><br></div></body></html>

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1020651467.1592.1709280020993>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation