Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 2 Mar 2024 06:52:58 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        Ronald Klop <ronald-lists@klop.ws>, Garrett Wollman <wollman@bimajority.org>, stable@freebsd.org,  rmacklem@freebsd.org
Subject:   Re: 13-stable NFS server hang
Message-ID:  <CAM5tNy5E4xpPnDpon-y2XoTNiY7=TDkdVBpE1BBk9fYFsd7%2BcA@mail.gmail.com>
In-Reply-To: <ZeMz2MoM-7LMQGsX@kib.kiev.ua>
References:  <CAM5tNy6v3N-uiULGA0vb_2s0GK1atRh6TYNDGfYMK0PkP46BbQ@mail.gmail.com> <1020651467.1592.1709280020993@localhost> <CAM5tNy4ras1NN%2BLC7=gpyFqEefHpWCrSV-_aSyn-D6Kt8Fvw6Q@mail.gmail.com> <ZeLMaphyKOJzQkcu@kib.kiev.ua> <CAM5tNy7yp9-h1B8kXhbzToDJn%2B3b2xQScarME8xV%2Bpbana-5VQ@mail.gmail.com> <ZeMz2MoM-7LMQGsX@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 2, 2024 at 6:13=E2=80=AFAM Konstantin Belousov <kib@freebsd.org=
> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. =
Do not click links or open attachments unless you recognize the sender and =
know the content is safe. If in doubt, forward suspicious emails to IThelp@=
uoguelph.ca.
>
>
> On Sat, Mar 02, 2024 at 05:40:08AM -0800, Rick Macklem wrote:
> > On Fri, Mar 1, 2024 at 10:51=E2=80=AFPM Konstantin Belousov <kib@freebs=
d.org> wrote:
> > >
> > > CAUTION: This email originated from outside of the University of Guel=
ph. Do not click links or open attachments unless you recognize the sender =
and know the content is safe. If in doubt, forward suspicious emails to ITh=
elp@uoguelph.ca.
> > >
> > >
> > > On Fri, Mar 01, 2024 at 06:23:56AM -0800, Rick Macklem wrote:
> > > > On Fri, Mar 1, 2024 at 12:00=E2=80=AFAM Ronald Klop <ronald-lists@k=
lop.ws> wrote:
> > > > >
> > > > > Interesting read.
> > > > >
> > > > >  Would it be possible to separate locking for admin actions like =
a client mounting an fs from traffic flowing for file operations?
> > > > Well, the NFS server does not really have any concept of a mount.
> > > > What I am referring to is the ClientID maintained for NFSv4 mounts,
> > > > which all the open/lock/session/layout state hangs off of.
> > > >
> > > > For most cases, this state information can safely be accessed/modif=
ied
> > > > via a mutex, but there are three exceptions:
> > > > - creating a new ClientID (which is done by the ExchangeID operatio=
n)
> > > >   and typically happens when a NFS client does a mount.
> > > > - delegation Recall (which only happens when delegations are enable=
d)
> > > >   One of the reasons delegations are not enabled by default on the
> > > > FreeBSD server.
> > > > - the DestroyClientID which is typically done by a NFS client durin=
g dismount.
> > > > For these cases, it is just too difficult to do them without sleepi=
ng.
> > > > As such, there is a sleep lock which the nfsd threads normally acqu=
ire shared
> > > > when doing NFSv4 operations, but for the above cases the lock is aq=
uired
> > > > exclusive.
> > > > - I had to give the exclusive lock priority over shared lock
> > > > acquisition (it is a
> > > >   custom locking mechanism with assorted weirdnesses) because witho=
ut
> > > >   that someone reported that new mounts took up to 1/2hr to occur.
> > > >   (The exclusive locker waited for 30min before all the other nfsd =
threads
> > > >    were not busy.)
> > > >   Because of this priority, once a nfsd thread requests the exclusi=
ve lock,
> > > >   all other nfsd threads executing NFSv4 RPCs block after releasing=
 their
> > > >   shared lock, until the exclusive locker releases the exclusive lo=
ck.
> > > Normal lockmgr locks + TDP_DEADLKTREAT private thread flag provide th=
e
> > > property of pref. exclusive waiters in presence of the shared waiters=
.
> > > I think this is what you described above.
> > It also has some weird properties, like if there are multiple requestor=
s
> > for the exclusive lock, once one thread gets it (the threads are nfsd w=
orker
> > threads and indistinct), the others that requested an exclusive thread =
are
> > unblocked without the lock being issued to them.
> This sounds to me as LK_SLEEPFAIL feature of lockmgr.
> Do not underestimate the amount of weird features in it.
Yep, sounds like it. I should take a look to see if lockmgr will work
instead of the "rolled my own".

I should also take another look at new client creation, to see if there is =
a
way to do it that doesn't require the exclusive lock (a lot of that code is
20years old now).

rick

>
> > They then check if the exclusive lock is still needed (usually not, sin=
ce
> > the other thread has dealt with the case where it was needed) and
> > then they can acquire a shared lock.
> > Without this, there were cases where several threads would acquire
> > the exclusive lock and then discover that the lock was not needed and
> > just release it again.
> >
> > It also uses an assortment of weird flags/call args.
> >
> > rick
> >



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy5E4xpPnDpon-y2XoTNiY7=TDkdVBpE1BBk9fYFsd7%2BcA>