Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Mar 2015 14:04:18 +0000
From:      Tim Borgeaud <timothy.borgeaud@framestore.com>
To:        freebsd-net@freebsd.org, Mark Hills <mark.hills@framestore.com>
Subject:   A defensive NFS server (sbwait, flow control)
Message-ID:  <CADqOPxsAeViRBJ5a6z2LodikKx1EqE_Na7QsUF43tXX8K3PCFQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi again FreeBSD folks,

A short while ago I sent a couple of emails regarding the idea of 'fair
share' NFS scheduling. Amongst others, Garrett Wollman replied, and also
sent a related email "Implementing backpressure in the NFS server". The
common theme: to provide a level of service in situations where requests
from some clients might tie up a lot of resources.

There are various issues to consider. We might say that we're looking  at
'defensive' functionality, and we've made some experimental progress to:

  round robin requests from different users

  provide some limits (flow control) on the number of requests read
  from a single transport

Testing has highlighted the issue that Garrett mentioned. A client can make
a set of concurrent requests and then tie up nfsd threads as they attempt
to send replies.

To be more specific, we seem to be hitting the situation in which an nfsd
thread sits in the "sbwait" state, waiting for a reply to be drained (by a
suspended client). Other threads subsequently pick up requests for the same
transport and then queue up waiting for a lock on the same socket.

I'm not sure of the exact situation in which the sbwait arises. It's easily
repeatable only where there are multiple concurrent requests from the same
transport.

Our testing is fairly synthetic, but it looks like this is a situation that
has been noticed before. Having a pool of spare threads doesn't seem like a
very robust solution. In fact, if more than the server minimum get tied up,
then, if load and threads fall, we end up with _all_ remaining threads
blocked (no more nfs service).

How to address this particular issue is not obvious to me. I assume there
are options including:

 - prevent thread blocking when sending replies

 - timeouts for sending replies (NFS or RPC level?)

 - serialize the sending of nfs/rpc replies to avoid multiple
   nfsd threads waiting on the same transport.

Does anyone have any thoughts about this? Either this particular issue or
more general direction for a 'defensive' server?

--
Tim Borgeaud
Systems Developer



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADqOPxsAeViRBJ5a6z2LodikKx1EqE_Na7QsUF43tXX8K3PCFQ>