Date: Thu, 12 Mar 2015 14:04:18 +0000 From: Tim Borgeaud <timothy.borgeaud@framestore.com> To: freebsd-net@freebsd.org, Mark Hills <mark.hills@framestore.com> Subject: A defensive NFS server (sbwait, flow control) Message-ID: <CADqOPxsAeViRBJ5a6z2LodikKx1EqE_Na7QsUF43tXX8K3PCFQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi again FreeBSD folks, A short while ago I sent a couple of emails regarding the idea of 'fair share' NFS scheduling. Amongst others, Garrett Wollman replied, and also sent a related email "Implementing backpressure in the NFS server". The common theme: to provide a level of service in situations where requests from some clients might tie up a lot of resources. There are various issues to consider. We might say that we're looking at 'defensive' functionality, and we've made some experimental progress to: round robin requests from different users provide some limits (flow control) on the number of requests read from a single transport Testing has highlighted the issue that Garrett mentioned. A client can make a set of concurrent requests and then tie up nfsd threads as they attempt to send replies. To be more specific, we seem to be hitting the situation in which an nfsd thread sits in the "sbwait" state, waiting for a reply to be drained (by a suspended client). Other threads subsequently pick up requests for the same transport and then queue up waiting for a lock on the same socket. I'm not sure of the exact situation in which the sbwait arises. It's easily repeatable only where there are multiple concurrent requests from the same transport. Our testing is fairly synthetic, but it looks like this is a situation that has been noticed before. Having a pool of spare threads doesn't seem like a very robust solution. In fact, if more than the server minimum get tied up, then, if load and threads fall, we end up with _all_ remaining threads blocked (no more nfs service). How to address this particular issue is not obvious to me. I assume there are options including: - prevent thread blocking when sending replies - timeouts for sending replies (NFS or RPC level?) - serialize the sending of nfs/rpc replies to avoid multiple nfsd threads waiting on the same transport. Does anyone have any thoughts about this? Either this particular issue or more general direction for a 'defensive' server? -- Tim Borgeaud Systems Developer
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADqOPxsAeViRBJ5a6z2LodikKx1EqE_Na7QsUF43tXX8K3PCFQ>