Date: Thu, 9 Jul 2015 16:12:11 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Garrett Wollman <wollman@csail.mit.edu> Cc: freebsd-fs@freebsd.org, rmacklem@freebsd.org Subject: Re: How does NFS respond when a VFS operation gives ERESTART? Message-ID: <689709398.6876771.1436472731160.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <21918.48686.157217.979707@khavrinen.csail.mit.edu> References: <21918.48686.157217.979707@khavrinen.csail.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Wollman wrote: > When networked filesystems are not involved, the special error code > [ERESTART] can be returned by the implementation of any system call, > with the effect of causing the system call to be restarted when > execution hits the kernel-user boundary, rather than returning to > userland. This is used to allow certain system calls to be restarted > after being interrupted by a signal. However, this normally only > applies to system calls which might potentially sleep for a long time > -- such as write() to a socket or a tty -- and not to disk I/O, which > is normally uninterruptible. > > In investigating an issue reported by our users, it appears to me from > an inspection of the code that ZFS can sometimes give an [ERESTART] > condition, specifically when writing to a dataset that has reached its > quota, AND there are pending block free operations that would reduce > usage below the quota. But I don't see any code in the NFS (or kernel > RPC) implementation that would actually handle this case, and of > course the NFS server doesn't normally hit the user-kernel boundary at > all. So does anyone have a theory about what actually happens in this > case, and what *should* happen? It doesn't seem useful to just spin > on the one operation over and over again until the blocks are freed > (which I think might take a full ZFS transaction sync interval). > Well, I'll admit I'm not sure I really understand the situation, but... My best guess would be have the NFS server reply NFSERR_DELAY to the client. (NFSERR_DELAY doesn't exist for NFSv2, but I suspect you don't care about NFSv2?) NFSERR_DELAY - Tells the client to wait a while (the RFCs don't define how long) and then try the RPC again. Does this sound like it would work? If it sounds reasonable, I think patching the server to do this shouldn't be too hard. rick > The actual symptom which I'm investigating is that sometimes -- > despite my fixes to the throttling code -- the server is still getting > throttled, with thousands of requests enqueued for the same file. > (The FHA code does a nice job of directing them all to the appropriate > set of service threads, but that doesn't help the other clients get > anything done because of the global throttle.) These seem not to make > any progress for a long time, but the condition ultimately clears by > itself -- what I'm trying to figure out is why so many requests get > queued and don't make progress, and so far this seems to be related to > hitting the quota on the filesystem. So [ERESTART] may be a total red > herring, but it was something that stuck out at me when I was > reviewing the code paths that could set [EDQUOT]. > > -GAWollman > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?689709398.6876771.1436472731160.JavaMail.zimbra>