Date: Thu, 9 Jul 2015 14:32:14 -0400 From: Garrett Wollman <wollman@csail.mit.edu> To: freebsd-fs@freebsd.org Cc: rmacklem@freebsd.org Subject: How does NFS respond when a VFS operation gives ERESTART? Message-ID: <21918.48686.157217.979707@khavrinen.csail.mit.edu>
next in thread | raw e-mail | index | archive | help
When networked filesystems are not involved, the special error code [ERESTART] can be returned by the implementation of any system call, with the effect of causing the system call to be restarted when execution hits the kernel-user boundary, rather than returning to userland. This is used to allow certain system calls to be restarted after being interrupted by a signal. However, this normally only applies to system calls which might potentially sleep for a long time -- such as write() to a socket or a tty -- and not to disk I/O, which is normally uninterruptible. In investigating an issue reported by our users, it appears to me from an inspection of the code that ZFS can sometimes give an [ERESTART] condition, specifically when writing to a dataset that has reached its quota, AND there are pending block free operations that would reduce usage below the quota. But I don't see any code in the NFS (or kernel RPC) implementation that would actually handle this case, and of course the NFS server doesn't normally hit the user-kernel boundary at all. So does anyone have a theory about what actually happens in this case, and what *should* happen? It doesn't seem useful to just spin on the one operation over and over again until the blocks are freed (which I think might take a full ZFS transaction sync interval). The actual symptom which I'm investigating is that sometimes -- despite my fixes to the throttling code -- the server is still getting throttled, with thousands of requests enqueued for the same file. (The FHA code does a nice job of directing them all to the appropriate set of service threads, but that doesn't help the other clients get anything done because of the global throttle.) These seem not to make any progress for a long time, but the condition ultimately clears by itself -- what I'm trying to figure out is why so many requests get queued and don't make progress, and so far this seems to be related to hitting the quota on the filesystem. So [ERESTART] may be a total red herring, but it was something that stuck out at me when I was reviewing the code paths that could set [EDQUOT]. -GAWollman
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21918.48686.157217.979707>