From owner-freebsd-fs@freebsd.org Thu Jul 9 20:12:14 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C7F319978AD for ; Thu, 9 Jul 2015 20:12:14 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 6283018C8; Thu, 9 Jul 2015 20:12:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DyBAAf1Z5V/61jaINbDoNYYAaDGrgQgWcKhS1KAoIaEwEBAQEBAQGBCoQjAQEBAwEBAQEgKyALBQsCAQgYAgINGQICJwEJJgIECAcEARwEiAUIDbkBljcBAQEHAQEBAR6BIYoqhDQBAQIDFzQHgi07EoExBZQthGeESIRTlmMCJmOCWloiMQd+CBcjgQQBAQE X-IronPort-AV: E=Sophos;i="5.15,441,1432612800"; d="scan'208";a="222736514" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 09 Jul 2015 16:12:12 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 0E3DD15F564; Thu, 9 Jul 2015 16:12:12 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 4ywuhBGDCSW7; Thu, 9 Jul 2015 16:12:11 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 62A1515F565; Thu, 9 Jul 2015 16:12:11 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ji70xgUtr_h7; Thu, 9 Jul 2015 16:12:11 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 46BD915F564; Thu, 9 Jul 2015 16:12:11 -0400 (EDT) Date: Thu, 9 Jul 2015 16:12:11 -0400 (EDT) From: Rick Macklem To: Garrett Wollman Cc: freebsd-fs@freebsd.org, rmacklem@freebsd.org Message-ID: <689709398.6876771.1436472731160.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <21918.48686.157217.979707@khavrinen.csail.mit.edu> References: <21918.48686.157217.979707@khavrinen.csail.mit.edu> Subject: Re: How does NFS respond when a VFS operation gives ERESTART? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: How does NFS respond when a VFS operation gives ERESTART? Thread-Index: 11bCQDl03U1gWGMn/4kqmO60HV0mvw== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2015 20:12:15 -0000 Garrett Wollman wrote: > When networked filesystems are not involved, the special error code > [ERESTART] can be returned by the implementation of any system call, > with the effect of causing the system call to be restarted when > execution hits the kernel-user boundary, rather than returning to > userland. This is used to allow certain system calls to be restarted > after being interrupted by a signal. However, this normally only > applies to system calls which might potentially sleep for a long time > -- such as write() to a socket or a tty -- and not to disk I/O, which > is normally uninterruptible. > > In investigating an issue reported by our users, it appears to me from > an inspection of the code that ZFS can sometimes give an [ERESTART] > condition, specifically when writing to a dataset that has reached its > quota, AND there are pending block free operations that would reduce > usage below the quota. But I don't see any code in the NFS (or kernel > RPC) implementation that would actually handle this case, and of > course the NFS server doesn't normally hit the user-kernel boundary at > all. So does anyone have a theory about what actually happens in this > case, and what *should* happen? It doesn't seem useful to just spin > on the one operation over and over again until the blocks are freed > (which I think might take a full ZFS transaction sync interval). > Well, I'll admit I'm not sure I really understand the situation, but... My best guess would be have the NFS server reply NFSERR_DELAY to the client. (NFSERR_DELAY doesn't exist for NFSv2, but I suspect you don't care about NFSv2?) NFSERR_DELAY - Tells the client to wait a while (the RFCs don't define how long) and then try the RPC again. Does this sound like it would work? If it sounds reasonable, I think patching the server to do this shouldn't be too hard. rick > The actual symptom which I'm investigating is that sometimes -- > despite my fixes to the throttling code -- the server is still getting > throttled, with thousands of requests enqueued for the same file. > (The FHA code does a nice job of directing them all to the appropriate > set of service threads, but that doesn't help the other clients get > anything done because of the global throttle.) These seem not to make > any progress for a long time, but the condition ultimately clears by > itself -- what I'm trying to figure out is why so many requests get > queued and don't make progress, and so far this seems to be related to > hitting the quota on the filesystem. So [ERESTART] may be a total red > herring, but it was something that stuck out at me when I was > reviewing the code paths that could set [EDQUOT]. > > -GAWollman > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >