From owner-freebsd-current@FreeBSD.ORG Tue May 19 11:14:25 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E20C9106564A; Tue, 19 May 2009 11:14:25 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 645E58FC15; Tue, 19 May 2009 11:14:25 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n4JBEKtL025680 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 19 May 2009 11:14:21 GMT (envelope-from ben@wanderview.com) Message-Id: From: Ben Kelly To: Attilio Rao In-Reply-To: <3bbf2fe10905190240g3e6dd267nf1621ca4e54d7a85@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Tue, 19 May 2009 07:14:20 -0400 References: <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> <200905181129.51526.jhb@freebsd.org> <3bbf2fe10905181012t4bde260bp31181e3ea7b03a42@mail.gmail.com> <200905181331.11174.jhb@freebsd.org> <3bbf2fe10905181038geaec26csffea4788a40feaca@mail.gmail.com> <34451C28-9ADF-467B-B2C8-43498C87C0C2@wanderview.com> <3bbf2fe10905190240g3e6dd267nf1621ca4e54d7a85@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Spam-Score: -1.44 () ALL_TRUSTED,AWL X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: Adam McDougall , freebsd-current@freebsd.org, Artem Belevich Subject: Re: [patch] zfs livelock and thread priorities X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 May 2009 11:14:26 -0000 On May 19, 2009, at 5:40 AM, Attilio Rao wrote: > 2009/5/19 Ben Kelly : >> On May 18, 2009, at 1:38 PM, Attilio Rao wrote: >>> >>> OMG. >>> This still doesn't explain priorities like 49 or such seen in the >>> first report as long as we don't set priorities by hand, >> >> I'm trying to understand why this particular priority value is so >> concerning, but I'm a little bit confused. Can you elaborate on >> why you >> think its a problem? From previous off-list e-mails I get the >> impression >> that you are concerned that it does not fall on an RQ_PPQ >> boundary. Is this >> the case? Again, I may be completely confused, but ULE does not >> seem to >> consider RQ_PPQ when it assigns priorities for interactive >> threads. Here is >> how I came to this conclusion: > > I'm concerned because the first starvation I saw in this thread was > caused by the proprity lowered inappropriately (it was 49 on 45 IIRC). > 49 means that the thread will never be choosen when the 45s are still > in the runqueue. I'm not concerned on RQ_PPQ boundaries. Ah, ok. Sorry for my confusion. I guess the condition seemed somewhat reasonable to me because the behavior of the 45s probably looks very interactive to the scheduler. The user threads wake up, see that there is no space in the arc, signal the txg threads, then sleep. The txg threads then wake up, see that the spa_zio threads are not done, signal all the user threads, then sleep. They bounce back and forth like this very quickly while waiting for data to be flushed to the disk. (On my system this can take a while since my backup pool is on a set of encrypted external USB drives.) It seems likely that their runtime and sleeptime values are balanced so the scheduler marks them as high priority interactive threads. So to me the interprocess communication within zfs appears to be somewhat brain damaged in low memory conditions, but I do not think it points to a problem in the scheduler. It seems that no matter what algorithm the scheduler uses to determine interactivity an application will be able to devise a perverse work load that will be misclassified. Anyway, that was my rough guestimate of what was happening. If you have time to do a more thorough analysis of the ktr dump that would be great. Thanks again for your help! - Ben