From owner-freebsd-hackers@FreeBSD.ORG Fri Jun 19 13:12:06 2015 Return-Path: Delivered-To: freebsd-hackers@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9E2CF957 for ; Fri, 19 Jun 2015 13:12:06 +0000 (UTC) (envelope-from cmfitch1@gmail.com) Received: from mail-ig0-x235.google.com (mail-ig0-x235.google.com [IPv6:2607:f8b0:4001:c05::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 69748773 for ; Fri, 19 Jun 2015 13:12:06 +0000 (UTC) (envelope-from cmfitch1@gmail.com) Received: by igbqq3 with SMTP id qq3so15408176igb.0 for ; Fri, 19 Jun 2015 06:12:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=eQhTapPSPNY6Nnvw4rLhcHL03MwOJ5hHe+DWjIfE0GQ=; b=L6gE+7VzXVit9rLhEO5SenMU3uX5pbww2Dn3fy4L/Kg/SU2MfPgDfu8kDXk6Iw/d1Z DDgjpsRLQetkHd6OhdSu8fkUZhXpM2lU36JtzqpScQvOCmol0unVaOvHdVrqyTbaalto umnrXkXxOcJqB/cXuCcssvQmZCJqzIuQU1DIoT6FFAFaXFU78rxeXlJUsNolbmjI6AIm +t6PLRvGx6sVK/DNQVhda4Vca8FLOlSEl8f8NLPmGMLFwKOB3/oz71f6oINHJuuzWqoa T1CUMXN5XTgEeoe8AaTOnDbpodRN7Rsq15NccYMpvJ5nuFgYrvQvRlETKnWjtTdk97Jg dcRg== MIME-Version: 1.0 X-Received: by 10.107.170.216 with SMTP id g85mr14024438ioj.31.1434719525684; Fri, 19 Jun 2015 06:12:05 -0700 (PDT) Received: by 10.79.113.27 with HTTP; Fri, 19 Jun 2015 06:12:05 -0700 (PDT) Date: Fri, 19 Jun 2015 09:12:05 -0400 Message-ID: Subject: Realtime process CPU starvation From: Chris Fitch To: freebsd-hackers@freebsd.org X-Mailman-Approved-At: Fri, 19 Jun 2015 13:25:10 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2015 13:12:06 -0000 Hello all, I have a problem that appears to be scheduler/filesystem related and could use some expert advice. I have a process running at realtime priority 0 under FreeBSD 10.0. The main thread needs to run every 10 ms, and when it has completed its work, it yields the processor via a nanosleep() call. The sleep time is computed to be the remainder of the 10 ms period that is not needed. This mostly works, but there are occurrences where the thread doesn't run again for an extended period of time. The thread is monitoring how far behind the realtime target it is and reporting whenever it falls more than 250 ms behind. I have seen it report anywhere from 270 ms to almost 2 seconds behind. This was confirmed using an off-cpu dtrace script with the following results when the thread reported that it fell 500 ms behind: Off cpu times (milliseconds): process value ------------- Distribution ------------- count -1 | 0 0 | 18 1 | 1 2 | 5 4 | 12 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 42229 16 | 7 32 | 7 64 | 0 128 | 3 256 | 1 512 | 0 How can a thread with the highest realtime priority not run for such a lengthy period? I dove into the ULE scheduler, and although I was in a bit over my head, it seems to me that the only way this is possible is for kernel priority threads to be starving it of CPU. I tried decreasing kern.sched.slice to 3 hoping that smaller scheduler timeslices would decrease the wait time if several kernel threads were ahead of it in the run queue, but it didn't help. I was suspicious of the ZFS filesystem, as the delayed running of the thread seemed to coincide with short periods of moderate disk activity. I obtained some evidence to confirm my suspicions using KTR_SCHED and schedgraph.py. The trace shows that 'solthread 0xfffffff' was running for almost 500 ms when the problem occurred. During this 500 ms, two of the four CPUs were idle. Based on the graph, it appears that an OpenSolaris thread can starve a realtime process of CPU when idle CPUs are available. Any thoughts? Thanks, Chris