From owner-freebsd-bugs@FreeBSD.ORG Sun Sep 30 13:50:04 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25137106564A for ; Sun, 30 Sep 2012 13:50:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 0C9478FC14 for ; Sun, 30 Sep 2012 13:50:04 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q8UDo3iB044569 for ; Sun, 30 Sep 2012 13:50:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q8UDo357044566; Sun, 30 Sep 2012 13:50:03 GMT (envelope-from gnats) Date: Sun, 30 Sep 2012 13:50:03 GMT Message-Id: <201209301350.q8UDo357044566@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Andriy Gapon Cc: Subject: Re: kern/172166: Deadlock in the networking code, possible due to a bug in the SCHED_ULE X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andriy Gapon List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2012 13:50:04 -0000 The following reply was made to PR kern/172166; it has been noted by GNATS. From: Andriy Gapon To: bug-followup@FreeBSD.org, eugen@eg.sd.rdtc.ru Cc: Subject: Re: kern/172166: Deadlock in the networking code, possible due to a bug in the SCHED_ULE Date: Sun, 30 Sep 2012 16:42:53 +0300 on 30/09/2012 14:54 Andriy Gapon said the following: > > It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three. > One of those threads is the syslogd thread that holds the lock, but the > currently running thread is 'ipmi0: kcs' thread with tid 100118. > It would interesting to examine what it is doing. > Looks like the kcs busy loops in here: kcs_loop -> kcs_read_byte -> kcs_wait_for_obf. Since this is a 6-CPU machine, steal threshold is set to 3 so other CPUs don't try to take any work from CPU5. Not sure if this is smart actually. Maybe it would make sense to have a lower threshold or to allow stealing of real-time threads at a lower threshold. Since the kcs thread is a kernel thread with real-time priority (68) it doesn't allow any other lower priority thread to run while it's not sleeping. Also, it looks like rwlock does not take care to propagate waiters' priorities in all cases. Maybe priority propagation could have helped here, but not sure... -- Andriy Gapon