From owner-freebsd-bugs@FreeBSD.ORG Sun Sep 30 13:50:07 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1662106564A for ; Sun, 30 Sep 2012 13:50:07 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8C3AD8FC15 for ; Sun, 30 Sep 2012 13:50:07 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q8UDo7KI045008 for ; Sun, 30 Sep 2012 13:50:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q8UDo7nY045000; Sun, 30 Sep 2012 13:50:07 GMT (envelope-from gnats) Date: Sun, 30 Sep 2012 13:50:07 GMT Message-Id: <201209301350.q8UDo7nY045000@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Andriy Gapon Cc: Subject: Re: kern/172166: Deadlock in the networking code, possible due to a bug in the SCHED_ULE X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andriy Gapon List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2012 13:50:08 -0000 The following reply was made to PR kern/172166; it has been noted by GNATS. From: Andriy Gapon To: bug-followup@FreeBSD.org, eugen@eg.sd.rdtc.ru Cc: Subject: Re: kern/172166: Deadlock in the networking code, possible due to a bug in the SCHED_ULE Date: Sun, 30 Sep 2012 16:44:09 +0300 on 30/09/2012 16:42 Andriy Gapon said the following: > on 30/09/2012 14:54 Andriy Gapon said the following: >> >> It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three. >> One of those threads is the syslogd thread that holds the lock, but the >> currently running thread is 'ipmi0: kcs' thread with tid 100118. >> It would interesting to examine what it is doing. >> > > Looks like the kcs busy loops in here: kcs_loop -> kcs_read_byte -> > kcs_wait_for_obf. > Since this is a 6-CPU machine, steal threshold is set to 3 so other CPUs don't > try to take any work from CPU5. Not sure if this is smart actually. Maybe it > would make sense to have a lower threshold or to allow stealing of real-time > threads at a lower threshold. > > Since the kcs thread is a kernel thread with real-time priority (68) it doesn't > allow any other lower priority thread to run while it's not sleeping. > > Also, it looks like rwlock does not take care to propagate waiters' priorities > in all cases. Maybe priority propagation could have helped here, but not sure... > In any case, the original trigger for this problem seems to be something in IPMI that keeps that thread running. -- Andriy Gapon