From owner-freebsd-current@FreeBSD.ORG Wed Dec 31 22:56:19 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1681DD0C for ; Wed, 31 Dec 2014 22:56:19 +0000 (UTC) Received: from frv27.fwdcdn.com (frv158.fwdcdn.com [212.42.77.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CA4C42583 for ; Wed, 31 Dec 2014 22:56:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=fsm; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date; bh=5SYmFvNWE0U2nvism1xoT6Jt4fp7hYwdHktou1F6CoY=; b=uw9u2ePV4fiNv9S528tHMydCi3PQhjo08pGpVa+qnpNMmxTTu0P6XTfhMP3GSEo0qkcqLYeJWMmFAic4+rOJ5DvlMk26LvVTQKvGmhJzOpGWwP3RdAqv07BzGziwUi5aSXG6AQV94vd2cGxIGQisHcaObhVbY/2COcE2xZZDO34=; Received: from [37.229.193.70] (helo=nonamehost.local) by frv27.fwdcdn.com with esmtpsa ID 1Y6SBa-000Fsc-EY ; Thu, 01 Jan 2015 00:56:14 +0200 Date: Thu, 1 Jan 2015 00:56:13 +0200 From: Ivan Klymenko To: Hans Petter Selasky Subject: Re: [RFC] kern/kern_timeout.c rewrite in progress Message-ID: <20150101005613.4f788b0c@nonamehost.local> In-Reply-To: <54A1B38C.1000709@selasky.org> References: <54A1B38C.1000709@selasky.org> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; amd64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Authentication-Result: IP=37.229.193.70; mail.from=fidaj@ukr.net; dkim=pass; header.d=ukr.net Cc: FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Dec 2014 22:56:19 -0000 =D0=92 Mon, 29 Dec 2014 21:03:24 +0100 Hans Petter Selasky =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > Hi, >=20 > I recently came across a class of errors which lead me into=20 > investigating the "kern/kern_timeout.c" and its subsystem. From what > I can see new features like the SMP awareness has been "added" > instead of fully "integrated". When going into the cornercases I've > uncovered that the internal callout statemachine can sometimes report > wrong values via its callout_active() and callout_pending() bits to > its clients, which in turn can make the clients behave badly. I > further did an investigation on how the safety of callout migration > between CPU's is maintained. When I looked into the code and found > stuff like "volatile" and "while()" loops to figure which CPU a > callout belongs I understood that such logic completely undermines > the cleverness found in the turnstiles of mutexes and decided to go > through all of the logic inside "kern_timeout.c". Also static code > analysis is harder when we don't use the basic mutexes and condition > variables available in the kernel. >=20 > First of all we need to make some driving rules for everyone: >=20 > 1) A new feature called direct callbacks which execute the timer=20 > callbacks from the fast interrupt handler was added. All these > callbacks _must_ be associated with a regular spinlocks, to maintain > a safe callout_drain(). Else they should only be executed on CPU0. >=20 > 2) All Giant locked callbacks should only execute on CPU0 to avoid=20 > congestion. >=20 > 3) Callbacks using read-only locks for its callback should also only=20 > execute on CPU0 to avoid multiple instances pending for completion on=20 > multiple CPU's, because read-only locks can be entered multiple > times. From what I can see, there are currently no consumers of this > feature in the kernel. >=20 ... panic: spin lock held too long http://paste.org.ru/?acf7io