From owner-freebsd-current@FreeBSD.ORG Tue May 28 21:29:55 2013 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8DCFF4C7; Tue, 28 May 2013 21:29:55 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) by mx1.freebsd.org (Postfix) with ESMTP id 68CC6AE2; Tue, 28 May 2013 21:29:55 +0000 (UTC) Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52] helo=damnhippie.dyndns.org) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1UhRSn-0004Cc-7F; Tue, 28 May 2013 21:29:49 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r4SLTk5U006631; Tue, 28 May 2013 15:29:46 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 24.8.230.52 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1/9NfFc/mEmTUj6ii1F/XwJ Subject: Re: Incorrect comparison of ticks in deadlkres From: Ian Lepore To: attilio@FreeBSD.org In-Reply-To: References: Content-Type: text/plain; charset="us-ascii" Date: Tue, 28 May 2013 15:29:46 -0600 Message-ID: <1369776586.1258.19.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: FreeBSD Current , Ryan Stone , arao@freebsd.og X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 21:29:55 -0000 On Sun, 2013-05-26 at 02:53 +0200, Attilio Rao wrote: > On Sat, May 25, 2013 at 11:55 PM, Ryan Stone wrote: > > Currently deadlkres performs the following comparison when trying to check > > for threads that have been blocked on a mutex or sleeping on an sx lock: > > > > if (TD_ON_LOCK(td) && ticks < td->td_blktick) { > > /* check for deadlock...*/ > > Yes the check looks indeed inverted. > > > > > > > The test against ticks is incorrect. It results in deadlkres only > > signaling a deadlock after ticks has rolled over; at 1000 hz this will take > > up to 49 days. From looking at the history of the code this test appears > > to be a attempt to deal with ticks rollover. However this is necessary; > > later on the code calculates the amount of time that has passed with: > > tticks = ticks - td->td_blktick; > > > > ticks was designed to exploit integer underflow in the case of rollover to > > guarantee that subtraction produces correct results in all cases (other > > than a double rollover, of course). I am going to remove the two incorrect > > tests unless somebody can point out a overflow/underflow case that I > > haven't considered. > > I'm not sure I follow what are you saying. > Assume that when thread td goes to sleep, ticks is very close to the > 32 bits limit. Then thread td goes to sleep and td->td_blktick is set > to a value very close to 32 bits limits. > After a while deadlkres thread kicks in and in the while "ticks" > counter overflowed, rolling back to a very low value. How are you > supposed to compute a valid value from this situation? > I think that you need to still guard about overflow of ticks for such cases. > ticks is defined as a signed integer but conceptually it is unsigned -- it increments from 0 to UINT_MAX (not INT_MAX) then rolls over. If td->td_blktick is captured while ticks = UINT_MAX and later ticks has rolled over and counted back up to 15, then ticks - td->td_blktick gives an elapsed time of 16, as it should be. Whether exploiting this property of signed overflow is elegant or ugly is in the eye of the beholder. :) If the intent of the "ticks < td->td_blktick" is to avoid the deadlock check until "after enough time has passed," then I guess it should probably be something more like "(ticks - td->blktick) > SOME_THRESHOLD" so that it also uses the signed overflow trick. -- Ian