Date: Sun, 26 May 2013 02:53:01 +0200 From: Attilio Rao <attilio@freebsd.org> To: Ryan Stone <rysto32@gmail.com> Cc: FreeBSD Current <freebsd-current@freebsd.org>, arao@freebsd.og Subject: Re: Incorrect comparison of ticks in deadlkres Message-ID: <CAJ-FndARggoG_scOWxzPNhJQA3foc_dW7-wtcm9b4_AG3OsVqg@mail.gmail.com> In-Reply-To: <CAFMmRNyQCs-yOB7gm4TRq3xcMp50PEJc0YNQLAjMs3q8iE-ZUw@mail.gmail.com> References: <CAFMmRNyQCs-yOB7gm4TRq3xcMp50PEJc0YNQLAjMs3q8iE-ZUw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, May 25, 2013 at 11:55 PM, Ryan Stone <rysto32@gmail.com> wrote: > Currently deadlkres performs the following comparison when trying to check > for threads that have been blocked on a mutex or sleeping on an sx lock: > > if (TD_ON_LOCK(td) && ticks < td->td_blktick) { > /* check for deadlock...*/ Yes the check looks indeed inverted. > > > The test against ticks is incorrect. It results in deadlkres only > signaling a deadlock after ticks has rolled over; at 1000 hz this will take > up to 49 days. From looking at the history of the code this test appears > to be a attempt to deal with ticks rollover. However this is necessary; > later on the code calculates the amount of time that has passed with: > tticks = ticks - td->td_blktick; > > ticks was designed to exploit integer underflow in the case of rollover to > guarantee that subtraction produces correct results in all cases (other > than a double rollover, of course). I am going to remove the two incorrect > tests unless somebody can point out a overflow/underflow case that I > haven't considered. I'm not sure I follow what are you saying. Assume that when thread td goes to sleep, ticks is very close to the 32 bits limit. Then thread td goes to sleep and td->td_blktick is set to a value very close to 32 bits limits. After a while deadlkres thread kicks in and in the while "ticks" counter overflowed, rolling back to a very low value. How are you supposed to compute a valid value from this situation? I think that you need to still guard about overflow of ticks for such cases. Additively, if you really want to improve deadlkres, you should bring into the logic a fix for the adaptive spinning. Think about the schematic LOR case. Because of the adaptive spinning what will happen is that 2 threads getting a deadlock on 2 different locks will just end up spinning. I think you should import some sort of checks just like spinmutexes do, but with much higher time threshhold. Attilio -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndARggoG_scOWxzPNhJQA3foc_dW7-wtcm9b4_AG3OsVqg>