From owner-freebsd-arch@FreeBSD.ORG Thu Nov 13 10:14:28 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B5AFBA93; Thu, 13 Nov 2014 10:14:28 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9BECE1AB; Thu, 13 Nov 2014 10:14:28 +0000 (UTC) Received: from AlfredMacbookAir.local (c-76-21-10-192.hsd1.ca.comcast.net [76.21.10.192]) by elvis.mu.org (Postfix) with ESMTPSA id 00D9D341F84E; Thu, 13 Nov 2014 02:14:27 -0800 (PST) Message-ID: <54648483.5060107@freebsd.org> Date: Thu, 13 Nov 2014 02:14:27 -0800 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: Questions about locking; turnstiles and sleeping threads References: <20141112212613.21037929@kan> <546472DA.3080006@freebsd.org> <5464764E.9080308@freebsd.org> <54647D1E.9010904@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Hans Petter Selasky , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2014 10:14:28 -0000 Would need more context to help on this. I can't tell based on your description which thread is holding which lock. If A is waiting for callout C to stop AND there exists a thread B that is contending against C for a lock, you should be fine so long as there is no lock cycle against A. Would be best if you pointed at some code and gave descriptions. -Alfred On 11/13/14, 1:52 AM, Adrian Chadd wrote: > Hm, the more I dig into this, the more I realise it's not a 1:45am > question to ask. > > Specifically, callout_stop_safe() takes 'safe', which says "are we > waiting around for this callout to finish if it started". Ie, > callout_drain() is callout_stop_safe(c, 1) ; callout_stop() is > callout_stop_safe(c, 0). > > If safe is 1, then it'll potentially put the current thread to sleep > in order to wait for it to synchronise with the callout that's > running. It's sleeping with cc_lock which is the per-callwheel lock > and it's doing that with whatever other locks are held. That's the > situation which is tripping things up. > > The manpage says that no locks should be held that the callout may > block on, which isn't the case here at all - I'm trying to grab a lock > in another thread that the caller _into_ the callout subsystem holds. > The manpage doesn't mention anything about this. Sniffle. > > > > -adrian > > On 13 November 2014 01:42, Alfred Perlstein wrote: >> OK that makes more sense. >> >> I've cc'd Hans for the usb issue. >> >> >> On 11/13/14, 1:38 AM, Adrian Chadd wrote: >>> It looks like the initial firings are because the check I put in >>> didn't check to see if it's MPSAFE. >>> >>> eg: >>> >>> ip6_input -> tcp6_input -> tcp_input -> tcp_do_segment -> >>> tcp_timer_active -> callout_stop_safe; called with tcpinp held. >>> closefp() -> closef() -> fdrop -> soclose() -> sofree() -> >>> tcp_usr_detach() -> tcp_discardcb() -> callout_stop_safe() with the >>> tcpinp and tcp locks held. >>> ioctl -> sys_ioctl-> devfs_ioctl_f -> acpi_ackSleepState -> >>> callout_stop_safe; with ACPI global lock held; >>> suspend path -> hdaa_suspend -> callout_stop_safe() with HDA driver mutex >>> held >>> >>> So we can't just put the simple witness check from _sleep() in >>> _callout_stop_safe(), it looks like it's mis-firing on MPSAFE callouts >>> (which the tcp timers all are) and that won't go via the sleepq. >>> It looks like the acpi callout is also mpsafe, as well as the HDA jack >>> callout. >>> >>> However, I did pick up this: >>> >>> detach path -> usbd_transfer_drain() -> usbd_transfer_stop() -> >>> ehci_device_intr_close() -> usbd_transfer_done() -> >>> callout_stop_safe() with USB HUB mutex held >>> >>> The usbd_transfer_done() callout is initialised with a mutex, but the >>> witness code should've detected it wasn't callout->c_lock and thus >>> warned. >>> >>> >>> >>> -adrian >>> >>> On 13 November 2014 01:13, Alfred Perlstein wrote: >>>> On 11/13/14, 1:09 AM, Adrian Chadd wrote: >>>>> On 13 November 2014 00:59, Alfred Perlstein wrote: >>>>>> On 11/12/14, 11:25 PM, Adrian Chadd wrote: >>>>>>> On 12 November 2014 19:48, Adrian Chadd wrote: >>>>>>>> kan pointed out that we should likely do a WITNESS_WARN() in the >>>>>>>> relevant spots in the callout code so we catch these before it >>>>>>>> happens. >>>>>>>> >>>>>>>> I'll see what we can add to -HEAD to be pro-active about it. >>>>>>> Amusingly, I tried adding it and it made my laptop turn to soup very >>>>>>> quickly - among other things, the TCP timer callouts are all setup >>>>>>> with non sleep locks held. >>>>>>> >>>>>> Howso? You only have to worry about callout_drain(), most other >>>>>> callout >>>>>> functions should be safe-ish.... >>>>> yeah, except for all the places where they drain the timer once >>>>> something happens so it doesn't fire. >>>>> >>>>> :) >>>> >>>> What is the backtrace...? >>>> >>>> >>> _______________________________________________ >>> freebsd-arch@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-arch >>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >>> > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >