From owner-freebsd-stable@FreeBSD.ORG Wed Aug 1 14:47:16 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD825106566C; Wed, 1 Aug 2012 14:47:16 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id B348C8FC0A; Wed, 1 Aug 2012 14:47:15 +0000 (UTC) Received: by laai10 with SMTP id i10so5707994laa.13 for ; Wed, 01 Aug 2012 07:47:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=6n+4Kvv2uC0O3mgUH5XBXq+O/sPG+AC3p9+53ZE0ZGc=; b=fSzST3D3/IviWls7ptp5nxhAFmUyQybUktZ6u6w06saKJgXYvM9HtdLay9j9niZtbD KH8QVYij+gUTg4/cn7/uhrhbJ9ptx+saEA/h6iEu4e4pjoaD/lieyVIMU9YgyoRZxQ2v 30H3+VAKifZPz42pJ/2JdRpzCsNjSdbPLynTtP08ifBjKzpKMqyEKdcfWY+DG9dbLDse rNo+0eljcnWU+H32rNz9JSj9ieTFHhjVg+z7h3vsfjuBxsu4CgYwVt95Rpq6NbGgazwg V3qN84fbKeyhqb/sim9jX8YK8dqwbwbSim4OZN8HF3msKKBhHbFs8phK+qCmAE0d6uXo hIgg== MIME-Version: 1.0 Received: by 10.112.11.100 with SMTP id p4mr8295418lbb.35.1343832433447; Wed, 01 Aug 2012 07:47:13 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.27.65 with HTTP; Wed, 1 Aug 2012 07:47:13 -0700 (PDT) In-Reply-To: References: <1342742294.2656.24.camel@powernoodle.corp.yahoo.com> <201207311634.24169.jhb@freebsd.org> <201208010853.11447.jhb@freebsd.org> Date: Wed, 1 Aug 2012 15:47:13 +0100 X-Google-Sender-Auth: S7nVji1WDY5eTZN5O2iztK1FjLI Message-ID: From: Attilio Rao To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: freebsd-stable@freebsd.org Subject: Re: [stable 9] panic on reboot: ipmi_wd_event() X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2012 14:47:17 -0000 On 8/1/12, Attilio Rao wrote: > On 8/1/12, John Baldwin wrote: >> On Tuesday, July 31, 2012 4:51:19 pm Attilio Rao wrote: >>> On 7/31/12, John Baldwin wrote: >>> > On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote: >>> >> Working on the Dell R420 today, got most of it working, even the >>> >> broadcom ethernet cards! However, I get the following when I reboot >>> >> the >>> >> system: >>> >> >>> >> Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid >>> >> 9) >>> >> owns a non-sleepable lock >>> >> KDB: stack backtrace of thread 100107: >>> >> sched_switch() at sched_switch+0x19f >>> >> mi_switch() at mi_switch+0x208 >>> >> sleepq_switch() at sleepq_switch+0xfc >>> >> sleepq_wait() at sleepq_wait+0x4d >>> >> _sleep() at _sleep+0x3f6 >>> >> ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97 >>> >> ipmi_set_watchdog() at ipmi_set_watchdog+0xb1 >>> >> ipmi_wd_event() at ipmi_wd_event+0x8f >>> >> kern_do_pat() at kern_do_pat+0x10f >>> >> sched_sync() at sched_sync+0x1ea >>> >> fork_exit() at fork_exit+0x135 >>> >> fork_trampoline() at fork_trampoline+0xe >>> > >>> > Hmmm, the watchdog pat should probably happen without holding locks if >>> > possible. This is related to the IPMI watchdog being special and >>> > wanting >>> > to schedule a thread to work. >>> >>> The watchdog pat without the locks is not easy to do because we >>> register the watchdog callbacks in eventhandlers, which are indeed >>> locked (and you may also end up racing against watchdog detach, if you >>> don't use any lock at all). >> >> No, eventhandlers go through several hoops to not hold any locks while >> the eventhandler functions are running. It seems in this case that a >> lock is held in a higher layer (sched_sync()) and that is what I was >> talking about. Yes, it is the 'sync_mtx' that is held. Something like >> this > > No, EVENTHANDLER_INVOKE() acquires eventhandler internal locks. > Look at eventhandler_find_list() for details. Oh, but I guess you misunderstood me -- I didn't mean to say that eventhandler callbacks run with eventhandlers lock held, I meant to say that that it would be nice if EVENTHANDLER_INVOKE() could run lockless. This would have avoided some issues in special context (I recall I had some issues at work years ago, but they could have been predating the STOP_SCHEDULER() patch and in DDB). Attilio -- Peace can only be achieved by understanding - A. Einstein