From owner-freebsd-stable@FreeBSD.ORG Tue Jul 31 20:51:21 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 45FDF106566B; Tue, 31 Jul 2012 20:51:21 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5629F8FC15; Tue, 31 Jul 2012 20:51:20 +0000 (UTC) Received: by lbon10 with SMTP id n10so5294591lbo.13 for ; Tue, 31 Jul 2012 13:51:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=Dfc5hpRxCfrSfMGmjrxcCgVsex/heW/e4ysZRV0UV0w=; b=ourS90Qwim3Ad4+tZzUgSG4uSEdEhTV1h1jal+wSx/uhbhgYyWqJ51FAXKB/Ax/d7W 2ad0U3J53J5CiCMS8VkqoHVmwqcKs8FzjCxitr7fLDWKmamHNN9NIyadPaspNkSU4726 MOakvLXdU9hI4QgZp2/y13WiXyOG6qxJMLCMmj2MG4+X2JUegrHGubdT5W1rd/z+t4Xv 3LIXpPPJeHPPg9HB6VEVoMtE4I3HpOTuO6EiXdkm1dHg/gwlUzKJCO2i5kM/uxqz1Z9a l2s1wKEQApI2ob7TcQMjFqfXR2qzMg3jVTEXW6/K2a5nF5f8/nicIoshyrGBn8Tghkkh vTsw== MIME-Version: 1.0 Received: by 10.152.46.6 with SMTP id r6mr16131876lam.7.1343767879327; Tue, 31 Jul 2012 13:51:19 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.27.65 with HTTP; Tue, 31 Jul 2012 13:51:19 -0700 (PDT) In-Reply-To: <201207311634.24169.jhb@freebsd.org> References: <1342742294.2656.24.camel@powernoodle.corp.yahoo.com> <201207311634.24169.jhb@freebsd.org> Date: Tue, 31 Jul 2012 21:51:19 +0100 X-Google-Sender-Auth: k0s3lyepoSL6Jk33N6xN5PY3Ql8 Message-ID: From: Attilio Rao To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: freebsd-stable@freebsd.org Subject: Re: [stable 9] panic on reboot: ipmi_wd_event() X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2012 20:51:21 -0000 On 7/31/12, John Baldwin wrote: > On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote: >> Working on the Dell R420 today, got most of it working, even the >> broadcom ethernet cards! However, I get the following when I reboot the >> system: >> >> Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9) >> owns a non-sleepable lock >> KDB: stack backtrace of thread 100107: >> sched_switch() at sched_switch+0x19f >> mi_switch() at mi_switch+0x208 >> sleepq_switch() at sleepq_switch+0xfc >> sleepq_wait() at sleepq_wait+0x4d >> _sleep() at _sleep+0x3f6 >> ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97 >> ipmi_set_watchdog() at ipmi_set_watchdog+0xb1 >> ipmi_wd_event() at ipmi_wd_event+0x8f >> kern_do_pat() at kern_do_pat+0x10f >> sched_sync() at sched_sync+0x1ea >> fork_exit() at fork_exit+0x135 >> fork_trampoline() at fork_trampoline+0xe > > Hmmm, the watchdog pat should probably happen without holding locks if > possible. This is related to the IPMI watchdog being special and wanting > to schedule a thread to work. The watchdog pat without the locks is not easy to do because we register the watchdog callbacks in eventhandlers, which are indeed locked (and you may also end up racing against watchdog detach, if you don't use any lock at all). There is a similar issue when you enter DDB o coredump, for example but this is someway collateral due to the "after-panic" nature of the situation. We should seriously looking into requirements for watchdog patting and possibly DDB entering situations, outline correct semantics to follow and refactor code to follow them. Attilio -- Peace can only be achieved by understanding - A. Einstein