From owner-freebsd-stable@FreeBSD.ORG Mon Apr 1 20:23:23 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 47DE45ED for ; Mon, 1 Apr 2013 20:23:23 +0000 (UTC) (envelope-from rmcintosh@nitemare.net) Received: from mail-qa0-f49.google.com (mail-qa0-f49.google.com [209.85.216.49]) by mx1.freebsd.org (Postfix) with ESMTP id 089B4E8B for ; Mon, 1 Apr 2013 20:23:22 +0000 (UTC) Received: by mail-qa0-f49.google.com with SMTP id l8so1075296qaq.15 for ; Mon, 01 Apr 2013 13:23:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:x-originating-ip:in-reply-to:references :date:message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=qFIHSijnA56BXPXRAvaKhjcWu1jSOIF8a5F1kAbGQMY=; b=B3oo48e5Vf3Zyqk2ut3q4+cxPI042L1I/DCRqMkLGb9H3wqDOCFf7LDwsSOHQrGbCJ skTaqR2vrK98klNWph+CsMoDSx7wBzHznPJWpWUXHiRkM0Ft66WpEiGT1RMesaAYr19d JFybgiTyLQRv5Kknfp6vRUO5931n7Mm5hh7n8SDR1RG8Od4Tc6fB6dsMPSkD4/rgdLVN GFkph8lazalpNJBE3N4bAvzwI+0Xj6BhMuhRTwLtygspeVFFKX3hBKA4YbIRixe/25Iq nLmLaxsNYILKvXqS0Y1lZNoVAlro1FlFmN4cFxlY5+R/2YarwaJbky38JdC2jG2Guxkn gOcA== MIME-Version: 1.0 X-Received: by 10.224.167.83 with SMTP id p19mr13952393qay.73.1364847801777; Mon, 01 Apr 2013 13:23:21 -0700 (PDT) Received: by 10.49.28.134 with HTTP; Mon, 1 Apr 2013 13:23:21 -0700 (PDT) X-Originating-IP: [64.72.74.50] In-Reply-To: <20130401172501.GA12934@icarus.home.lan> References: <515937BF.9010805@delphij.net> <51593BB8.4020403@delphij.net> <20130401122550.GA7367@icarus.home.lan> <5159B5FA.1080005@delphij.net> <20130401172501.GA12934@icarus.home.lan> Date: Mon, 1 Apr 2013 16:23:21 -0400 Message-ID: Subject: Re: 9.1-REL Supermicro H8DCL-iF kernel panic From: Ryan McIntosh To: Jeremy Chadwick X-Gm-Message-State: ALoCoQnKtb6jWV7LEWXcv3Ai103wbeHBSTHZ/KDIwVsjUBCT2c+oyni9qbLPAdIGkI3xgXgn5p2B Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-stable@freebsd.org, Xin LI X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Apr 2013 20:23:23 -0000 I had to get some sleep lol. Yes Jeremy, I do completely understand that and likewise FreeBSD was unusable without any type of semi-hack fix, let alone fixing it properly, as without msix the system was pretty slow. If you'd like access or are up for trying to fix the driver I'm all for being a guinea pig. Let me know. Ryan On Mon, Apr 1, 2013 at 1:25 PM, Jeremy Chadwick wrote: > On Mon, Apr 01, 2013 at 09:29:46AM -0700, Xin Li wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA256 > > > > On 4/1/13 5:25 AM, Jeremy Chadwick wrote: > > > On Mon, Apr 01, 2013 at 05:45:48AM -0400, Ryan McIntosh wrote: > > >> I can confirm that works as intended. I appreciate the prompt > > >> response and it looks like there's a real fix. > > >> > > >> For google reference for anyone else searching.. > > >> > > >> Motherboard: Supermicro H8DCL-iF OS: FreeBSD 9.1-RELEASE > > >> > > >> Boot message: panic: m_getzone: m_getjcl: invalid cluster type > > >> cpuid = 0 KBD: stack backtrace: #0 0xffffffff809208a6 at > > >> kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2 > > >> 0xffffffff804ad5a7 at em_refresh_mbufs+0x207 #3 > > >> 0xffffffff804adb7f at em_rxeof+0x47f #4 0xffffffff804adca4 at > > >> em_msix_rx+0x24 #5 0xffffffff808be8d4 at > > >> intr_event_execute_handlers+0x104 #6 0xffffffff808c0076 at > > >> ithread_loop+0xa6 #7 0xffffffff808bb9ef at fork_exit+0x11f #8 > > >> 0xffffffff80bc368e at fork_trampoline+0xe > > >> > > >> Panic image from H8DCl-iF: > > >> http://nitemail.net/img/crash91-h8dcl-if.png > > >> > > >> Original image from X8DTU-6+: > > >> http://www.grosbein.net/img/crash-91rc.png > > >> > > >> As per Xin Li, which seems to work: > > >> > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c?r1=238214&r2=239304&view=patch > > >> > > >> > > >> > > References: > > >> > http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063958.html > > >> > > >> > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/172113 > > >> > > >> > > >> Thanks again, > > >> > > >> Ryan McIntosh e: rmcintosh@nitemare.net > > >> > > >> > > >> On Mon, Apr 1, 2013 at 3:48 AM, Xin Li > > >> wrote: > > >> > > >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 > > >>> > > >>> On 4/1/13 12:34 AM, Ryan McIntosh wrote: > > >>>> I could try that patch, however that was intended for > > >>>> if_igb.c which for my system (and the panic's are almost > > >>>> identical except if_em for me) I'd have to apply that fix to > > >>>> if_em.c and I haven't looked at the source just yet. If you > > >>>> can give me a patch I'll do apply and test it shortly > > >>>> though. > > >>> > > >>> Try this: > > >>> > > >>> > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c?r1=238214&r2=239304&view=patch > > > > > >>> > > > Jack Vogel has stated it's not a "real fix" (your words) but rather > > > a "bandaid", for both igb(4) and em(4). The commit messages (for > > > r238214 and r239304) contain details: > > > > > > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c#rev238214 > > > > > > > > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c#rev239304 > > > > Hm why 238214 is related, or did you mean the change between 238214 > > and 239304? > > Correct (the latter). :-) The "bandaid" in 239304 **wasn't** to fix a > bug introduced in 238214, it was an overall "bandaid". > > I've gotten in the habit of always examining two commits (fix + previous > commit) to see what got introduced where. > > > Yes, this is a bandaid and the right fix should be refactor the code a > > little bit to make sure that no interrupt handler is installed before > > the driver have done other initializations but I don't have hardware > > that can reproduce this issue handy to validate changes like that. > > Yes exactly. I just want to make sure Ryan understands that this is > simply a workaround for said spurious interrupt scenario, while the > actual root cause needs to be dealt as you describe. > > -- > | Jeremy Chadwick jdc@koitsu.org | > | UNIX Systems Administrator http://jdc.koitsu.org/ | > | Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | >