From owner-freebsd-current@FreeBSD.ORG Fri Jul 13 20:15:41 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 154AB16A401; Fri, 13 Jul 2007 20:15:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 94F4713C481; Fri, 13 Jul 2007 20:15:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l6DKFb4P006368; Fri, 13 Jul 2007 16:15:38 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-current@freebsd.org Date: Fri, 13 Jul 2007 15:28:50 -0400 User-Agent: KMail/1.9.6 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200707131528.51396.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Fri, 13 Jul 2007 16:15:38 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3663/Fri Jul 13 15:16:34 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: scottl@freebsd.org, Matt Reimer Subject: Re: arcmsr crash X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jul 2007 20:15:41 -0000 On Tuesday 05 June 2007 05:22:38 pm Matt Reimer wrote: > Once a week or so we're seeing a panic with a -current kernel built > just before the gcc 4.2 import (maybe three weeks ago). The box has a > Supermicro X7DBE/X7DBE+ motherboard with two Xeon 5160s, 16G RAM, and > an Areca 1220 controller with eight 500G disks connected. > > Does this indicate that the arcmsr driver is at fault: > > Tracing command irq16: arcmsr0 pid 26 tid 100018 td 0xffffff040fc5b000 > cpustop_handler() at cpustop_handler+0x35 > ipi_nmi_handler() at ipi_nmi_handler+0x2e > trap() at trap+0x365 > nmi_calltrap() at nmi_calltrap+0x8 > --- trap 0x13, rip = 0xffffffff8041ab11, rsp = 0xffffffffab59eff0, rbp > = 0xffffffffac0a37d0 --- > siocnclose() at siocnclose+0x21 > sio_cnputc() at sio_cnputc+0x89 > cnputc() at cnputc+0x6a > putchar() at putchar+0x5f > kvprintf() at kvprintf+0xd45 > printf() at printf+0xe1 > panic() at panic+0x145 > xpt_done() at xpt_done+0x14a > arcmsr_interrupt() at arcmsr_interrupt+0x2df > ithread_loop() at ithread_loop+0x108 > fork_exit() at fork_exit+0xaa > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffffffac0a3d30, rbp = 0 --- Looks like it has panic'd here: switch (done_ccb->ccb_h.path->periph->type) { case CAM_PERIPH_BIO: mtx_lock(&cam_bioq_lock); TAILQ_INSERT_TAIL(&cam_bioq, &done_ccb->ccb_h, sim_links.tqe); done_ccb->ccb_h.pinfo.index = CAM_DONEQ_INDEX; mtx_unlock(&cam_bioq_lock); swi_sched(cambio_ih, 0); break; default: panic("unknown periph type %d", done_ccb->ccb_h.path->periph->type); } which should seem to indicate that, yes, it is a driver bug. -- John Baldwin