From owner-freebsd-net@FreeBSD.ORG Thu Oct 29 16:49:09 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF711106566B; Thu, 29 Oct 2009 16:49:09 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.27]) by mx1.freebsd.org (Postfix) with ESMTP id 7C4F38FC21; Thu, 29 Oct 2009 16:49:09 +0000 (UTC) Received: by qw-out-2122.google.com with SMTP id 9so517354qwb.7 for ; Thu, 29 Oct 2009 09:49:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=sK9TSpL7gOS16/7T3zPXRmfm3rbrODv36ihxBPO4mOY=; b=iHEUJXl2G5VAJbolvIuoldmBn+wdDCD2lLPIIfzD6KGSlzBPMuow8I+IQfZCHoCITi HT+MZwDfCGl6nHoWjvE5Jq9GqVJZ4LeSKo8wETvsS+u1w3cr11Jaddt3Y0YGzWDCdubY 2nA+1Vw/jvaq7QmQlrmpPF0+BiBAzf6zuLTjo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=RcmwR+O7GKjbfqIRJnjbPZ6Dc2RMFuL95AoD15T7p9rriBX6JS+7U7SrDnFuGYKFQf SHMp+6tSFw0u+cgYxKEUl38zgNhyOA9kzZ3HAixQn/64aUr3Dl8+2CyVd9DaKlNj4SdP k7PKADD+4G+WMhuAvo7SrKEFvDpuNdmPsdFe0= Received: by 10.224.12.198 with SMTP id y6mr176066qay.207.1256834948553; Thu, 29 Oct 2009 09:49:08 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id 22sm1420051qyk.10.2009.10.29.09.49.05 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 29 Oct 2009 09:49:06 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 29 Oct 2009 09:49:09 -0700 From: Pyun YongHyeon Date: Thu, 29 Oct 2009 09:49:09 -0700 To: Mark Atkinson Message-ID: <20091029164909.GA13275@michelle.cdnetworks.com> References: <200910290010.n9T0A3cV083541@freefall.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, bug-followup@FreeBSD.org Subject: Re: kern/124127: [msk] watchdog timeout (missed Tx interrupts) -- recovering X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2009 16:49:10 -0000 On Thu, Oct 29, 2009 at 06:52:34AM -0700, Mark Atkinson wrote: > Wow, not sure what to blame for that charset nightmare. Apologies. > Here's the original message: > > On the unpatched -current kernel, built > > FreeBSD hellfire.filament.org 9.0-CURRENT FreeBSD 9.0-CURRENT #14: Mon > Oct 19 09:12:03 PDT 2009 > > I recieved the following panic today related to this: > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0xdeadc10a > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc0987410 > stack pointer = 0x28:0xd533dac0 > frame pointer = 0x28:0xd533dae8 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 0 (mskc0 taskq) > Physical memory: 495 MB > Dumping 132 MB: 117 101 85 69 53 37 21 5 > > Reading symbols from /boot/kernel/linux.ko...Reading symbols from > /boot/kernel/linux.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/linux.ko > #0 0xc08907a9 in doadump () at /usr/src/sys/kern/kern_shutdown.c:254 > 254 } > (kgdb) bt > #0 0xc08907a9 in doadump () at /usr/src/sys/kern/kern_shutdown.c:254 > #1 0xc04f7e37 in db_fncall (dummy1=-1067299898, dummy2=0, > dummy3=-718022488, > dummy4=0xd533d898 "\200%t?") at /usr/src/sys/ddb/db_command.c:548 > #2 0xc04f8214 in db_command (last_cmdp=0xc0da059c, cmd_table=0x0, > dopager=1) > at /usr/src/sys/ddb/db_command.c:445 > #3 0xc04f8352 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 > #4 0xc04fa05e in db_trap (type=12, code=0) at > /usr/src/sys/ddb/db_main.c:229 > #5 0xc08bf2d2 in kdb_reenter () at /usr/src/sys/kern/subr_kdb.c:398 > #6 0xc0ba9b62 in trap_fatal (frame=0x1, eva=3735929098) > at /usr/src/sys/i386/i386/trap.c:938 > #7 0xc0baa483 in trap (frame=0xd533da80) at > /usr/src/sys/i386/i386/trap.c:339 > #8 0xc0b8e4ab in Xlcall_syscall () at > /usr/src/sys/i386/i386/exception.s:241 > #9 0xc0987410 in in_lltable_lookup (llt=0xc39e1000, flags=Variable > "flags" is not available. > ) > at /usr/src/sys/netinet/in.c:1380 > #10 0xc0982470 in arpintr (m=0xc3baeb00) at > /usr/src/sys/netinet/if_ether.c:642 > #11 0xc094227a in netisr_dispatch_src (proto=7, source=0, m=0xc0de) > at /usr/src/sys/net/netisr.c:932 > #12 0xc09424dd in netisr_unregister (nhp=0xc0de) > at /usr/src/sys/net/netisr.c:583 > #13 0xc093ac69 in ether_demux (ifp=0x0, m=0xc3baeb00) > at /usr/src/sys/net/if_ethersubr.c:911 > #14 0xc093b1ce in ether_output (ifp=0xc36ad400, m=0xc3baeb00, > dst=0xc0c55c27, > ro=0x301010a) at /usr/src/sys/net/if_ethersubr.c:181 > ---Type to continue, or q to quit--- > #15 0xc070b032 in msk_handle_events (sc=0xc3686c00) > at /usr/src/sys/dev/msk/if_msk.c:3048 > #16 0xc070b828 in msk_int_task (arg=0xc3686c00, pending=1) > at /usr/src/sys/dev/msk/if_msk.c:3625 > #17 0xc08cac8c in taskqueue_run (queue=0xc36bf380) > at /usr/src/sys/kern/subr_taskqueue.c:72 > #18 0xc08cadcc in taskqueue_thread_loop (arg=0xc3686c8c) > at /usr/src/sys/kern/subr_taskqueue.c:90 > #19 0xc0869271 in fork_exit (callout=0xc08cad67 , > arg=0xc3686c8c, frame=0xd533dd38) at /usr/src/sys/kern/kern_fork.c:854 > #20 0xc0b8e520 in Xatpic_intr0 () at atpic_vector.s:62 > #21 0x00000000 in ?? () > I think it's not a bug of msk(4). Qin Li fixed the bug in arp code. See r198301. For watchdog timeout issues on 88E8053 controller, did you ever try disabling MSI? msk(4) was changed a lot since 7.0-RELEASE to support newer controllers and added several workarounds to address silicon bugs. So don't blindly apply experimental patches to your controller. 88E8053 also has a couple of hardware bugs but I guess msk(4) already incorporated required workarounds. So if you can reliably reproduce watchdog timeouts please let me know.