From owner-freebsd-stable@FreeBSD.ORG Mon Aug 30 00:57:42 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D2AD10656A4 for ; Mon, 30 Aug 2010 00:57:42 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6276B8FC19 for ; Mon, 30 Aug 2010 00:57:42 +0000 (UTC) Received: by pwi8 with SMTP id 8so2319614pwi.13 for ; Sun, 29 Aug 2010 17:57:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=yn01JPjREcoOOegzY/D13PSTfH5S22JuwNRbTK97b/o=; b=w17xyc+qPKLniKfZi7+ot8nsRlE4DKGuKe9W3jf7w0Qxpa8C5sjJIceq2QX/tO/ScL rY2Hs3VVU5R732lmp0izWPimxDP9hNmnYp2dRiLqskErHcf1rAz20LU+R7Z5zVU5FHg2 bcSVoSW4xXIrYfYlwvLq1Ok99ipOJzOqPTEbw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=nohk5eKnlK5lLxfu7g/UFxg3ydYxOjXQyq0LR9K/JBY441TW7P/Dh2/vyyYezHU86Q U/pnzM8DtTfwZZ+nIGJT4saZiauiGD6nylQQ78Kucycefmo57rdh4yrsjcTgzRDH5hVi 9ipxVs8E5TAbkBzL5RIvhBRkfp5amq/tV1tOM= Received: by 10.142.174.4 with SMTP id w4mr4007376wfe.119.1283129861839; Sun, 29 Aug 2010 17:57:41 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id y16sm8877881wff.14.2010.08.29.17.57.39 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 29 Aug 2010 17:57:40 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Sun, 29 Aug 2010 17:57:14 -0700 From: Pyun YongHyeon Date: Sun, 29 Aug 2010 17:57:14 -0700 To: Philipp Wuensche Message-ID: <20100830005714.GB1330@michelle.cdnetworks.com> References: <201008250109.o7P19uEp046002@lava.sentex.ca> <4C76A226.5070302@h3q.com> <20100826212757.GA3391@icarus.home.lan> <4C76E320.9090008@h3q.com> <20100826221526.GA4760@icarus.home.lan> <4C77B7DA.3040801@h3q.com> <4C781707.9020201@h3q.com> <4C791ADB.9040505@h3q.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="d6Gm4EdcadzBjdND" Content-Disposition: inline In-Reply-To: <4C791ADB.9040505@h3q.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org, Jack Vogel , Jeremy Chadwick Subject: Re: Crashes on X7SPE-HF with em X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Aug 2010 00:57:42 -0000 --d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Aug 28, 2010 at 04:19:07PM +0200, Philipp Wuensche wrote: > Philipp Wuensche wrote: > > > > It just now started running the kernel without IPSEC and ALTQ. > > Here we go again, this time it crashed with IPSEC and ALTQ disabled, > crashdump looks different this time though. > > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0xffff80400bc58038 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff808a41ae > stack pointer = 0x28:0xffffff80000e69a0 > frame pointer = 0x28:0xffffff80000e69b0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 0 (em1 taskq) > trap number = 12 > panic: page fault > cpuid = 0 > Uptime: 23h30m3s > Physical memory: 4079 MB > Dumping 1907 MB: 1892 1876em1: Watchdog timeout -- resetting > 1860 1844 1828 1812 1796 1780 1764 1748 1732 1716 1700 1684 1668 1652 > 1636 1620 1604 1588 1572 1556 1540 1524 1508 1492 1476 1460 1444 1428 > 1412 1396 1380 1364 1348 1332 1316 1300 1284 1268 1252 1236 1220 1204 > 1188 1172 1156 1140 1124 1108 1092 1076 1060 1044 1028 1012 996 980 964 > 948 932 916 900 884 868 852 836 820 804 788 772 756 740 724 708 692 676 > 660 644 628 612 596 580 564 548 532 516 500 484 468 452 436 420 404 388 > 372 356 340 324 308 292 276 260 244 228 212 196 180 164 148 132 116 100 > 84 68 52 36 20 4 > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from > /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from > /boot/kernel/geom_stripe.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/geom_stripe.ko > Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from > /boot/kernel/coretemp.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/coretemp.ko > Reading symbols from /boot/kernel/ahci.ko...Reading symbols from > /boot/kernel/ahci.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ahci.ko > Reading symbols from /boot/kernel/ipmi.ko...Reading symbols from > /boot/kernel/ipmi.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ipmi.ko > Reading symbols from /boot/kernel/smbus.ko...Reading symbols from > /boot/kernel/smbus.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/smbus.ko > Reading symbols from /boot/kernel/pflog.ko...Reading symbols from > /boot/kernel/pflog.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/pflog.ko > Reading symbols from /boot/kernel/pf.ko...Reading symbols from > /boot/kernel/pf.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/pf.ko > #0 doadump () at pcpu.h:224 > 224 __asm("movq %%gs:0,%0" : "=r" (td)); > (kgdb) list *0xffffffff808a41ae > 0xffffffff808a41ae is in pmap_kextract > (/usr/src/sys/amd64/amd64/pmap.c:1172). > 1167 vm_paddr_t pa; > 1168 > 1169 if (va >= DMAP_MIN_ADDRESS && va < DMAP_MAX_ADDRESS) { > 1170 pa = DMAP_TO_PHYS(va); > 1171 } else { > 1172 pde = *vtopde(va); > 1173 if (pde & PG_PS) { > 1174 pa = (pde & PG_PS_FRAME) | (va & PDRMASK); > 1175 } else { > 1176 /* > (kgdb) backtrace > #0 doadump () at pcpu.h:224 > #1 0xffffffff805b2b5e in boot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:416 > #2 0xffffffff805b2f6c in panic (fmt=0x0) > at /usr/src/sys/kern/kern_shutdown.c:590 > #3 0xffffffff808ac70d in trap_fatal (frame=0xffffffff80c5cc60, > eva=Variable "eva" is not available. > ) > at /usr/src/sys/amd64/amd64/trap.c:777 > #4 0xffffffff808acacf in trap_pfault (frame=0xffffff80000e68f0, usermode=0) > at /usr/src/sys/amd64/amd64/trap.c:693 > #5 0xffffffff808ad2e2 in trap (frame=0xffffff80000e68f0) > at /usr/src/sys/amd64/amd64/trap.c:451 > #6 0xffffffff808923b4 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #7 0xffffffff808a41ae in pmap_kextract (va=51771551252551) > at /usr/src/sys/amd64/amd64/pmap.c:1172 > #8 0xffffffff80890f83 in bus_dmamap_load_mbuf_sg (dmat=0xffffff0002727c00, > map=0xffffffff80c99d40, m0=Variable "m0" is not available. > ) > at /usr/src/sys/amd64/amd64/busdma_machdep.c:659 > #9 0xffffffff8032f8fc in em_refresh_mbufs (rxr=0xffffff0002712600, > limit=975) > at /usr/src/sys/dev/e1000/if_em.c:3691 > #10 0xffffffff8032ff3c in em_rxeof (rxr=0xffffff0002712600, count=100, > done=0x0) at /usr/src/sys/dev/e1000/if_em.c:4210 > #11 0xffffffff80330788 in em_handle_que (context=Variable "context" is > not available. > ) > at /usr/src/sys/dev/e1000/if_em.c:1451 > #12 0xffffffff805efc94 in taskqueue_run (queue=0xffffff0002727b80) > at /usr/src/sys/kern/subr_taskqueue.c:239 > #13 0xffffffff805eff06 in taskqueue_thread_loop (arg=Variable "arg" is > not available. > ) > at /usr/src/sys/kern/subr_taskqueue.c:360 > #14 0xffffffff80589998 in fork_exit ( > callout=0xffffffff805efec0 , > arg=0xffffff80003c2740, frame=0xffffff80000e6c80) > at /usr/src/sys/kern/kern_fork.c:844 > #15 0xffffffff8089288e in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:566 > #16 0x0000000000000000 in ?? () > #17 0x0000000000000000 in ?? () > #18 0x0000000000000000 in ?? () > #19 0x0000000000000000 in ?? () > #20 0x0000000000000000 in ?? () > #21 0x0000000000000000 in ?? () > #22 0x0000000000000000 in ?? () > #23 0x0000000000000000 in ?? () > #24 0x0000000000000000 in ?? () > #25 0x0000000000000000 in ?? () > #26 0x0000000000000000 in ?? () > #27 0x0000000000000000 in ?? () > #28 0x0000000000000000 in ?? () > #29 0x0000000000000000 in ?? () > #30 0x0000000000000000 in ?? () > #31 0x0000000000000000 in ?? () > #32 0x0000000000000000 in ?? () > #33 0x0000000000000000 in ?? () > #34 0x0000000000000000 in ?? () > #35 0x0000000000000000 in ?? () > #36 0x0000000000000000 in ?? () > #37 0x0000000000000000 in ?? () > #38 0x0000000000000000 in ?? () > #39 0x0000000000000000 in ?? () > #40 0x000000000109b000 in ?? () > #41 0x0000000000000000 in ?? () > #42 0x0000000000000000 in ?? () > #43 0xffffffff80c823e0 in sleepq_chains () > #44 0xffffff00025087c0 in ?? () > #45 0xffffff80000e6b20 in ?? () > #46 0xffffff80000e6ad8 in ?? () > #47 0xffffff000267f7c0 in ?? () > #48 0xffffffff805d6a8a in sched_switch (td=0xffffff80003c2740, > newtd=0xffffffff805efec0, flags=Variable "flags" is not available. > ) at /usr/src/sys/kern/sched_ule.c:1844 > Previous frame inner to this frame (corrupt stack?) Ok, thanks for the backtrace. This one indicates suspicious code path in em(4). Would you try attached patch and let me know whether it makes any difference on your box? Note, the patch was not extensively tested so make sure to test first before applying the patch to production box. The patch generated against HEAD but I guess it could be applied to stable/8. --d6Gm4EdcadzBjdND Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="em.rxdma.patch" Index: sys/dev/e1000/if_em.c =================================================================== --- sys/dev/e1000/if_em.c (revision 211984) +++ sys/dev/e1000/if_em.c (working copy) @@ -244,7 +244,7 @@ static void em_set_promisc(struct adapter *); static void em_disable_promisc(struct adapter *); static void em_set_multi(struct adapter *); static void em_update_link_status(struct adapter *); -static void em_refresh_mbufs(struct rx_ring *, int); +static int em_refresh_mbufs(struct rx_ring *, int); static void em_register_vlan(void *, struct ifnet *, u16); static void em_unregister_vlan(void *, struct ifnet *, u16); static void em_setup_vlan_hw_support(struct adapter *); @@ -3675,68 +3675,51 @@ em_txeof(struct tx_ring *txr) * Refresh RX descriptor mbufs from system mbuf buffer pool. * **********************************************************************/ -static void -em_refresh_mbufs(struct rx_ring *rxr, int limit) +static int +em_refresh_mbufs(struct rx_ring *rxr, int i) { struct adapter *adapter = rxr->adapter; struct mbuf *m; bus_dma_segment_t segs[1]; bus_dmamap_t map; struct em_buffer *rxbuf; - int i, error, nsegs, cleaned; + int error, nsegs; - i = rxr->next_to_refresh; - cleaned = -1; - while (i != limit) { - m = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR); - if (m == NULL) - goto update; - m->m_len = m->m_pkthdr.len = MCLBYTES; + m = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR); + if (m == NULL) + return (ENOBUFS); + m->m_len = m->m_pkthdr.len = MCLBYTES; - if (adapter->max_frame_size <= (MCLBYTES - ETHER_ALIGN)) - m_adj(m, ETHER_ALIGN); + if (adapter->max_frame_size <= (MCLBYTES - ETHER_ALIGN)) + m_adj(m, ETHER_ALIGN); - /* - * Using memory from the mbuf cluster pool, invoke the - * bus_dma machinery to arrange the memory mapping. - */ - error = bus_dmamap_load_mbuf_sg(rxr->rxtag, rxr->rx_sparemap, - m, segs, &nsegs, BUS_DMA_NOWAIT); - if (error != 0) { - m_free(m); - goto update; - } + /* + * Using memory from the mbuf cluster pool, invoke the + * bus_dma machinery to arrange the memory mapping. + */ + error = bus_dmamap_load_mbuf_sg(rxr->rxtag, rxr->rx_sparemap, m, segs, + &nsegs, 0); + if (error != 0) { + m_free(m); + return (error); + } - /* If nsegs is wrong then the stack is corrupt. */ - KASSERT(nsegs == 1, ("Too many segments returned!")); + /* If nsegs is wrong then the stack is corrupt. */ + KASSERT(nsegs == 1, ("Too many segments returned!")); - rxbuf = &rxr->rx_buffers[i]; - if (rxbuf->m_head != NULL) - bus_dmamap_unload(rxr->rxtag, rxbuf->map); + rxbuf = &rxr->rx_buffers[i]; + if (rxbuf->m_head != NULL) { + bus_dmamap_sync(rxr->rxtag, rxbuf->map, BUS_DMASYNC_POSTREAD); + bus_dmamap_unload(rxr->rxtag, rxbuf->map); + } - map = rxbuf->map; - rxbuf->map = rxr->rx_sparemap; - rxr->rx_sparemap = map; - bus_dmamap_sync(rxr->rxtag, - rxbuf->map, BUS_DMASYNC_PREREAD); - rxbuf->m_head = m; - rxr->rx_base[i].buffer_addr = htole64(segs[0].ds_addr); - - cleaned = i; - /* Calculate next index */ - if (++i == adapter->num_rx_desc) - i = 0; - /* This is the work marker for refresh */ - rxr->next_to_refresh = i; - } -update: - bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, - BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); - if (cleaned != -1) /* Update tail index */ - E1000_WRITE_REG(&adapter->hw, - E1000_RDT(rxr->me), cleaned); - - return; + map = rxbuf->map; + rxbuf->map = rxr->rx_sparemap; + rxr->rx_sparemap = map; + bus_dmamap_sync(rxr->rxtag, rxbuf->map, BUS_DMASYNC_PREREAD); + rxbuf->m_head = m; + rxr->rx_base[i].buffer_addr = htole64(segs[0].ds_addr); + return (0); } @@ -3840,6 +3823,7 @@ em_setup_receive_ring(struct rx_ring *rxr) BUS_DMASYNC_POSTREAD); bus_dmamap_unload(rxr->rxtag, rxbuf->map); m_freem(rxbuf->m_head); + rxbuf->m_head = NULL; } } @@ -3873,7 +3857,7 @@ em_setup_receive_ring(struct rx_ring *rxr) /* Setup our descriptor indices */ rxr->next_to_check = 0; - rxr->next_to_refresh = 0; + rxr->rxdiscard = 0; bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); @@ -4107,13 +4091,13 @@ em_rxeof(struct rx_ring *rxr, int count, int *done struct mbuf *mp, *sendmp; u8 status = 0; u16 len; - int i, processed, rxdone = 0; + int i, processed, rdt, rxdone = 0; bool eop; struct e1000_rx_desc *cur; EM_RX_LOCK(rxr); - for (i = rxr->next_to_check, processed = 0; count != 0;) { + for (i = rdt = rxr->next_to_check, processed = 0; count != 0;) { if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) break; @@ -4133,9 +4117,21 @@ em_rxeof(struct rx_ring *rxr, int count, int *done count--; if ((cur->errors & E1000_RXD_ERR_FRAME_ERR_MASK) == 0) { - + mp = rxr->rx_buffers[i].m_head; + if (em_refresh_mbufs(rxr, i) != 0) { + ifp->if_iqdrops++; + if (rxr->fmp != NULL) { + /* Mark discarding chanined mbufs. */ + if (eop == 0) + rxr->rxdiscard++; + m_freem(rxr->fmp); + rxr->fmp = NULL; + rxr->lmp = NULL; + } + sendmp = NULL; + goto discard; + } /* Assign correct length to the current fragment */ - mp = rxr->rx_buffers[i].m_head; mp->m_len = len; if (rxr->fmp == NULL) { @@ -4151,6 +4147,14 @@ em_rxeof(struct rx_ring *rxr, int count, int *done } if (eop) { + if (rxr->rxdiscard > 0) { + m_freem(rxr->fmp); + rxr->fmp = NULL; + rxr->lmp = NULL; + rxr->rxdiscard = 0; + sendmp = NULL; + goto discard; + } rxr->fmp->m_pkthdr.rcvif = ifp; ifp->if_ipackets++; em_receive_checksum(cur, rxr->fmp); @@ -4179,6 +4183,9 @@ skip: } } else { ifp->if_ierrors++; + /* Mark discarding chanined mbufs. */ + if (eop == 0) + rxr->rxdiscard++; /* Reuse loaded DMA map and just update mbuf chain */ mp = rxr->rx_buffers[i].m_head; mp->m_len = mp->m_pkthdr.len = MCLBYTES; @@ -4195,11 +4202,18 @@ skip: sendmp = NULL; } +discard: /* Zero out the receive descriptors status. */ cur->status = 0; ++rxdone; /* cumulative for POLL */ - ++processed; - + /* Only refresh mbufs every 8 descriptors */ + rdt = i; + if (++processed == 8) { + bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); + E1000_WRITE_REG(&adapter->hw, E1000_RDT(rxr->me), rdt); + processed = 0; + } /* Advance our pointers to the next descriptor. */ if (++i == adapter->num_rx_desc) i = 0; @@ -4212,18 +4226,13 @@ skip: EM_RX_LOCK(rxr); i = rxr->next_to_check; } - - /* Only refresh mbufs every 8 descriptors */ - if (processed == 8) { - em_refresh_mbufs(rxr, i); - processed = 0; - } } /* Catch any remaining refresh work */ if (processed != 0) { - em_refresh_mbufs(rxr, i); - processed = 0; + bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); + E1000_WRITE_REG(&adapter->hw, E1000_RDT(rxr->me), rdt); } rxr->next_to_check = i; Index: sys/dev/e1000/if_em.h =================================================================== --- sys/dev/e1000/if_em.h (revision 211984) +++ sys/dev/e1000/if_em.h (working copy) @@ -310,11 +310,11 @@ struct rx_ring { struct taskqueue *tq; struct e1000_rx_desc *rx_base; struct em_dma_alloc rxdma; - u32 next_to_refresh; u32 next_to_check; struct em_buffer *rx_buffers; struct mbuf *fmp; struct mbuf *lmp; + u8 rxdiscard; /* Interrupt resources */ void *tag; --d6Gm4EdcadzBjdND--