From owner-freebsd-current@FreeBSD.ORG Mon Jun 4 01:29:47 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AF39416A421 for ; Mon, 4 Jun 2007 01:29:47 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.229]) by mx1.freebsd.org (Postfix) with ESMTP id 28C6013C44B for ; Mon, 4 Jun 2007 01:29:44 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by nz-out-0506.google.com with SMTP id 14so756609nzn for ; Sun, 03 Jun 2007 18:29:44 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:received:received:date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; b=Z5Uhmv5cfy7ZqtXN34YZV2tbskQfr77pKAu72lbIsmlTF3nM2n8HahAJ3RPYRsbHvFdv/oMpaUFRO428hWLCpg0auCjffzbLoV1fTAHFsOQPLcecITNQzzpjNAIN+WVP3PbRA3TgLsvIRpMQQCdahq8HxkPsIixx/JIIr69VP48= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; b=q9UQAX2BJb8HaSDbDq3c8c2FQDaofZX4Wc6WY4o+t/2Q1PkLQdw42s4yibg29a4n+G7cqjUKlJ8UC4hyEvUzVBh21Yv7lwpZ7zf2USTwE1kxP2Bjsnb63EdL7gu8+I4hnR3ipkzymUuhAHmXTTFKE4pVGgJgIe64A11LVvgO2aI= Received: by 10.114.37.1 with SMTP id k1mr4322997wak.1180918870404; Sun, 03 Jun 2007 18:01:10 -0700 (PDT) Received: from michelle.cdnetworks.co.kr ( [211.53.35.84]) by mx.google.com with ESMTP id j29sm4850249waf.2007.06.03.18.01.07; Sun, 03 Jun 2007 18:01:08 -0700 (PDT) Received: from michelle.cdnetworks.co.kr (localhost.cdnetworks.co.kr [127.0.0.1]) by michelle.cdnetworks.co.kr (8.13.5/8.13.5) with ESMTP id l54113CL009210 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 4 Jun 2007 10:01:03 +0900 (KST) (envelope-from pyunyh@gmail.com) Received: (from yongari@localhost) by michelle.cdnetworks.co.kr (8.13.5/8.13.5/Submit) id l54112GP009209; Mon, 4 Jun 2007 10:01:02 +0900 (KST) (envelope-from pyunyh@gmail.com) Date: Mon, 4 Jun 2007 10:01:02 +0900 From: Pyun YongHyeon To: Arne H Juul Message-ID: <20070604010102.GA6456@cdnetworks.co.kr> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Qxx1br4bt0+wmkIi" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i Cc: current@freebsd.org Subject: Re: panic in tulip_rx_intr after recent changes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 01:29:47 -0000 --Qxx1br4bt0+wmkIi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, Jun 03, 2007 at 03:39:56PM +0200, Arne H Juul wrote: > (this mail didn't make it to the list from my private > address, so I'm resending it from work instead; my > apologies if it suddenly appears multiple times) > > > I'm getting a kernel panic during network startup with the > "de" driver. Here's the messages from the crash dump: > > <118>Mounting local file systems: > <118>. > <118>Setting hostname: bluebox.trondheim.corp.yahoo.com. > <118>net.inet6.ip6.auto_linklocal: > <118>1 > <118> -> > <118>0 > <118> > de0: unable to load rx map, error = 27 > panic: tulip_rx_intr > cpuid = 0 > KDB: enter: panic > Uptime: 13s > > I think this must have been introduced during the last week > or so on -CURRENT; my old kernel works OK: > > arnej@bluebox:~ $ uname -a > FreeBSD bluebox 7.0-CURRENT FreeBSD 7.0-CURRENT #13: Tue May 29 08:02:41 > CEST 2007 root@bluebox:/usr/obj/home/src.cur/sys/GENERIC amd64 > > as you can see this is on amd64 platform. > > it crashes here (in if_de.c): > > 3557 error = bus_dmamap_load_mbuf(ri->ri_data_tag, > *nextout->di_map, ms, > 3558 tulip_dma_map_rxbuf, nextout->di_desc, > BUS_DMA_NOWAIT); > 3559 if (error) { > 3560 device_printf(sc->tulip_dev, > 3561 "unable to load rx map, error = %d\n", > error); > 3562 panic("tulip_rx_intr"); /* XXX */ > 3563 } > > errno 27 is EFBIG, and indeed the mbuf is MCLBYTES: > > (kgdb) print ms[0].M_dat.MH.MH_pkthdr.len > $22 = 2048 > > while the tag has a lower limit: > > (kgdb) print ri->ri_data_tag[0].maxsegsz > $21 = 2032 > > it looks like this is the triggering change: > > RCS file: /usr/cvs/src/sys/amd64/amd64/busdma_machdep.c,v > ---------------------------- > revision 1.81 > date: 2007/05/29 06:30:25; author: yongari; state: Exp; lines: +2 -0 > Honor maxsegsz of less than a page size in a DMA tag. Previously it > used to return PAGE_SIZE without respect to restrictions of a DMA tag. > This affected all of the busdma load functions that use > _bus_dmamap_loader_buffer() as their back-end. > > so the questions are... > > Is the above change wrong? > or is the "de" driver buggy? > or should bus_dmamap_load_mbuf handle this somehow? > and does it cause problems other places too? > I'm not familiar with de(4) but it seems that it needs big cleanup. All busdma load functions can fail so it's job of the driver to recover from busdma load failure. I think explicitly invoking panic(9) is really bad idea. The de(4) set maximum segment size for a dma segment to TULIP_DATA_PER_DESC in tulip_busdma_allocring(). I don't know why the author limit the segment size to TULIP_DATA_PER_DESC but I guess it comes from the limit of DMA engine of the hardware.(e.g. the hardware can dma upto TULIP_DATA_PER_DESC bytes in size for SG operations.) In Rx path it allocates a mbuf with m_getcl(9) so the length of the mbuf is MCLBYTES which is greater than a segment size supported by the hardware. I guess we have two possible way to fix de(4). 1. Nuke TULIP_DATA_PER_DESC and use MCLBYTES instead. Of course, it assumes the hardware can support upto the segment size in dma operation. 2. Set the mbuf length to TULIP_DATA_PER_DESC in Rx path after allocating a mbuf with m_getcl(9). See attached patch(I don't have de(4) hardware so it's just guess work but you may know the point). However it still lacks a code that should recover from busdma load failure. :-( -- Regards, Pyun YongHyeon --Qxx1br4bt0+wmkIi Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="if_de.patch" Index: if_de.c =================================================================== RCS file: /home/ncvs/src/sys/dev/de/if_de.c,v retrieving revision 1.182 diff -u -r1.182 if_de.c --- if_de.c 23 Feb 2007 12:18:37 -0000 1.182 +++ if_de.c 4 Jun 2007 00:47:16 -0000 @@ -3553,7 +3553,7 @@ M_ASSERTPKTHDR(ms); KASSERT(ms->m_data == ms->m_ext.ext_buf, ("rx mbuf data doesn't point to cluster")); - ms->m_len = ms->m_pkthdr.len = MCLBYTES; + ms->m_len = ms->m_pkthdr.len = TULIP_RX_BUFLEN; error = bus_dmamap_load_mbuf(ri->ri_data_tag, *nextout->di_map, ms, tulip_dma_map_rxbuf, nextout->di_desc, BUS_DMA_NOWAIT); if (error) { --Qxx1br4bt0+wmkIi--