From owner-freebsd-current@FreeBSD.ORG Thu May 5 06:59:20 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF392106564A for ; Thu, 5 May 2011 06:59:19 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 882018FC08 for ; Thu, 5 May 2011 06:59:19 +0000 (UTC) Received: by vxc34 with SMTP id 34so2680528vxc.13 for ; Wed, 04 May 2011 23:59:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=gcUsFsJFxrH49dGvWtLQ4tLDYXwfnGWEI+Hsly799vg=; b=n80sEXJ8d7+VlEtnMxpehni8is5vYtd6OzaRRXlQRJ8GUvSVoC0JUe7iSuwoFsgVF1 wSfbSKZuY04yL7+fea7wfWbzHQG2gKRQhKzC0rzeIRr9wogq8ZfJS+4gt1HJPRW14wcO cy+LYflWkuddSJmBcyMfI6tEMh+Dfp8Ljyk9Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=h76ST5Zakpe881DkLQayaqOepENx9TejiQL9EZ2bCpouXedm529CmEGhH65qHbGq5Q goBMXH2jXhdjBUnoQtIVsw6vZwk4QHaBserBQx3jmZ0Q0veOuRRUA+2/oylwyZWURDwV mMlcEbFty5fyU4UucVEIiH44VETYQdRfJ4qSA= MIME-Version: 1.0 Received: by 10.52.18.14 with SMTP id s14mr2522267vdd.164.1304578758804; Wed, 04 May 2011 23:59:18 -0700 (PDT) Received: by 10.52.184.169 with HTTP; Wed, 4 May 2011 23:59:18 -0700 (PDT) In-Reply-To: References: <4D94A354.9080903@sentex.net> <4DC07013.9070707@gmx.net> <4DC078BD.9080908@gmx.net> Date: Wed, 4 May 2011 23:59:18 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Olivier Smedts , FreeBSD current mailing list Subject: Re: problems with em(4) since update to driver 7.2.2 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2011 06:59:20 -0000 OK, but what this does not explain is why I do not see this if its so easily reproduced, what causes the failure case, any idea? As I said, given the code was not feasible for igb anyway I would not be unhappy about returning to the old way of doing things. Jack On Wed, May 4, 2011 at 11:03 PM, Arnaud Lacombe wrote: > Hi, > > On Thu, May 5, 2011 at 1:20 AM, Arnaud Lacombe wrote: > > Hi, > > > > On Wed, May 4, 2011 at 5:38 PM, Jack Vogel wrote: > >> I have had my validation engineer busy all day, we have tried both > >> a 9 kernel as well as 8.2, using the code from HEAD, and we > >> cannot reproduce this problem. > >> > > Actually, it can be trivially reproduced by tainting `error'. As it is > > uninitialized in HEAD, it's value can be _anything_, so let's mark it > > as explicitly invalid. > > > > diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c > > --- ./if_em.c 2011-02-18 01:18:23.000000000 -0500 > > +++ /data/src/freebsd/em-7.2.2/src/if_em.c 2011-05-05 > > 01:12:01.000000000 -0400 > > @@ -3912,7 +3912,7 @@ > > struct adapter *adapter = rxr->adapter; > > struct em_buffer *rxbuf; > > bus_dma_segment_t seg[1]; > > - int i, j, nsegs, error; > > + int i, j, nsegs, error = -1; > > > > The error pointed out in this thread pops up in the next boot. > > > I put a call to kdb_enter() at the beginning of the function, helped > with some textdump I got all the backtrace [0] for all the time > em_setup_receive_ring() is called. All are exactly the same: > > kdb_enter_why(0,c09f6511,f391aaa8,c09be1e2,c09f6511,...) at > kdb_enter_why+0x3b > kdb_enter(c09f6511,0,3810,ffffffff,5dc,...) at kdb_enter+0x19 > em_setup_receive_ring(c3c8d600,c3c8d7a4,c3c96004,310000fa,c3c8d600,...) > at em_setup_receive_ring+0x22 > em_setup_receive_structures(c3c96000,f15f2000,38,8100,3,...) at > em_setup_receive_structures+0x26 > em_init_locked(c3c96000,0,c09f5de5,414,10000,...) at em_init_locked+0x2f2 > em_ioctl(c3c7d000,80206934,c3ce9d00,c07b7a0b,c3f2a230,...) at > em_ioctl+0x1c3 > ifhwioctl(c3f2a230,f391ac34,c07b7a0b,c3f3e3d0,c08df1c0,...) at > ifhwioctl+0x4b8 > ifioctl(c3f3e3d0,80206934,c3ce9d00,c3f2a230,c3f2a230,...) at ifioctl+0x82 > kern_ioctl(c3f2a230,3,80206934,c3ce9d00,c3ce9d00,...) at kern_ioctl+0xa8 > ioctl(c3f2a230,f391acf8,c,c,f391ad2c,...) at ioctl+0xc5 > syscall(f391ad38) at syscall+0x17d > Xint0x80_syscall() at Xint0x80_syscall+0x20 > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x4816ee23, esp = > 0xbfbfe67c, ebp = 0xbfbfe698 --- > > This fully explain why the main loop in em_setup_receive_ring() is > never entered, as we always verify `j == rxr->next_to_check' (provided > that mbuf have been refreshed if some packet were transfered) and > return the value on the stack. As of now, beside changing the > call-site of em_setup_receive_ring() to ensure it is never re-entered, > I'd guess that the patch I sent earlier today, is the only way to > ensure that no junk is returned. > > I'd guess that the driver _is_ able to transmit, if the code was not > explicitly calling em_stop() upon em_setup_receive_structures() > failure. > > - Arnaud > > [0]: I wish that would have been as easy as in Linux, where a WARN() > call do all the job automatically, but still, I should not hope for > that much unless I am the one implementing it ... yes, free whining, > it's 2a.m. ... > > > - Arnaud > > > >> The data your netstat -m shows suggests to me that what's happening > >> is somehow setup of the receive ring is running more than once maybe?? > >> > >> You asked at one point how this could go into STABLE, well, because > >> not only here at Intel, but at lots of external customers this code has > been > >> used and tested thoroughly. > >> > >> I am not calling into question your problem, but until I understand what > it > >> is I cannot "fix" it :) > >> > >> The thing I am guessing right now is the culprit is the setup code, the > >> reason > >> is that when I ported to the igb driver I found that it did not work on > our > >> newer > >> hardware, and so I went back to the older version of setup for igb. Now, > >> even > >> though I have not seen hardware fail with em, maybe there is some. > >> > >> To help me give me a complete pciconf -lv, and if its a namebrand system > >> tell me that, including all hardware in it. > >> > >> If you like Olivier I can make a version of em for you that also reverts > the > >> setup code the way I did for igb, see if that fixes it for you? > >> > >> Thanks for your patience, > >> > >> Jack > >> _______________________________________________ > >> freebsd-current@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-current > >> To unsubscribe, send any mail to " > freebsd-current-unsubscribe@freebsd.org" > >> > > >