From owner-freebsd-net@FreeBSD.ORG Mon Sep 13 22:08:49 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B8AA106566C for ; Mon, 13 Sep 2010 22:08:49 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3C0398FC14 for ; Mon, 13 Sep 2010 22:08:48 +0000 (UTC) Received: by qwg5 with SMTP id 5so4219646qwg.13 for ; Mon, 13 Sep 2010 15:08:48 -0700 (PDT) Received: by 10.229.1.143 with SMTP id 15mr3374913qcf.287.1284415728156; Mon, 13 Sep 2010 15:08:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.38.83 with HTTP; Mon, 13 Sep 2010 15:08:08 -0700 (PDT) In-Reply-To: <20100913180447.GA1229@michelle.cdnetworks.com> References: <20100909102826.GB53812@rambler-co.ru> <20100909201050.GG7203@michelle.cdnetworks.com> <20100909211808.GJ7203@michelle.cdnetworks.com> <20100913142707.GL10050@rambler-co.ru> <20100913180447.GA1229@michelle.cdnetworks.com> From: Vlad Galu Date: Tue, 14 Sep 2010 01:08:08 +0300 Message-ID: To: pyunyh@gmail.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, Igor Sysoev Subject: Re: bge hangs on recent 7.3-STABLE X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Sep 2010 22:08:49 -0000 On Mon, Sep 13, 2010 at 9:04 PM, Pyun YongHyeon wrote: > On Mon, Sep 13, 2010 at 06:27:08PM +0400, Igor Sysoev wrote: >> On Thu, Sep 09, 2010 at 02:18:08PM -0700, Pyun YongHyeon wrote: >> >> > On Thu, Sep 09, 2010 at 01:10:50PM -0700, Pyun YongHyeon wrote: >> > > On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote: >> > > > Hi, >> > > > >> > > > I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 1= 1.01.2010 >> > > > and 25.02.2010. Hosts process about 10K input and 10K output packe= ts/s >> > > > without issues. One of them, however, is loaded more than others, = so it >> > > > processes 20K/20K packets/s. >> > > > >> > > > Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010. >> > > > Then bge on this host hung two times. I was able to restart it fro= m >> > > > console using: >> > > > =A0 /etc/rc.d/netif restart bge0 >> > > > >> > > > Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE,= 07.09.2010. >> > > > After reboot bge hung every several seconds. I was able to restart= it, >> > > > but bge hung again after several seconds. >> > > > >> > > > Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since = there >> > > > were several if_bge.c commits on 15.08.2010. The same hangs. >> > > > Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before >> > > > the first if_bge.c commit after 25.02.2010. Now it runs without ha= ngs. >> > > > >> > > > The hosts are amd64 dual core SMP with 4G machines. bge informatio= n: >> > > > >> > > > bge0@pci0:4:0:0: =A0 =A0 =A0 =A0class=3D0x020000 card=3D0x165914e4= chip=3D0x165914e4 rev=3D0x11 hdr=3D0x00 >> > > > =A0 =A0 vendor =A0 =A0 =3D 'Broadcom Corporation' >> > > > =A0 =A0 device =A0 =A0 =3D 'NetXtreme Gigabit Ethernet PCI Express= (BCM5721)' >> > > > >> > > > bge0: mem 0xfe5f0000-0xfe5fffff irq 19 at device 0.0 on pci4 >> > > > miibus1: on bge0 >> > > > brgphy0: PHY 1 on miibus1 >> > > > brgphy0: =A010baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000ba= seT, 1000baseT-FDX, auto >> > > > bge0: Ethernet address: 00:e0:81:5f:6e:8a >> > > > >> > > >> > > Could you show me verbose boot message(bge part only)? >> > > Also show me the output of "pciconf -lcbv". >> > > >> > >> > Forgot to send a patch. Let me know whether attached patch fixes >> > the issue or not. >> >> > Index: sys/dev/bge/if_bge.c >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> > --- sys/dev/bge/if_bge.c =A0 =A0(revision 212341) >> > +++ sys/dev/bge/if_bge.c =A0 =A0(working copy) >> > @@ -3386,9 +3386,11 @@ >> > =A0 =A0 sc->bge_rx_saved_considx =3D rx_cons; >> > =A0 =A0 bge_writembx(sc, BGE_MBX_RX_CONS0_LO, sc->bge_rx_saved_considx= ); >> > =A0 =A0 if (stdcnt) >> > - =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, sc->bge= _std); >> > + =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, (sc->bg= e_std + >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 BGE_STD_RX_RING_CNT - 1) % BGE_STD_RX_RI= NG_CNT); >> > =A0 =A0 if (jumbocnt) >> > - =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc->b= ge_jumbo); >> > + =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, (sc->= bge_jumbo + >> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 BGE_JUMBO_RX_RING_CNT - 1) % BGE_JUMBO_R= X_RING_CNT); >> > =A0#ifdef notyet >> > =A0 =A0 /* >> > =A0 =A0 =A0* This register wraps very quickly under heavy packet drops= . >> >> Thank you, it seems the patch has fixed the bug. >> BTW, I noticed the same hungs on FreeBSD 8.1, date=3D2010.09.06.23.59.59 >> I will apply the patch on all my updated hosts. >> > > Thanks for testing. I'm afraid bge(4) in HEAD, stable/8 and > stable/7(including 8.1-RELEASE and 7.3-RELEASE) may suffer from > this issue. Let me know what other hosts work with the patch. Hi Pyun, Thanks for the patch. It seems to have fixed the symptom in my case, on a card identical to Igor's, but on board of an IBM eServer 306m. Regards, Vlad --=20 Good, fast & cheap. Pick any two.