Date: Mon, 12 Apr 2010 17:02:55 -0700 From: Pyun YongHyeon <pyunyh@gmail.com> To: "Erich Jenkins, Fuujin Group Ltd" <erich@fuujingroup.com> Cc: freebsd-net@freebsd.org, Evgenii Davidov <dado@korolev-net.ru> Subject: Re: Broadcom BCM5701 / HP NC6770 Message-ID: <20100413000255.GH1444@michelle.cdnetworks.com> In-Reply-To: <4BC3B676.3070503@fuujingroup.com> References: <20100409070147.GA77350@korolev-net.ru> <4BBEE18C.6040204@fuujingroup.com> <20100409173821.GD1085@michelle.cdnetworks.com> <4BC016F3.4020300@fuujingroup.com> <20100410212520.GB6481@michelle.cdnetworks.com> <4BC12097.4030508@fuujingroup.com> <4BC19324.3050800@fuujingroup.com> <20100412175701.GC1444@michelle.cdnetworks.com> <20100412194209.GF1444@michelle.cdnetworks.com> <4BC3B676.3070503@fuujingroup.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 12, 2010 at 06:10:30PM -0600, Erich Jenkins, Fuujin Group Ltd wrote: > Pyun YongHyeon wrote: > >On Mon, Apr 12, 2010 at 10:57:01AM -0700, Pyun YongHyeon wrote: > >>On Sun, Apr 11, 2010 at 03:15:16AM -0600, Erich Jenkins, Fuujin Group Ltd > >>wrote: > >>>I've been muddling around in src/sys/dev on the old system and the new > >>>system and there appear to be rather major changes to MII and bge, > >>>possibly the whole stack? > >>> > >>It was not completely rewritten but many improvements were made. > >> > >>>There are a number of things that seem to have been merged with other > >>>parts of the network stack, or perhaps written into the individual > >>>drivers (someone working on the net stack would have to verify that). > >>> > >>>For instance, some files called in 5.3-REL seem to have gone away > >>>completely, and in the new (unpatched) version of if_bge.c under > >>>7.3-REL, calls to these modules are gone: > >>> > >>>- #include <vm/vm.h> /* for vtophys */ > >>>- #include <vm/pmap.h> /* for vtophys */ > >>One of the most significant changes would be bus_dma(9) conversion > >>which is required to all drivers to make it work correctly on a > >>variety of platforms. bus_dma(9) does not directly use vtophys > >>anymore so these headers were nuked. > >> > >>>- #include <machine/clock.h> /* for DELAY */ > >>>- #include <machine/bus_memio.h> > >>> > >>>- #include <dev/pci/pcireg.h> (called but something changed in here) > >>>- #include <dev/pci/pcivar.h> (ditto above) > >>> > >>No, these headers are still present. > >> > >>>It appears that the checksum features have been completely rewritten, > >>Checksum offloading was not completely rewritten but workaround > >>for buggy controllers was added. > >> > >>>and some of the ring settings have changed. It's interesting that the > >>>driver only fills 256 of the rx rings in the hopes that the cpu is "fast > >>>enough to keep up with the NIC". Would a subroutine here to grab the cpu > >>That magic number 256 is adequate for most cases but it may not be > >>enough to handle heavy loads. Internally the controller use fixed > >>512 RX buffers but bge(4) used only half of the buffers to save > >>resources. I think you can increase SSLOTS to 512 to get full 512 > >>RX buffers. > >> > >>>clock and count (number of procs/pipelines) be more trouble than it's > >>>worth to "automagically" increase the number of rx rings the driver > >>>fills based on the system in which it's installed? > >>> > >>Dynamically increasing number of RX buffers is doable but it would > >>add much more code. If there is high demand for that I would just > >>increase number of RX buffers to 512. Controller can't be > >>configured to have more than 512 RX buffers. > >> > >>>Something also changed in pci/pcireg.h and pci/pcivar.h, but I haven't > >>>had the time to hunt down and expand the source tree from the 5.3-REL > >>>branch yet. > >>> > >>>I have other machines with copper nics utilizing the bge driver, and > >>>there are no issues at all. Perhaps I'm getting ahead of things, but > >>Yes that is expected one. :-) > >> > >>>since this seems to have been broken through several releases, would it > >>>make any sense to split the support between the BCM5701KHB chipset and > >>>the more recent BCM chipset to avoid causing issues with cards/systems > >>>not currently experiencing troubles? > >>> > >>I'd like to if I can. Supporting huge number of different > >>controllers in single driver is maintenance nightmare. However, > >>rewriting some part that require special handling for certain > >>controller/revision is too risky because I don't have access to > >>most controllers. > >> > >>One theory for the issue I got while reading the code is link state > >>handling. As I said in previous mail, link state handling for TBI > >>is somewhat tricky in bge(4) and driver seemed to rely on periodic > >>register access to keep track of link state. I guess polling(4) may > >>give different behavior on link state handling as it does not rely > >>on interrupts at all. So would you try to use polling(4) and see > >>that make any difference on your box? > > > >If polling(4) make it work, try attached patch. > > > > > >------------------------------------------------------------------------ > > > >_______________________________________________ > >freebsd-net@freebsd.org mailing list > >http://lists.freebsd.org/mailman/listinfo/freebsd-net > >To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > I'll get this set up. I've got a jail issues on 7.0-REL that I'm trying > to figure out too, so it might take a few hours before I get to this. > I beleive bge(4) in 7.0-RELEASE and 7.3-RELEASE is quite different. So I'm not sure whether the patch works on 7.0-RELEASE. > I just checked on a reported iSCSI error on a machine using a BCM5721 > nic (copper gigE) and I'm seeing issues like this: > > Apr 11 06:24:59 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > "Opcode": Got 0 expected 5. > Apr 11 06:24:59 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > iscsi_write_data_decap() failed > Apr 11 16:51:52 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > "Opcode": Got 0 expected 5. > Apr 11 16:51:52 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > iscsi_write_data_decap() failed > Apr 12 10:32:49 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > "Opcode": Got 0 expected 5. > Apr 12 10:32:49 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > iscsi_write_data_decap() failed > Apr 12 11:55:42 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > "Opcode": Got 0 expected 5. > Apr 12 11:55:42 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > iscsi_write_data_decap() failed > Apr 12 14:07:13 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > "Opcode": Got 0 expected 5. > Apr 12 14:07:13 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > iscsi_write_data_decap() failed > > Any chance this could be because of the NIC chipset? I don't see this on > any of the machines configured identically, using the em driver for > Intel GigE nics. > Have no idea what happens here. Does this also happen on 7.3-RELEASE?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100413000255.GH1444>