Date: Thu, 22 Apr 2021 22:22:30 -0700 From: Kevin Bowling <kevin.bowling@kev009.com> To: FreeBSD Net <freebsd-net@freebsd.org> Subject: Client Networking Issues / NIC Lab Message-ID: <CAK7dMtBy=wvi4=ES6yhO0t%2BVfXcjcTtSuMK1_Vt3t3eZPY53Yg@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Greetings, I have been looking into client networking issues in FreeBSD lately. To summarize the situation, common NICs like Intel gigabit (e1000 aka lem(4)/em(4)/igb(4)), Realtek (re(4)), Aquantia, and Tehuti Networks are unmaintained or not present on FreeBSD. The purpose of this thread is to gauge whether that matters, and if it does what to do. I believe it is important because we are losing out on a pipeline of new contributors by not supporting client hardware well. We risk losing NAS, firewall, and other embedded users which may not be large enough to negotiate with these vendors for support or have the volume to do custom BOMs to avoid risky parts. My opinion has been developed after researching the drivers, Bugzilla, and various internet forums where end users exchange advice or ask for help where FreeBSD is the underlying cause. e1000 is in the best shape, with recent vendor involvement, but covers 20 years of silicon with over 100 chipsets (of which at least 60 are significant variations). Datasheets are readily available for most of them, as well as "specification updates" which list errata. There are chipsets which have been completely broken for several years. More common, there are cases that lead to user frustration, including with the most recent hardware. All of the silicon tends to have significant bugs around PCIe, TSO, NC-SI (IPMI sideband), arbitration conflicts with ME and more. Intel doesn't patch the microcode on these, but many of the issues can be worked around in software. Performing an audit of the driver will take quite a while, and making and testing changes gives me concern. When we (my previous employer and team) converted these drivers to iflib, we fixed some of the common cases for PCIe and TSO issues but only had a handful of chips to test against, so the driver works better for some and worse or not at all for others. I have started fixing some of the bugs in Bugzilla, but I only have a few e1000 variants on hand to test, and I have an unrelated full time job so this is just occupying limited spare time as a hobby. re(4) is in pretty abhorrent state. All of these chips require runtime patching of the phy (which I believe is a DSP algorithm that gets improved over time) and mcu code. That is totally absent in FreeBSD. A vendor driver exists in net/realtek-re-kmod which contains the fixups and works alright for many users. This driver cannot be imported into FreeBSD as is. There is a strange use of the C PreProcessor which blows up compile time and driver size needlessly. The out of tree driver has a different set of supported adapters, so some kind of meld is necessary. Realtek does not provide public chip documentation, I am trying to see if they will grant NDA access to contributors. Aquantia has an out of tree driver in net/aquantia-atlantic-kmod. The code is not currently in a place where I'd like to see it in the tree. I am not really sure how common these are, the company was acquired by Marvell which is still producing them as a client networking option while they have other IP for higher end/speed. Tehuti Networks seems to have gone out of business. Probably not worth worrying about. 1) Do nothing. This situation has gone on for a while. Users are somewhat accustomed to purchasing FreeBSD-specific hardware for things like SOHO gateways and NAS. A lot of people just revert back to Linux for client use. OpenBSD seems to have more active contribution around this kind of thing and works better for common cases so that may be another exit ramp. 2) Quantify usage data and beg the vendors for help. This might work for Intel, however these devices have transferred to a client team at intel that does not plan to support FreeBSD, and intel does not keep test systems around long enough to meet FreeBSD user's needs. Realtek is a similar story, I am unsure how long they hold on to test systems and would probably need technical guidance to work with the FreeBSD community. Unsure about Marvell, I've never worked with them. 3) Build a NIC lab and focus on building community support. It would also give the vendors a place to test hardware their labs have purged (due to IT asset management policies or other bureaucratic blunders). Set some boundaries like a 15 year window of chipsets which should cover practical embedded use cases. There are backplane systems and/or external PCI(e) expansion systems that could be assembled to house a large number of NICs. It would probably be cheaper than this, but say a budget of $15000USD is enough to purchase some expansions, a couple managed switches, and a few dozen common NICs. Community members may also send in NICs they wish to see supported or undergo testing. For this to work out long term, there needs to be a quorum of people interested in collaborating on the issue. There are some risks around simply setting this up, depending on the configuration, the bus topology may introduce problems unrelated to the NICs and we'd probably need some semi-automated device.hints or devctl stuff to keep from over provisioning system resources (work on a subset of cards at a time). An interesting extension of this would be a semi-automated validation setup for subsystem changes (significant driver changes, iflib, lro, etc). 4) ??? Regards, Kevin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAK7dMtBy=wvi4=ES6yhO0t%2BVfXcjcTtSuMK1_Vt3t3eZPY53Yg>