From owner-freebsd-xen@FreeBSD.ORG Mon Sep 12 23:37:03 2011 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D62B41065673 for ; Mon, 12 Sep 2011 23:37:03 +0000 (UTC) (envelope-from gibbs@scsiguy.com) Received: from aslan.scsiguy.com (www.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id A8DC88FC08 for ; Mon, 12 Sep 2011 23:37:03 +0000 (UTC) Received: from Justins-MacBook-Pro.local (207-225-98-3.dia.static.qwest.net [207.225.98.3]) (authenticated bits=0) by aslan.scsiguy.com (8.14.4/8.14.4) with ESMTP id p8CNcJHW037161 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 12 Sep 2011 17:38:19 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Message-ID: <4E6E978C.5060608@scsiguy.com> Date: Mon, 12 Sep 2011 17:36:44 -0600 From: "Justin T. Gibbs" User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2 MIME-Version: 1.0 To: "freebsd-xen@freebsd.org" Content-Type: multipart/mixed; boundary="------------030606050402080907060207" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (aslan.scsiguy.com [70.89.174.89]); Mon, 12 Sep 2011 17:38:19 -0600 (MDT) Cc: Subject: [CFT][PATCH] Netfront fixes for FreeBSD-head X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Sep 2011 23:37:03 -0000 This is a multi-part message in MIME format. --------------030606050402080907060207 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Folks, I'm planning to ask RE for permission to merge the following netfront fixes (listed below) into 9.0/head. Before I do so, I'd appreciate some community testing on the proposed patch set (attached). The main areas of concern are: o TSO/LRO and performance o Compatibility with NetBSD and other Dom0s that lack sg or TSO/LRO support. o Compatibility with configurations enabling ioemu on an interface. Thanks, Justin Change 504547 on 2011/09/12 by justing@justing-ns1 Update netfront so that it queries and honors published back-end features. Reported by: Hugo Silva (fix inspired by patch provided) sys/dev/xen/netfront/netfront.c: o Add xn_query_features() which reads the XenStore and records the TSO, LRO, and chained ring-request support of the backend. o Rename xn_configure_lro() to xn_configure_features() and use this routine to manage the setup of TSO, LRO, and checksum offload. o In create_netdev(), initialize if_capabilities and if_hwassist to the capabilities found on all backends. Delegate configuration of if_capenable and the TSO flag if if_hwassist to xn_configure_features(). Change 504474 on 2011/09/12 by justing@justing-ns1 Modify the netfront driver so it can successfully attach to PV devices with the ioemu attribute set. Reported by: Janne Snabb (fix inspired by patch provided) PR: kern/154302 sys/dev/xen/netfront/netfront.c: o If a mac address for the interface cannot be found in the front-side XenStore tree, look for an entry in the back-side tree. With ioemu devices, the emulator does not populate the front side tree and neither does Xend. o Return an error rather than panic when an attach attempt fails. Change 503342 on 2011/09/01 by justing@justing-ns1 Correct suspend/resume support in the Netfront driver. sys/dev/xen/netfront/netfront.c: o Implement netfront_suspend(), a specialized suspend handler for the netfront driver. This routine simply disables the carrier so the driver is idle during system suspend processing. o Fix a leak when re-initializing LRO during a link reset. o In netif_release_tx_bufs(), when cleaning up the grant references for our TX ring, use gnttab_end_foreign_access_ref instead of attempting to grant the page again. o In netif_release_tx_bufs(), we do not track mbufs associated with mbuf chains, but instead just free each mbuf directly. Use m_free(), not m_freem(), to avoid double frees of mbufs. o Refactor some code to enhance clarity. --------------030606050402080907060207 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="netfront.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="netfront.diff" --- //depot/vendor/FreeBSD/head/sys/dev/xen/netfront/netfront.c 2011-06-11 05:00:28.000000000 -0600 +++ //depot/SpectraBSD/head/sys/dev/xen/netfront/netfront.c 2011-09-12 23:13:27.000000000 -0600 @@ -92,7 +92,8 @@ #include "xenbus_if.h" -#define XN_CSUM_FEATURES (CSUM_TCP | CSUM_UDP | CSUM_TSO) +/* Features supported by all backends. TSO and LRO can be negotiated */ +#define XN_CSUM_FEATURES (CSUM_TCP | CSUM_UDP) #define NET_TX_RING_SIZE __RING_SIZE((netif_tx_sring_t *)0, PAGE_SIZE) #define NET_RX_RING_SIZE __RING_SIZE((netif_rx_sring_t *)0, PAGE_SIZE) @@ -159,6 +160,8 @@ static void xn_ifinit_locked(struct netfront_info *); static void xn_ifinit(void *); static void xn_stop(struct netfront_info *); +static void xn_query_features(struct netfront_info *np); +static int xn_configure_features(struct netfront_info *np); #ifdef notyet static void xn_watchdog(struct ifnet *); #endif @@ -174,7 +177,7 @@ static int create_netdev(device_t dev); static void netif_disconnect_backend(struct netfront_info *info); static int setup_device(device_t dev, struct netfront_info *info); -static void end_access(int ref, void *page); +static void free_ring(int *ref, void *ring_ptr_ref); static int xn_ifmedia_upd(struct ifnet *ifp); static void xn_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr); @@ -261,6 +264,7 @@ u_int irq; u_int copying_receiver; u_int carrier; + u_int maxfrags; /* Receive-ring batched refills. */ #define RX_MIN_TARGET 32 @@ -405,11 +409,33 @@ { int error, i; char *s, *e, *macstr; + const char *path; - error = xs_read(XST_NIL, xenbus_get_node(dev), "mac", NULL, - (void **) &macstr); - if (error) + path = xenbus_get_node(dev); + error = xs_read(XST_NIL, path, "mac", NULL, (void **) &macstr); + if (error == ENOENT) { + /* + * Deal with missing mac XenStore nodes on devices with + * HVM emulation (the 'ioemu' configuration attribute) + * enabled. + * + * The HVM emulator may execute in a stub device model + * domain which lacks the permission, only given to Dom0, + * to update the guest's XenStore tree. For this reason, + * the HVM emulator doesn't even attempt to write the + * front-side mac node, even when operating in Dom0. + * However, there should always be a mac listed in the + * backend tree. Fallback to this version if our query + * of the front side XenStore location doesn't find + * anything. + */ + path = xenbus_get_otherend_path(dev); + error = xs_read(XST_NIL, path, "mac", NULL, (void **) &macstr); + } + if (error != 0) { + xenbus_dev_fatal(dev, error, "parsing %s/mac", path); return (error); + } s = macstr; for (i = 0; i < ETHER_ADDR_LEN; i++) { @@ -450,7 +476,7 @@ err = create_netdev(dev); if (err) { xenbus_dev_fatal(dev, err, "creating netdev"); - return err; + return (err); } #if __FreeBSD_version >= 700000 @@ -460,10 +486,22 @@ &xn_enable_lro, 0, "Large Receive Offload"); #endif - return 0; + return (0); } +static int +netfront_suspend(device_t dev) +{ + struct netfront_info *info = device_get_softc(dev); + XN_RX_LOCK(info); + XN_TX_LOCK(info); + netfront_carrier_off(info); + XN_TX_UNLOCK(info); + XN_RX_UNLOCK(info); + return (0); +} + /** * We are reconnecting to the backend, due to a suspend/resume, or a backend * driver restart. We tear down our netif structure and recreate it, but @@ -749,10 +787,7 @@ */ if (((uintptr_t)m) <= NET_TX_RING_SIZE) continue; - gnttab_grant_foreign_access_ref(np->grant_tx_ref[i], - xenbus_get_otherend_id(np->xbdev), - virt_to_mfn(mtod(m, vm_offset_t)), - GNTMAP_readonly); + gnttab_end_foreign_access_ref(np->grant_tx_ref[i]); gnttab_release_grant_reference(&np->gref_tx_head, np->grant_tx_ref[i]); np->grant_tx_ref[i] = GRANT_REF_INVALID; @@ -761,7 +796,7 @@ if (np->xn_cdata.xn_tx_chain_cnt < 0) { panic("netif_release_tx_bufs: tx_chain_cnt must be >= 0"); } - m_freem(m); + m_free(m); } } @@ -1494,7 +1529,7 @@ * deal with nfrags > MAX_TX_REQ_FRAGS, which is a quirk of * the Linux network stack. */ - if (nfrags > MAX_TX_REQ_FRAGS) { + if (nfrags > sc->maxfrags) { m = m_defrag(m_head, M_DONTWAIT); if (!m) { /* @@ -1911,6 +1946,8 @@ return (error); /* Step 1: Reinitialise variables. */ + xn_query_features(np); + xn_configure_features(np); netif_release_tx_bufs(np); /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */ @@ -1978,6 +2015,67 @@ #endif } +static void +xn_query_features(struct netfront_info *np) +{ + int val; + + device_printf(np->xbdev, "backend features:"); + + if (xs_scanf(XST_NIL, xenbus_get_otherend_path(np->xbdev), + "feature-sg", NULL, "%d", &val) < 0) + val = 0; + + np->maxfrags = 1; + if (val) { + np->maxfrags = MAX_TX_REQ_FRAGS; + printf(" feature-sg"); + } + + if (xs_scanf(XST_NIL, xenbus_get_otherend_path(np->xbdev), + "feature-gso-tcpv4", NULL, "%d", &val) < 0) + val = 0; + + np->xn_ifp->if_capabilities &= ~(IFCAP_TSO4|IFCAP_LRO); + if (val) { + np->xn_ifp->if_capabilities |= IFCAP_TSO4|IFCAP_LRO; + printf(" feature-gso-tcp4"); + } + + printf("\n"); +} + +static int +xn_configure_features(struct netfront_info *np) +{ + int err; + + err = 0; +#if __FreeBSD_version >= 700000 + if ((np->xn_ifp->if_capenable & IFCAP_LRO) != 0) + tcp_lro_free(&np->xn_lro); +#endif + np->xn_ifp->if_capenable = + np->xn_ifp->if_capabilities & ~(IFCAP_LRO|IFCAP_TSO4); + np->xn_ifp->if_hwassist &= ~CSUM_TSO; +#if __FreeBSD_version >= 700000 + if (xn_enable_lro && (np->xn_ifp->if_capabilities & IFCAP_LRO) != 0) { + err = tcp_lro_init(&np->xn_lro); + if (err) { + device_printf(np->xbdev, "LRO initialization failed\n"); + } else { + np->xn_lro.ifp = np->xn_ifp; + np->xn_ifp->if_capenable |= IFCAP_LRO; + } + } + if ((np->xn_ifp->if_capabilities & IFCAP_TSO4) != 0) { + np->xn_ifp->if_capenable |= IFCAP_TSO4; + np->xn_ifp->if_hwassist |= CSUM_TSO; + } +#endif + return (err); +} + /** Create a network device. * @param handle device handle */ @@ -2002,7 +2100,7 @@ np->rx_target = RX_MIN_TARGET; np->rx_min_target = RX_MIN_TARGET; np->rx_max_target = RX_MAX_TARGET; - + /* Initialise {tx,rx}_skbs to be a free chain containing every entry. */ for (i = 0; i <= NET_TX_RING_SIZE; i++) { np->tx_mbufs[i] = (void *) ((u_long) i+1); @@ -2032,11 +2130,8 @@ } err = xen_net_read_mac(dev, np->mac); - if (err) { - xenbus_dev_fatal(dev, err, "parsing %s/mac", - xenbus_get_node(dev)); + if (err) goto out; - } /* Set up ifnet structure */ ifp = np->xn_ifp = if_alloc(IFT_ETHER); @@ -2055,19 +2150,6 @@ ifp->if_hwassist = XN_CSUM_FEATURES; ifp->if_capabilities = IFCAP_HWCSUM; -#if __FreeBSD_version >= 700000 - ifp->if_capabilities |= IFCAP_TSO4; - if (xn_enable_lro) { - int err = tcp_lro_init(&np->xn_lro); - if (err) { - device_printf(dev, "LRO initialization failed\n"); - goto exit; - } - np->xn_lro.ifp = ifp; - ifp->if_capabilities |= IFCAP_LRO; - } -#endif - ifp->if_capenable = ifp->if_capabilities; ether_ifattach(ifp, np->mac); callout_init(&np->xn_stat_ch, CALLOUT_MPSAFE); @@ -2078,8 +2160,7 @@ exit: gnttab_free_grant_references(np->gref_tx_head); out: - panic("do something smart"); - + return (err); } /** @@ -2133,12 +2214,8 @@ XN_TX_UNLOCK(info); XN_RX_UNLOCK(info); - end_access(info->tx_ring_ref, info->tx.sring); - end_access(info->rx_ring_ref, info->rx.sring); - info->tx_ring_ref = GRANT_REF_INVALID; - info->rx_ring_ref = GRANT_REF_INVALID; - info->tx.sring = NULL; - info->rx.sring = NULL; + free_ring(&info->tx_ring_ref, &info->tx.sring); + free_ring(&info->rx_ring_ref, &info->rx.sring); if (info->irq) unbind_from_irqhandler(info->irq); @@ -2146,12 +2223,17 @@ info->irq = 0; } - static void -end_access(int ref, void *page) +free_ring(int *ref, void *ring_ptr_ref) { - if (ref != GRANT_REF_INVALID) - gnttab_end_foreign_access(ref, page); + void **ring_ptr_ptr = ring_ptr_ref; + + if (*ref != GRANT_REF_INVALID) { + /* This API frees the associated storage. */ + gnttab_end_foreign_access(*ref, *ring_ptr_ptr); + *ref = GRANT_REF_INVALID; + } + *ring_ptr_ptr = NULL; } static int @@ -2174,7 +2256,7 @@ DEVMETHOD(device_attach, netfront_attach), DEVMETHOD(device_detach, netfront_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), - DEVMETHOD(device_suspend, bus_generic_suspend), + DEVMETHOD(device_suspend, netfront_suspend), DEVMETHOD(device_resume, netfront_resume), /* Xenbus interface */ --------------030606050402080907060207--