From owner-svn-src-stable@freebsd.org Tue Nov 19 21:15:13 2019 Return-Path: Delivered-To: svn-src-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A92221BE567; Tue, 19 Nov 2019 21:15:13 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47Hdrd40M2z3xZb; Tue, 19 Nov 2019 21:15:13 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6C4E91870F; Tue, 19 Nov 2019 21:15:13 +0000 (UTC) (envelope-from vmaffione@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xAJLFDfJ096787; Tue, 19 Nov 2019 21:15:13 GMT (envelope-from vmaffione@FreeBSD.org) Received: (from vmaffione@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xAJLFC8l096782; Tue, 19 Nov 2019 21:15:12 GMT (envelope-from vmaffione@FreeBSD.org) Message-Id: <201911192115.xAJLFC8l096782@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: vmaffione set sender to vmaffione@FreeBSD.org using -f From: Vincenzo Maffione Date: Tue, 19 Nov 2019 21:15:12 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r354866 - stable/12/usr.sbin/bhyve X-SVN-Group: stable-12 X-SVN-Commit-Author: vmaffione X-SVN-Commit-Paths: stable/12/usr.sbin/bhyve X-SVN-Commit-Revision: 354866 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Nov 2019 21:15:13 -0000 Author: vmaffione Date: Tue Nov 19 21:15:12 2019 New Revision: 354866 URL: https://svnweb.freebsd.org/changeset/base/354866 Log: MFC r354288: bhyve: add backend rx backpressure to virtio-net If a VM is flooded with more ingress packets than the guest OS can handle, the current virtio-net code will keep reading those packets and drop most of them as no space is available in the receive queue. This is an undesirable receive livelock, which is a waste of CPU and memory resources and potentially opens to DoS attacks. With this change, virtio-net uses the new netbe_rx_disable() function to disable ingress operation in the backend while the guest is short on RX buffers. Once the guest makes more buffers available to the RX virtqueue, ingress operation is enabled again by calling netbe_rx_enable(). Reviewed by: bryanv, jhb Differential Revision: https://reviews.freebsd.org/D20987 Modified: stable/12/usr.sbin/bhyve/mevent.c stable/12/usr.sbin/bhyve/mevent.h stable/12/usr.sbin/bhyve/net_backends.c stable/12/usr.sbin/bhyve/pci_e82545.c stable/12/usr.sbin/bhyve/pci_virtio_net.c Directory Properties: stable/12/ (props changed) Modified: stable/12/usr.sbin/bhyve/mevent.c ============================================================================== --- stable/12/usr.sbin/bhyve/mevent.c Tue Nov 19 21:14:15 2019 (r354865) +++ stable/12/usr.sbin/bhyve/mevent.c Tue Nov 19 21:15:12 2019 (r354866) @@ -321,6 +321,14 @@ mevent_add(int tfd, enum ev_type type, return mevent_add_state(tfd, type, func, param, MEV_ADD); } +struct mevent * +mevent_add_disabled(int tfd, enum ev_type type, + void (*func)(int, enum ev_type, void *), void *param) +{ + + return mevent_add_state(tfd, type, func, param, MEV_ADD_DISABLED); +} + static int mevent_update(struct mevent *evp, int newstate) { Modified: stable/12/usr.sbin/bhyve/mevent.h ============================================================================== --- stable/12/usr.sbin/bhyve/mevent.h Tue Nov 19 21:14:15 2019 (r354865) +++ stable/12/usr.sbin/bhyve/mevent.h Tue Nov 19 21:15:12 2019 (r354866) @@ -43,6 +43,9 @@ struct mevent; struct mevent *mevent_add(int fd, enum ev_type type, void (*func)(int, enum ev_type, void *), void *param); +struct mevent *mevent_add_disabled(int fd, enum ev_type type, + void (*func)(int, enum ev_type, void *), + void *param); int mevent_enable(struct mevent *evp); int mevent_disable(struct mevent *evp); int mevent_delete(struct mevent *evp); Modified: stable/12/usr.sbin/bhyve/net_backends.c ============================================================================== --- stable/12/usr.sbin/bhyve/net_backends.c Tue Nov 19 21:14:15 2019 (r354865) +++ stable/12/usr.sbin/bhyve/net_backends.c Tue Nov 19 21:15:12 2019 (r354866) @@ -220,7 +220,7 @@ tap_init(struct net_backend *be, const char *devname, errx(EX_OSERR, "Unable to apply rights for sandbox"); #endif - priv->mevp = mevent_add(be->fd, EVF_READ, cb, param); + priv->mevp = mevent_add_disabled(be->fd, EVF_READ, cb, param); if (priv->mevp == NULL) { WPRINTF(("Could not register event\n")); goto error; @@ -432,7 +432,7 @@ netmap_init(struct net_backend *be, const char *devnam priv->cb_param = param; be->fd = priv->nmd->fd; - priv->mevp = mevent_add(be->fd, EVF_READ, cb, param); + priv->mevp = mevent_add_disabled(be->fd, EVF_READ, cb, param); if (priv->mevp == NULL) { WPRINTF(("Could not register event\n")); return (-1); Modified: stable/12/usr.sbin/bhyve/pci_e82545.c ============================================================================== --- stable/12/usr.sbin/bhyve/pci_e82545.c Tue Nov 19 21:14:15 2019 (r354865) +++ stable/12/usr.sbin/bhyve/pci_e82545.c Tue Nov 19 21:15:12 2019 (r354866) @@ -2351,6 +2351,8 @@ e82545_init(struct vmctx *ctx, struct pci_devinst *pi, net_genmac(pi, sc->esc_mac.octet); } + netbe_rx_enable(sc->esc_be); + /* H/w initiated reset */ e82545_reset(sc, 0); Modified: stable/12/usr.sbin/bhyve/pci_virtio_net.c ============================================================================== --- stable/12/usr.sbin/bhyve/pci_virtio_net.c Tue Nov 19 21:14:15 2019 (r354865) +++ stable/12/usr.sbin/bhyve/pci_virtio_net.c Tue Nov 19 21:15:12 2019 (r354866) @@ -101,7 +101,6 @@ struct pci_vtnet_softc { net_backend_t *vsc_be; - int vsc_rx_ready; int resetting; /* protected by tx_mtx */ uint64_t vsc_features; /* negotiated features */ @@ -156,7 +155,6 @@ pci_vtnet_reset(void *vsc) pthread_mutex_lock(&sc->tx_mtx); } - sc->vsc_rx_ready = 0; sc->rx_merge = 1; sc->rx_vhdrlen = sizeof(struct virtio_net_rxhdr); @@ -180,30 +178,29 @@ pci_vtnet_rx(struct pci_vtnet_softc *sc) int len, n; uint16_t idx; - if (!sc->vsc_rx_ready) { - /* - * The rx ring has not yet been set up. - * Drop the packet and try later. - */ - netbe_rx_discard(sc->vsc_be); - return; - } - - /* - * Check for available rx buffers - */ vq = &sc->vsc_queues[VTNET_RXQ]; - if (!vq_has_descs(vq)) { + for (;;) { /* - * No available rx buffers. Drop the packet and try later. - * Interrupt on empty, if that's negotiated. + * Check for available rx buffers. */ - netbe_rx_discard(sc->vsc_be); - vq_endchains(vq, /*used_all_avail=*/1); - return; - } + if (!vq_has_descs(vq)) { + /* No rx buffers. Enable RX kicks and double check. */ + vq_kick_enable(vq); + if (!vq_has_descs(vq)) { + /* + * Still no buffers. Interrupt if needed + * (including for NOTIFY_ON_EMPTY), and + * disable the backend until the next kick. + */ + vq_endchains(vq, /*used_all_avail=*/1); + netbe_rx_disable(sc->vsc_be); + return; + } - do { + /* More rx buffers found, so keep going. */ + vq_kick_disable(vq); + } + /* * Get descriptor chain. */ @@ -215,7 +212,8 @@ pci_vtnet_rx(struct pci_vtnet_softc *sc) if (len <= 0) { /* * No more packets (len == 0), or backend errored - * (err < 0). Return unused available buffers. + * (err < 0). Return unused available buffers + * and stop. */ vq_retchain(vq); /* Interrupt if needed/appropriate and stop. */ @@ -225,10 +223,8 @@ pci_vtnet_rx(struct pci_vtnet_softc *sc) /* Publish the info to the guest */ vq_relchain(vq, idx, (uint32_t)len); - } while (vq_has_descs(vq)); + } - /* Interrupt if needed, including for NOTIFY_ON_EMPTY. */ - vq_endchains(vq, /*used_all_avail=*/1); } /* @@ -254,13 +250,11 @@ pci_vtnet_ping_rxq(void *vsc, struct vqueue_info *vq) struct pci_vtnet_softc *sc = vsc; /* - * A qnotify means that the rx process can now begin + * A qnotify means that the rx process can now begin. */ pthread_mutex_lock(&sc->rx_mtx); - if (sc->vsc_rx_ready == 0) { - sc->vsc_rx_ready = 1; - vq_kick_disable(vq); - } + vq_kick_disable(vq); + netbe_rx_enable(sc->vsc_be); pthread_mutex_unlock(&sc->rx_mtx); }