From owner-freebsd-net@freebsd.org Sat Dec 28 04:44:02 2019 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9A76A1D9FDA for ; Sat, 28 Dec 2019 04:44:02 +0000 (UTC) (envelope-from pkelsey@gmail.com) Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47lB0y41Ymz3DDT; Sat, 28 Dec 2019 04:44:02 +0000 (UTC) (envelope-from pkelsey@gmail.com) Received: by mail-il1-f173.google.com with SMTP id f10so23808457ils.8; Fri, 27 Dec 2019 20:44:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Q3nlrGPMPO643TjH2I/Mv6eK2nDDu7P1XnkxCTWr9UA=; b=bh+CwtWhxnL5vjeQDgUTdPnV3jOGNN52Vu/Upugw8mdU/jhMAyKG2Ucq3VpIaalNHK xG8ktmczNkkCfFf/3cGc1N35Kc/TGem3b7xcjLaXalkgjZSgRa8CijciSPkoM2prr0bR xCBMfHDqvUr3WAvb4YOvm9XuQNw+MHheXSEBU5nv7Mwo/TlW62VLrHyKmPWps+3haT2q +H/Wx9EbiPOn4zTw3SlOIM21NXhfQ8R4Ci3ejTh5md04K29Mg8pIF3xVJg7hLKQv+mzO p4yugcN9zfylKteNAwk/xDcrU/jis0agTmlMVD3p5DhzWtmLtjv3xlsQzkY7pB7Btjn5 AGMQ== X-Gm-Message-State: APjAAAV5cI0az9YtH5jUZpQw1u1tS7iuRSxKG6awOPZjP8mMSmgmedjx 5qQxqfvL4Y+zrYLuA+ztG8RbHj9SvY+s9M13meNWjw== X-Google-Smtp-Source: APXvYqyOJNbdCkm+yXpLBhIhZpjS4W/XCi234v1zg1iRnTrO5zzTQ/d1zVO7TFTeEFMeqDOqw2rwzdYIrFXCEU2Q4aM= X-Received: by 2002:a92:1f16:: with SMTP id i22mr48896724ile.206.1577508240637; Fri, 27 Dec 2019 20:44:00 -0800 (PST) MIME-Version: 1.0 References: <67dc1ce9-274c-7e70-30dc-97e2d5767237@FreeBSD.org> <963e3042-90b4-4de2-e18c-3e29627a25a9@FreeBSD.org> In-Reply-To: <963e3042-90b4-4de2-e18c-3e29627a25a9@FreeBSD.org> From: Patrick Kelsey Date: Fri, 27 Dec 2019 23:43:47 -0500 Message-ID: Subject: Re: vmx: strange issue, related to to tso? To: Andriy Gapon Cc: Vincenzo Maffione , freebsd-net X-Rspamd-Queue-Id: 47lB0y41Ymz3DDT X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Dec 2019 04:44:02 -0000 On Fri, Dec 27, 2019 at 5:01 PM Andriy Gapon wrote: > On 27/12/2019 15:34, Vincenzo Maffione wrote: > > It may be useful to check what happens if you replace the vmx0 interface > with an > > em0. > > In this way you would know if the issue is vmx-specific or not. > > I'll put this on my to-do, can't test right now. > > But one thing I noticed when comparing the TCP control block of the > connection > before and after the "TSO dance" is that TF_TSO gets cleared after any > outgoing > traffic while TSO is disabled on the interface. And the flag does not > come back > after TSO is reenabled. Any new connections get the flag, of course. > > So, I indeed suspect that there is a problem with vmx TSO. > As another data point, an older system from before vmx->iflib conversion > does > not exhibit the problem. > > > Il giorno gio 26 dic 2019 alle ore 20:04 Andriy Gapon > > ha scritto: > > > > > > Maybe someone would have any pointers for me with the following > problem. > > This happens with CURRENT as of the beginning of September. > > I connect via ssh to a VM running on VMware, it has a single vmx0 > interface. > > The problem is that when I print a moderately large amount of text > to the > > terminal (e.g., tail -100 /var/log/messages) I literally see it > printed in > > chunks with noticeable pauses between chunks. It takes several > seconds for all > > lines to get shown. This happens every time I do it. > > There is an interesting twist. If I disable TSO with ifconfig vmx0 > -tso and > > print the same output in the same ssh session, then the output is > smooth and > > fast as I would expect it. The lines scroll by almost instantly. > > If then I re-enable TSO and again produce the same output in the > same ssh, then > > it is still fast. > > > > It appears that the TCP connection gets tuned to some very > sub-optimal > > parameters when TSO is enabled. When I disable TSO, the parameters > get re-tuned > > to better values and the values stick when I re-enable TSO. > > This is just a conjecture, of course. > > > > I have some tcpdump captures, but I do not see anything that would > really stand > > out. One difference is that in the slow case only "full sized" > packets are sent > > while in the fast case there are shorter packets with push flag. > > > > Some packets for the slow case: > > 00:00:00.453202 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags > [.], seq > > 37:1485, ack 36, win 128, options [nop,nop,TS val 1403195134 ecr > 4966311], > > length 1448 > > 00:00:00.096859 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags > [.], ack 1485, > > win 1026, options [nop,nop,TS val 4966864 ecr 1403195134], length 0 > > 00:00:00.442963 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags > [.], seq > > 1485:2933, ack 36, win 128, options [nop,nop,TS val 1403195664 ecr > 4966864], > > length 1448 > > 00:00:00.092677 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags > [.], ack 2933, > > win 1026, options [nop,nop,TS val 4967400 ecr 1403195664], length 0 > > 00:00:00.437336 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags > [.], seq > > 2933:4381, ack 36, win 128, options [nop,nop,TS val 1403196194 ecr > 4967400], > > length 1448 > > 00:00:00.097190 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags > [.], ack 4381, > > win 1026, options [nop,nop,TS val 4967934 ecr 1403196194], length 0 > > > > Some packets after the TSO dance: > > 00:00:00.000450 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [.], seq > > 4077:5525, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 1448 > > 00:00:00.000016 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [P.], seq > > 5525:6097, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 572 > > 00:00:00.000009 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags > [.], ack 5525, > > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0 > > 00:00:00.000303 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [.], seq > > 6097:7545, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 1448 > > 00:00:00.000019 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [P.], seq > > 7545:8117, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 572 > > 00:00:00.000013 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags > [.], ack 7545, > > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0 > > 00:00:00.000162 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [.], seq > > 8117:9565, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 1448 > > 00:00:00.000012 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags > [P.], seq > > 9565:10137, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr > 21706510], > > length 572 > > 00:00:00.000007 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags > [.], ack 9565, > > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0 > > > > What else can I examine to debug the problem further? > > Thank you! > > -- > > Andriy Gapon > > _______________________________________________ > > freebsd-net@freebsd.org mailing > list > > https://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to " > freebsd-net-unsubscribe@freebsd.org > > " > > > > I am not able to test this at the moment, nor likely in the very near future, but I did have a few minutes to do some code reading and now believe that the following is part of the problem, if not the entire problem. Using r353803 as a reference, I believe line 1323 in sys/dev/vmware/vmxnet3/if_vmx.c (in vmxnet3_isc_txd_encap()) should be: sop->hlen = hdrlen + ipi->ipi_tcp_hlen; instead of the current: sop->hlen = hdrlen; This can be seen by going back to r333813 and examining the CSUM_TSO case of vmxnet3_txq_offload_ctx(). The final increment of *start in that case is what was literally lost in translation when converting the driver to iflib. -Patrick