From owner-freebsd-transport@freebsd.org Thu Feb 25 11:26:24 2016 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D5025AB4BBF for ; Thu, 25 Feb 2016 11:26:24 +0000 (UTC) (envelope-from yongmincho82@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B5EDD3AF for ; Thu, 25 Feb 2016 11:26:24 +0000 (UTC) (envelope-from yongmincho82@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id B46DEAB4BBE; Thu, 25 Feb 2016 11:26:24 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9A117AB4BBD for ; Thu, 25 Feb 2016 11:26:24 +0000 (UTC) (envelope-from yongmincho82@gmail.com) Received: from mail-pa0-x22a.google.com (mail-pa0-x22a.google.com [IPv6:2607:f8b0:400e:c03::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6DB6E3AB for ; Thu, 25 Feb 2016 11:26:24 +0000 (UTC) (envelope-from yongmincho82@gmail.com) Received: by mail-pa0-x22a.google.com with SMTP id fl4so30849009pad.0 for ; Thu, 25 Feb 2016 03:26:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:mime-version:content-type :content-disposition:user-agent; bh=7owfSbbMe+fw5e+coqTWgszdkWHU9ioofvs83vvzT5U=; b=EUzTwVdrXF/CP55XgBPu4CpQCwnrOffkSMI6O5XnzFybX1astn2wqAXz/BzdWQmM4D 24Hx/TWU3tKdm9+7HMTb75Q6iCAE9C2NvpbbcMW4DAWIYf1hDnaUeqVTSCnPYiJYfOrw pBCPxnzKy0SL3qlM1OWGqMVYYbriwG5ldz7m2EWLtlDYcESjSjzvOHXbSg3EDbMK6IIo ORe+6aBSRV74KkNgla/UGk/U+Bq6cMO9L1nTTjt79xhEP3TUOwanbIR0BQwIUjDD2oiE Ag0zrPUJcyvdMXlFberRUMfsmqLuon1kVcg5e4Io2BpgHpPsNlc4aHDXa0GGberOVE2T SDDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-type:content-disposition:user-agent; bh=7owfSbbMe+fw5e+coqTWgszdkWHU9ioofvs83vvzT5U=; b=hVSV9yeICxE/rzF0Lm33PZfhibWhK86W3Uq/d0sPQJ3HKLY4FY68gLr0DI4b71pz90 Nn/M2irJdei2pfGgY6EdHnlrxV9Qx22JYztEuwcEurg57L0uZsz7rQNeEAvo4spYHDg5 AFZM7pH9OeouHRO46V4HPUhdJE4XNH4owSDZFDuhY1DveM5/wjchp0CZXQnvKExZ5mLC bIJc+6Jwyk6dFbEqk6Gd4yatlYdNoK6IJFjqq0rKFFfdbrjcQ04lSad7TkWB1xnOhxEQ tx9gbQo2q/KZ6cYNdnFrg0IgyCFTfTpEqTqjZzAAnLdTf5hf2JLDvP3y7LBK17KzvSCv CWfA== X-Gm-Message-State: AG10YOSRcTBLnpCMlqztcsEkvzZ7Xxgoq/uUMnRXUd4t2j+/BMgHxp1SDF1XHsa03DQ4qA== X-Received: by 10.66.167.237 with SMTP id zr13mr62801950pab.85.1456399583959; Thu, 25 Feb 2016 03:26:23 -0800 (PST) Received: from yongmincho-All-Series ([106.247.248.2]) by smtp.gmail.com with ESMTPSA id vy6sm11631549pac.38.2016.02.25.03.26.22 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 25 Feb 2016 03:26:23 -0800 (PST) Date: Thu, 25 Feb 2016 20:26:26 +0900 From: Yongmin Cho To: transport@freebsd.org Subject: In TCP recovery state, problem of computing the pipe(amount of data in flight). Message-ID: <20160225112625.GA5680@yongmincho-All-Series> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2016 11:26:25 -0000 Hi, all. I have a question about net.inet.tcp.rfc6675_pipe in sysctl. The bytes in flight was changed to be like below in r290122. pipe = snd_max - snd_una - sackhint.sacked_bytes + sackhint.sack_bytes_rexmit. I think, The implementation of sackhint.sack_bytes_rexmit is right. But, I don't think, sackhint.sacked_bytes is right way. The sackhint.sacked_bytes is computed by array of sack_blocks in tcp_sack_doack function. You know, tcp header can have four sacked blocks. (If tcp uses timestmap option, tcp header can have three sacked blocks.) Even if The receiver has sacked blocks greater than three or four, The receiver can send ack with three or four last sack blocks. So if the receiver has many sacked blocks, the sender only knows three sacked_bytes. the snd_holes tail queue in struct tcpcb has all of sack holes which is greater than snd_una. So, i think, sack_bytes_rexmit is correct. Because sack_bytes_rexmit is computed by snd_holes tail queue in struct tcpcb. but sackhint.sacked_bytes is too small. Because sackhint.sacked_bytes is just computed by ack with three or four last sacked blocks. So, the return value of tcp_compute_pipe() function is too big, while recovery phase. In recovery state, the sender can send data, if the return value of tcp_compute_pipe() should be less than snd_ssthresh. Sometimes it takes a long time to send data, if the sender knows many sack holes. Furthermore, Sometimes the sender can't send data, Because the return value of tcp_compute_pipe() function. And retransmission timeout is triggered. IMO, sackhint.sack_bytes should be computed using snd_holes tail queue. Because snd_holes has all of sack holes which is greater than snd_una, sackhint.sack_bytes can be computed using snd_holes. Please check my opinion. I'm sorry, I'm not good at english. Thank you in advance your answers. From owner-freebsd-transport@freebsd.org Sat Feb 27 03:16:10 2016 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D076EAB22CD for ; Sat, 27 Feb 2016 03:16:10 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B4C0CFA1 for ; Sat, 27 Feb 2016 03:16:10 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id B2369AB22CC; Sat, 27 Feb 2016 03:16:10 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97E99AB22CB for ; Sat, 27 Feb 2016 03:16:10 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7AE85FA0 for ; Sat, 27 Feb 2016 03:16:10 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 0BFE719035; Fri, 26 Feb 2016 19:16:04 -0800 (PST) Date: Fri, 26 Feb 2016 19:16:04 -0800 From: hiren panchasara To: Yongmin Cho Cc: transport@freebsd.org Subject: Re: In TCP recovery state, problem of computing the pipe(amount of data in flight). Message-ID: <20160227031604.GP31665@strugglingcoder.info> References: <20160225112625.GA5680@yongmincho-All-Series> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="6eUvXotnMb6+obQB" Content-Disposition: inline In-Reply-To: <20160225112625.GA5680@yongmincho-All-Series> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Feb 2016 03:16:10 -0000 --6eUvXotnMb6+obQB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 02/25/16 at 08:26P, Yongmin Cho wrote: > Hi, all. >=20 > I have a question about net.inet.tcp.rfc6675_pipe in sysctl. > The bytes in flight was changed to be like below in r290122. > pipe =3D snd_max - snd_una - sackhint.sacked_bytes + > sackhint.sack_bytes_rexmit. > I think, The implementation of sackhint.sack_bytes_rexmit is right. > But, I don't think, sackhint.sacked_bytes is right way. > The sackhint.sacked_bytes is computed by array of sack_blocks in > tcp_sack_doack function. > You know, tcp header can have four sacked blocks. > (If tcp uses timestmap option, tcp header can have three sacked > blocks.) > Even if The receiver has sacked blocks greater than three or four, > The receiver can send ack with three or four last sack blocks. > So if the receiver has many sacked blocks, the sender only knows three > sacked_bytes. > the snd_holes tail queue in struct tcpcb has all of sack holes which > is greater than snd_una. > So, i think, sack_bytes_rexmit is correct. > Because sack_bytes_rexmit is computed by snd_holes tail queue in > struct tcpcb. > but sackhint.sacked_bytes is too small. > Because sackhint.sacked_bytes is just computed by ack with three or > four last sacked blocks. > So, the return value of tcp_compute_pipe() function is too big, while > recovery phase. > In recovery state, the sender can send data, > if the return value of tcp_compute_pipe() should be less than > snd_ssthresh. > Sometimes it takes a long time to send data, if the sender knows many > sack holes. > Furthermore, Sometimes the sender can't send data, Because the return > value of tcp_compute_pipe() function. > And retransmission timeout is triggered. Your analysis is correct and we did think about this. Please look at https://reviews.freebsd.org/D3971 's summary section. Main reason for going with this approach was that it was at least on the conservative side i.e. would send less data (and not more) and would not bloat the network. BTW, have you run into this problem of this causing slower recovery? >=20 > IMO, sackhint.sack_bytes should be computed using snd_holes tail > queue. > Because snd_holes has all of sack holes which is greater than snd_una, > sackhint.sack_bytes can be computed using snd_holes. I thought snd_holes also gets populated by the info in SACKs and if for some reason other end has more than 3 or 4 holes and can't send it, snd_holes would also have incorrect info. I'd have to look at the code again to see if its possible to do this more correctly with snd_holes. Though, I do see the point of this approach would provide better protection against transient problems where other end cannot send SACK holes info for a couple times and resumes again. Again, I'd have to go look at the code closely. It'd be even better if you have a patch for this. If not, no worries. :-) Cheers, Hiren --6eUvXotnMb6+obQB Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJW0RTzXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lERUH/25haE1IDw4Oy1WcZuMnB4gn VUwzuIfVYEY8k9T7ieWvqKYYhMy6rgylDgKWJ7NpI29OLs2NXF1IdJ2HQScPlOt1 f96OWR3AhWtbYA0AqsFcx4sCMY0RkTO/axid8/SpW8QY7+OIUNsLNj54MWjFVihT qdfAfOQWVBjy8H11lamVOvYure1hRJ2BNFPnlnc1CwC/TnLqJblUU3mTFRZomgAM agIhVgXFu1nqcksQ4d8LhhIUeZuiOtaIyO1XbjuXas3gyblHnpe01D19Jr3uv4UU nrPA8Op5Rst65bjlRh+MJ97Aj8ltD7f5Z4LXOX384Axk53eL2DCCSd5Rmc72yFk= =Wyro -----END PGP SIGNATURE----- --6eUvXotnMb6+obQB--