Date: Fri, 28 Oct 2011 07:46:07 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Lawrence Stewart <lstewart@freebsd.org> Cc: Kostik Belousov <kostikbel@gmail.com>, freebsd-net@freebsd.org, freebsd-current@freebsd.org, Andre Oppermann <andre@freebsd.org>, John Baldwin <jhb@freebsd.org> Subject: Re: 9.0-RC1 panic in tcp_input: negative winow. Message-ID: <20111028054605.GF1667@garage.freebsd.pl> In-Reply-To: <4EA9F76E.9010008@freebsd.org> References: <20111022084931.GD1697@garage.freebsd.pl> <201110240814.22368.jhb@freebsd.org> <20111026075431.GB1672@garage.freebsd.pl> <201110260753.37264.jhb@freebsd.org> <4EA9F76E.9010008@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--RE3pQJLXZi4fr8Xo Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 28, 2011 at 11:29:34AM +1100, Lawrence Stewart wrote: > On 10/26/11 22:53, John Baldwin wrote: > > The assertion would be triggered when the next packet arrives (as I said > > above). Try modifying your debugging output to also log if the ACK is > > delayed. I suspect it is not delayed until the last one. (Pushing out= an > > ACK will reset rcv_adv to be beyond rcv_nxt in tcp_output(), so in the = case > > of an immediate ACK, rcv_nxt> rcv_adv is only a transient condition all > > under a single lock invocation so never visible to other consumers of t= he > > protocol control block.) If that is what you see, then that confirms w= hat > > I guessed above and I will likely just remove the assertion in tcp_inpu= t() > > and patch the timewait code to handle this case. > > >=20 > Pawel, have you been able to confirm John's hypothesis? [...] Yeah, sorry. I moved the debug to the points where we drop the t_inpcb lock and I still see rcv_nxt being greater than rcv_adv: tcp_do_segment:2970 negative window: tp 0xfffffe00685ee3d0 rcv_nxt 1312878= 324 rcv_adv 1312878187 This is just before the INP_WUNLOCK(tp->t_inpcb) under 'check_delack' label. I see this a lot (it was logged 545 times for 11 different tp pointers during 24h period). tcp_do_segment:3009 negative window: tp 0xfffffe005cfc6000 rcv_nxt 1442546= 453 rcv_adv 1442545722 This is just before calling tcp_output(). This one was logged 65 times for 3 different tp pointers. I placed a debug also after tcp_output() call, but it is not logged, so once we return from tcp_output() everything is fine. The panic would be triggered 115 times for 5 different tp pointers during that time. I write 'tp pointers' as I'm not 100% sure if the same pointer always represents the same connection or if it is reused. > [...] What I don't=20 > quite get is why we haven't had a lot more reports of this issue... Maybe because my TCP/IP stack is heavly modified? ...not:) No idea to be honest. Ask Ken to turn on INVARIANTS in 9.0-RC2 and we will see:) --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --RE3pQJLXZi4fr8Xo Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk6qQZ0ACgkQForvXbEpPzTIcwCcC6C06i2hgJshb29NsE5iZ5NJ l/EAoO/qBU7/4+8tJOElQQUArjNWpq4t =CGv+ -----END PGP SIGNATURE----- --RE3pQJLXZi4fr8Xo--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111028054605.GF1667>