Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Dec 2011 16:18:42 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        =?utf-8?q?=D0=9A=D0=BE=D0=BD=D1=8C=D0=BA=D0=BE=D0=B2?=, Andre Oppermann <andre@freebsd.org>, freebsd-net@freebsd.org, =?utf-8?q?_=D0=95=D0=B2=D0=B3=D0=B5=D0=BD=D0=B8=D0=B9?= <kes-kes@yandex.ru>, freebsd-current@freebsd.org, Kostik Belousov <kostikbel@gmail.com>, Lawrence Stewart <lstewart@freebsd.org>
Subject:   Re: 9.0-RC1 panic in tcp_input: negative winow.
Message-ID:  <201112291618.43170.jhb@freebsd.org>
In-Reply-To: <20111229202501.GA1889@garage.freebsd.pl>
References:  <20111022084931.GD1697@garage.freebsd.pl> <201112291112.59912.jhb@freebsd.org> <20111229202501.GA1889@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, December 29, 2011 3:25:02 pm Pawel Jakub Dawidek wrote:
> On Thu, Dec 29, 2011 at 11:12:59AM -0500, John Baldwin wrote:
> > On Sunday, December 25, 2011 11:01:33 am =D0=9A=D0=BE=D0=BD=D1=8C=D0=BA=
=D0=BE=D0=B2 =D0=95=D0=B2=D0=B3=D0=B5=D0=BD=D0=B8=D0=B9 wrote:
> > > =D0=97=D0=B4=D1=80=D0=B0=D0=B2=D1=81=D1=82=D0=B2=D1=83=D0=B9=D1=82=D0=
=B5, John.
> > >=20
> > > =D0=92=D1=8B =D0=BF=D0=B8=D1=81=D0=B0=D0=BB=D0=B8 20 =D0=B4=D0=B5=D0=
=BA=D0=B0=D0=B1=D1=80=D1=8F 2011 =D0=B3., 16:52:44:
> > >=20
> > > JB> On Saturday, December 17, 2011 6:21:27 pm Pawel Jakub Dawidek wro=
te:
> > > >> On Mon, Dec 12, 2011 at 11:00:23AM -0500, John Baldwin wrote:
> > > >> > An update.  I've sent Pawel a testing patch to see if my hypothe=
sis is correct
> > > >> > (www.freebsd.org/~jhb/patches/tcp_negwin_test.patch).  If it is =
then I intend
> > > >> > to commit www.freebsd.org/~jhb/patches/tcp_negwin2.patch as the =
fix.
> > > >>=20
> > > >> Unfortunately it paniced today. Take a look at:
> > > >>=20
> > > >>       http://people.freebsd.org/~pjd/misc/tcp_panic.jpg
> > >=20
> > > JB> Ok, the one use case I was worried about is happening regularly b=
efore your
> > > JB> panic, so that is good.  Can you use gdb to figure out which call=
 to
> > > JB> tcp_output() is actually panic'ing?  I wonder if it is this case:
> > >=20
> > > JB>         /*
> > > JB>          * Return any desired output.
> > > JB>          */
> > > JB>         if (needoutput || (tp->t_flags & TF_ACKNOW)) {
> > > JB>                 (void) tcp_output(tp);
> > > JB>                 /* XXX: Debug */
> > > JB>                 KASSERT(SEQ_GEQ(tp->rcv_adv, tp->rcv_nxt),
> > > JB>                     ("tcp_input: negative window after ACK"));
> > >=20
> > > JB> And if 'needoutput' is true, but TF_ACKNOW is not set, and tcp_ou=
tput() decides
> > > JB> to not do anything.  I've updated tcp_negwin_test.patch to not pa=
nic if that call
> > > JB> to tcp_output() doesn't actually send a packet.  Please re-test.
> > >=20
> > >=20
> > > # uname -a
> > > FreeBSD meta-up 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #4: Sat Dec 24 =
13:59:20 EET 2011     @:/usr/obj/usr/src/sys/KES_KERN_v10  i386
> > >=20
> > > rebooting once per day. Now I compile kernel with debug options.
> > > Can you advice me which and where I find debug info when it will
> > > reboting next time? so I can help to debug problem
> >=20
> > Are you using the patch at the URL above (tcp_negwin_test.patch)?  If n=
ot,
> > can you try applying that patch and seeing if you still get any panics?
>=20
> I applied 1.5 days ago, so far now panics and no other messages.
> I modified the patch a bit to not panic, but print a message when panic
> was suppose to happen. This box is too valuable for me to panic it too
> often. Because there were no debug messages I understand that the
> scenerio didn't happen yet and not that the problem is fixed, right?

Yes.  It would be best to see the messages logged to be safe.  Thanks.

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201112291618.43170.jhb>