Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Feb 2021 19:20:55 +0100
From:      Michael Tuexen <tuexen@freebsd.org>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com>, "net@FreeBSD.org" <net@freebsd.org>
Subject:   Re: panic: sackhint bytes rtx >= 0
Message-ID:  <58AED4F6-FBA4-46D5-98D4-DD31BD9A946B@freebsd.org>
In-Reply-To: <c03e903f-5386-4c7b-ee95-b558791d6e35@FreeBSD.org>
References:  <960081ac-2839-a226-3142-5a4666eb54ad@FreeBSD.org> <8BB910F9-B828-49A2-8941-4F6C6419D8EA@freebsd.org> <SN4PR0601MB37288873BB2967AC78B26C8986809@SN4PR0601MB3728.namprd06.prod.outlook.com> <c03e903f-5386-4c7b-ee95-b558791d6e35@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help


> On 25. Feb 2021, at 19:08, Andriy Gapon <avg@FreeBSD.org> wrote:
>=20
> On 24/02/2021 00:40, Scheffenegger, Richard wrote:
>> Hi Andriy,
>>=20
>> I guess I am currently the person who has the most recent knowledge =
about that
>> part of the base stack=E2=80=A6
>>=20
>> Do you happen to have more (preceding) information about this, or a =
way to
>> reproduce this?
>=20
> Unfortunately, no.  It "just happened".
> There was no unusual activity at the time of the crash.
> At least, from a user / administrator perspective.
So you did not use specific TCP stacks or congestion control modules?
>=20
>> Are you running any special stack (RACK, BBR) which may have switched =
back to
>> the base stack in the middle of a loss recovery (I suspected at one =
point that
>> this may cause issues, potentially)?
>=20
> I do not have any customizations to the network stack, everything at =
defaults.
>=20
>> Or was something done with the ipfw that may have temporarily =
impacted a tcp
>> session?
>=20
> I use pf as a firewall.  I did not touch it at the time of the crash.
Does pf change the packets?
>=20
>> The accounting with sack_bytes_rexmit is rather old, and not touched =
recently
>> (but the sackhint struct was changed recently, and other/additional =
scoreboard
>> accounting was added).
>>=20
>> (kgdb) p *cur
>> $1 =3D {start =3D 3846347980, end =3D 3846352300, rxmit =3D =
3846352300, scblink =3D
>> {tqe_next =3D 0xfffff8013da5a220, tqe_prev =3D 0xfffff80754818930}}
>>=20
>> This indicates, that the current hole in the SACK scoreboard (3 =
segments of size
>> 1440 bytes) were retransmitted  (rxmit =3D=3D end), before the =
current
>> acknowledgement came back.
>>=20
>> Thus the expectation is, that sackhint.sack_bytes_rexmit also has a =
value of at
>> least that number of bytes (4320). It is increased in tcp_output() =
for each
>> packet leaving while performing a retransmission.
>>=20
>> But this is the peculiar part:
>>=20
>> (kgdb) p tp@entry->sackhint.sack_bytes_rexmit
>> $3 =3D -1440
>>=20
>> Indicating negative one packet had been retransmitted before (thus =
subtracting
>> the hole, which was previously retransmitted violates the invariant). =
And the
>> only piece of code decrementing it appears to be in tcp_output() =
during
>> non-permanent error handling=E2=80=A6
>>=20
>> All updates to sackhint should be protected by the INPLOCK, so even =
if the rx
>> and tx paths are running on different core, the sack_bytes_rexmit =
should never
>> become negative.
>>=20
>> The sack blocks returned indicate that (with snd.una as zero =
baseline, in
>> segments) the client knows about segments 2..34 and 35..47.
>>=20
>> The first hole has shrunk from the right (unusual; possible when two
>> retransmissions were lost again, or the 3 segment originally sent, =
delayed by
>> ~50 segments (unlikely).
>>=20
>> Sorry to not being able to spot something obvious right away=E2=80=A6
>=20
> Thank you very much for the response.
> Unfortunately, I cannot help much in terms of re-creating the issue as =
for me it
> really "just happened" without me touching anything.  There were just =
some
> HTTP(S) and SSH sessions, the usual stuff.
>=20
> If you need anything else from the crash dump, please let me know.
>=20
>> *Von:*tuexen@freebsd.org <tuexen@freebsd.org>
>> *Gesendet:* Dienstag, 23. Februar 2021 22:21
>> *An:* Richard Scheffenegger <rscheff@freebsd.org>
>> *Betreff:* Fwd: panic: sackhint bytes rtx >=3D 0
>>=20
>> =20
>>=20
>> *NetApp Security WARNING*: This is an external email. Do not click =
links or open
>> attachments unless you recognize the sender and know the content is =
safe.
>>=20
>>=20
>>=20
>> FYI
>>=20
>>=20
>>=20
>>    Begin forwarded message:
>>=20
>>    =20
>>=20
>>    *From: *Andriy Gapon <avg@FreeBSD.org <mailto:avg@FreeBSD.org>>
>>=20
>>    *Subject: panic: sackhint bytes rtx >=3D 0*
>>=20
>>    *Date: *23. February 2021 at 22:02:20 CET
>>=20
>>    *To: *FreeBSD Current <current@FreeBSD.org =
<mailto:current@FreeBSD.org>>,
>>    net@FreeBSD.org <mailto:net@FreeBSD.org>
>>=20
>>    =20
>>=20
>>=20
>>    Got this panic on 13.0-STABLE 4b2a20dfde9c using a custom kernel =
with INVARIANTS
>>    enabled.
>>    Below is some information from the crash dump.
>>    If anyone has any clues, suggestions, etc, please help.
>>    I will try to help you to help me the best I can.
>>=20
>>    #0  doadump (textdump=3Dtextdump@entry=3D1)
>>       at /usr/devel/git/trant/sys/kern/kern_shutdown.c:399
>>    #1  0xffffffff808396b2 in kern_reboot (howto=3D260)
>>       at /usr/devel/git/trant/sys/kern/kern_shutdown.c:486
>>    #2  0xffffffff80839d07 in vpanic (
>>       fmt=3D0xffffffff80cbd551 "sackhint bytes rtx >=3D 0", =
ap=3D0xfffffe0120b9e6d0)
>>       at /usr/devel/git/trant/sys/kern/kern_shutdown.c:919
>>    #3  0xffffffff808398b3 in panic (fmt=3D<unavailable>)
>>       at /usr/devel/git/trant/sys/kern/kern_shutdown.c:843
>>    #4  0xffffffff8098a82c in tcp_sack_doack (tp=3D<optimized out>,
>>       tp@entry=3D0xfffff807548187f0, to=3D<optimized out>,
>>       to@entry=3D0xfffffe0120b9e780, th_ack=3D<optimized out>)
>>       at /usr/devel/git/trant/sys/netinet/tcp_sack.c:691
>>    #5  0xffffffff80983699 in tcp_do_segment (m=3D0xfffff8029868ca00,
>>       m@entry=3D<error reading variable: value is not available>,
>>       th=3D<optimized out>,
>>       th@entry=3D<error reading variable: value is not available>,
>>       so=3D0xfffff804e7359b10,
>>       so@entry=3D<error reading variable: value is not available>,
>>       tp=3D0xfffff807548187f0,
>>       tp@entry=3D<error reading variable: value is not available>,
>>       drop_hdrlen=3D60,
>>       drop_hdrlen@entry=3D<error reading variable: value is not =
available>,
>>       tlen=3D<optimized out>,
>>       tlen@entry=3D<error reading variable: value is not available>,
>>       iptos=3D72 'H',
>>       iptos@entry=3D<error reading variable: value is not available>)
>>       at /usr/devel/git/trant/sys/netinet/tcp_input.c:2497
>>    #6  0xffffffff80980d97 in tcp_input (mp=3D<optimized out>,
>>       mp@entry=3D<error reading variable: value is not available>,
>>       offp=3D<optimized out>,
>>       offp@entry=3D<error reading variable: value is not available>,
>>       proto=3D<optimized out>)
>>       at /usr/devel/git/trant/sys/netinet/tcp_input.c:1381
>>    #7  0xffffffff80976eb7 in ip_input (m=3D0x0)
>>       at /usr/devel/git/trant/sys/netinet/ip_input.c:833
>>    #8  0xffffffff8094c78f in netisr_dispatch_src (proto=3D1,
>>       source=3Dsource@entry=3D0, m=3D0xfffff8029868ca00)
>>       at /usr/devel/git/trant/sys/net/netisr.c:1143
>>    #9  0xffffffff8094cb0e in netisr_dispatch (proto=3D<unavailable>,
>>       m=3D<unavailable>) at =
/usr/devel/git/trant/sys/net/netisr.c:1234
>>    #10 0xffffffff80943345 in ether_demux =
(ifp=3Difp@entry=3D0xfffff80008c75000,
>>       m=3D<unavailable>) at =
/usr/devel/git/trant/sys/net/if_ethersubr.c:923
>>    #11 0xffffffff809446c1 in ether_input_internal =
(ifp=3D0xfffff80008c75000,
>>       m=3D<unavailable>) at =
/usr/devel/git/trant/sys/net/if_ethersubr.c:709
>>    #12 0xffffffff809443d0 in ether_nh_input (m=3D<unavailable>)
>>       at /usr/devel/git/trant/sys/net/if_ethersubr.c:739
>>    #13 0xffffffff8094c78f in netisr_dispatch_src =
(proto=3Dproto@entry=3D5,
>>       source=3Dsource@entry=3D0, m=3Dm@entry=3D0xfffff8029868ca00)
>>       at /usr/devel/git/trant/sys/net/netisr.c:1143
>>    #14 0xffffffff8094cb0e in netisr_dispatch (proto=3D<unavailable>,
>>       proto@entry=3D5, m=3D<unavailable>, m@entry=3D0xfffff8029868ca00)=

>>       at /usr/devel/git/trant/sys/net/netisr.c:1234
>>    #15 0xffffffff8094383d in ether_input (ifp=3D0xfffff80008c75000,
>>       ifp@entry=3D<error reading variable: value is not available>,
>>       m=3D0xfffff8029868ca00,
>>       m@entry=3D<error reading variable: value is not available>)
>>       at /usr/devel/git/trant/sys/net/if_ethersubr.c:830
>>    #16 0xffffffff8066da13 in re_rxeof =
(sc=3Dsc@entry=3D0xfffffe00391a5000,
>>       rx_npktsp=3D0x0) at =
/usr/devel/git/trant/sys/dev/re/if_re.c:2388
>>    #17 0xffffffff8066b3ab in re_intr_msi (xsc=3D0xfffffe00391a5000)
>>       at /usr/devel/git/trant/sys/dev/re/if_re.c:2684
>>    #18 0xffffffff80801bcf in intr_event_execute_handlers (
>>       p=3Dp@entry=3D0xfffff80006a94a50, =
ie=3Die@entry=3D0xfffff80008c57800)
>>       at /usr/devel/git/trant/sys/kern/kern_intr.c:1168
>>    #19 0xffffffff80801942 in ithread_execute_handlers (
>>       p=3Dp@entry=3D0xfffff80006a94a50, =
ie=3Die@entry=3D0xfffff80008c57800)
>>       at /usr/devel/git/trant/sys/kern/kern_intr.c:1181
>>    #20 0xffffffff80801758 in ithread_loop =
(arg=3Darg@entry=3D0xfffff80008c20c40)
>>       at /usr/devel/git/trant/sys/kern/kern_intr.c:1269
>>    #21 0xffffffff807fe0a9 in fork_exit (
>>       callout=3D0xffffffff808016c0 <ithread_loop>, =
arg=3D0xfffff80008c20c40,
>>       frame=3D0xfffffe0120b9ed40)
>>       at /usr/devel/git/trant/sys/kern/kern_fork.c:1069
>>=20
>>=20
>>    (kgdb) fr 4
>>    #4  0xffffffff8098a82c in tcp_sack_doack (tp=3D<optimized out>,
>>    tp@entry=3D0xfffff807548187f0, to=3D<optimized out>, =
to@entry=3D0xfffffe0120b9e780,
>>    th_ack=3D<optimized out>) at =
/usr/devel/git/trant/sys/netinet/tcp_sack.c:691
>>    691                     KASSERT(tp->sackhint.sack_bytes_rexmit >=3D =
0,
>>    (kgdb) i loc
>>    sack_blocks =3D {{start =3D 3846350860, end =3D 3846396940}, =
{start =3D 3846398380, end
>>    =3D 3846415660}, {start =3D 0, end =3D 0}, {start =3D 1417775088, =
end =3D 4294965255},
>>    {start =3D 549054304, end =3D 4294966785}}
>>    left_edge_delta =3D <optimized out>
>>    delivered_data =3D 0
>>    sack_changed =3D <optimized out>
>>    num_sack_blks =3D <optimized out>
>>    i =3D <optimized out>
>>    j =3D <optimized out>
>>    sack =3D {start =3D <optimized out>, end =3D <optimized out>}
>>    sblkp =3D 0xfffffe0120b9e700
>>    temp =3D <optimized out>
>>    cur =3D 0xfffff8013da5a320
>>    (kgdb) p *cur
>>    $1 =3D {start =3D 3846347980, end =3D 3846352300, rxmit =3D =
3846352300, scblink =3D
>>    {tqe_next =3D 0xfffff8013da5a220, tqe_prev =3D =
0xfffff80754818930}}
>>    (kgdb) p 3846352300 - 3846347980
>>    $2 =3D 4320
>>    (kgdb) p tp->sackhint.sack_bytes_rexmit
>>    value has been optimized out
>>    (kgdb) p tp@entry->sackhint.sack_bytes_rexmit
>>    <mailto:tp@entry-%3esackhint.sack_bytes_rexmit>
>>    $3 =3D -1440
>>    (kgdb) p 4320 - 1440
>>    $4 =3D 2880
>>    (kgdb) p *sblkp
>>    $5 =3D {start =3D 3846350860, end =3D 3846396940}
>>=20
>>    (kgdb) p *tp@entry
>>    $7 =3D {t_inpcb =3D 0xfffff80640ca65b8, t_fb =3D =
0xffffffff80ef4700 <tcp_def_funcblk>,
>>    t_fb_ptr =3D 0x0, t_maxseg =3D 1440, t_logstate =3D 0, t_port =3D =
0, t_state =3D 4,
>>    t_idle_reduce =3D 0, t_delayed_ack =3D 0, t_fin_is_rst =3D 0,
>>     t_log_state_set =3D 0, bits_spare =3D 0, t_flags =3D 1613759072, =
snd_una =3D
>>    3846347980, snd_max =3D 3846417100, snd_nxt =3D 3846417100, snd_up =
=3D 3846347980,
>>    snd_wnd =3D 237568, snd_cwnd =3D 4320, t_peakrate_thr =3D 0, =
ts_offset =3D 71449215,
>>     rfbuf_ts =3D 162235772, rcv_numsacks =3D 0, t_tsomax =3D 0, =
t_tsomaxsegcount =3D 0,
>>    t_tsomaxsegsize =3D 0, rcv_nxt =3D 2281639092, rcv_adv =3D =
2281705332, rcv_wnd =3D
>>    66240, t_flags2 =3D 1030, t_srtt =3D 876, t_rttvar =3D 33, =
ts_recent =3D 0,
>>     snd_scale =3D 8 '\b', rcv_scale =3D 6 '\006', snd_limited =3D 2 =
'\002',
>>    request_r_scale =3D 6 '\006', last_ack_sent =3D 2281639092, =
t_rcvtime =3D 2309118641,
>>    rcv_up =3D 2281639092, t_segqlen =3D 0, t_segqmbuflen =3D 0, =
t_segq =3D {
>>       tqh_first =3D 0x0, tqh_last =3D 0xfffff80754818880}, t_in_pkt =3D=
 0x0, t_tail_pkt
>>    =3D 0x0, t_timers =3D 0xfffff80754818a78, t_vnet =3D 0x0, =
snd_ssthresh =3D 31680,
>>    snd_wl1 =3D 2281639092, snd_wl2 =3D 3846347980, irs =3D =
2281631223,
>>     iss =3D 3840447913, t_acktime =3D 0, t_sndtime =3D 2309118613, =
ts_recent_age =3D 0,
>>    snd_recover =3D 3846415660, cl4_spare =3D 0, t_oobflags =3D 0 =
'\000', t_iobc =3D 0
>>    '\000', t_rxtcur =3D 270, t_rxtshift =3D 1, t_rtttime =3D =
2309118613,
>>     t_rtseq =3D 3846415660, t_starttime =3D 2309086941, t_fbyte_in =3D =
2309087188,
>>    t_fbyte_out =3D 2309087159, t_pmtud_saved_maxseg =3D 0, =
t_blackhole_enter =3D 0,
>>    t_blackhole_exit =3D 0, t_rttmin =3D 30, t_rttbest =3D 845, =
t_softerror =3D 0,
>>     max_sndwnd =3D 237568, snd_cwnd_prev =3D 64800, snd_ssthresh_prev =
=3D 8640,
>>    snd_recover_prev =3D 3846347980, t_sndzerowin =3D 0, t_rttupdated =
=3D 368,
>>    snd_numholes =3D 2, t_badrxtwin =3D 0, snd_holes =3D {tqh_first =3D =
0xfffff8013da5a320,
>>       tqh_last =3D 0xfffff8013da5a230}, snd_fack =3D 3846415660, =
sackblks =3D {{start =3D
>>    2281632180, end =3D 2281632690}, {start =3D 0, end =3D 0}, {start =
=3D 0, end =3D 0},
>>    {start =3D 0, end =3D 0}, {start =3D 0, end =3D 0}, {start =3D 0, =
end =3D 0}},
>>     sackhint =3D {nexthole =3D 0xfffff8013da5a220, sack_bytes_rexmit =
=3D -1440,
>>    last_sack_ack =3D 3846415660, delivered_data =3D 1440, =
sacked_bytes =3D 61920,
>>    recover_fs =3D 67680, prr_delivered =3D 1440, _pad =3D {0}}, =
t_rttlow =3D 25,
>>     rfbuf_cnt =3D 0, tod =3D 0x0, t_sndrexmitpack =3D 520, =
t_rcvoopack =3D 0, t_toe =3D 0x0,
>>    cc_algo =3D 0xffffffff80ef2530 <newreno_cc_algo>, ccv =3D =
0xfffff80754818bc0, osd =3D
>>    0x0, t_bytes_acked =3D 11520, t_maxunacktime =3D 0, t_keepinit =3D =
0,
>>     t_keepidle =3D 0, t_keepintvl =3D 0, t_keepcnt =3D 0, t_dupacks =3D=
 4, t_lognum =3D 0,
>>    t_loglimit =3D 0, t_pacing_rate =3D -1, t_logs =3D {stqh_first =3D =
0x0, stqh_last =3D
>>    0x0}, t_lin =3D 0x0, t_lib =3D 0x0, t_output_caller =3D 0x0, =
t_stats =3D 0x0,
>>     t_logsn =3D 0, gput_ts =3D 0, gput_seq =3D 0, gput_ack =3D 0, =
t_stats_gput_prev =3D 0,
>>    t_tfo_client_cookie_len =3D 0 '\000', t_end_info_status =3D 0, =
t_tfo_pending =3D 0x0,
>>    t_tfo_cookie =3D {client =3D '\000' <repeats 15 times>, server =3D =
0}, {
>>       t_end_info_bytes =3D "\000\000\000\000\000\000\000", t_end_info =
=3D 0}}
>>=20
>>    (kgdb) p *tp@entry->sackhint.nexthole =
<mailto:*tp@entry-%3esackhint.nexthole>
>>    $8 =3D {start =3D 3846396940, end =3D 3846398380, rxmit =3D =
3846398380, scblink =3D
>>    {tqe_next =3D 0x0, tqe_prev =3D 0xfffff8013da5a330}}
>>=20
>>    --=20
>>    Andriy Gapon
>>    _______________________________________________
>>    freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> =
mailing list
>>    https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>    <https://lists.freebsd.org/mailman/listinfo/freebsd-current>;
>>    To unsubscribe, send any mail to =
"freebsd-current-unsubscribe@freebsd.org
>>    <mailto:freebsd-current-unsubscribe@freebsd.org>"
>>=20
>> =20
>>=20
>=20
>=20
> --=20
> Andriy Gapon




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?58AED4F6-FBA4-46D5-98D4-DD31BD9A946B>