Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Jan 2006 09:20:33 -0600
From:      craig@olyun.gank.org
To:        freebsd-net@freebsd.org
Subject:   Race condition in ip6_getpmtu?
Message-ID:  <20060125152032.GA40581@nowhere>

next in thread | raw e-mail | index | archive | help
I seem to be running into a race condition in ip6_getpmtu.  I've been
having sporadic panics recently -- sometimes the machine will last a
week, sometimes it'll panic twice in a day.  The backtrace is always the
same:

(ddb and doadump frames removed)
#9  0xc066548a in calltrap () at /compile/src/sys/i386/i386/exception.s:139
#10 0xc05d32f1 in ip6_getpmtu (ro_pmtu=0xc528a90c, ro=0xc528a90c, 
    ifp=0xc4fea800, dst=0xe886b964, mtup=0x0, alwaysfragp=0xe886b8e4)
    at /compile/src/sys/netinet6/ip6_output.c:1415
#11 0xc05d2197 in ip6_output (m0=0xc528a90c, opt=0x0, ro=0xc528a90c, flags=4, 
    im6o=0x0, ifpp=0x0, inp=0x0) at /compile/src/sys/netinet6/ip6_output.c:806
#12 0xc05c74d4 in in6_gif_output (ifp=0xc51ae400, family=-971248640, 
    m=0xc5362900) at /compile/src/sys/netinet6/in6_gif.c:216
#13 0xc05900cb in gif_output (ifp=0xc51ae400, m=0xc52d4400, dst=0xc51b1a50, 
    rt=0xc543e084) at /compile/src/sys/net/if_gif.c:435
#14 0xc05a75f0 in ip_output (m=0xc52d4400, opt=0xc51ae400, ro=0xe886baac, 
    flags=0, imo=0x0, inp=0xc6189bf4)
    at /compile/src/sys/netinet/ip_output.c:776
#15 0xc05b1828 in tcp_output (tp=0xc6060c94)
    at /compile/src/sys/netinet/tcp_output.c:1080
#16 0xc05b99ca in tcp_usr_rcvd (so=0x0, flags=0)
    at /compile/src/sys/netinet/tcp_usrreq.c:600
#17 0xc055129b in soreceive (so=0xc61b3de8, psa=0x0, uio=0xe886bcb0, mp0=0x0, 
    controlp=0x0, flagsp=0x0) at /compile/src/sys/kern/uipc_socket.c:1400
#18 0xc053c63f in soo_read (fp=0x0, uio=0x0, active_cred=0xc61c7e00, flags=0, 
    td=0xc61bec00) at /compile/src/sys/kern/sys_socket.c:91
... all the way back to read(), though sometimes it will be in write()

(kgdb) up 10
#10 0xc05d32f1 in ip6_getpmtu (ro_pmtu=0xc528a90c, ro=0xc528a90c, 
    ifp=0xc4fea800, dst=0xe886b964, mtup=0x0, alwaysfragp=0xe886b8e4)
    at /compile/src/sys/netinet6/ip6_output.c:1415
1415                            mtu = ro_pmtu->ro_rt->rt_rmx.rmx_mtu;

(kgdb) print ro_pmtu
$1 = (struct route_in6 *) 0xc528a90c
(kgdb) print ro_pmtu->ro_rt
$2 = (struct rtentry *) 0x0

But...  That line is in a block where ro_rt has already been checked to see if
it's null.

1400            if (ro_pmtu->ro_rt) {
...
1412                    if (mtu)
1413                            mtu = min(mtu, ro_pmtu->ro_rt->rt_rmx.rmx_mtu);
1414                    else
1415                            mtu = ro_pmtu->ro_rt->rt_rmx.rmx_mtu;
...
1441            } else if (ifp) {

So, somehow ro_pmto->ro_rt is being set to null between line 1400 and
1415.  The only function call inside the block is tcp_hc_getmtu(), and
that doesn't touch the routing table.  My guess is that it's being
preempted by another process and the cached neighbor entry is expiring.
I don't see any locks protecting ro_pmtu, however I'm unfamiliar with
how locking in the IP6 code works so there may be one higher up.

The traffic in question is IPv4 traffic going out over an IPv6 gif
tunnel.  So far it always seems to happen when trying to send an
encapsulated packet.

Any IPv6 gurus know if that's a reasonable theory?

Thanks,
Craig

---------

Addendum: While writing this message, it panic'd again.  However this
time it was in a delayed ACK transmission (still for the gif tunnel
though).

#10 0xc05d32f1 in ip6_getpmtu (ro_pmtu=0xc51ecc0c, ro=0xc51ecc0c, 
    ifp=0xc4fea800, dst=0xe3938a2c, mtup=0x0, alwaysfragp=0xe39389ac)
    at /compile/src/sys/netinet6/ip6_output.c:1415
#11 0xc05d2197 in ip6_output (m0=0xc51ecc0c, opt=0x0, ro=0xc51ecc0c, flags=4, 
    im6o=0x0, ifpp=0x0, inp=0x0) at /compile/src/sys/netinet6/ip6_output.c:806
#12 0xc05c74d4 in in6_gif_output (ifp=0xc51af400, family=-991434496, 
    m=0xc53a9d00) at /compile/src/sys/netinet6/in6_gif.c:216
#13 0xc05900cb in gif_output (ifp=0xc51af400, m=0xc53ef900, dst=0xc51ad970, 
    rt=0xc5439084) at /compile/src/sys/net/if_gif.c:435
#14 0xc05a75f0 in ip_output (m=0xc53ef900, opt=0xc51af400, ro=0xe3938b74, 
    flags=0, imo=0x0, inp=0xc61484ec)
    at /compile/src/sys/netinet/ip_output.c:776
#15 0xc05b1828 in tcp_output (tp=0xc6cf3398)
    at /compile/src/sys/netinet/tcp_output.c:1080
#16 0xc05b71c2 in tcp_timer_delack (xtp=0xc6cf3398)
    at /compile/src/sys/netinet/tcp_timer.c:175
#17 0xc051cb83 in softclock (dummy=0x0)
    at /compile/src/sys/kern/kern_timeout.c:290
#18 0xc04f5e90 in ithread_loop (arg=0xc4ee5280)
...

Exact same spot in ip6_getpmtu though.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060125152032.GA40581>