Date: Mon, 22 Nov 2004 17:11:54 -0500 From: John Baldwin <jhb@FreeBSD.org> To: Sten Spans <sten@blinkenlights.nl> Cc: freebsd-alpha@FreeBSD.org Subject: Re: alpha and em mtu Message-ID: <200411221711.54916.jhb@FreeBSD.org> In-Reply-To: <Pine.SOC.4.61.0411222147180.10997@tea.blinkenlights.nl> References: <Pine.SOC.4.61.0411142153430.26307@tea.blinkenlights.nl> <200411221432.42028.jhb@FreeBSD.org> <Pine.SOC.4.61.0411222147180.10997@tea.blinkenlights.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 22 November 2004 04:15 pm, Sten Spans wrote: > On Mon, 22 Nov 2004, John Baldwin wrote: > > On Sunday 21 November 2004 07:35 am, Sten Spans wrote: > >>> Does this panic go > >>> away if you use a different MTU btw? > >> > >> I've tried running > >> > >> i=1; while true; echo $i; ifconfig em0 mtu $i; let i++; sleep 2; > >> > >> and on the client: > >> while true; do echo bla | telnet alpha 22; sleep 1; done > >> > >> this caused no crashes with mtu 1-1500. > >> > >> But: > >> deepthought# ifconfig em0 mtu 1666 > >> deepthought# tcp_input: ip 0xfffffc0018cdb00e is misaligned > >> deepthought# ifconfig em0 mtu 1564 > >> deepthought# tcp_input: ip 0xfffffc001857c80e is misaligned > >> deepthought# ifconfig em0 mtu 1532 > >> deepthought# tcp_input: ip 0xfffffc001859300e is misaligned > >> > >> If it has to be 8 bytes aligned then it's off by 4, doesn't > >> seem to be vlanmtu though. > > erm, that would be 2. > > > Ok, this is helpful I think. (Big MTU -> panic.) > > Another thing is : > > deepthought# ifconfig em0 mtu 9000 > sten@ford:~$ ping -s 8000 intern.dt > PING intern.deepthought.blinkenlights.nl (192.168.1.3) 8000(8028) bytes of > data. 8008 bytes from intern.deepthought.blinkenlights.nl (192.168.1.3): > icmp_seq=1 ttl=64 time=1.19 ms 8008 bytes from > intern.deepthought.blinkenlights.nl (192.168.1.3): icmp_seq=2 ttl=64 > time=0.756 ms > > 21:59:12.587494 IP intern.ford > intern.deepthought.blinkenlights.nl: icmp > 8008: echo request seq 1 21:59:12.588223 IP > intern.deepthought.blinkenlights.nl > intern.ford: icmp 8008: echo reply > seq 1 21:59:13.587730 IP intern.ford > intern.deepthought.blinkenlights.nl: > icmp 8008: echo request seq 2 > > Aka icmp does work, which makes me think that the > problem is tcp specific. I've also tried disabling all > the sack/tcp sysctl's but that didn't seem to help. > And I've tried connecting from a box with mtu 1500, > but that also caused the same panic. > > > I'll get an sk card soonish which will allow me to double > check this panic with another nic. Although I would not guess > that the panic is driver specific. Which makes me wonder why > lo0 does work: > deepthought# ifconfig lo0 mtu 1501 > deepthought# telnet 127.0.0.1 22 > Trying 127.0.0.1... > Connected to localhost. > Escape character is '^]'. > SSH-2.0-OpenSSH_3.8.1p1 FreeBSD-20040419 > > > The next step is probably > > to start walking up the stack determining where the pointer starts off > > and how it ends up aligned. Can you use gdb to figure out the source > > file/line of the previous stack frame before tcp_input()? > > sure: > > db> trace > tcp_input() at tcp_input+0x3a4 > ip_input() at ip_input+0x9fc > netisr_processqueue() at netisr_processqueue+0xac > swi_net() at swi_net+0xf0 > ithread_loop() at ithread_loop+0x1d4 > fork_exit() at fork_exit+0x100 > exception_return() at exception_return > --- root of call graph --- > > (gdb) l *tcp_input+0x3a4 > 0xfffffc00004cd054 is in tcp_input (/usr/src/sys/netinet/tcp_input.c:554). > 549 > 550 /* > 551 * Check that TCP offset makes sense, > 552 * pull out TCP options and adjust length. > XXX > 553 */ > 554 off = th->th_off << 2; > 555 if (off < sizeof (struct tcphdr) || off > tlen) { > 556 tcpstat.tcps_rcvbadoff++; > 557 goto drop; > 558 } > (gdb) l *ip_input+0x9fc > 0xfffffc00004c355c is in ip_input (/usr/src/sys/netinet/ip_input.c:739). > 734 /* > 735 * Switch out to protocol's input routine. > 736 */ > 737 ipstat.ips_delivered++; > 738 > 739 (*inetsw[ip_protox[ip->ip_p]].pr_input)(m, hlen); > 740 return; > 741 bad: > 742 m_freem(m); > 743 } > (gdb) l *netisr_processqueue+0xac > 0xfffffc00004ad45c is in netisr_processqueue > (/usr/src/sys/net/netisr.c:233). > 228 > 229 for (;;) { > 230 IF_DEQUEUE(ni->ni_queue, m); > 231 if (m == NULL) > 232 break; > 233 ni->ni_handler(m); > 234 } > 235 } Hmm, so can you check here to see if the 'm' pointer in this routine is misaligned? If so, then this may be a driver bug. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200411221711.54916.jhb>