From owner-freebsd-alpha@FreeBSD.ORG Mon Nov 22 22:12:39 2004 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9978B16A4D0 for ; Mon, 22 Nov 2004 22:12:39 +0000 (GMT) Received: from mail6.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 08C5B43D48 for ; Mon, 22 Nov 2004 22:12:39 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 30675 invoked from network); 22 Nov 2004 22:12:38 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 22 Nov 2004 22:12:38 -0000 Received: from [10.50.41.235] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id iAMMCGQm030184; Mon, 22 Nov 2004 17:12:33 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: Sten Spans Date: Mon, 22 Nov 2004 17:11:54 -0500 User-Agent: KMail/1.6.2 References: <200411221432.42028.jhb@FreeBSD.org> In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200411221711.54916.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Robert Watson cc: freebsd-alpha@FreeBSD.org Subject: Re: alpha and em mtu X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Nov 2004 22:12:39 -0000 On Monday 22 November 2004 04:15 pm, Sten Spans wrote: > On Mon, 22 Nov 2004, John Baldwin wrote: > > On Sunday 21 November 2004 07:35 am, Sten Spans wrote: > >>> Does this panic go > >>> away if you use a different MTU btw? > >> > >> I've tried running > >> > >> i=1; while true; echo $i; ifconfig em0 mtu $i; let i++; sleep 2; > >> > >> and on the client: > >> while true; do echo bla | telnet alpha 22; sleep 1; done > >> > >> this caused no crashes with mtu 1-1500. > >> > >> But: > >> deepthought# ifconfig em0 mtu 1666 > >> deepthought# tcp_input: ip 0xfffffc0018cdb00e is misaligned > >> deepthought# ifconfig em0 mtu 1564 > >> deepthought# tcp_input: ip 0xfffffc001857c80e is misaligned > >> deepthought# ifconfig em0 mtu 1532 > >> deepthought# tcp_input: ip 0xfffffc001859300e is misaligned > >> > >> If it has to be 8 bytes aligned then it's off by 4, doesn't > >> seem to be vlanmtu though. > > erm, that would be 2. > > > Ok, this is helpful I think. (Big MTU -> panic.) > > Another thing is : > > deepthought# ifconfig em0 mtu 9000 > sten@ford:~$ ping -s 8000 intern.dt > PING intern.deepthought.blinkenlights.nl (192.168.1.3) 8000(8028) bytes of > data. 8008 bytes from intern.deepthought.blinkenlights.nl (192.168.1.3): > icmp_seq=1 ttl=64 time=1.19 ms 8008 bytes from > intern.deepthought.blinkenlights.nl (192.168.1.3): icmp_seq=2 ttl=64 > time=0.756 ms > > 21:59:12.587494 IP intern.ford > intern.deepthought.blinkenlights.nl: icmp > 8008: echo request seq 1 21:59:12.588223 IP > intern.deepthought.blinkenlights.nl > intern.ford: icmp 8008: echo reply > seq 1 21:59:13.587730 IP intern.ford > intern.deepthought.blinkenlights.nl: > icmp 8008: echo request seq 2 > > Aka icmp does work, which makes me think that the > problem is tcp specific. I've also tried disabling all > the sack/tcp sysctl's but that didn't seem to help. > And I've tried connecting from a box with mtu 1500, > but that also caused the same panic. > > > I'll get an sk card soonish which will allow me to double > check this panic with another nic. Although I would not guess > that the panic is driver specific. Which makes me wonder why > lo0 does work: > deepthought# ifconfig lo0 mtu 1501 > deepthought# telnet 127.0.0.1 22 > Trying 127.0.0.1... > Connected to localhost. > Escape character is '^]'. > SSH-2.0-OpenSSH_3.8.1p1 FreeBSD-20040419 > > > The next step is probably > > to start walking up the stack determining where the pointer starts off > > and how it ends up aligned. Can you use gdb to figure out the source > > file/line of the previous stack frame before tcp_input()? > > sure: > > db> trace > tcp_input() at tcp_input+0x3a4 > ip_input() at ip_input+0x9fc > netisr_processqueue() at netisr_processqueue+0xac > swi_net() at swi_net+0xf0 > ithread_loop() at ithread_loop+0x1d4 > fork_exit() at fork_exit+0x100 > exception_return() at exception_return > --- root of call graph --- > > (gdb) l *tcp_input+0x3a4 > 0xfffffc00004cd054 is in tcp_input (/usr/src/sys/netinet/tcp_input.c:554). > 549 > 550 /* > 551 * Check that TCP offset makes sense, > 552 * pull out TCP options and adjust length. > XXX > 553 */ > 554 off = th->th_off << 2; > 555 if (off < sizeof (struct tcphdr) || off > tlen) { > 556 tcpstat.tcps_rcvbadoff++; > 557 goto drop; > 558 } > (gdb) l *ip_input+0x9fc > 0xfffffc00004c355c is in ip_input (/usr/src/sys/netinet/ip_input.c:739). > 734 /* > 735 * Switch out to protocol's input routine. > 736 */ > 737 ipstat.ips_delivered++; > 738 > 739 (*inetsw[ip_protox[ip->ip_p]].pr_input)(m, hlen); > 740 return; > 741 bad: > 742 m_freem(m); > 743 } > (gdb) l *netisr_processqueue+0xac > 0xfffffc00004ad45c is in netisr_processqueue > (/usr/src/sys/net/netisr.c:233). > 228 > 229 for (;;) { > 230 IF_DEQUEUE(ni->ni_queue, m); > 231 if (m == NULL) > 232 break; > 233 ni->ni_handler(m); > 234 } > 235 } Hmm, so can you check here to see if the 'm' pointer in this routine is misaligned? If so, then this may be a driver bug. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org