From owner-freebsd-arm@FreeBSD.ORG Mon Aug 26 21:41:44 2013 Return-Path: Delivered-To: freebsd-arm@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E8EA6550; Mon, 26 Aug 2013 21:41:44 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BF1172328; Mon, 26 Aug 2013 21:41:44 +0000 (UTC) Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52] helo=damnhippie.dyndns.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1VE4Xf-000E9K-Hn; Mon, 26 Aug 2013 21:41:43 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r7QLfelr052158; Mon, 26 Aug 2013 15:41:40 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 24.8.230.52 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1+Nrlfe6JdEGJMtZ6NEFNF3 Subject: Re: ARM network trouble after recent mbuf changes From: Ian Lepore To: Andre Oppermann In-Reply-To: <521BC472.7040804@freebsd.org> References: <1377550636.1111.156.camel@revolution.hippie.lan> <521BC472.7040804@freebsd.org> Content-Type: text/plain; charset="us-ascii" Date: Mon, 26 Aug 2013 15:41:40 -0600 Message-ID: <1377553300.1111.157.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-arm X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 21:41:45 -0000 On Mon, 2013-08-26 at 23:11 +0200, Andre Oppermann wrote: > On 26.08.2013 22:57, Ian Lepore wrote: > > This new thread pulls together info from several other threads and irc > > conversations, to summarize what we know right now for Andre in case the > > problem is directly related to the mbuf changes. > > > > It looks like ARM systems consistantly get address translation faults > > related to network operations during boot. Zbyszek Bodek bisected it > > down to r254807; revisions before that work, beginning with that one > > they don't. A representative dmesg appears below. The abort happens in > > in_cksum(), or sbappendaddr_locked(), or soreceive_generic(), depending > > on various kernel config options and what network operations happen > > first. > > > > Thomas Skibo reports: > > > > I've been experiencing this too on the Zedboard and I spent some time > > looking into it. > > > > In my case, arprequest() is overwriting past the end of an mbuf into the > > m_next field of the next one. Later, something tries to reference > > address 0x6401a8c0 which is actually the machine's IP address in network > > order. It looks like MH_ALIGN() used in arprequest() isn't working > > properly after the recent mbuf header changes. > > > > Here's the mbuf just after arprequest() has performed MH_ALIGN(). The > > m_data pointer is 0xc2c41de8 and the length is 0x1c. That puts the data > > over the edge into the next mbuf. The m_pkthdr appears to have been > > placed at 0xc2c41d18 (I think). It looks like the compiler inserted > > padding at 1d14 so MHLEN isn't correct. > > > > XMD% mrd 0xc2c41d00 32 > > C2C41D00: 00000000 > > C2C41D04: 00000000 > > C2C41D08: C2C41DE8 (m_data) > > C2C41D0C: 0000001C (m_len) > > C2C41D10: 00000201 (m_type,m_flags) > > C2C41D14: 00000000 (?) > > C2C41D18: 00000000 (pkthdr.rcvif) > > C2C41D1C: 00000000 (pkthdr.tags) > > C2C41D20: 0000001C (pkthdr.len) > > C2C41D24: 00000000 > > C2C41D28: 00000000 > > C2C41D2C: 00000000 > > > > Thomas also reports that removing the bitfield definitions, so that > > flags and type are two separate integers, works around the problem. > > > > Could this be something related to how bitfields are handled in EABI? > > Can you try this patch see check if it makes a difference on the bitfield? > Nope, that made no difference for me, same abort in the same place. -- Ian