Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2013 23:04:51 +0200
From:      Andre Oppermann <andre@freebsd.org>
To:        Ian Lepore <ian@FreeBSD.org>
Cc:        freebsd-arm <freebsd-arm@FreeBSD.org>
Subject:   Re: ARM network trouble after recent mbuf changes
Message-ID:  <521BC2F3.4050600@freebsd.org>
In-Reply-To: <1377550636.1111.156.camel@revolution.hippie.lan>
References:  <1377550636.1111.156.camel@revolution.hippie.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On 26.08.2013 22:57, Ian Lepore wrote:
> This new thread pulls together info from several other threads and irc
> conversations, to summarize what we know right now for Andre in case the
> problem is directly related to the mbuf changes.
>
> It looks like ARM systems consistantly get address translation faults
> related to network operations during boot.  Zbyszek Bodek bisected it
> down to r254807; revisions before that work, beginning with that one
> they don't.  A representative dmesg appears below.  The abort happens in
> in_cksum(), or sbappendaddr_locked(), or soreceive_generic(), depending
> on various kernel config options and what network operations happen
> first.
>
> Thomas Skibo reports:
>
> I've been experiencing this too on the Zedboard and I spent some time
> looking into it.
>
> In my case, arprequest() is overwriting past the end of an mbuf into the
> m_next field of the next one.  Later, something tries to reference
> address 0x6401a8c0 which is actually the machine's IP address in network
> order.  It looks like MH_ALIGN() used in arprequest() isn't working
> properly after the recent mbuf header changes.
>
> Here's the mbuf just after arprequest() has performed MH_ALIGN().  The
> m_data pointer is 0xc2c41de8 and the length is 0x1c.  That puts the data
> over the edge into the next mbuf.  The m_pkthdr appears to have been
> placed at 0xc2c41d18 (I think).  It looks like the compiler inserted
> padding at 1d14 so MHLEN isn't correct.
>
> XMD% mrd 0xc2c41d00 32
> C2C41D00:   00000000
> C2C41D04:   00000000
> C2C41D08:   C2C41DE8 (m_data)
> C2C41D0C:   0000001C (m_len)
> C2C41D10:   00000201 (m_type,m_flags)
> C2C41D14:   00000000  (?)
> C2C41D18:   00000000 (pkthdr.rcvif)
> C2C41D1C:   00000000 (pkthdr.tags)
> C2C41D20:   0000001C (pkthdr.len)
> C2C41D24:   00000000
> C2C41D28:   00000000
> C2C41D2C:   00000000
>
> Thomas also reports that removing the bitfield definitions, so that
> flags and type are two separate integers, works around the problem.
>
> Could this be something related to how bitfields are handled in EABI?

It could be.  We do use bitfields since forever in ip.h for the IPv4 header
too.

Besides that, do all the ARM system with this problem use the same network
interface type?

-- 
Andre

> A sample dmesg with the fault...
>
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> Copyright (c) 1992-2013 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>          The Regents of the University of California. All rights
> reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 10.0-CURRENT #1 r254935: Mon Aug 26 14:32:21 MDT 2013
>
> ilepore@revolution.hippie.lan:/local/build/staging/freebsd/bb/obj/arm.armv6/local/build/staging/freebsd/bb/src/sys/BB arm
> FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
> CPU: Cortex A8-r3 rev 2 (Cortex-A core)
>   Supported features: ARM_ISA THUMB2 JAZELLE THUMBEE ARMv4 Security_Ext
>   WB disabled EABT branch prediction enabled
> LoUU:2 LoC:2 LoUIS:1
> Cache level 1:
>   32KB/64B 4-way data cache WT WB Read-Alloc
>   32KB/64B 4-way instruction cache Read-Alloc
> Cache level 2:
>   256KB/64B 8-way unified cache WT WB Read-Alloc Write-Alloc
> real memory  = 268435456 (256 MB)
> avail memory = 256483328 (244 MB)
> Texas Instruments AM3358 Processor, Revision ES1.0
> random device not loaded; using insecure entropy
> random: <Software, Yarrow> initialized
> simplebus0: <Flattened device tree simple bus> on fdtbus0
> aintc0: <TI AINTC Interrupt Controller> mem 0x48200000-0x48200fff on
> simplebus0
> aintc0: Revision 5.0
> ti_scm0: <TI Control Module> mem 0x44e10000-0x44e11fff on simplebus0
> am335x_prcm0: <AM335x Power and Clock Management> mem
> 0x44e00000-0x44e012ff on simplebus0
> am335x_prcm0: Clocks: System 24.0 MHz, CPU 720 MHz
> am335x_dmtimer0: <AM335x DMTimer> mem
> 0x44e05000-0x44e05fff,0x44e31000-0x44e31fff,0x48040000-0x48040fff,0x48042000-0x48042fff,0x48044000-0x48044fff,0x48046000-0x48046fff,0x48048000-0x48048fff,0x4804a000-0x4804afff irq 66,67,68,69,92,93,94,95 on simplebus0
> Timecounter "AM335x Timecounter" frequency 24000000 Hz quality 1000
> Event timer "AM335x Eventtimer0" frequency 24000000 Hz quality 1000
> gpio0: <TI General Purpose I/O (GPIO)> mem
> 0x44e07000-0x44e07fff,0x4804c000-0x4804cfff,0x481ac000-0x481acfff,0x481ae000-0x481aefff irq 96,97,98,99,32,33,62,63 on simplebus0
> gpioc0: <GPIO controller> on gpio0
> gpiobus0: <GPIO bus> on gpio0
> uart0: <TI UART (16550 compatible)> mem 0x44e09000-0x44e09fff irq 72 on
> simplebus0
> uart0: console (115384,n,8,1)
> ti_edma30: <TI EDMA Controller> mem
> 0x49000000-0x490fffff,0x49800000-0x498fffff,0x49900000-0x499fffff,0x49a00000-0x49afffff irq 12,13,14 on simplebus0
> ti_edma30: EDMA revision 40014c00
> sdhci_ti0: <TI MMCHS (SDHCI 2.0)> mem 0x48060000-0x48060fff irq 64 on
> simplebus0
> mmc0: <MMC/SD bus> on sdhci_ti0
> cpsw0: <3-port Switch Ethernet Subsystem> mem 0x4a100000-0x4a103fff irq
> 40,41,42,43 on simplebus0
> cpsw0: CPSW SS Version 1.12 (0)
> cpsw0: Initial queue size TX=128 RX=384
> cpsw0: Ethernet address: 00:18:31:8e:c0:96
> miibus0: <MII bus> on cpsw0
> smscphy0: <SMC LAN8710A 10/100 interface> PHY 0 on miibus0
> smscphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> iichb0: <TI I2C Controller> mem 0x44e0b000-0x44e0bfff irq 70 on
> simplebus0
> iichb0: I2C revision 4.0
> iicbus0: <OFW I2C bus> on iichb0
> iic0: <I2C generic I/O> on iicbus0
> am335x_pmic0: <TI TPS65217 Power Management IC> at addr 0x24 on iicbus0
> am335x_pwm0: <AM335x PWM> mem
> 0x48300000-0x483000ff,0x48300100-0x4830017f,0x48300180-0x483001ff,0x48300200-0x4830025f irq 86,58 on simplebus0
> am335x_pwm1: <AM335x PWM> mem
> 0x48302000-0x483020ff,0x48302100-0x4830217f,0x48302180-0x483021ff,0x48302200-0x4830225f irq 87,59 on simplebus0
> am335x_pwm2: <AM335x PWM> mem
> 0x48304000-0x483040ff,0x48304100-0x4830417f,0x48304180-0x483041ff,0x48304200-0x4830425f irq 88,60 on simplebus0
> Timecounters tick every 10.000 msec
> mmcsd0: 8GB <SDHC 00000 1.0 SN 2079402514 MFG 08/2010 by 27 SM> at mmc0
> 48.0MHz/4bit/65535-block
> am335x_pmic0: TPS65217B ver 1.1 powered by USB and AC
> Sending DHCP Discover packet from interface cpsw0 (00:18:31:8e:c0:96)
> cpsw0: link state changed to DOWN
> cpsw0: link state changed to UP
> Received DHCP Offer packet on cpsw0 from 172.22.42.240 (accepted)
> Sending DHCP Request packet from interface cpsw0 (00:18:31:8e:c0:96)
> Received DHCP Ack packet on cpsw0 from 172.22.42.240 (accepted) (got
> root path)
> cpsw0 at 172.22.42.234 server 172.22.42.240 boot
> file /bb/boot/kernel/kernel
> subnet mask 255.255.255.0 router 172.22.42.254 rootfs 172.22.42.240:/bb
> Adjusted interface cpsw0
>
> vm_fault(0xc05b0820, 57405000, 1, 0) -> 1
> Fatal kernel mode data abort: 'Translation Fault (S)'
> trapframe: 0xc05c1ae8
> FSR=00000005, FAR=5740540c, spsr=20000093
> r0 =c188d418, r1 =00000000, r2 =00000000, r3 =00000010
> r4 =f02a16ac, r5 =ea2a16ac, r6 =c188d3e8, r7 =00000000
> r8 =425443df, r9 =00000000, r10=00000014, r11=c188d40c
> r12=57405400, ssp=c05c1b3c, slr=c049d748, pc =c049d720
>
> [ thread pid 0 tid 100000 ]
> Stopped at      in_cksum+0x14:  ldr     r1, [r12, #0x00c]
> 	
> -- Ian
>
>
>
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?521BC2F3.4050600>