Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2013 14:57:16 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        Andre Oppermann <andre@FreeBSD.org>, freebsd-arm <freebsd-arm@FreeBSD.org>
Subject:   ARM network trouble after recent mbuf changes
Message-ID:  <1377550636.1111.156.camel@revolution.hippie.lan>

next in thread | raw e-mail | index | archive | help
This new thread pulls together info from several other threads and irc
conversations, to summarize what we know right now for Andre in case the
problem is directly related to the mbuf changes.

It looks like ARM systems consistantly get address translation faults
related to network operations during boot.  Zbyszek Bodek bisected it
down to r254807; revisions before that work, beginning with that one
they don't.  A representative dmesg appears below.  The abort happens in
in_cksum(), or sbappendaddr_locked(), or soreceive_generic(), depending
on various kernel config options and what network operations happen
first.

Thomas Skibo reports:

I've been experiencing this too on the Zedboard and I spent some time 
looking into it.

In my case, arprequest() is overwriting past the end of an mbuf into the
m_next field of the next one.  Later, something tries to reference
address 0x6401a8c0 which is actually the machine's IP address in network
order.  It looks like MH_ALIGN() used in arprequest() isn't working
properly after the recent mbuf header changes.

Here's the mbuf just after arprequest() has performed MH_ALIGN().  The
m_data pointer is 0xc2c41de8 and the length is 0x1c.  That puts the data
over the edge into the next mbuf.  The m_pkthdr appears to have been
placed at 0xc2c41d18 (I think).  It looks like the compiler inserted
padding at 1d14 so MHLEN isn't correct.

XMD% mrd 0xc2c41d00 32
C2C41D00:   00000000
C2C41D04:   00000000
C2C41D08:   C2C41DE8 (m_data)
C2C41D0C:   0000001C (m_len)
C2C41D10:   00000201 (m_type,m_flags)
C2C41D14:   00000000  (?)
C2C41D18:   00000000 (pkthdr.rcvif)
C2C41D1C:   00000000 (pkthdr.tags)
C2C41D20:   0000001C (pkthdr.len)
C2C41D24:   00000000
C2C41D28:   00000000
C2C41D2C:   00000000

Thomas also reports that removing the bitfield definitions, so that
flags and type are two separate integers, works around the problem. 

Could this be something related to how bitfields are handled in EABI?

A sample dmesg with the fault...

KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights
reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-CURRENT #1 r254935: Mon Aug 26 14:32:21 MDT 2013

ilepore@revolution.hippie.lan:/local/build/staging/freebsd/bb/obj/arm.armv6/local/build/staging/freebsd/bb/src/sys/BB arm
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
CPU: Cortex A8-r3 rev 2 (Cortex-A core)
 Supported features: ARM_ISA THUMB2 JAZELLE THUMBEE ARMv4 Security_Ext
 WB disabled EABT branch prediction enabled
LoUU:2 LoC:2 LoUIS:1 
Cache level 1: 
 32KB/64B 4-way data cache WT WB Read-Alloc
 32KB/64B 4-way instruction cache Read-Alloc
Cache level 2: 
 256KB/64B 8-way unified cache WT WB Read-Alloc Write-Alloc
real memory  = 268435456 (256 MB)
avail memory = 256483328 (244 MB)
Texas Instruments AM3358 Processor, Revision ES1.0
random device not loaded; using insecure entropy
random: <Software, Yarrow> initialized
simplebus0: <Flattened device tree simple bus> on fdtbus0
aintc0: <TI AINTC Interrupt Controller> mem 0x48200000-0x48200fff on
simplebus0
aintc0: Revision 5.0
ti_scm0: <TI Control Module> mem 0x44e10000-0x44e11fff on simplebus0
am335x_prcm0: <AM335x Power and Clock Management> mem
0x44e00000-0x44e012ff on simplebus0
am335x_prcm0: Clocks: System 24.0 MHz, CPU 720 MHz
am335x_dmtimer0: <AM335x DMTimer> mem
0x44e05000-0x44e05fff,0x44e31000-0x44e31fff,0x48040000-0x48040fff,0x48042000-0x48042fff,0x48044000-0x48044fff,0x48046000-0x48046fff,0x48048000-0x48048fff,0x4804a000-0x4804afff irq 66,67,68,69,92,93,94,95 on simplebus0
Timecounter "AM335x Timecounter" frequency 24000000 Hz quality 1000
Event timer "AM335x Eventtimer0" frequency 24000000 Hz quality 1000
gpio0: <TI General Purpose I/O (GPIO)> mem
0x44e07000-0x44e07fff,0x4804c000-0x4804cfff,0x481ac000-0x481acfff,0x481ae000-0x481aefff irq 96,97,98,99,32,33,62,63 on simplebus0
gpioc0: <GPIO controller> on gpio0
gpiobus0: <GPIO bus> on gpio0
uart0: <TI UART (16550 compatible)> mem 0x44e09000-0x44e09fff irq 72 on
simplebus0
uart0: console (115384,n,8,1)
ti_edma30: <TI EDMA Controller> mem
0x49000000-0x490fffff,0x49800000-0x498fffff,0x49900000-0x499fffff,0x49a00000-0x49afffff irq 12,13,14 on simplebus0
ti_edma30: EDMA revision 40014c00
sdhci_ti0: <TI MMCHS (SDHCI 2.0)> mem 0x48060000-0x48060fff irq 64 on
simplebus0
mmc0: <MMC/SD bus> on sdhci_ti0
cpsw0: <3-port Switch Ethernet Subsystem> mem 0x4a100000-0x4a103fff irq
40,41,42,43 on simplebus0
cpsw0: CPSW SS Version 1.12 (0)
cpsw0: Initial queue size TX=128 RX=384
cpsw0: Ethernet address: 00:18:31:8e:c0:96
miibus0: <MII bus> on cpsw0
smscphy0: <SMC LAN8710A 10/100 interface> PHY 0 on miibus0
smscphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
iichb0: <TI I2C Controller> mem 0x44e0b000-0x44e0bfff irq 70 on
simplebus0
iichb0: I2C revision 4.0
iicbus0: <OFW I2C bus> on iichb0
iic0: <I2C generic I/O> on iicbus0
am335x_pmic0: <TI TPS65217 Power Management IC> at addr 0x24 on iicbus0
am335x_pwm0: <AM335x PWM> mem
0x48300000-0x483000ff,0x48300100-0x4830017f,0x48300180-0x483001ff,0x48300200-0x4830025f irq 86,58 on simplebus0
am335x_pwm1: <AM335x PWM> mem
0x48302000-0x483020ff,0x48302100-0x4830217f,0x48302180-0x483021ff,0x48302200-0x4830225f irq 87,59 on simplebus0
am335x_pwm2: <AM335x PWM> mem
0x48304000-0x483040ff,0x48304100-0x4830417f,0x48304180-0x483041ff,0x48304200-0x4830425f irq 88,60 on simplebus0
Timecounters tick every 10.000 msec
mmcsd0: 8GB <SDHC 00000 1.0 SN 2079402514 MFG 08/2010 by 27 SM> at mmc0
48.0MHz/4bit/65535-block
am335x_pmic0: TPS65217B ver 1.1 powered by USB and AC
Sending DHCP Discover packet from interface cpsw0 (00:18:31:8e:c0:96)
cpsw0: link state changed to DOWN
cpsw0: link state changed to UP
Received DHCP Offer packet on cpsw0 from 172.22.42.240 (accepted)
Sending DHCP Request packet from interface cpsw0 (00:18:31:8e:c0:96)
Received DHCP Ack packet on cpsw0 from 172.22.42.240 (accepted) (got
root path)
cpsw0 at 172.22.42.234 server 172.22.42.240 boot
file /bb/boot/kernel/kernel
subnet mask 255.255.255.0 router 172.22.42.254 rootfs 172.22.42.240:/bb 
Adjusted interface cpsw0

vm_fault(0xc05b0820, 57405000, 1, 0) -> 1
Fatal kernel mode data abort: 'Translation Fault (S)'
trapframe: 0xc05c1ae8
FSR=00000005, FAR=5740540c, spsr=20000093
r0 =c188d418, r1 =00000000, r2 =00000000, r3 =00000010
r4 =f02a16ac, r5 =ea2a16ac, r6 =c188d3e8, r7 =00000000
r8 =425443df, r9 =00000000, r10=00000014, r11=c188d40c
r12=57405400, ssp=c05c1b3c, slr=c049d748, pc =c049d720

[ thread pid 0 tid 100000 ]
Stopped at      in_cksum+0x14:  ldr     r1, [r12, #0x00c]
	
-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1377550636.1111.156.camel>