Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 02 Sep 2018 09:09:28 -0600
From:      Ian Lepore <ian@freebsd.org>
To:        "Dr. Rolf Jansen" <rj@obsigna.com>, freebsd-arm@freebsd.org
Subject:   Re: Kernel Panic on BBB cause by ti_adc intr
Message-ID:  <1535900968.9486.5.camel@freebsd.org>
In-Reply-To: <B259CA27-7D08-45B1-97BB-35A544E346BB@obsigna.com>
References:  <B259CA27-7D08-45B1-97BB-35A544E346BB@obsigna.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2018-09-02 at 00:15 -0300, Dr. Rolf Jansen wrote:
> I got signal sources connected to AIN0 and AIN1 of the BBB. The
> signals are divided, clipped and clamped and are guaranteed to stay
> in the range of 0 to 1.8 V. Generally, the circuitry does work and
> the ADC readings match very well the expectations.
> 
> Only, sometimes, usually when I power on some considerable load (e.g.
> a hair dryer) connected to a different AC plug, but in the same room,
> the BBB bails out, giving the stack backtrace shown below. It might
> well be, that a power-on spike traverses the AC electricity supply,
> but there is no way that the spike after clipping and clamping would
> exceed said limits.
> 
> My understanding of the stack backtrace is, that somehow an interrupt
> is triggered by said spike, and then it hits a bug in the interrupt
> handler. It seems that an exclusive sleep mutex is locked when it is
> not expected to be. This happened on FreeBSD 12.0-ALPHA3 and today
> also on -ALPHA4.
> 
> Question:
> 
>    I don't need interrupt handling in my project, since the signal
>    changes are slow, and the changes need to be read in defined
>    time intervals. So, is it possible to deactivate the interrupt
>    handler of the ti_adc?
> 
> Presumably then the feature of the exclusive sleep mutex on ti_adc0
> would not be challenged and therefore may continue sleeping forever.
> Of course, I want continue being able of timed reading of the ADC
> values.
> 
> Any suggestions would be greatly appreciated, since a BBB which can
> be DoS'ed by powering on a hair dryer is not as useful as it could
> be.
> 
> Best regards
> 
> Rolf
> 
> 
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex ti_adc0 (ti_adc) r = 0 (0xc2277d08) locked @
> /usr/src/sys/arm/ti/ti_adc.c:508
> stack backtrace:
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xd2ebeca0
> FSR=00000005, FAR=00000128, spsr=20000013
> r0 =00000000, r1 =00000003, r2 =00000001, r3 =00000000
> r4 =00000000, r5 =00000000, r6 =00000003, r7 =00000016
> r8 =00000000, r9 =c2280e00, r10=00000021, r11=d2ebed60
> r12=c0ace03c, ssp=d2ebed30, slr=c067d61c, pc =c00888c0
> 
> panic: Fatal abort
> cpuid = 0
> time = 1535844155
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> 	 pc = 0xc05c7484  lr = 0xc0075d04 (db_trace_self_wrapper+0x30)
> 	 sp = 0xd2ebea80  fp = 0xd2ebeb98
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> 	 pc = 0xc0075d04  lr = 0xc029d60c (vpanic+0x16c)
> 	 sp = 0xd2ebeba0  fp = 0xd2ebebc0
> 	 r4 = 0x00000100  r5 = 0x00000001
> 	 r6 = 0xc071bb22  r7 = 0xc0a8cfd8
> vpanic() at vpanic+0x16c
> 	 pc = 0xc029d60c  lr = 0xc029d3ec (doadump)
> 	 sp = 0xd2ebebc8  fp = 0xd2ebebcc
> 	 r4 = 0xd2ebeca0  r5 = 0x00000013
> 	 r6 = 0x00000128  r7 = 0x00000005
> 	 r8 = 0x00000005  r9 = 0xd2ebeca0
> 	r10 = 0x00000128
> doadump() at doadump
> 	 pc = 0xc029d3ec  lr = 0xc05e9bb0 (abort_align)
> 	 sp = 0xd2ebebd4  fp = 0xd2ebec00
> 	 r4 = 0xc029d3ec  r5 = 0xd2ebebd4
> abort_align() at abort_align
> 	 pc = 0xc05e9bb0  lr = 0xc05e9740 (abort_handler+0x2e0)
> 	 sp = 0xd2ebec08  fp = 0xd2ebec98
> 	 r4 = 0x00000013  r5 = 0x00000128
> abort_handler() at abort_handler+0x2e0
> 	 pc = 0xc05e9740  lr = 0xc05c9dd4 (exception_exit)
> 	 sp = 0xd2ebeca0  fp = 0xd2ebed60
> 	 r4 = 0x00000000  r5 = 0x00000000
> 	 r6 = 0x00000003  r7 = 0x00000016
> 	 r8 = 0x00000000  r9 = 0xc2280e00
> 	r10 = 0x00000021
> exception_exit() at exception_exit
> 	 pc = 0xc05c9dd4  lr = 0xc067d61c (ti_adc_intr+0x88)
> 	 sp = 0xd2ebed30  fp = 0xd2ebed60
> 	 r0 = 0x00000000  r1 = 0x00000003
> 	 r2 = 0x00000001  r3 = 0x00000000
> 	 r4 = 0x00000000  r5 = 0x00000000
> 	 r6 = 0x00000003  r7 = 0x00000016
> 	 r8 = 0x00000000  r9 = 0xc2280e00
> 	r10 = 0x00000021 r12 = 0xc0ace03c
> evdev_push_event() at evdev_push_event+0x4c
> 	 pc = 0xc00888c0  lr = 0xc067d61c (ti_adc_intr+0x88)
> 	 sp = 0xd2ebed68  fp = 0xd2ebedd0
> 	 r4 = 0xd2fce800  r5 = 0xc2277d00
> 	 r6 = 0x00000000  r7 = 0x00000421
> 	 r8 = 0xc2277d18  r9 = 0xc2280e00
> ti_adc_intr() at ti_adc_intr+0x88
> 	 pc = 0xc067d61c  lr = 0xc02662fc (ithread_loop+0x1f0)
> 	 sp = 0xd2ebedd8  fp = 0xd2ebee20
> 	 r4 = 0xd2fce800  r5 = 0x00000000
> 	 r6 = 0xd2fce844  r7 = 0x00000000
> 	 r8 = 0xc0719541  r9 = 0xc2280e00
> 	r10 = 0x00000000
> ithread_loop() at ithread_loop+0x1f0
> 	 pc = 0xc02662fc  lr = 0xc0262ef8 (fork_exit+0xa0)

That's a strange exception stack, with lots of registers containing
zeroes at exception time that were non-zero in the prior stack frame.
It makes me think something has overwritten the stack with garbage
data. When I look at ti_adc_tsc_read_data() it has a stack-allocated
data array with 16 elements, and a loop that could load more than 16
elements into that array (ADC_FIFO_COUNT_MSK is 0x7f), that seems like
trouble.

You said you don't need interrupts, does that mean you're reading the
values via sysctl and aren't using the EVDEV stuff? If so, you might be
able to quickly work around the panic by building a custom kernel using
'nooption EVDEV_SUPPORT'.

-- Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1535900968.9486.5.camel>