Date: Sun, 02 Sep 2018 09:09:28 -0600 From: Ian Lepore <ian@freebsd.org> To: "Dr. Rolf Jansen" <rj@obsigna.com>, freebsd-arm@freebsd.org Subject: Re: Kernel Panic on BBB cause by ti_adc intr Message-ID: <1535900968.9486.5.camel@freebsd.org> In-Reply-To: <B259CA27-7D08-45B1-97BB-35A544E346BB@obsigna.com> References: <B259CA27-7D08-45B1-97BB-35A544E346BB@obsigna.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2018-09-02 at 00:15 -0300, Dr. Rolf Jansen wrote: > I got signal sources connected to AIN0 and AIN1 of the BBB. The > signals are divided, clipped and clamped and are guaranteed to stay > in the range of 0 to 1.8 V. Generally, the circuitry does work and > the ADC readings match very well the expectations. > > Only, sometimes, usually when I power on some considerable load (e.g. > a hair dryer) connected to a different AC plug, but in the same room, > the BBB bails out, giving the stack backtrace shown below. It might > well be, that a power-on spike traverses the AC electricity supply, > but there is no way that the spike after clipping and clamping would > exceed said limits. > > My understanding of the stack backtrace is, that somehow an interrupt > is triggered by said spike, and then it hits a bug in the interrupt > handler. It seems that an exclusive sleep mutex is locked when it is > not expected to be. This happened on FreeBSD 12.0-ALPHA3 and today > also on -ALPHA4. > > Question: > > I don't need interrupt handling in my project, since the signal > changes are slow, and the changes need to be read in defined > time intervals. So, is it possible to deactivate the interrupt > handler of the ti_adc? > > Presumably then the feature of the exclusive sleep mutex on ti_adc0 > would not be challenged and therefore may continue sleeping forever. > Of course, I want continue being able of timed reading of the ADC > values. > > Any suggestions would be greatly appreciated, since a BBB which can > be DoS'ed by powering on a hair dryer is not as useful as it could > be. > > Best regards > > Rolf > > > Kernel page fault with the following non-sleepable locks held: > exclusive sleep mutex ti_adc0 (ti_adc) r = 0 (0xc2277d08) locked @ > /usr/src/sys/arm/ti/ti_adc.c:508 > stack backtrace: > Fatal kernel mode data abort: 'Translation Fault (L1)' on read > trapframe: 0xd2ebeca0 > FSR=00000005, FAR=00000128, spsr=20000013 > r0 =00000000, r1 =00000003, r2 =00000001, r3 =00000000 > r4 =00000000, r5 =00000000, r6 =00000003, r7 =00000016 > r8 =00000000, r9 =c2280e00, r10=00000021, r11=d2ebed60 > r12=c0ace03c, ssp=d2ebed30, slr=c067d61c, pc =c00888c0 > > panic: Fatal abort > cpuid = 0 > time = 1535844155 > KDB: stack backtrace: > db_trace_self() at db_trace_self > pc = 0xc05c7484 lr = 0xc0075d04 (db_trace_self_wrapper+0x30) > sp = 0xd2ebea80 fp = 0xd2ebeb98 > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > pc = 0xc0075d04 lr = 0xc029d60c (vpanic+0x16c) > sp = 0xd2ebeba0 fp = 0xd2ebebc0 > r4 = 0x00000100 r5 = 0x00000001 > r6 = 0xc071bb22 r7 = 0xc0a8cfd8 > vpanic() at vpanic+0x16c > pc = 0xc029d60c lr = 0xc029d3ec (doadump) > sp = 0xd2ebebc8 fp = 0xd2ebebcc > r4 = 0xd2ebeca0 r5 = 0x00000013 > r6 = 0x00000128 r7 = 0x00000005 > r8 = 0x00000005 r9 = 0xd2ebeca0 > r10 = 0x00000128 > doadump() at doadump > pc = 0xc029d3ec lr = 0xc05e9bb0 (abort_align) > sp = 0xd2ebebd4 fp = 0xd2ebec00 > r4 = 0xc029d3ec r5 = 0xd2ebebd4 > abort_align() at abort_align > pc = 0xc05e9bb0 lr = 0xc05e9740 (abort_handler+0x2e0) > sp = 0xd2ebec08 fp = 0xd2ebec98 > r4 = 0x00000013 r5 = 0x00000128 > abort_handler() at abort_handler+0x2e0 > pc = 0xc05e9740 lr = 0xc05c9dd4 (exception_exit) > sp = 0xd2ebeca0 fp = 0xd2ebed60 > r4 = 0x00000000 r5 = 0x00000000 > r6 = 0x00000003 r7 = 0x00000016 > r8 = 0x00000000 r9 = 0xc2280e00 > r10 = 0x00000021 > exception_exit() at exception_exit > pc = 0xc05c9dd4 lr = 0xc067d61c (ti_adc_intr+0x88) > sp = 0xd2ebed30 fp = 0xd2ebed60 > r0 = 0x00000000 r1 = 0x00000003 > r2 = 0x00000001 r3 = 0x00000000 > r4 = 0x00000000 r5 = 0x00000000 > r6 = 0x00000003 r7 = 0x00000016 > r8 = 0x00000000 r9 = 0xc2280e00 > r10 = 0x00000021 r12 = 0xc0ace03c > evdev_push_event() at evdev_push_event+0x4c > pc = 0xc00888c0 lr = 0xc067d61c (ti_adc_intr+0x88) > sp = 0xd2ebed68 fp = 0xd2ebedd0 > r4 = 0xd2fce800 r5 = 0xc2277d00 > r6 = 0x00000000 r7 = 0x00000421 > r8 = 0xc2277d18 r9 = 0xc2280e00 > ti_adc_intr() at ti_adc_intr+0x88 > pc = 0xc067d61c lr = 0xc02662fc (ithread_loop+0x1f0) > sp = 0xd2ebedd8 fp = 0xd2ebee20 > r4 = 0xd2fce800 r5 = 0x00000000 > r6 = 0xd2fce844 r7 = 0x00000000 > r8 = 0xc0719541 r9 = 0xc2280e00 > r10 = 0x00000000 > ithread_loop() at ithread_loop+0x1f0 > pc = 0xc02662fc lr = 0xc0262ef8 (fork_exit+0xa0) That's a strange exception stack, with lots of registers containing zeroes at exception time that were non-zero in the prior stack frame. It makes me think something has overwritten the stack with garbage data. When I look at ti_adc_tsc_read_data() it has a stack-allocated data array with 16 elements, and a loop that could load more than 16 elements into that array (ADC_FIFO_COUNT_MSK is 0x7f), that seems like trouble. You said you don't need interrupts, does that mean you're reading the values via sysctl and aren't using the EVDEV stuff? If so, you might be able to quickly work around the panic by building a custom kernel using 'nooption EVDEV_SUPPORT'. -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1535900968.9486.5.camel>