Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Apr 2009 21:26:33 +0200
From:      Andreas Tobler <andreast-list@fgznet.ch>
To:        Bartlomiej Sieka <tur@semihalf.com>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        freebsd-ppc@freebsd.org
Subject:   Re: Fatal kernel trap during boot 8.0-CURRENT
Message-ID:  <49F0C0E9.4030701@fgznet.ch>
In-Reply-To: <259915AE-12EF-4887-BB81-D955DDD970B3@semihalf.com>
References:  <D350790D-5AC3-4AD5-821E-6431ADAA0A2A@fahrners.de> <49E76291.4000002@fgznet.ch> <49EFD957.2010605@freebsd.org> <259915AE-12EF-4887-BB81-D955DDD970B3@semihalf.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bartlomiej Sieka wrote:
> 
> On 2009-04-23, at 04:58, Nathan Whitehorn wrote:
> 
>> Andreas Tobler wrote:
>>> Jochen Fahrner wrote:
>>>> Hi,
>>>> after a fews days of power off I wanted to boot the 8.0 kernel I 
>>>> installed last week on my iMac G3.
>>>> I got a kernel trap after starting sc0 driver:
>>>>
>>>> =======================
>>>> sc0: Unknown <16 virtual consoles, flags=0x300>
>>>> Timecounter "decrementer" frequency 24960000 Hz quality 0
>>>> Timecounters tick every 10.000 msec
>>>>
>>>> fatal kernel trap:
>>>> exception = 0x7 (program)
>>>> srr0      = 0x509168
>>>> srr1      = 0x83032
>>>> lr        = 0x4f9788
>>>> curthread = 0x633a30
>>>> pid = 0, comm=swapper
>>>> thread pid 0 tid 100000
>>>> Stopped at 0x509168
>>>> illegal instruction 7c0049ce
>>>> ==========================
>>>>
>>>> I could repeat this several times.
>>>> Then I booted my old 7.1 kernel without problems.
>>>> After that I also could boot 8.0 again.
>>>
>>> Fyi, I experience the same on my imac G3. And I use the same 
>>> procedure to get back to -CURRENT.
>> This is related to Altivec support. 7c0049ce is stvx    v0,r0,r9, 
>> which is the first executed Altivec instruction in save_vec(), and the 
>> faulting address is close to where to save_vec() ends up in my kernel. 
>> save_vec() can only be called if the process is marked with PCB_VEC. I 
>> have no idea how that ends up happening, and I can't duplicate the 
>> problem on my G3. One option would be to insert a panic() or a 
>> kdb_backtrace() into enable_vec(), which might at least tell us where 
>> it is getting called from...
>>
>> The only thing I can think of is that the 750 is taking a performance 
>> monitor exception and falling through to the EXC_VEC handler, which 
>> will try to turn on Altivec. The way Altivec support works is that 
>> only Altivec-aware processors should ever fault to EXC_VEC, in which 
>> case we should be fine setting PCB_VEC on the process. Very confusing...
> 

I added a kdb_backtrace at the beginning of save_vec:

Need to manually write down the trace....

0xd00048f0 at kdb_backtrace+0x4c
0xd0004910 at save_vec+0x1c
0xd0004930 at cpu_switch+0x54
0xd0004960 at mi_switch+0x290
0xd0004990 at sleepq_switch+0xcc
0xd00049b0 at sleepq_timedwait+0x58
0xd00049e0 at _cv_timedwait+0x1b4
0xd0004a20 at _sema_timedwait+0x84
0xd0004a50 at ata_queue_request+0x410
.....

If one needs the full stack trace I can mail privately a jpg, I don't 
want to spam the list.


> Perhaps the problem is related to an issue we came across while working 
> on Efika support. The issue was that the Altivec-specific code was 
> executed, due to PCB_VEC being set when it shouldn't (Efika has the 
> MPC5200B SoC, which is e300-based). PCB_VEC turned out the be set 
> because thread0.td_pcb contained garbage, and our problem went away 
> after zeroing the thread0.td_pcb in powerpc_init(), similarly to what 
> booke/machdep.c implementation does.
> 
> Please try the attached patch and see if it fixes the problem seen on 
> iMac G3.

Bartlomiej, I did try your suggested fix and it looks good. So far I was 
not able to reproduce the trap with your fix. While w/o fix I can nearly 
every time trigger the trap on a cold boot, not always though, but in 8 
of 10 tries.

Thank you very much!

Andreas



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49F0C0E9.4030701>