Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 01 Apr 2009 18:19:58 +1100
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        freebsd-current@freebsd.org
Cc:        Alexander Motin <mav@freebsd.org>
Subject:   Re: [SOLVED->UNSOLVED] Re: kernel panic with snd_hda "panic: Duplicate free of item 0xffffff00025f8c00 from zone 0xffffff00b697d400(1024)" (possibly an ACPI issue?)
Message-ID:  <49D3159E.5000901@freebsd.org>
In-Reply-To: <49D0150E.8050601@room52.net>
References:  <49CF0754.9070907@room52.net> <49CF2B0C.7050301@FreeBSD.org>	<49CF6551.4080301@room52.net> <49CF68C4.8020806@FreeBSD.org> <49D0150E.8050601@room52.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Lawrence Stewart wrote:
> Alexander Motin wrote:
>> Lawrence Stewart wrote:
>>> Alexander Motin wrote:
>>>> I can't reproduce neither "Invalid corb size (0)" error, nor the 
>>>> crash in case of it. I have tried to simulate that error, but system 
>>>> handled it correctly. But I have INVARIANTS disabled on my system.
>>>>
>>>> Can you try to disable MSI?
>>>
>>> Setting hw.pci.enable_msix=0 and hw.pci.enable_msi=0 at the loader 
>>> prompt made no difference.
>>
>> It could also be done with hint.hdac.0.msi=0.
>>
>>>> Can you try to move hdac_irq_alloc() call after hdac_rirb_init() in 
>>>> hdac_attach()? May be interrupt shots while something is not yet 
>>>> initialized?
>>>
>>> Running with the following patch made no difference either.
>>>
>>> Any other ideas I could try?
>>
>> I have none. I haven't changed anything significant last time, except 
>> enabling MSI. You can try to investigate both problems: original 
>> "Invalid corb size (0)" and the consequent crash. As I have said, I 
>> can't reproduce none of them. Try to put some debug printfs inside 
>> hdac_get_capabilities(), may be it give some new clues.
>>
> 
> Seems to have been a red herring. I tried a couple of older kernel 
> revisions which made no difference. Booting into windows, the sound card 
> works fine. Then I booted into FreeBSD and it booted fine without the 
> panic, although the hda driver spewed out a heap of error messages like 
> this:
> 
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: 
> hdac_command_send_internal: TIMEOUT numcmd=1, sent=1, received=0
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: 
> hdac_command_send_internal: TIMEOUT numcmd=1, sent=1, received=0
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: Codec #0 is not 
> responding! Probing aborted.
> 
> repeated up to:
> 
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: 
> hdac_command_send_internal: TIMEOUT numcmd=1, sent=0, received=0
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: 
> hdac_command_send_internal: TIMEOUT numcmd=1, sent=0, received=0
> Mar 30 09:50:12 lstewart-laptop kernel: hdac0: Codec #14 is not 
> responding! Probing aborted.
> 
> Then I rebooted again and got the panic on every reboot. Weird.
> 
> On a whim, I suspected fishy BIOS settings left over after a BIOS 
> upgrade so I reset all BIOS values to factory defaults and now it seems 
> to be working fine. There are no audio related options in the BIOS, so 
> not sure what was broken but something must not have been happy or must 
> have been left in an inconsistent state somehow.
> 
> Sorry for not having thought of it sooner, but it looks like the case is 
> closed.
> 

So... just got this panic again, for no discernible reason. Turn the 
laptop on and bam. Reboot and try again, same panic. So I try my 
previous trick and enter the BIOS screen, reset all to factory defaults, 
save and exit - everything is fine once again. I hadn't entered the BIOS 
since resetting everything to factory defaults last time which resolved 
the issue as noted in my previous email.

So, either the BIOS is dodgy, or FreeBSD is tickling something in a bad 
way. I don't think I've found any evidence to implicate either theory yet.

As a starting point, I poured over the kernel boot log and noticed this 
line:

Apr  1 16:53:34 lstewart-laptop kernel: ACPI Warning (tbutils-0243): 
Incorrect checksum in table [ASF!] -  C8, should be 5E [20070320]

Could anyone enlighten me as to whether that is a cause for concern? 
Might make the "dodgy BIOS" theory more likely if the incorrect checksum 
is indeed bad.

The full verbose boot log is available at: 
http://people.freebsd.org/~lstewart/misc/intel_debug/verbose_boot.txt

Cheers,
Lawrence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49D3159E.5000901>