Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 May 2012 17:56:00 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        dane foster <dene@ilovedene.com>
Cc:        freebsd-hackers@freebsd.org, Mark Felder <feld@feld.me>, freebsd-questions@freebsd.org
Subject:   Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
Message-ID:  <CAJ-VmokWTKdTOVF7fXnkERbgNurtGZB5OHV8QDA7TzXEAFVWQA@mail.gmail.com>
In-Reply-To: <62F1D149-FC1C-4E00-98FD-DF6C46A5DC55@ilovedene.com>
References:  <op.wbwe9s0k34t2sn@tech304> <op.wen3bwws34t2sn@tech304> <490F2075-3E4D-4F85-9935-937CED8FB10B@averesystems.com> <op.wen42clw34t2sn@tech304> <CAJ-Vmoneopo8xNpThbewfE2tg6HrdH74DXurO38P_aVs=YS9%2BA@mail.gmail.com> <op.wete9wbq34t2sn@tech304> <62F1D149-FC1C-4E00-98FD-DF6C46A5DC55@ilovedene.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

You guys now absolutely, positively have enough information for a PR.

It's still not clear whether it's a device/interrupt layer issue in
FreeBSD, or whether vmware is doing something wrong with how it
implements shared interrupts, or a bit of both..

Adrian

On 24 May 2012 13:54, dane foster <dene@ilovedene.com> wrote:
> Hey all,
>
> On 25/05/2012, at 1:47 AM, Mark Felder wrote:
>
>> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd <adrian@freebsd.org> wr=
ote:
>>
>>> Hi,
>>>
>>> can you please, -please- file a PR? And place all of the above
>>> information in it so we don't lose it?
>>>
>>
>> I'd be glad to post a PR and assist in helping to get it permanently fix=
ed. I certainly don't want this data to get lost and honestly our business =
uses FreeBSD on VMWare so much that we really need a permanent fix as much =
as anyone else :-)
>>
>> The reason I've hesitated to post a PR so far is that I didn't have any =
truly useful or concrete evidence of where the problem lies. After Dane Fos=
ter contacted me and told me he could recreate the crash on demand with his=
 workload it was easier to narrow things down. The suggestion that it was a=
n interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his=
 crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unuse=
d was starting to prove there is some strong evidence here in favor of the =
interrupts issue.
>>
>> Dane, what's the status on your end? Has your fix still been successful?=
 Is it also stable if you simply set hint.mpt.0.msi_enable=3D"1" ?
>>
>
> The situation I've got that's stable now is:
>
> hw.pci.enable_msi=3D"0"
> hw.pci.enable_msix=3D"0"
>
> in /boot/loader.conf
>
> and:
>
> samael:~:% vmstat -i =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0[ 6:31PM]
> interrupt =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0total =A0 =
=A0 =A0 rate
> irq1: atkbd0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 6 =A0 =
=A0 =A0 =A0 =A00
> irq18: em0 mpt0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A03061100 =A0 =A0 =A0 =
=A0 15
> irq19: em1 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 6891706 =A0 =A0 =
=A0 =A0 35
> cpu0: timer =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0166383735 =A0 =A0 =A0 =
=A0868
> cpu1: timer =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0166382123 =A0 =A0 =A0 =
=A0868
> cpu3: timer =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0166382123 =A0 =A0 =A0 =
=A0868
> cpu2: timer =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0166382121 =A0 =A0 =A0 =
=A0868
> Total =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0675482914 =A0 =
=A0 =A0 3525
>
> Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeB=
SD 8.3-STABLE #1: Mon May =A07 11:51:03 NZST 2012 =A0 =A0 root@samael.slush=
.ca:/usr/obj/usr/src/sys/DENE =A0amd64).
>
> Neither of those settings on their own seem to stop it from happening.
>
> The 9 box I've tried this on still hangs almost every time i run handbrak=
e, no matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 =
and em0/1
>
> I can cause the hang mostly on demand, but not quite sure what informatio=
n to provide from the hung system. If somebody can let me know what they ne=
ed, including root access, I can make that happen.
>
> Cheers,
>
> Dane
>
>
>
>>
>> Thanks!
>
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmokWTKdTOVF7fXnkERbgNurtGZB5OHV8QDA7TzXEAFVWQA>