Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Dec 2014 16:34:02 +0000
From:      Steven Hartland <killing@multiplay.co.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: Creating a bootable ZFS disk?
Message-ID:  <54A2D3FA.6000404@multiplay.co.uk>
In-Reply-To: <5AFFE5CE-ABC7-4D9C-B8E7-0AC9C3327D6B@tao.org.uk>
References:  <54a048f2.45c1c20a.6ffd.ffffe6d7SMTPIN_ADDED_BROKEN@mx.google.com> <54A062FE.6020500@multiplay.co.uk> <54A067D0.4050606@multiplay.co.uk> <349F0A87-5F85-4367-9A5C-E77DBFA16588@karthauser.co.uk> <7E1DA790-822F-4253-A3F6-1E5F5EFFEE04@karthauser.co.uk> <54A1129F.3040004@multiplay.co.uk> <C0A39010-8EBD-46C6-8AA5-B7970DF5280B@tao.org.uk> <54A18986.4000002@multiplay.co.uk> <5AFFE5CE-ABC7-4D9C-B8E7-0AC9C3327D6B@tao.org.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

On 30/12/2014 14:08, Dr Josef Karthauser wrote:
> On 29 Dec 2014, at 17:04, Steven Hartland <killing@multiplay.co.uk> wrote:
>
>> On 29/12/2014 16:02, Dr Josef Karthauser wrote:
>>> On 29 Dec 2014, at 08:36, Steven Hartland <killing@multiplay.co.uk> wrote:
>>>> That looks like a 10.0 boot not a 10.1 boot could you confirm and provide a 10.1 boot if thats the case please Joe?
>>> Whoops! Sorry.
>>>
>>> Attached is a verbose boot time dmesg for the 10.1 that causes the problem under load.
>>> I immediately rebooted back onto 10.0, so the (un-verbose) dmesg for that follows.
>>>
>>> Joe
>> Thanks Joe actually a verbose boot from 10.0 for comparison would good too.
> Ok - I’m attaching a 10.0, 10.1 verbose boot and a diff of the two.
>
>> Also something to try on the 10.1 to see if it makes any difference, add the following to /boot/loader.conf or run from the loader prompt:
>> hint.ahci.0.msi=1
>>
>> You can also try =0 as well if 1 makes no difference.
> I’ll try these later when the machine’s less busy.
Ah now that's very interesting!

On 10.0 we're only allocating 1 out of the possible 8 MSI vectors where 
as on 10.1 we're allocating all 8.

== 10.0 ==
ahci0: <ATI IXP700 AHCI SATA controller> port 
0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f 
mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0
ahci0: attempting to allocate 1 MSI vectors (8 supported)
msi: routing MSI IRQ 263 to local APIC 0 vector 58
ahci0: using IRQ 263 for MSI

== 10.1 ==
ahci0: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port 
0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f 
mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0
ahci0: attempting to allocate 8 MSI vectors (8 supported)
msi: routing MSI IRQ 263 to local APIC 0 vector 64
msi: routing MSI IRQ 264 to local APIC 0 vector 65
msi: routing MSI IRQ 265 to local APIC 0 vector 66
msi: routing MSI IRQ 266 to local APIC 0 vector 67
msi: routing MSI IRQ 267 to local APIC 0 vector 68
msi: routing MSI IRQ 268 to local APIC 0 vector 69
msi: routing MSI IRQ 269 to local APIC 0 vector 70
msi: routing MSI IRQ 270 to local APIC 0 vector 71
ahci0: using IRQs 263-270 for MSI

This change was brought into stable/10 by r260387 and originally came 
from r256843.

I don't believe there's anything wrong with the change, but if this is 
indeed the cause it could indicate some sort of hardware bug which when 
throughput is increased by the use of multiple MSI vectors causes an issue.

This is strengthened by the fact that ATI's previous generation HW 
(SB600) had MSI disabled by r245875 due to a very similar issue.

So given all the evidence so far ahci.0.msi=1 may well be the fix.

     Regards
     Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54A2D3FA.6000404>