From owner-freebsd-stable@FreeBSD.ORG Tue Dec 30 19:24:08 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 03DFDEEA for ; Tue, 30 Dec 2014 19:24:08 +0000 (UTC) Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8B43B1449 for ; Tue, 30 Dec 2014 19:24:07 +0000 (UTC) Received: by mail-wi0-f176.google.com with SMTP id ex7so24586035wid.3 for ; Tue, 30 Dec 2014 11:24:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=nbyqTzxFUYlaUgoHOHWJuBHjEoSK3pK0CrL00FasC/k=; b=JHsezWt9fMVKF1AhHHZe+eZrA+lWBWHJM9m6AuSwWISxOLFsnvB3N9cNnws+hh/wpq PGQspIhZLZMzYjQxtO57STHcfzJ94XN6R0HwtDB2WIkZAywyy6LUwyUyQxtatsi6xenJ vPPV/rbdChY8nkJ2Mk120hmVY1fRqWQ0QwWGTwWlVNh8fGAseznZHGfU8TYhzOi1Lhkt 3zs2UUpVdDHm45S/AJw3SbNYnljCQ+CqKj9T8dt3mSDjmvuN8bPSxnnEG50o97NEoICH w6J3a64XO7O3ocHLd9lAQiYriAdVYciE/rjjINxIg0r8aJtnCLPEabFTim6RTDnH+3+v 9z7w== X-Gm-Message-State: ALoCoQklvCRnRq0EmUx0T/v+e5ZwJJgYj6Pm5cGhXXYIr0IiDFAt+kE9zBkTeO3fGAjyTkfcTslm X-Received: by 10.180.12.75 with SMTP id w11mr109462790wib.9.1419957256662; Tue, 30 Dec 2014 08:34:16 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id x6sm20711214wjf.24.2014.12.30.08.34.14 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Dec 2014 08:34:15 -0800 (PST) Message-ID: <54A2D3FA.6000404@multiplay.co.uk> Date: Tue, 30 Dec 2014 16:34:02 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: Re: Creating a bootable ZFS disk? References: <54a048f2.45c1c20a.6ffd.ffffe6d7SMTPIN_ADDED_BROKEN@mx.google.com> <54A062FE.6020500@multiplay.co.uk> <54A067D0.4050606@multiplay.co.uk> <349F0A87-5F85-4367-9A5C-E77DBFA16588@karthauser.co.uk> <7E1DA790-822F-4253-A3F6-1E5F5EFFEE04@karthauser.co.uk> <54A1129F.3040004@multiplay.co.uk> <54A18986.4000002@multiplay.co.uk> <5AFFE5CE-ABC7-4D9C-B8E7-0AC9C3327D6B@tao.org.uk> In-Reply-To: <5AFFE5CE-ABC7-4D9C-B8E7-0AC9C3327D6B@tao.org.uk> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2014 19:24:08 -0000 On 30/12/2014 14:08, Dr Josef Karthauser wrote: > On 29 Dec 2014, at 17:04, Steven Hartland wrote: > >> On 29/12/2014 16:02, Dr Josef Karthauser wrote: >>> On 29 Dec 2014, at 08:36, Steven Hartland wrote: >>>> That looks like a 10.0 boot not a 10.1 boot could you confirm and provide a 10.1 boot if thats the case please Joe? >>> Whoops! Sorry. >>> >>> Attached is a verbose boot time dmesg for the 10.1 that causes the problem under load. >>> I immediately rebooted back onto 10.0, so the (un-verbose) dmesg for that follows. >>> >>> Joe >> Thanks Joe actually a verbose boot from 10.0 for comparison would good too. > Ok - I’m attaching a 10.0, 10.1 verbose boot and a diff of the two. > >> Also something to try on the 10.1 to see if it makes any difference, add the following to /boot/loader.conf or run from the loader prompt: >> hint.ahci.0.msi=1 >> >> You can also try =0 as well if 1 makes no difference. > I’ll try these later when the machine’s less busy. Ah now that's very interesting! On 10.0 we're only allocating 1 out of the possible 8 MSI vectors where as on 10.1 we're allocating all 8. == 10.0 == ahci0: port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0 ahci0: attempting to allocate 1 MSI vectors (8 supported) msi: routing MSI IRQ 263 to local APIC 0 vector 58 ahci0: using IRQ 263 for MSI == 10.1 == ahci0: port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe5ffc00-0xfe5fffff irq 19 at device 17.0 on pci0 ahci0: attempting to allocate 8 MSI vectors (8 supported) msi: routing MSI IRQ 263 to local APIC 0 vector 64 msi: routing MSI IRQ 264 to local APIC 0 vector 65 msi: routing MSI IRQ 265 to local APIC 0 vector 66 msi: routing MSI IRQ 266 to local APIC 0 vector 67 msi: routing MSI IRQ 267 to local APIC 0 vector 68 msi: routing MSI IRQ 268 to local APIC 0 vector 69 msi: routing MSI IRQ 269 to local APIC 0 vector 70 msi: routing MSI IRQ 270 to local APIC 0 vector 71 ahci0: using IRQs 263-270 for MSI This change was brought into stable/10 by r260387 and originally came from r256843. I don't believe there's anything wrong with the change, but if this is indeed the cause it could indicate some sort of hardware bug which when throughput is increased by the use of multiple MSI vectors causes an issue. This is strengthened by the fact that ATI's previous generation HW (SB600) had MSI disabled by r245875 due to a very similar issue. So given all the evidence so far ahci.0.msi=1 may well be the fix. Regards Steve