From owner-freebsd-drivers@freebsd.org Sun Jan 14 14:34:15 2018 Return-Path: Delivered-To: freebsd-drivers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C3F2AE72C7E; Sun, 14 Jan 2018 14:34:15 +0000 (UTC) (envelope-from list1@gjunka.com) Received: from msa1.earth.yoonka.com (yoonka.com [88.98.225.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "msa1.earth.yoonka.com", Issuer "msa1.earth.yoonka.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5C6D37E52C; Sun, 14 Jan 2018 14:34:14 +0000 (UTC) (envelope-from list1@gjunka.com) Received: from crayon2.yoonka.com (crayon2.yoonka.com [10.70.7.20]) (authenticated bits=0) by msa1.earth.yoonka.com (8.15.2/8.15.2) with ESMTPSA id w0EEYBk8024846 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Sun, 14 Jan 2018 14:34:11 GMT (envelope-from list1@gjunka.com) Subject: Re: Server doesn't boot when 3 PCIe slots are populated To: freebsd-questions@freebsd.org, freebsd-drivers@freebsd.org References: <061ccfb3-ee6a-71a7-3926-372bb17b3171@kicp.uchicago.edu> From: Grzegorz Junka Message-ID: <4cd39c52-9bf0-ef44-8335-9b4cf6eb6a6b@gjunka.com> Date: Sun, 14 Jan 2018 14:34:11 +0000 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <061ccfb3-ee6a-71a7-3926-372bb17b3171@kicp.uchicago.edu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB-large X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Jan 2018 14:34:15 -0000 On 13/01/2018 18:31, Valeri Galtsev wrote: > > > On 01/13/18 10:21, Grzegorz Junka wrote: >> Hello, >> >> I am installing a FreeBSD server based on Supermicro H8SML-iF. There >> are three PCIe slots to which I installed 2 NVMe drives and one >> network card Intel I350-T4 (with 4 Ethernet slots). >> >> I am observing a strange behavior where the system doesn't boot if >> all three PCIe slots are populated. It shows this message: >> >> nvme0: mem 0xfd8fc000-0xfd8fffff irq 24 at >> device 0.0 on pci1 >> nvme0: controller ready did not become 1 within 30000 ms >> nvme0: did not complete shutdown within 5 seconds of notification >> >> The I see a kernel panic/dump and the system reboots after 15 seconds. >> >> If I remove one card, either one of the NVMe drives or the network >> card, the system boots fine. Also, if in BIOS I set PnP OS to YES >> then sometimes it boots (but not always). If I set PnP OS to NO, and >> all three cards are installed, the system never boots. >> >> When the system boots OK I can see that the network card is reported >> as 4 separate devices on one of the PCIe slots. I tried different >> NVMe drives as well as changing which device is installed to which >> slot but the result seems to be the same in any case. >> >> What may be the issue? Amount of power drawn by the hardware? Too >> many devices not supported by the motherboard? Too many interrupts >> for the FreeBSD kernel to handle? > > That would be my first suspicion. Either total power drawn off the > power supply. Or total power drawn off the PCI[whichever it is] bus > power leads. Check if any of the add-on cards have extra power port > (many video cards do). Card likely will work without extra power > connected to it, but connecting extra power on the card may solve your > problem. Next: borrow more powerful power supply and see if that > resolves the issue. Or temporarily disconnect everything else (like > all hard drives), and boot with all three cards off live CD, and see > if that doesn't crash, then it is marginally insufficient power supply. Thanks for the suggestion. The power supply was able to power two NVMe disks and 6 spinning HDD disks without issues in another server. So the total power should be fine. It may be the PCI bus power leads is causing problems but then, two NVMe drives wouldn't take more than 5-9W and the network card even less. PCI Express specification allows much more to be drawn from each slot. In total the server shouldn't take more than 50-70W, I am not saying that it's not because of the power supply, but I think it would be the least likely at this point. I will try with another power supply when I find one. GregJ