Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jan 2018 08:43:31 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Grzegorz Junka <list1@gjunka.com>
Cc:        freebsd-questions@freebsd.org, freebsd-drivers@freebsd.org
Subject:   Re: Server doesn't boot when 3 PCIe slots are populated
Message-ID:  <CANCZdfp1Hi9Zsnz7snFuUZGaVCNi_KciJspbX%2B7FQ5%2BRUyEFyg@mail.gmail.com>
In-Reply-To: <0e582bdb-e1f9-438c-3da2-2bcdc950aab5@gjunka.com>
References:  <ecce3fa6-3909-0947-685c-8a412684e99c@gjunka.com> <CAOgwaMsf9zByJYhL3KqpUMW5qKAzQEHpDWcwejY-uK=9swWbUQ@mail.gmail.com> <3d0ad00c-5214-71b0-017b-c2d5ba608e37@gjunka.com> <CAOgwaMsOKrGfGNmRt-C9Skjssj8JPAtFpk8bwG9v55LmaWdoVw@mail.gmail.com> <8df1e967-01e0-d3c2-e14c-64c7fc8c66b0@gjunka.com> <CANCZdfqZ-dogHXBdoyMPLOPs_R-vD%2BwLM-r6sm6ypesd0Nvp4A@mail.gmail.com> <0e582bdb-e1f9-438c-3da2-2bcdc950aab5@gjunka.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jan 14, 2018 at 11:44 PM, Grzegorz Junka <list1@gjunka.com> wrote:

>
> On 15/01/2018 06:18, Warner Losh wrote:
>
>
>
> On Jan 14, 2018 11:05 PM, "Grzegorz Junka" <list1@gjunka.com> wrote:
>
>
> On 14/01/2018 16:18, Mehmet Erol Sanliturk wrote:
>
>>
>>
>> On Sun, Jan 14, 2018 at 5:46 PM, Grzegorz Junka <list1@gjunka.com
>> <mailto:list1@gjunka.com>> wrote:
>>
>>
>>     On 13/01/2018 17:56, Mehmet Erol Sanliturk wrote:
>>
>>
>>
>>         On Sat, Jan 13, 2018 at 7:21 PM, Grzegorz Junka
>>         <list1@gjunka.com <mailto:list1@gjunka.com>
>>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>> wrote:
>>
>>             Hello,
>>
>>             I am installing a FreeBSD server based on Supermicro H8SML-iF.
>>             There are three PCIe slots to which I installed 2 NVMe
>>         drives and
>>             one network card Intel I350-T4 (with 4 Ethernet slots).
>>
>>             I am observing a strange behavior where the system doesn't
>>         boot if
>>             all three PCIe slots are populated. It shows this message:
>>
>>             nvme0: <Generic NVMe Device> mem 0xfd8fc000-0xfd8fffff irq
>>         24 at
>>             device 0.0 on pci1
>>             nvme0: controller ready did not become 1 within 30000 ms
>>             nvme0: did not complete shutdown within 5 seconds of
>>         notification
>>
>>             The I see a kernel panic/dump and the system reboots after
>>         15 seconds.
>>
>>             If I remove one card, either one of the NVMe drives or the
>>         network
>>             card, the system boots fine. Also, if in BIOS I set PnP OS
>>         to YES
>>             then sometimes it boots (but not always). If I set PnP OS
>>         to NO,
>>             and all three cards are installed, the system never boots.
>>
>>             When the system boots OK I can see that the network card is
>>             reported as 4 separate devices on one of the PCIe slots. I
>>         tried
>>             different NVMe drives as well as changing which device is
>>             installed to which slot but the result seems to be the
>>         same in any
>>             case.
>>
>>             What may be the issue? Amount of power drawn by the
>>         hardware? Too
>>             many devices not supported by the motherboard? Too many
>>         interrupts
>>             for the FreeBSD kernel to handle?
>>
>>             Any help would be greatly appreciated.
>>
>>             GregJ
>>
>>             _______________________________________________
>>
>>
>>
>>
>>
>>         From my experience from other trade marked main boards , an
>>         action may be to check manual of your server board to see
>>         whether there are rules about use of these slots : Sometimes
>>         differently shaped slots are supplied with same ports : If one
>>         slot is occupied , the other slot should be left open , or
>>         rules about not to insert such a kind of device into a slot ,
>>         for example , graphic cards .
>>
>>
>>         Mehmet Erol Sanliturk
>>
>>
>>     I checked the manual but couldn't find any restrictions regarding
>>     PCIe ports. It only says how many lanes are available in each
>>     slot. Would there be any obvious BIOS setting that could cause
>>     this issue? I tried after resetting BIOS to default settings but
>>     maybe something is set incorrectly by default?
>>
>>     GregJ
>>     _______________________________________________
>>
>>
>>
>>
>>
>> http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56
>> x0/H8SML-iF.cfm
>> H8SML-iF
>>
>>
>> On the above page , click "OS Compatibility"
>>
>>
>> On the following page , click "SR5650"
>>
>> http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp_SR5650.cfm
>> OS Compatibility Chart
>>
>>
>> On the column ( third )
>>
>> H8SML-7F
>> H8SML-7
>> H8SML-iF
>> H8SML-i
>>
>>
>> there listed only *
>> *
>> **
>> *
>> *
>> *
>> *
>>
>> FreeBSD 8.0
>> FreeBSD 9.1
>>
>> From this list , it may be said that , this mother board date is old ,
>> means , it seems that the new OS versions are not tested after currently
>> tested OS versions .
>>
>>
>> To check interaction between operating system and your Supermicro
>> H8SML-iF , select one of the suitable operating system ( Unix class OSes
>> are more suitable ) for you and tested on this card , and try to install it
>> as you like your installed components . If it boots successfully , it means
>> that there is an incompatibility between your FreeBSD and the main board .
>> If no one of them boots , then you may conclude that , there is a problem
>> in your settings .
>>
>>
>> BIOS settings are important , because , OS communicates with the main
>> board through these settings .
>>
>>
>> In manual ( downloaded from the above page :
>> Manual Revision 1.0c
>> Release Date: March 12, 2014 ) , page 4-9  , "PCI/PnP Configuration" is
>> defined .
>> If PnP is selected YES. OS adjusts some device settings  . If NO is
>> selected , BIOS adjusts some device settings . When BIOS adjusted device
>> settings are not conforming to OS parameters , the result will be "FAIL" .
>>
>> Therefore , more suitable selection is YES .
>>
>>
>> Another point is that , there are many more BIOS selectable parameters
>> and jumpers about PCI slots and others  .
>> There are some BIOS settings for PCI slots :
>>
>> PCI X4 Slot 6 ( page 4-9 )
>> PCI x8 Slot 7 ( page 4-10 )
>>
>>
>>
>> Please review these BIOS settings in your manual and set them with
>> respect to your requirements .
>>
>>
> Thanks Mehmet for looking into this. It's an old motherboard but my point
> is that it boots fine when either: one NVMe and the network card, or both
> NVMe are installed, but not when all three are installed. How would that be
> related to FreeBSD compatibility? The chipset and all devices that I am
> trying to install are supported by FreeBSD 11.x.
>
> I just tried booting into a Debian live system and it also didn't
> enumerate NVMe drives properly. This means that it's not FreeBSD related
> and is no longer relevant for this list. I will try to play with BIOS
> settings to see if I can make it work that way. Thanks for all the help.
>
>
>
> Nvme drives are weird about power. I distrust the power estimate of 5-9w
> earlier in the thread... given the oddity with debian, it's not too crazy
> to think that. How far does FreeBSD boot though?
>
>
> I tried with a different power supply but the outcome was exactly the
> same. Sometimes FreeBSD boots fine but one of the NVMe drives is not
> visible (i.e. dmesg grep shows only one NVMe). When it doesn't work it
> boots up to the point of enumerating drives (SATA, USB, NVMe). Then it
> stops at the first NVMe and reboots.
>

Any panic message / traceback, or just a system reset?


> The funny thing is that very often it's enough to pull out one of the
> cards and put it back in. Then the system boots fine with all three cards.
> I had that a few times. Once it's booted it works, I can restart the system
> and it boots every time. As soon as I power off, unplug from the power
> main, wait a few minutes and power it on again, the issue comes back -
> can't boot as NVMe can't be enumerated.
>

Sounds like misaligned cards then...

Warner


> I though it might be caused by the hardware being too cold. I left the
> server once overnight but it didn't boot up, it was trying and restarting
> the whole night.
>
> GregJ
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp1Hi9Zsnz7snFuUZGaVCNi_KciJspbX%2B7FQ5%2BRUyEFyg>