Date: Mon, 15 Jan 2018 19:59:35 +0300 From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com> To: galtsev@kicp.uchicago.edu Cc: Grzegorz Junka <list1@gjunka.com>, FreeBSD Questions Mailing List <freebsd-questions@freebsd.org>, Warner Losh <imp@bsdimp.com>, freebsd-drivers@freebsd.org Subject: Re: Server doesn't boot when 3 PCIe slots are populated Message-ID: <CAOgwaMuah3D46qu9efp_nNA7EDoFRyO-7KS9%2BxwJ5xkGBHxi%2Bg@mail.gmail.com> In-Reply-To: <57715.108.68.169.115.1516033864.squirrel@cosmo.uchicago.edu> References: <ecce3fa6-3909-0947-685c-8a412684e99c@gjunka.com> <CAOgwaMsf9zByJYhL3KqpUMW5qKAzQEHpDWcwejY-uK=9swWbUQ@mail.gmail.com> <3d0ad00c-5214-71b0-017b-c2d5ba608e37@gjunka.com> <CAOgwaMsOKrGfGNmRt-C9Skjssj8JPAtFpk8bwG9v55LmaWdoVw@mail.gmail.com> <8df1e967-01e0-d3c2-e14c-64c7fc8c66b0@gjunka.com> <CANCZdfqZ-dogHXBdoyMPLOPs_R-vD%2BwLM-r6sm6ypesd0Nvp4A@mail.gmail.com> <0e582bdb-e1f9-438c-3da2-2bcdc950aab5@gjunka.com> <CAOgwaMvusKzt%2BYvmKeuyox0c=wgqEv9UP475Eacm2B0OkF7OrQ@mail.gmail.com> <57715.108.68.169.115.1516033864.squirrel@cosmo.uchicago.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 15, 2018 at 7:31 PM, Valeri Galtsev <galtsev@kicp.uchicago.edu> wrote: > > On Mon, January 15, 2018 3:44 am, Mehmet Erol Sanliturk wrote: > > On Mon, Jan 15, 2018 at 9:44 AM, Grzegorz Junka <list1@gjunka.com> > wrote: > > > >> > >> On 15/01/2018 06:18, Warner Losh wrote: > >> > >>> > >>> > >>> On Jan 14, 2018 11:05 PM, "Grzegorz Junka" <list1@gjunka.com <mailto: > >>> list1@gjunka.com>> wrote: > >>> > >>> > >>> On 14/01/2018 16:18, Mehmet Erol Sanliturk wrote: > >>> > >>> > >>> > >>> On Sun, Jan 14, 2018 at 5:46 PM, Grzegorz Junka > >>> <list1@gjunka.com <mailto:list1@gjunka.com> > >>> <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>> wrote: > >>> > >>> > >>> On 13/01/2018 17:56, Mehmet Erol Sanliturk wrote: > >>> > >>> > >>> > >>> On Sat, Jan 13, 2018 at 7:21 PM, Grzegorz Junka > >>> <list1@gjunka.com <mailto:list1@gjunka.com> > >>> <mailto:list1@gjunka.com <mailto:list1@gjunka.com>> > >>> <mailto:list1@gjunka.com <mailto:list1@gjunka.com> > >>> <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>>> wrote: > >>> > >>> Hello, > >>> > >>> I am installing a FreeBSD server based on > >>> Supermicro H8SML-iF. > >>> There are three PCIe slots to which I installed 2 > >>> NVMe > >>> drives and > >>> one network card Intel I350-T4 (with 4 Ethernet > >>> slots). > >>> > >>> I am observing a strange behavior where the system > >>> doesn't > >>> boot if > >>> all three PCIe slots are populated. It shows this > >>> message: > >>> > >>> nvme0: <Generic NVMe Device> mem > >>> 0xfd8fc000-0xfd8fffff irq > >>> 24 at > >>> device 0.0 on pci1 > >>> nvme0: controller ready did not become 1 within > >>> 30000 ms > >>> nvme0: did not complete shutdown within 5 seconds > >>> of > >>> notification > >>> > >>> The I see a kernel panic/dump and the system > >>> reboots after > >>> 15 seconds. > >>> > >>> If I remove one card, either one of the NVMe > >>> drives or the > >>> network > >>> card, the system boots fine. Also, if in BIOS I > >>> set PnP OS > >>> to YES > >>> then sometimes it boots (but not always). If I set > >>> PnP OS > >>> to NO, > >>> and all three cards are installed, the system > >>> never boots. > >>> > >>> When the system boots OK I can see that the > >>> network card is > >>> reported as 4 separate devices on one of the PCIe > >>> slots. I > >>> tried > >>> different NVMe drives as well as changing which > >>> device is > >>> installed to which slot but the result seems to be > >>> the > >>> same in any > >>> case. > >>> > >>> What may be the issue? Amount of power drawn by the > >>> hardware? Too > >>> many devices not supported by the motherboard? Too > >>> many > >>> interrupts > >>> for the FreeBSD kernel to handle? > >>> > >>> Any help would be greatly appreciated. > >>> > >>> GregJ > >>> > >>> _______________________________________________ > >>> > >>> > >>> > >>> > >>> > >>> From my experience from other trade marked main boards > >>> , an > >>> action may be to check manual of your server board to > >>> see > >>> whether there are rules about use of these slots : > >>> Sometimes > >>> differently shaped slots are supplied with same ports > >>> : If one > >>> slot is occupied , the other slot should be left open , > >>> or > >>> rules about not to insert such a kind of device into a > >>> slot , > >>> for example , graphic cards . > >>> > >>> > >>> Mehmet Erol Sanliturk > >>> > >>> > >>> I checked the manual but couldn't find any restrictions > >>> regarding > >>> PCIe ports. It only says how many lanes are available in > >>> each > >>> slot. Would there be any obvious BIOS setting that could > >>> cause > >>> this issue? I tried after resetting BIOS to default > >>> settings but > >>> maybe something is set incorrectly by default? > >>> > >>> GregJ > >>> _______________________________________________ > >>> > >>> > >>> > >>> > >>> > >>> http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56 > >>> x0/H8SML-iF.cfm > >>> <http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR5 > >>> 6x0/H8SML-iF.cfm> > >>> H8SML-iF > >>> > >>> > >>> On the above page , click "OS Compatibility" > >>> > >>> > >>> On the following page , click "SR5650" > >>> > >>> http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp > >>> _SR5650.cfm > >>> <http://www.supermicro.com/Aplus/support/resources/OS/OS_Com > >>> p_SR5650.cfm> > >>> OS Compatibility Chart > >>> > >>> > >>> On the column ( third ) > >>> > >>> H8SML-7F > >>> H8SML-7 > >>> H8SML-iF > >>> H8SML-i > >>> > >>> > >>> there listed only * > >>> * > >>> ** > >>> * > >>> * > >>> * > >>> * > >>> > >>> FreeBSD 8.0 > >>> FreeBSD 9.1 > >>> > >>> From this list , it may be said that , this mother board date > >>> is old , means , it seems that the new OS versions are not > >>> tested after currently tested OS versions . > >>> > >>> > >>> To check interaction between operating system and your > >>> Supermicro H8SML-iF , select one of the suitable operating > >>> system ( Unix class OSes are more suitable ) for you and > >>> tested on this card , and try to install it as you like your > >>> installed components . If it boots successfully , it means > >>> that there is an incompatibility between your FreeBSD and the > >>> main board . If no one of them boots , then you may conclude > >>> that , there is a problem in your settings . > >>> > >>> > >>> BIOS settings are important , because , OS communicates with > >>> the main board through these settings . > >>> > >>> > >>> In manual ( downloaded from the above page : > >>> Manual Revision 1.0c > >>> Release Date: March 12, 2014 ) , page 4-9 , "PCI/PnP > >>> Configuration" is defined . > >>> If PnP is selected YES. OS adjusts some device settings . If > >>> NO is selected , BIOS adjusts some device settings . When BIOS > >>> adjusted device settings are not conforming to OS parameters , > >>> the result will be "FAIL" . > >>> > >>> Therefore , more suitable selection is YES . > >>> > >>> > >>> Another point is that , there are many more BIOS selectable > >>> parameters and jumpers about PCI slots and others . > >>> There are some BIOS settings for PCI slots : > >>> > >>> PCI X4 Slot 6 ( page 4-9 ) > >>> PCI x8 Slot 7 ( page 4-10 ) > >>> > >>> > >>> > >>> Please review these BIOS settings in your manual and set them > >>> with respect to your requirements . > >>> > >>> > >>> Thanks Mehmet for looking into this. It's an old motherboard but > >>> my point is that it boots fine when either: one NVMe and the > >>> network card, or both NVMe are installed, but not when all three > >>> are installed. How would that be related to FreeBSD compatibility? > >>> The chipset and all devices that I am trying to install are > >>> supported by FreeBSD 11.x. > >>> > >>> I just tried booting into a Debian live system and it also didn't > >>> enumerate NVMe drives properly. This means that it's not FreeBSD > >>> related and is no longer relevant for this list. I will try to > >>> play with BIOS settings to see if I can make it work that way. > >>> Thanks for all the help. > >>> > >>> > >>> > >>> Nvme drives are weird about power. I distrust the power estimate of > >>> 5-9w > >>> earlier in the thread... given the oddity with debian, it's not too > >>> crazy > >>> to think that. How far does FreeBSD boot though? > >>> > >>> > >> I tried with a different power supply but the outcome was exactly the > >> same. Sometimes FreeBSD boots fine but one of the NVMe drives is not > >> visible (i.e. dmesg grep shows only one NVMe). When it doesn't work it > >> boots up to the point of enumerating drives (SATA, USB, NVMe). Then it > >> stops at the first NVMe and reboots. > >> > >> The funny thing is that very often it's enough to pull out one of the > >> cards and put it back in. Then the system boots fine with all three > >> cards. > >> I had that a few times. Once it's booted it works, I can restart the > >> system > >> and it boots every time. As soon as I power off, unplug from the power > >> main, wait a few minutes and power it on again, the issue comes back - > >> can't boot as NVMe can't be enumerated. > >> > >> I though it might be caused by the hardware being too cold. I left the > >> server once overnight but it didn't boot up, it was trying and > >> restarting > >> the whole night. > >> > >> GregJ > >> > >> > >> _______________________________________________ > >> > >> > > > > > > > > The above explanation brings mind to the "impedance mismatch in > > electronics" problem . > > Hm, I wouldn't say so. First of all, I will seriously doubt that sane > cards are out of specs as far as impedance is concerned. > > But before going further, let's make sure we talk about the same thing. I > assume impedance mismatch is what is related to impedance of the load > attached to transmission line to be different from impedance of > transmission line itself. In such case part of transmitted signal is > reflected from the load back into transmission line. This can make mess as > transmitted signal is mixed with this reflected at different positions of > the loads along the same transmission line. One has to have really large > mismatch (over 20% at least) to make that matter. Many of us remember this > in at least two computer related cases: 1. we used terminators at the end > of SCSI cables (or attached "self-terminating SCSI device to the end of > line). 2. In some system boards in which memory buses had no terminators > the manual would say to populate slots beginning from the fartherst away > from CPU (to defeat reflection from open end of memory bus lines). > > I have never heard of anything like that on PCI express bus. If I am > wrong, could you give some pointer so I can read about it. > > Thanks in advance for pointers! (I know: you learn something every day - > which I bet I am about to ;-) > > Valeri > > > > > ( Please search > > > > > > impedance mismatch in electronics > > impedance matching in electronics > > > > > > in Internet if you want explanations about them . ) > > > > > > When all of these cards are inserted into slots simultaneously , their > > accumulated electronic effect may distort behaviour of your mother board > > circuits or attached card circuit(s) . > > > > > > Therefore , if you can find another NVMe and/or network card , please > test > > their effect . > > Such tests may be inconclusive because mother board circuits may be > > affected negatively from "properly" operating add on cards when they are > > inserted together . > > > > > > If it is feasible for you , you may use USB attached network card(s) to > > eliminate network card attachment . > > Or you may use a more capable one NVMe card instead of two smaller NVMe > > cards , or you may use only one of them , or/and select an SATA SSD . > > Such a choice would save your investment and produces a working server > > with > > a "little" loss when compared to "all" . > > > > > > > > > > Mehmet Erol Sanliturk > > _______________________________________________ > > freebsd-questions@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > > To unsubscribe, send any mail to > > "freebsd-questions-unsubscribe@freebsd.org" > > > > > ++++++++++++++++++++++++++++++++++++++++ > Valeri Galtsev > Sr System Administrator > Department of Astronomy and Astrophysics > Kavli Institute for Cosmological Physics > University of Chicago > Phone: 773-702-4247 > ++++++++++++++++++++++++++++++++++++++++ > The problem of "impedance matching" occurs between any two interacting circuits : When a circuit gives its "output" to another circuit as "input" there exists this problem irrespective of subjects and kinds of circuits . Obviously , behaviours are not exactly the same . If you search the following phrase in Internet , you will find a large amount of links : impedance matching circuit design If we think a computer main board slots , the following may occur : Assume a slot has a voltage level for triggering input into an add on card , i.e. , add on card is affected when it senses a voltage level equal or greater than that level . The lower level values will not trigger the add on card . Assume an add on card is working . Assume a new add on card is also working alone . When both of these add on cards are inserted into slots , the power drawn will lower the voltage level of the surrounding circuit more than a single card . If this lowered voltage level is less than threshold level of the added cards ( one of them , or both of them ) it ( they ) will not sense the signals from the surrounding circuits . Therefore , it (they) will not respond to the action requesting signals . In one of the previous messages , https://lists.freebsd.org/pipermail/freebsd-questions/2018-January/280455.html it is said that " I am observing a strange behavior where the system doesn't boot if all three PCIe slots are populated. It shows this message: nvme0: <Generic NVMe Device> mem 0xfd8fc000-0xfd8fffff irq 24 at device 0.0 on pci1 nvme0: controller ready did not become 1 within 30000 ms nvme0: did not complete shutdown within 5 seconds of notification The I see a kernel panic/dump and the system reboots after 15 seconds. If I remove one card, either one of the NVMe drives or the network card, the system boots fine. " A good example may be the above message . Mehmet Erol Sanliturk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMuah3D46qu9efp_nNA7EDoFRyO-7KS9%2BxwJ5xkGBHxi%2Bg>