Date: Tue, 2 Jun 2015 22:06:48 -0500 From: Sean Kelly <smkelly@smkelly.org> To: Jim Harris <jim.harris@gmail.com> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: 10.1 NVMe kernel panic Message-ID: <EF729BA5-4D1A-47F6-AF55-DE82A49D46C4@smkelly.org> In-Reply-To: <CAJP=Hc-w_J9wAJXqhtzdGa7fQ0bqFcSXm0sGi0Xnue8jqXOw5A@mail.gmail.com> References: <90B2D392-01FD-415A-B3D9-3CEDFC8373C4@smkelly.org> <CAJP=Hc-w_J9wAJXqhtzdGa7fQ0bqFcSXm0sGi0Xnue8jqXOw5A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Jim, Thanks for the reply. I set hw.nvme.force_intx=1 and get a new form of kernel panic: http://smkelly.org/stuff/nvme_crash_force_intx.txt <http://smkelly.org/stuff/nvme_crash_force_intx.txt> It looks like the NVMes are just failing to initialize at all now. As long as that tunable is in the kenv, I get this behavior. If I kldload them after boot, the init fails as well. But if I kldunload, kenv -u, kldload, it then works again. The only difference is kldload doesn’t result in a panic, just timeouts initializing them all. I also compiled and tried stable/10 and it crashed in a similar way, but i’ve not captured the panic yet. It crashes even without the tunable in place. I’ll see if I can capture it. -- Sean Kelly smkelly@smkelly.org http://smkelly.org > On Jun 2, 2015, at 6:10 PM, Jim Harris <jim.harris@gmail.com> wrote: > > > > On Thu, May 21, 2015 at 8:33 AM, Sean Kelly <smkelly@smkelly.org <mailto:smkelly@smkelly.org>> wrote: > Greetings. > > I have a Dell R630 server with four of Dell’s 800GB NVMe SSDs running FreeBSD 10.1-p10. According to the PCI vendor, they are some sort of rebranded Samsung drive. If I boot the system and then load nvme.ko and nvd.ko from a command line, the drives show up okay. If I put > nvme_load=“YES” > nvd_load=“YES” > in /boot/loader.conf, the box panics on boot: > panic: nexus_setup_intr: NULL irq resource! > > If I boot the system with “Safe Mode: ON” from the loader menu, it also boots successfully and the drives show up. > > You can see a full ‘boot -v’ here: > http://smkelly.org/stuff/nvme-panic.txt <http://smkelly.org/stuff/nvme-panic.txt> <http://smkelly.org/stuff/nvme-panic.txt <http://smkelly.org/stuff/nvme-panic.txt>> > > Anyone have any insight into what the issue may be here? Ideally I need to get this working in the next few days or return this thing to Dell. > > Hi Sean, > > Can you try adding hw.nvme.force_intx=1 to /boot/loader.conf? > > I suspect you are able to load the drivers successfully after boot because interrupt assignments are not restricted to CPU0 at that point - see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321 <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321> for a related issue. Your logs clearly show that vectors were allocated for the first 2 NVMe SSDs, but the third could not get its full allocation. There is a bug in the INTx fallback code that needs to be fixed - you do not hit this bug when loading after boot because bug #199321 only affects interrupt allocation during boot. > > If the force_intx test works, would you able to upgrade your nvme drivers to the latest on stable/10? There are several patches (one related to interrupt vector allocation) that have been pushed to stable/10 since 10.1 was released, and I will be pushing another patch for the issue you have reported shortly. > > Thanks, > > -Jim > > > > > Thanks! > > -- > Sean Kelly > smkelly@smkelly.org <mailto:smkelly@smkelly.org> > http://smkelly.org <http://smkelly.org/> > > _______________________________________________ > freebsd-stable@freebsd.org <mailto:freebsd-stable@freebsd.org> mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable <http://lists.freebsd.org/mailman/listinfo/freebsd-stable> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org <mailto:freebsd-stable-unsubscribe@freebsd.org>"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EF729BA5-4D1A-47F6-AF55-DE82A49D46C4>
