Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Jun 2015 22:06:48 -0500
From:      Sean Kelly <smkelly@smkelly.org>
To:        Jim Harris <jim.harris@gmail.com>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: 10.1 NVMe kernel panic
Message-ID:  <EF729BA5-4D1A-47F6-AF55-DE82A49D46C4@smkelly.org>
In-Reply-To: <CAJP=Hc-w_J9wAJXqhtzdGa7fQ0bqFcSXm0sGi0Xnue8jqXOw5A@mail.gmail.com>
References:  <90B2D392-01FD-415A-B3D9-3CEDFC8373C4@smkelly.org> <CAJP=Hc-w_J9wAJXqhtzdGa7fQ0bqFcSXm0sGi0Xnue8jqXOw5A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Jim,

Thanks for the reply. I set hw.nvme.force_intx=3D1 and get a new form of =
kernel panic:
http://smkelly.org/stuff/nvme_crash_force_intx.txt =
<http://smkelly.org/stuff/nvme_crash_force_intx.txt>;

It looks like the NVMes are just failing to initialize at all now. As =
long as that tunable is in the kenv, I get this behavior. If I kldload =
them after boot, the init fails as well. But if I kldunload, kenv -u, =
kldload, it then works again. The only difference is kldload doesn=E2=80=99=
t result in a panic, just timeouts initializing them all.

I also compiled and tried stable/10 and it crashed in a similar way, but =
i=E2=80=99ve not captured the panic yet. It crashes even without the =
tunable in place. I=E2=80=99ll see if I can capture it.

--=20
Sean Kelly
smkelly@smkelly.org
http://smkelly.org

> On Jun 2, 2015, at 6:10 PM, Jim Harris <jim.harris@gmail.com> wrote:
>=20
>=20
>=20
> On Thu, May 21, 2015 at 8:33 AM, Sean Kelly <smkelly@smkelly.org =
<mailto:smkelly@smkelly.org>> wrote:
> Greetings.
>=20
> I have a Dell R630 server with four of Dell=E2=80=99s 800GB NVMe SSDs =
running FreeBSD 10.1-p10. According to the PCI vendor, they are some =
sort of rebranded Samsung drive. If I boot the system and then load =
nvme.ko and nvd.ko from a command line, the drives show up okay. If I =
put
>         nvme_load=3D=E2=80=9CYES=E2=80=9D
>         nvd_load=3D=E2=80=9CYES=E2=80=9D
> in /boot/loader.conf, the box panics on boot:
>         panic: nexus_setup_intr: NULL irq resource!
>=20
> If I boot the system with =E2=80=9CSafe Mode: ON=E2=80=9D from the =
loader menu, it also boots successfully and the drives show up.
>=20
> You can see a full =E2=80=98boot -v=E2=80=99 here:
> http://smkelly.org/stuff/nvme-panic.txt =
<http://smkelly.org/stuff/nvme-panic.txt>; =
<http://smkelly.org/stuff/nvme-panic.txt =
<http://smkelly.org/stuff/nvme-panic.txt>>;
>=20
> Anyone have any insight into what the issue may be here? Ideally I =
need to get this working in the next few days or return this thing to =
Dell.
>=20
> Hi Sean,
>=20
> Can you try adding hw.nvme.force_intx=3D1 to /boot/loader.conf?
>=20
> I suspect you are able to load the drivers successfully after boot =
because interrupt assignments are not restricted to CPU0 at that point - =
see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321 =
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321>; for a =
related issue.  Your logs clearly show that vectors were allocated for =
the first 2 NVMe SSDs, but the third could not get its full allocation.  =
There is a bug in the INTx fallback code that needs to be fixed - you do =
not hit this bug when loading after boot because bug #199321 only =
affects interrupt allocation during boot.
>=20
> If the force_intx test works, would you able to upgrade your nvme =
drivers to the latest on stable/10?  There are several patches (one =
related to interrupt vector allocation) that have been pushed to =
stable/10 since 10.1 was released, and I will be pushing another patch =
for the issue you have reported shortly.
>=20
> Thanks,
>=20
> -Jim
>=20
>=20
>  =20
>=20
> Thanks!
>=20
> --
> Sean Kelly
> smkelly@smkelly.org <mailto:smkelly@smkelly.org>
> http://smkelly.org <http://smkelly.org/>;
>=20
> _______________________________________________
> freebsd-stable@freebsd.org <mailto:freebsd-stable@freebsd.org> mailing =
list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable =
<http://lists.freebsd.org/mailman/listinfo/freebsd-stable>;
> To unsubscribe, send any mail to =
"freebsd-stable-unsubscribe@freebsd.org =
<mailto:freebsd-stable-unsubscribe@freebsd.org>"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EF729BA5-4D1A-47F6-AF55-DE82A49D46C4>