From owner-freebsd-current@freebsd.org  Thu Dec  7 16:05:49 2017
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BA938E8B5FF;
 Thu,  7 Dec 2017 16:05:49 +0000 (UTC)
 (envelope-from aplattner@nvidia.com)
Received: from hqemgate15.nvidia.com (hqemgate15.nvidia.com [216.228.121.64])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits))
 (Client CN "hqemgate15.nvidia.com", Issuer "RapidSSL SHA256 CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 98F006B54F;
 Thu,  7 Dec 2017 16:05:49 +0000 (UTC)
 (envelope-from aplattner@nvidia.com)
Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by
 hqemgate15.nvidia.com
 id <B5a29659f0001>; Thu, 07 Dec 2017 08:00:32 -0800
Received: from HQMAIL101.nvidia.com ([172.20.161.6])
 by hqpgpgate101.nvidia.com (PGP Universal service);
 Thu, 07 Dec 2017 08:00:42 -0800
X-PGP-Universal: processed;
 by hqpgpgate101.nvidia.com on Thu, 07 Dec 2017 08:00:42 -0800
Received: from HQMAIL102.nvidia.com (172.18.146.10) by HQMAIL101.nvidia.com
 (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1293.2; Thu, 7 Dec
 2017 16:00:42 +0000
Received: from krypton.plattnerplace.us (10.124.1.5) by HQMAIL102.nvidia.com
 (172.18.146.10) with Microsoft SMTP Server (TLS) id 15.0.1293.2; Thu, 7 Dec
 2017 16:00:41 +0000
Subject: Re: couple of nvidia-driver issues
To: Alan Somers <asomers@freebsd.org>, Andriy Gapon <avg@freebsd.org>
CC: Alexey Dokuchaev <danfe@freebsd.org>, freebsd-x11
 <freebsd-x11@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>
References: <07b9dbda-60ef-3643-308f-18a05e8ca958@FreeBSD.org>
 <20171205140308.GA94043@FreeBSD.org>
 <5e95dc14-9d3b-e2eb-b89c-f66f7857eb58@FreeBSD.org>
 <CAOtMX2hTKcmstQaVgmEAU-YmFE+O89_Y-E=TgEnUzmV9skUfUw@mail.gmail.com>
From: Aaron Plattner <aplattner@nvidia.com>
X-Nvconfidentiality: public
Message-ID: <fd4f8bc8-ff56-4b70-498d-79cdf09aa2e4@nvidia.com>
Date: Thu, 7 Dec 2017 08:00:40 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.5.0
MIME-Version: 1.0
In-Reply-To: <CAOtMX2hTKcmstQaVgmEAU-YmFE+O89_Y-E=TgEnUzmV9skUfUw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [10.124.1.5]
X-ClientProxiedBy: HQMAIL108.nvidia.com (172.18.146.13) To
 HQMAIL102.nvidia.com (172.18.146.10)
X-Mailman-Approved-At: Thu, 07 Dec 2017 17:26:32 +0000
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Dec 2017 16:05:49 -0000

On 12/07/2017 07:35 AM, Alan Somers wrote:
> On Thu, Dec 7, 2017 at 2:33 AM, Andriy Gapon <avg@freebsd.org=20
> <mailto:avg@freebsd.org>> wrote:
>=20
>=20
>     [cc-ing current@ to raise more awareness]
>=20
>     On 05/12/2017 16:03, Alexey Dokuchaev wrote:
>      > On Fri, Nov 24, 2017 at 11:31:51AM +0200, Andriy Gapon wrote:
>      >>
>      >> I have reported a couple of nvidia-driver issues in the FreeBSD
>     section
>      >> of the nVidia developer forum, but no replies so far.
>      >>
>      >> Well, the first issue is not with the driver, but with a utility
>     that
>      >> comes with it, nvidia-smi:
>      >>
>     https://devtalk.nvidia.com/default/topic/1026589/freebsd/nvidia-smi-q=
uery-gpu-spins-forever-on-freebsd-head-amd64-/
>     <https://devtalk.nvidia.com/default/topic/1026589/freebsd/nvidia-smi-=
query-gpu-spins-forever-on-freebsd-head-amd64-/>
>      >> I wonder if I am the only one affected or if I see the problem
>     because
>      >> I am on head or something else.
>      >> I am pretty sure that the problem is caused by a programming bug
>     related
>      >> to strtok_r.
>      >
>      > I'll try to reproduce it and report back.
>=20
>     I've done some work with a debugger and it seems that there is code
>     that does
>     something like this:
>=20
>     char *last =3D NULL;
>=20
>     while (1) {
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (last =3D=3D NULL)
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 p =3D strtok=
_r(str, sep, &last);
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 else
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 p =3D strtok=
_r(NULL, sep, &last);
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (p =3D=3D NULL)
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break;
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 ...
>     }
>=20
>     The problem is that when 'p' points to the last token, 'last' is
>     NULL (in
>     FreeBSD implementation of strtok_r).=C2=A0 That means that when we go=
 to
>     the next
>     iteration the parsing starts all over again leading to the endless lo=
op.
>     The code is incorrect from the standards point of view, because the
>     value of
>     'last' is completely opaque and should not be used for anything else
>     but passing
>     it back to strtok_r.
>=20
>     I used gdb -w to change the logic to:
>=20
>     char *last =3D 1;
>=20
>     While (1) {
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (last =3D=3D 1)
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 p =3D strtok=
_r(str, sep, &last);
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 else
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 p =3D strtok=
_r(NULL, sep, &last);
>      =C2=A0 =C2=A0 =C2=A0 =C2=A0 ...
>     }
>=20
>     Where 1 is used as an "impossible" pointer value which is neither
>     NULL nor a
>     valid pointer that can be set by strtok_r.=C2=A0 It's not ideal, but
>     binary code
>     editing is not as easy as that of source code.
>=20
>     The binary patch is here:
>     https://people.freebsd.org/~avg/nvidia-smi.bsdiff
>     <https://people.freebsd.org/~avg/nvidia-smi.bsdiff>
>=20
>      >> The second issue is with the FreeBSD support for the kernel drive=
r:
>      >>
>     https://devtalk.nvidia.com/default/topic/1026645/freebsd/panic-relate=
d-to-nvkms_timers-lock-sx-lock-/
>     <https://devtalk.nvidia.com/default/topic/1026645/freebsd/panic-relat=
ed-to-nvkms_timers-lock-sx-lock-/>
>      >> I would like to get some feedback on my analysis.
>      >> I am testing this patch right now:
>      >>
>     https://people.freebsd.org/~avg/extra-patch-src_nvidia-modeset_nvidia=
-modeset-freebsd.c
>     <https://people.freebsd.org/~avg/extra-patch-src_nvidia-modeset_nvidi=
a-modeset-freebsd.c>
>      >
>      > Unfortunately, I'm not an expert on kernel locking primitives to
>     give you
>      > a proper review, let's see what others have to say.
>=20
>     It's been a while since I posted the patch and there are no comments
>     yet.
>     I can only add that I am running an INVARIANTS and WITNESS enabled
>     kernel all
>     the time and before the patch I was getting kernel panics every now
>     and then.
>     Since I started using the patch I haven't had a single nvidia panic y=
et.
>=20
>      >> Also, what's the best place or who are the best people with whom =
to
>      >> discuss such issues?
>      >
>      > Yes, this is a problem now: since Christian Zander had left
>     nVidia, he
>      > could not tell me who'd be their next liaison to talk to from Free=
BSD
>      > community. :-(
>=20
>     Oh, I didn't know about Christian's departure.
>     So, we are not in a very good position now.
>=20
>=20
> How about Aaron Plattner (CC'd).=C2=A0 Aaron, are you still working on=20
> FreeBSD driver issues?

Thanks for the heads up, Alan. I filed bug 2032249 to track this.

-- Aaron