From owner-freebsd-current@freebsd.org Thu Dec 7 10:01:02 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABCACE82146; Thu, 7 Dec 2017 10:01:02 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com [209.85.215.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EC3927F82F; Thu, 7 Dec 2017 10:01:01 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f51.google.com with SMTP id a12so7478929lfe.4; Thu, 07 Dec 2017 02:01:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=eey9dYtbB+DTTbhovoI5BWO32O9wW5FafkNk0tGEw6g=; b=eyQjCFsKMeNWJHTEte1VQ2WY5MdypWR5LTwfu6Gg8UeVIp7XRDwyWYrDb3TILvTOh4 126WTmF6MjfShRiQYmhYVKliDsxM5lKraqHT4JQbgDCQ6LegPv8Sv7/Wc3haq5G9UBQX vExot7tjwOl2xtW9VhVyzyi6F7PIn9YG+ph+bo4DTBzqIqfNLx+F0j7sheJPCfgmESVR Aor03242PdEkw4aIfLTrIwCwNAvey6kRfHTIS/9Rg41fjlIuyhrmFWhwacKv8n6Ip4SC EPvAgfwijTXwV4IORs2U5Hma4oLlbZdR6TJnU2m82FW+57or7NHGKnc81CgbE8p2Rlv7 JfCw== X-Gm-Message-State: AJaThX6IcFYPTtILrliDVp7v6baP+gP3Y6R/SHibOjm7rWS0e/Xj/Vgp WrgbZ/I88KmO1k1wuppbQMowaOOE X-Google-Smtp-Source: AGs4zMbz0wbvr9Ah1vXmn8vHw75xu6444/OBkQ2X/DFupjFDAmASeB0WF4cxltG/a9KontmTrkq0cA== X-Received: by 10.46.64.76 with SMTP id n73mr15745109lja.33.1512639233838; Thu, 07 Dec 2017 01:33:53 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id p21sm947366ljb.85.2017.12.07.01.33.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Dec 2017 01:33:53 -0800 (PST) Subject: Re: couple of nvidia-driver issues To: Alexey Dokuchaev Cc: freebsd-x11 , FreeBSD Current References: <07b9dbda-60ef-3643-308f-18a05e8ca958@FreeBSD.org> <20171205140308.GA94043@FreeBSD.org> From: Andriy Gapon Message-ID: <5e95dc14-9d3b-e2eb-b89c-f66f7857eb58@FreeBSD.org> Date: Thu, 7 Dec 2017 11:33:51 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20171205140308.GA94043@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Dec 2017 10:01:02 -0000 [cc-ing current@ to raise more awareness] On 05/12/2017 16:03, Alexey Dokuchaev wrote: > On Fri, Nov 24, 2017 at 11:31:51AM +0200, Andriy Gapon wrote: >> >> I have reported a couple of nvidia-driver issues in the FreeBSD section >> of the nVidia developer forum, but no replies so far. >> >> Well, the first issue is not with the driver, but with a utility that >> comes with it, nvidia-smi: >> https://devtalk.nvidia.com/default/topic/1026589/freebsd/nvidia-smi-query-gpu-spins-forever-on-freebsd-head-amd64-/ >> I wonder if I am the only one affected or if I see the problem because >> I am on head or something else. >> I am pretty sure that the problem is caused by a programming bug related >> to strtok_r. > > I'll try to reproduce it and report back. I've done some work with a debugger and it seems that there is code that does something like this: char *last = NULL; while (1) { if (last == NULL) p = strtok_r(str, sep, &last); else p = strtok_r(NULL, sep, &last); if (p == NULL) break; ... } The problem is that when 'p' points to the last token, 'last' is NULL (in FreeBSD implementation of strtok_r). That means that when we go to the next iteration the parsing starts all over again leading to the endless loop. The code is incorrect from the standards point of view, because the value of 'last' is completely opaque and should not be used for anything else but passing it back to strtok_r. I used gdb -w to change the logic to: char *last = 1; While (1) { if (last == 1) p = strtok_r(str, sep, &last); else p = strtok_r(NULL, sep, &last); ... } Where 1 is used as an "impossible" pointer value which is neither NULL nor a valid pointer that can be set by strtok_r. It's not ideal, but binary code editing is not as easy as that of source code. The binary patch is here: https://people.freebsd.org/~avg/nvidia-smi.bsdiff >> The second issue is with the FreeBSD support for the kernel driver: >> https://devtalk.nvidia.com/default/topic/1026645/freebsd/panic-related-to-nvkms_timers-lock-sx-lock-/ >> I would like to get some feedback on my analysis. >> I am testing this patch right now: >> https://people.freebsd.org/~avg/extra-patch-src_nvidia-modeset_nvidia-modeset-freebsd.c > > Unfortunately, I'm not an expert on kernel locking primitives to give you > a proper review, let's see what others have to say. It's been a while since I posted the patch and there are no comments yet. I can only add that I am running an INVARIANTS and WITNESS enabled kernel all the time and before the patch I was getting kernel panics every now and then. Since I started using the patch I haven't had a single nvidia panic yet. >> Also, what's the best place or who are the best people with whom to >> discuss such issues? > > Yes, this is a problem now: since Christian Zander had left nVidia, he > could not tell me who'd be their next liaison to talk to from FreeBSD > community. :-( Oh, I didn't know about Christian's departure. So, we are not in a very good position now. -- Andriy Gapon