From owner-freebsd-current@freebsd.org Mon Oct 15 15:41:25 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A28C010DEC1B for ; Mon, 15 Oct 2018 15:41:25 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vs1-xe35.google.com (mail-vs1-xe35.google.com [IPv6:2607:f8b0:4864:20::e35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4166E7921F for ; Mon, 15 Oct 2018 15:41:25 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vs1-xe35.google.com with SMTP id w1so16498957vsj.8 for ; Mon, 15 Oct 2018 08:41:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=n+Ww27kc6xQhdnoMCmNcv6i8MmTcb1vlCbrTZ9u0Vl4=; b=LuoAfXCgDjk8bw7RVdXiGdQZoDSsJkQ167DVy70ac3aFK/miopvOhbCUDLH+IiejM+ XaHJPaq2XA+4tcg+YLMWycfRxFxmOFeLPjONV9PeqqcYWTGpCRVR3spdyKM2bD9N5KTu eWuJlWmaOq0+VU+q6IHlRf2j7OXx559LWbT2lsHE575IpOsYXYxNpZaOS1fkpBNfUL32 KK9j5USP6UgY4jQYxHdMpsut9E7t8PB4VorIfv+uMeQbeDOpEzILCnhrzLpV/0p8B1Of D/BZP4A3guKuC16GK01JGmGzc+t8UB7e4ht9VkA4miZBLexvIHuYJv3rx35IMfce3dXK A0cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=n+Ww27kc6xQhdnoMCmNcv6i8MmTcb1vlCbrTZ9u0Vl4=; b=TETkW/i9oV0hUAO+K1DJ78/XVgR+iWRna4xml6+EQNz6hdYCpqjPnIM17So0Shjj5k +VJHjma9aGiIitHVN1Nt9Eh2KBQUy5YlUZAJncBxOGEWEtF6JkcoCdylck7dyS+o91O9 5tkn9oeKVIiYMw01vYUJsB589b8Hah9yrf9jrx0XX8eoDbaazwUTrsww4PhcRjtdyflH Hovmz7grJXz2l9VvoXm3Tan5IEQwQFBMNOCQgUfcJlAXkdiqfLNfcsg5iCcY2GyAaxKV br+Y5etvlgwBvXtj8EjHxK+daJJN6/CaXGbxYnAE583ZvGO9XIkWo4rc/+KaD+cZtuyN vhiA== X-Gm-Message-State: ABuFfoh27tCS0YIIJgfUyiVVX8jKMut6oiPAazVoyKjGaCIJAUM+NQl1 yN2hKQR5Ak74gK9uWamiXCNnu5jez7Hc6WKRB+GQTQ== X-Google-Smtp-Source: ACcGV63WSSbqAdBrT20mxOY5Hldk7VSGd8JHZXtCAQr5n9bSNxy0jPaPf4SrS4WKmkv5F0XKj8NwRQbJQNr0QQEpBOU= X-Received: by 2002:a67:f757:: with SMTP id w23mr6948840vso.76.1539618084332; Mon, 15 Oct 2018 08:41:24 -0700 (PDT) MIME-Version: 1.0 References: <1bb0a463-7630-e182-edb6-d02a868704d8@yuripv.net> In-Reply-To: <1bb0a463-7630-e182-edb6-d02a868704d8@yuripv.net> From: Warner Losh Date: Mon, 15 Oct 2018 09:21:23 -0600 Message-ID: Subject: Re: vm_fault on boot with NVMe/nda To: Yuri Pankov Cc: dnebdal@gmail.com, FreeBSD Current Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2018 15:41:25 -0000 At Netflix we have our OCA firmware based on FreeBSD -current (we take a snapshot every 5 weeks or so). We've been booting thousands of machines off nda for over a year... It absolutely works and is one of the things that lets us deliver the content we do... I have some patches in my queue waiting for the freeze to lift that do much better trim shaping to the drives. You might also want to turn on vfs.ffs.dotrimcons=1 which is a new feature that eliminates many of the BIO_DELETE requests that come down from UFS that are turned into trims. nvd has no queueing policy at all: it shot-guns all requests to the drive w/o collapsing or any moderation at all... This isn't so good for most drives out there today... Warner On Mon, Oct 15, 2018 at 8:38 AM Yuri Pankov wrote: > Daniel Nebdal wrote: > > Hi. I have a 12-ALPHA9 / r339331 amd64 system (a HPE ProLiant ML30 G9), > > with a Kingston NVMe SSD ("KINGSTON SKC1000480G") on a PCIe card. > > > > By default, it shows up as /dev/nvd0, and this is how I installed the > > system. It has a single large UFS2 (with SJ and TRIM support) partition > > mounted as /. (There's also a few other partitions on it that should be > > irrelevant for this.) This works, but it does sometimes slow down for > > minutes at the time with disturbing queue lengths in gstat; on the order > of > > tens of thousands. As I understand it, this is due to how TRIM operations > > take precedence over everything else when using nvd ? > > > > Looking around, I noticed the nda driver for NVMe-through-CAM. To test > it, > > I added hw.nvme.use_nvd=0 to loader.conf. On one level, this works: The > > drive shows up as /dev/nda0 . On the other hand, trying to mount nda0p2 > as > > / floods the console with "vm_fault: pager read error, pid 1 (init)", and > > never finishes booting. > > > > What is more interesting is that if I boot from the drive, but mount an > > alpha9 usb stick as /, I can then mount the nda device just fine, and the > > very minimal testing I did (using bin/cat and COPYRIGHT on the NVMe > drive) > > seems to work. > > > > So - is nda meant to be bootable, or am I a bit over-eager in trying to > do > > so? > > If not, is there anything smart I can do to get better performance out of > > nvd? > > (Or have I just overlooked something obvious?) > > > > Dmesg from a normal nvd boot here: > > https://openbenchmarking.org/system/1810159-RA-SSD30089593/SSD/dmesg > > FWIW, I set hw.nvme.use_nvd=0 in the installer, got 12-ALPHA8 installed > on nda0, and it's happily booting from it (using ZFS, though), so it's > certainly meant to be bootable. > >