Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Nov 2024 13:16:15 -0000 (UTC)
From:      "Peter 'PMc' Much" <pmc@citylink.dinoex.sub.org>
To:        freebsd-fs@freebsd.org
Subject:   Re: nvme device errors & zfs
Message-ID:  <slrnvik6kv.12qg.pmc@disp.intra.daemon.contact>
References:  <3293802b-3785-4715-8a6b-0802afb6f908@app.fastmail.com> <CANCZdfpPmVtt0wMWAYzhq4R0nkt39dg3S2-zVCCQcw%2BTSugkEg@mail.gmail.com> <ad8551cc-a595-454b-8645-89a16f60ab0f@app.fastmail.com> <CAFYkXjkdvq29aFvNfkmFjb%2BZN8gPJgZFMr942iju=KVcwieDYw@mail.gmail.com> <620f4e82-aa69-4af0-bd22-b21203ab8745@app.fastmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2024-11-05, Dave Cottlehuber <dch@FreeBSD.org> wrote:

> I would hope temperature throttling would not be quite so brutal, to
> remove itself from the bus entirely, but its a reasonable explanation.

It might be a reasonable choice to protect the data first.
Also people will then notice that there is a problem and not complain
about bad performance.

If a more elegant reaction is desired, that might be implemented
by obtaining the current temperature and dynamically issuing some
"nvmecontrol power -p x -w y ..." as appropriate. (From what I hear,
these options behave rather device specific, so some testing may
be required)

https://gitr.daemon.contact/tools/tree/heatctl.rb#n218

I'm not yet doing temperature-driven nvme performance steering, but
practically everything else: fan engage, scrub pausing, cpu consumtion
(via rctl) etc.

cheerio,
PMc



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?slrnvik6kv.12qg.pmc>