Date: Fri, 26 Mar 2021 13:29:45 +0100 From: Michael Gmelin <freebsd@grem.de> To: Mathieu Chouquet-Stringer <me+freebsd@mathieu.digital> Cc: Matt Churchyard <matt.churchyard@userve.net>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, current@freebsd.org Subject: Re: Scrub incredibly slow with 13.0-RC3 (as well as RC1 & 2) Message-ID: <20210326132945.3274687e@bsd64.grem.de> In-Reply-To: <YF2raxOUeN8Y23eT@weirdfishes> References: <YFhuxr0qRzchA7x8@weirdfishes> <202103221515.12MFFHRK015188@higson.cam.lispworks.com> <YFi6Lwh3ISn8UMvS@weirdfishes> <YFk11A/j7URClN/l@weirdfishes> <YFm3BTK/J9XY/mCN@weirdfishes> <202103241230.12OCUqur030001@higson.cam.lispworks.com> <YFs3jFT7sEaGeQCe@weirdfishes> <33eb78e2de404a77b271880dbee4c22e@SERVER.ad.usd-group.com> <YF2raxOUeN8Y23eT@weirdfishes>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 26 Mar 2021 10:37:47 +0100 Mathieu Chouquet-Stringer <me+freebsd@mathieu.digital> wrote: > On Thu, Mar 25, 2021 at 08:55:12AM +0000, Matt Churchyard wrote: > > Just an a aside, I did post a message a few weeks ago with a similar > > problem on 13 (as well as snapshot issues). Scrub seemed ok for a > > short while, but then ground to a halt. It would take 10+ minutes to > > go 0.01%, with everything appearing fairly idle. I finally gave up > > and stopped it after about 20 hours. Moving to 12.2 and rebuilding > > the pool, the system scrubbed the same data in an hour, and I've > > just scrubbed the same system after a month of use with about 4 > > times the data in 3 hours 20. As far as I'm aware, both should be > > using effectively the same "new" scrub code. > > > > Will be interesting if you find a cause as I didn't get any response > > to what for me was a complete showstopper for moving to 13. > > Bear with me, I'm slowly resilvering now... But same thing, it's not > even maxing out my slow drives... Looks like it'll take 2 days... > > I did some flame graphs using dtrace. The first one is just the output > of that: > dtrace -x stackframes=100 -n 'profile-99 /arg0/ { @[stack()] = > count(); } tick-60s { exit(0); }' > > Clearly my machine is not busy at all. > And the second is the output of pretty much the same thing except I'm > only capturing pid 31 which is the one busy. > dtrace -x stackframes=100 -n 'profile-99 /arg0 && pid == 31/ { > @[stack()] = count(); } tick-60s { exit(0); }' > > One striking thing is how many times hpet_get_timecount is present... Does tuning of - vfs.zfs.scrub_delay - vfs.zfs.resilver_min_time_ms - vfs.zfs.resilver_delay make a difference? Best, Michael -- Michael Gmelin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20210326132945.3274687e>