Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jan 2019 09:28:31 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Borja Marcos <borjam@sarenet.es>
Cc:        FreeBSD FS <freebsd-fs@freebsd.org>
Subject:   Re: Interesting: ZFS scrub prefetch hurting sequential scrub performance?
Message-ID:  <CANCZdfqLAeRjb0sgW1KScrAuFy8n3CZopLhkueufWQD0_azCjw@mail.gmail.com>
In-Reply-To: <C324E072-44FD-49D3-8B32-91E392833CFB@sarenet.es>
References:  <8ECF7513-9DFB-46EF-86BA-DB717D713792@sarenet.es> <C324E072-44FD-49D3-8B32-91E392833CFB@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 4, 2019 at 3:53 AM Borja Marcos <borjam@sarenet.es> wrote:

>
>
> > On 3 Jan 2019, at 11:34, Borja Marcos <borjam@sarenet.es> wrote:
> >
> >
> > Hi,
> >
> > I have noticed that my scrubs have become painfully slow. I am wonderin=
g
> wether I=E2=80=99ve just hit some worst case or maybe
> > there is some interaction between the ZFS sequential scrub and scrub
> prefetch. I don=E2=80=99t recall seeing this behavior
> > before the sequential scrub code was committed.
> >
> > Did I hit some worst case or should scrub prefetch be disabled with the
> new sequential scrub code?
>
> I have done a test with the old scrub code (vfs.zfs.zfs_scan_legacy=3D1) =
and
> I see a very similar behavior, with the
> scrub stalling again.
>
> Once more, disabling prefetch for the scrub (vfs.zfs.no_scrub_prefetch=3D=
1)
> solves the issue.
>
> I suffered this problem on 11 at some point but I attributed it (wrongly!=
)
> to hardware problems at the time.
>
> Not I=E2=80=99ve just found a talk about a new prefetch mechanism for the=
 scrub by
> Tom Caputi. Could it be the problem?
> https://www.youtube.com/watch?v=3Dupn9tYh917s


It's always been a hard problem to schedule background activity without
affecting foreground performance. For Hard Drives this isn't so terrible to
do: keep the queue depths small so that when any new work arrives, the
latency in switching between the two workloads is small. With SSDs, it gets
harder, though in a read-only workload it degenerates to about the same.
SSDs do their own read ahead, sometimes, and they have lots of background
activity that can be triggered by reads (like if it could read the block,
but the error rate from the NAND was over some threshold, the drive might
decide to copy all the data out of that block because data with that error
rate won't be readable with the correction codes in place long enough to
meet the retention specs). And writes can also trigger this background
behavior. So switching between the foreground and background tasks becomes
even more sluggish.

But I think in ZFS' case, it may just be a bit of a bug in backing off the
scrub operation to allow better local host performance...

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqLAeRjb0sgW1KScrAuFy8n3CZopLhkueufWQD0_azCjw>