Date: Wed, 22 Nov 2023 00:27:39 -0700 From: "Edward Sanford Sutton, III" <mirror176@hotmail.com> To: stable@freebsd.org Subject: Re: Unusual ZFS behaviour Message-ID: <CO1PR11MB47700ED7D924BBEB1FC68097E6BAA@CO1PR11MB4770.namprd11.prod.outlook.com> In-Reply-To: <bdaae481-427e-2ce0-008f-30516b9a47d7@grosbein.net> References: <f8764549-773a-4695-b1fc-76e70e49de1b@chen.org.nz> <bdaae481-427e-2ce0-008f-30516b9a47d7@grosbein.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/22/23 00:04, Eugene Grosbein wrote: > 22.11.2023 13:49, Jonathan Chen wrote: >> Hi, >> >> I'm running a somewhat recent version of STABLE-13/amd64: stable/13-n256681-0b7939d725ba: Fri Nov 10 08:48:36 NZDT 2023, and I'm seeing some unusual behaviour with ZFS. >> >> To reproduce: >> 1. one big empty disk, GPT scheme, 1 freebsd-zfs partition. >> 2. create a zpool, eg: tank >> 3. create 2 sub-filesystems, eg: tank/one, tank/two >> 4. fill each sub-filesystem with large files until the pool is ~80% full. In my case I had 200 10Gb files in each. >> 5. in one session run 'md5 tank/one/*' >> 6. in another session run 'md5 tank/two/*' >> >> For most of my runs, one of the sessions against a sub-filesystem will be starved of I/O, while the other one is performant. >> >> Is anyone else seeing this? More details of the disk, disk controller, and FreeBSD version may be helpful. If it is SATA, maybe there is impact form its own organizing of when to handle the queue of tasks it is given in addition to what FreeBSD+ZFS have assigned and OS (+ZFS if not its default) version could say what IO balancing is currently present/available. > > Please try repeating the test with atime updates disabled: > > zfs set atime=off tank/one > zfs set atime=off tank/two atime's impact is a write and writes get priority so if anything there would be 'little' breaks in the reads to write such data. I doubt the scenario+hardware in discussion is bottlenecking on writing atime data for the access of these 10GB files but it would be interesting. On the other hand, I think it is atime that trashes a smooth disk of freshly created file structure with many files after default cronjobs pass over it due to it + COW fragmenting the data structure to do basic things like list disk contents; I have not tested properly what the source of that repeatable issue was yet. Accessing data within the file doesn't seem impacted the same way as the directory listing though. I was thinking maybe an impact with sysctl settings involving prefetch may impact the sequence. vfs.zfs.prefetch_disable=1 is probably the one I am thinking of but I normally don't tweak zfs and related settings if I don't have to as I usually later find the tweaks are problems themselves. I keep my system running smoother when I put it under excessive load with idprio and nice being used on the heavier noninteractive load. > Does it make any difference? > Does it make any difference, if you import the pool with readonly=on instead? > > Writing to ~80% pool is almost always slow for ZFS. Been there, done that. It is painful but I think it is more complicated than just a total free space counter before that issue shows up. Other performance issues also exist as I've had horrible I/O on disks that never exceeded 20% used since being formatted.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CO1PR11MB47700ED7D924BBEB1FC68097E6BAA>