Date: Tue, 6 Dec 2016 19:17:40 +0300 From: Alex Tutubalin <lexa@lexa.ru> To: freebsd-fs@freebsd.org Subject: Re: 11-STABLE vs 11.0-RELENG test Message-ID: <a10b10ed-9200-26f3-5ca4-976a13922dee@lexa.ru> In-Reply-To: <477264f9-9263-6df8-4566-2911e84f2ad8@lexa.ru> References: <477264f9-9263-6df8-4566-2911e84f2ad8@lexa.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Followup, same box as in 1st message, but with different HBA and 5 HDD: Summary: 1) read speed depends on 'on-disk' data, not by code used while read. 2) FreeBSD 11.0-releng (svn up today) creates 'fast' big files (420+ Mb/s read speed) 3) FreeBSD 11-STABLE creates 'slow' big files. (~200 Mb/s) 4) zfs send slow-dataset | zfs recv (under 11.0-releng) creates 'intermediate' files (320Mb/s) 5) file copy slow-file ... (under 11.0-releng) creates fast copy (400+Mb/s) So, ZFS write code in 11-STABLE looks broken. Here detailed log of my tests: Box: Intel i5-2400, 16Gb of RAM, Adaptec 5805, 5x4Tb drives (HGST NAS drives) in JBOD, merged in RAIDZ2, enough free space: $ zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zdata 18,1T 5,11T 13,0T - 9% 28% 1.00x ONLINE - tunables in loader.conf: vm.kmem_size="15G" vfs.zfs.arc_max="14G" Boot 11.0-releng: $ uname -a FreeBSD iscsi.lexa.ru 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #7 r309605: Tue Dec 6 12:21:10 MSK 2016 lexa@iscsi.lexa.ru:/usr/obj/usr/src/sys/GENERIC amd64 create separate dataset for test: # zfs create -o recordsize=1m -o mountpoint=/test-11releng zdata/11releng write 100gb file: # dd if=/dev/zero of=/test-11releng/100g-created-on-11releng bs=1m count=100k 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 257.686751 secs (416684917 bytes/sec) reboot (still 11.0-releng) and read test file again: Cold read: $ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 250.526067 secs (428594850 bytes/sec) read again to see how cached metadata helps (if any): $ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 243.992662 secs (440071358 bytes/sec) Not much difference between cold and cached metadata, OK. Now boot 11.0-STABLE on same box: $ uname -a FreeBSD iscsi.lexa.ru 11.0-STABLE FreeBSD 11.0-STABLE #0 r309616: Tue Dec 6 16:15:48 MSK 2016 lexa@iscsi.lexa.ru:/usr/obj/usr/src11/sys/GENERIC amd64 Cold read of same data file: $ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 250.640950 secs (428398402 bytes/sec) Still OK: data created on 11.0-releng reads at full speed under 11-stable Create 11-stable own dataset: # zfs create -o recordsize=1m -o mountpoint=/test-11stable zdata/11stable And write similar 100g file: # dd if=/dev/zero of=/test-11stable/100g-created-on-11stable bs=1m count=100k 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 246.409501 secs (435755041 bytes/sec) Reboot (again into 11-stable) and read back created file $ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 524.488506 secs (204721707 bytes/sec) Hmm, about 1/2 of expected speed. Let's read fist test file, created on 11-releng: $ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 250.591102 secs (428483620 bytes/sec) Speed is still here. So, problem is related to on-disk data written by 11-stable. Let's try to recover 'bad data' into good one (assuming we're downgraded to 11-releng and want to get speed back) Boot into 11-releng: $ uname -a FreeBSD iscsi.lexa.ru 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #7 r309605: Tue Dec 6 12:21:10 MSK 2016 lexa@iscsi.lexa.ru:/usr/obj/usr/src/sys/GENERIC amd64 Try zfs send|recv: # zfs snapshot zdata/11stable@1 # zfs send zdata/11stable@1 | zfs recv zdata/11stable-sendrecv-on11releng # slow because same dataset used # zfs destroy zdata/11stable-sendrecv-on11releng@1 # zfs destroy zdata/11stable@1 # zfs set mountpoint=/test-11stable-sendrecv-on11releng zdata/11stable-sendrecv-on11releng Reboot and test read speed: $ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 336.193935 secs (319381676 bytes/sec) INTERMEDIATE speed :( Better than 'bad datafile', worse than 'good datafile' Try file copy to another dataset: # zfs create -o recordsize=1m -o mountpoint=/test-11stable-copy-on11releng zdata/11stable-copy-on11releng # cd /test-11stable # tar cf - . | (cd /test-11stable-copy-on11releng/; tar xvf - ) Reboot again and test speed another time: $ dd if=/test-11stable-copy-on11releng/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 261.484845 secs (410632526 bytes/sec) OK, per-file copy solves the problem (as expected). Finally, let's check that all testfiles still reads at the expected speed 'good' one, created on 11.0-releng: ]$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 249.435476 secs (430468770 bytes/sec) 'bad' one, created on 11-stable: $ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 525.558477 secs (204304920 bytes/sec) and 'intermediate', created by zfs send bad-dataset | zfs recv: $ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 336.570705 secs (319024148 bytes/sec) Boot into 11-stable again and test read speed too: $ uname -a FreeBSD iscsi.lexa.ru 11.0-STABLE FreeBSD 11.0-STABLE #0 r309616: Tue Dec 6 16:15:48 MSK 2016 lexa@iscsi.lexa.ru:/usr/obj/usr/src11/sys/GENERIC amd64 Good: ]$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 256.322824 secs (418902151 bytes/sec) Bad: $ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 529.918422 secs (202623985 bytes/sec) Copied by tar (from bad to some intermediate state): $ dd if=/test-11stable-copy-on11releng/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 260.234944 secs (412604783 bytes/sec) Copied by zfs send | zfs recv: $ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable of=/dev/null bs=1m 102400+0 records in 102400+0 records out 107374182400 bytes transferred in 336.240919 secs (319337047 bytes/sec) (virtually) same speed as under 11.0-releng, so read speed is limited by on-disk structures. -- Alex Tutubalin Web: http://blog.lexa.ru mailto:lexa@lexa.ru
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a10b10ed-9200-26f3-5ca4-976a13922dee>