Date: Wed, 13 May 2015 13:54:00 -0500 From: Nathan Weeks <weeks@iastate.edu> To: freebsd-fs@freebsd.org Subject: ZFS read performance disparity between clone and parent Message-ID: <CAHSgV3F0kJ=HrhT59TSpPgn3Vs-bFO87dB1RwkhrPUdrjy9ODA@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
While troubleshooting performance disparities between development and production jails hosting PostgreSQL instances, I noticed (with the help of dtruss) that the 8k read() performance in the production jail was an order of magnitude worse than the read() performance in the development jail. As the ZFS file system hosting the production jail was cloned from a snapshot of the development jail, and had not been modified, this didn't make sense to me. Using "dd" command with an 8k block size to emulate the PostgreSQL read() size, I observed a large performance difference between reading one of the large (1G) underlying postgres database files in the development jail's file system vs. the corresponding file in the cloned file system: # dd if=/jails/dev/usr/local/pgsql/data/base/16399/16436 of=/dev/null bs=8192 131072+0 records in 131072+0 records out 1073741824 bytes transferred in 4.198993 secs (255714128 bytes/sec) # dd if=/jails/prod/usr/local/pgsql/data/base/16399/16436 of=/dev/null bs=8192 131072+0 records in 131072+0 records out 1073741824 bytes transferred in 17.314135 secs (62015331 bytes/sec) # ls -l /jails/dev/usr/local/pgsql/data/base/16399/16436 /jails/prod/usr/local/pgsql/data/base/16399/16436 -rw------- 1 70 70 1073741824 Feb 5 16:41 /jails/dev/usr/local/pgsql/data/base/16399/16436 -rw------- 1 70 70 1073741824 Feb 5 16:41 /jails/prod/usr/local/pgsql/data/base/16399/16436 I repeated this exercise several times to verify the read performance difference. Interestingly, prefixing the "dd" command with "/usr/bin/time -l" revealed that in both cases, "block input operations" was 0, apparently indicating that both files were being read from cache. In neither case did "zpool iostat 1" show significant I/O being performed during the execution of the "dd" command. Has anyone else encountered a similar issue, and know of an explanation/solution/better workaround? I had previously assumed that there would be no performance difference between reading a file on a ZFS file system and the corresponding file on a cloned file system when none of the data blocks have changed (this is FreeBSD 9.3, so the "Single Copy ARC" feature should apply). Dedup isn't being used on any file system. The output of zfs-stats follows; I can provide any additional info that might be of use in identifying the cause of this issue. ------------------------------------------------------------------------ ZFS Subsystem Report Wed May 13 12:22:00 2015 ------------------------------------------------------------------------ System Information: Kernel Version: 903000 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 ZFS Storage pool Version: 5000 ZFS Filesystem Version: 5 FreeBSD 9.3-RELEASE-p5 #0: Mon Nov 3 22:38:58 UTC 2014 root 12:22PM up 166 days, 3:36, 7 users, load averages: 2.34, 2.31, 2.17 ------------------------------------------------------------------------ System Memory: 8.83% 21.95 GiB Active, 1.67% 4.14 GiB Inact 68.99% 171.40 GiB Wired, 0.40% 1.00 GiB Cache 20.10% 49.93 GiB Free, 0.01% 16.12 MiB Gap Real Installed: 256.00 GiB Real Available: 99.99% 255.97 GiB Real Managed: 97.06% 248.43 GiB Logical Total: 256.00 GiB Logical Used: 78.49% 200.92 GiB Logical Free: 21.51% 55.08 GiB Kernel Memory: 117.28 GiB Data: 99.98% 117.25 GiB Text: 0.02% 26.07 MiB Kernel Memory Map: 241.10 GiB Size: 43.83% 105.67 GiB Free: 56.17% 135.43 GiB ------------------------------------------------------------------------ ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 143.56m Recycle Misses: 275.73m Mutex Misses: 1.50m Evict Skips: 20.24b ARC Size: 99.77% 127.71 GiB Target Size: (Adaptive) 100.00% 128.00 GiB Min Size (Hard Limit): 12.50% 16.00 GiB Max Size (High Water): 8:1 128.00 GiB ARC Size Breakdown: Recently Used Cache Size: 68.86% 88.15 GiB Frequently Used Cache Size: 31.14% 39.85 GiB ARC Hash Breakdown: Elements Max: 27.87m Elements Current: 40.13% 11.18m Collisions: 1.95b Chain Max: 26 Chains: 2.44m ------------------------------------------------------------------------ ARC Efficiency: 88.77b Cache Hit Ratio: 99.52% 88.34b Cache Miss Ratio: 0.48% 426.00m Actual Hit Ratio: 98.86% 87.76b Data Demand Efficiency: 99.99% 58.75b Data Prefetch Efficiency: 98.47% 1.08b CACHE HITS BY CACHE LIST: Anonymously Used: 0.21% 187.51m Most Recently Used: 1.93% 1.71b Most Frequently Used: 97.41% 86.05b Most Recently Used Ghost: 0.04% 39.14m Most Frequently Used Ghost: 0.41% 358.78m CACHE HITS BY DATA TYPE: Demand Data: 66.49% 58.74b Prefetch Data: 1.21% 1.07b Demand Metadata: 31.74% 28.04b Prefetch Metadata: 0.56% 491.01m CACHE MISSES BY DATA TYPE: Demand Data: 1.70% 7.26m Prefetch Data: 3.89% 16.56m Demand Metadata: 83.84% 357.15m Prefetch Metadata: 10.57% 45.03m ------------------------------------------------------------------------ L2ARC is disabled ------------------------------------------------------------------------ File-Level Prefetch: (HEALTHY) DMU Efficiency: 187.26b Hit Ratio: 82.21% 153.94b Miss Ratio: 17.79% 33.32b Colinear: 33.32b Hit Ratio: 0.01% 3.35m Miss Ratio: 99.99% 33.32b Stride: 150.63b Hit Ratio: 100.00% 150.63b Miss Ratio: 0.00% 453.04k DMU Misc: Reclaim: 33.32b Successes: 0.36% 118.64m Failures: 99.64% 33.20b Streams: 3.31b +Resets: 0.00% 20.36k -Resets: 100.00% 3.31b Bogus: 0 ------------------------------------------------------------------------ VDEV cache is disabled ------------------------------------------------------------------------ ZFS Tunables (sysctl): kern.maxusers 16718 vm.kmem_size 266754412544 vm.kmem_size_scale 1 vm.kmem_size_min 0 vm.kmem_size_max 329853485875 vfs.zfs.l2c_only_size 0 vfs.zfs.mfu_ghost_data_lsize 63695688192 vfs.zfs.mfu_ghost_metadata_lsize 8300248064 vfs.zfs.mfu_ghost_size 71995936256 vfs.zfs.mfu_data_lsize 34951425024 vfs.zfs.mfu_metadata_lsize 4976638976 vfs.zfs.mfu_size 41843978240 vfs.zfs.mru_ghost_data_lsize 41844330496 vfs.zfs.mru_ghost_metadata_lsize 23598693888 vfs.zfs.mru_ghost_size 65443024384 vfs.zfs.mru_data_lsize 67918019072 vfs.zfs.mru_metadata_lsize 411918848 vfs.zfs.mru_size 71823354880 vfs.zfs.anon_data_lsize 0 vfs.zfs.anon_metadata_lsize 0 vfs.zfs.anon_size 29893120 vfs.zfs.l2arc_norw 1 vfs.zfs.l2arc_feed_again 1 vfs.zfs.l2arc_noprefetch 1 vfs.zfs.l2arc_feed_min_ms 200 vfs.zfs.l2arc_feed_secs 1 vfs.zfs.l2arc_headroom 2 vfs.zfs.l2arc_write_boost 8388608 vfs.zfs.l2arc_write_max 8388608 vfs.zfs.arc_meta_limit 34359738368 vfs.zfs.arc_meta_used 34250008792 vfs.zfs.arc_min 17179869184 vfs.zfs.arc_max 137438953472 vfs.zfs.dedup.prefetch 1 vfs.zfs.mdcomp_disable 0 vfs.zfs.nopwrite_enabled 1 vfs.zfs.zfetch.array_rd_sz 1048576 vfs.zfs.zfetch.block_cap 256 vfs.zfs.zfetch.min_sec_reap 2 vfs.zfs.zfetch.max_streams 8 vfs.zfs.prefetch_disable 0 vfs.zfs.no_scrub_prefetch 0 vfs.zfs.no_scrub_io 0 vfs.zfs.resilver_min_time_ms 3000 vfs.zfs.free_min_time_ms 1000 vfs.zfs.scan_min_time_ms 1000 vfs.zfs.scan_idle 50 vfs.zfs.scrub_delay 4 vfs.zfs.resilver_delay 2 vfs.zfs.top_maxinflight 32 vfs.zfs.write_to_degraded 0 vfs.zfs.mg_noalloc_threshold 0 vfs.zfs.condense_pct 200 vfs.zfs.metaslab.weight_factor_enable 0 vfs.zfs.metaslab.preload_enabled 1 vfs.zfs.metaslab.preload_limit 3 vfs.zfs.metaslab.unload_delay 8 vfs.zfs.metaslab.load_pct 50 vfs.zfs.metaslab.min_alloc_size 10485760 vfs.zfs.metaslab.df_free_pct 4 vfs.zfs.metaslab.df_alloc_threshold 131072 vfs.zfs.metaslab.debug_unload 0 vfs.zfs.metaslab.debug_load 0 vfs.zfs.metaslab.gang_bang 131073 vfs.zfs.check_hostid 1 vfs.zfs.spa_asize_inflation 24 vfs.zfs.deadman_enabled 1 vfs.zfs.deadman_checktime_ms 5000 vfs.zfs.deadman_synctime_ms 1000000 vfs.zfs.recover 0 vfs.zfs.txg.timeout 5 vfs.zfs.min_auto_ashift 9 vfs.zfs.max_auto_ashift 13 vfs.zfs.vdev.cache.bshift 16 vfs.zfs.vdev.cache.size 0 vfs.zfs.vdev.cache.max 16384 vfs.zfs.vdev.trim_on_init 1 vfs.zfs.vdev.write_gap_limit 4096 vfs.zfs.vdev.read_gap_limit 32768 vfs.zfs.vdev.aggregation_limit 131072 vfs.zfs.vdev.scrub_max_active 2 vfs.zfs.vdev.scrub_min_active 1 vfs.zfs.vdev.async_write_max_active 10 vfs.zfs.vdev.async_write_min_active 1 vfs.zfs.vdev.async_read_max_active 3 vfs.zfs.vdev.async_read_min_active 1 vfs.zfs.vdev.sync_write_max_active 10 vfs.zfs.vdev.sync_write_min_active 10 vfs.zfs.vdev.sync_read_max_active 10 vfs.zfs.vdev.sync_read_min_active 10 vfs.zfs.vdev.max_active 1000 vfs.zfs.vdev.bio_delete_disable 0 vfs.zfs.vdev.bio_flush_disable 0 vfs.zfs.vdev.trim_max_pending 64 vfs.zfs.vdev.trim_max_bytes 2147483648 vfs.zfs.cache_flush_disable 0 vfs.zfs.zil_replay_disable 0 vfs.zfs.sync_pass_rewrite 2 vfs.zfs.sync_pass_dont_compress 5 vfs.zfs.sync_pass_deferred_free 2 vfs.zfs.zio.use_uma 0 vfs.zfs.snapshot_list_prefetch 0 vfs.zfs.version.ioctl 3 vfs.zfs.version.zpl 5 vfs.zfs.version.spa 5000 vfs.zfs.version.acl 1 vfs.zfs.debug 0 vfs.zfs.super_owner 0 vfs.zfs.trim.enabled 1 vfs.zfs.trim.max_interval 1 vfs.zfs.trim.timeout 30 vfs.zfs.trim.txg_delay 32 ------------------------------------------------------------------------ -- Nathan Weeks USDA-ARS Corn Insects and Crop Genetics Research Unit Crop Genome Informatics Laboratory Iowa State University http://weeks.public.iastate.edu/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHSgV3F0kJ=HrhT59TSpPgn3Vs-bFO87dB1RwkhrPUdrjy9ODA>