Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Oct 2013 10:50:23 +0300
From:      Vitalij Satanivskij <satan@ukr.net>
To:        Vitalij Satanivskij <satan@ukr.net>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: FreeBSD 10.0-BETA1 #8 r256765M spend too  much time in locks
Message-ID:  <20131024075023.GA52443@hell.ukr.net>
In-Reply-To: <20131024074826.GA50853@hell.ukr.net>
References:  <20131024074826.GA50853@hell.ukr.net>

next in thread | previous in thread | raw e-mail | index | archive | help


EEE  forget to notice - old system have none compression on main zfs 



Vitalij Satanivskij wrote:
VS> Hello.
VS> 
VS> After upgrading system from old current (r245701) to fresh current r255173 (than switch to stable/10 r256765M) 
VS> found some strange system behavior:
VS> 
VS> Diff  from r256765M anf r256765 is 
VS> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
VS> ===================================================================
VS> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c        (revision 256765)
VS> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c        (working copy)
VS> @@ -5147,7 +5147,7 @@
VS>         len = l2hdr->b_asize;
VS>         cdata = zio_data_buf_alloc(len);
VS>         csize = zio_compress_data(ZIO_COMPRESS_LZ4, l2hdr->b_tmp_cdata,
VS> -           cdata, l2hdr->b_asize, (size_t)SPA_MINBLOCKSIZE);
VS> +           cdata, l2hdr->b_asize, (size_t)(1ULL << l2hdr->b_dev->l2ad_vdev->vdev_ashift));
VS>  
VS>         if (csize == 0) {
VS>                 /* zero block, indicate that there's nothing to write */
VS> 
VS> 
VS> But same situation was befor patch.
VS> 
VS> 
VS> System load to high 
VS> CPU:  6.8% user,  0.0% nice, 57.3% system,  0.8% interrupt, 35.1% idle
VS> 
VS> hotkernel (dtrace script) says 
VS> 
VS> kernel`__mtx_unlock_flags                                 292   0.3%
VS> kernel`__mtx_lock_flags                                   315   0.3%
VS> zfs.ko`lzjb_compress                                      349   0.3%
VS> kernel`__rw_wlock_hard                                    686   0.7%
VS> kernel`spinlock_exit                                     1050   1.0%
VS> kernel`vmem_xalloc                                       7516   7.3%
VS> kernel`_sx_xlock_hard                                    8664   8.5%
VS> kernel`acpi_cpu_c1                                       9737   9.5%
VS> kernel`cpu_idle                                         20189  19.7%
VS> kernel`__mtx_lock_sleep                                 45952  44.9%
VS> 
VS> 
VS> 
VS> Trying to  understand where is a problem I'm build kernel with lock profiling. 
VS> 
VS> and get some data (for one minute )
VS> 
VS> (file attached) 
VS> 
VS> using agregation find most lock's is in 
VS> 
VS> 14,159818        /usr/src/sys/kern/subr_vmem.c:1128(sleep mutex:kmem arena) 
VS> 9,553523         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1597(sx:buf_hash_table.ht_locks[i].ht_lock) 
VS> 8,386943         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3541(sx:l2arc_buflist_mtx) 
VS> 8,110206         /usr/src/sys/kern/subr_vmem.c:1230(sleep mutex:kmem arena) 
VS> 5,909429         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1969(sx:arc_mru->arcs_locks[i].arcs_lock) 
VS> 5,452206         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1969(sx:arc_mfu->arcs_locks[i].arcs_lock) 
VS> 5,050224         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:303(sx:tx->tx_cpu[c].tc_open_lock) 
VS> 4,232819         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:891(sx:buf_hash_table.ht_locks[i].ht_lock) 
VS> 4,211348         /usr/src/sys/kern/vfs_subr.c:2101(lockmgr:zfs) 
VS> 4,011656         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:862(sx:buf_hash_table.ht_locks[i].ht_lock) 
VS> 3,823698         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2009(sx:arc_eviction_mtx) 
VS> 2,697344         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:126(sx:h->hash_mutexes[i]) 
VS> 2,343242         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1256(sx:arc_mfu->arcs_locks[i].arcs_lock) 
VS> 1,752713         /usr/src/sys/kern/vfs_lookup.c:707(lockmgr:zfs) 
VS> 1,580856         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1136(sx:zfsvfs->z_hold_mtx[i]) 
VS> 1,534242         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1291(sx:arc_mfu->arcs_locks[i].arcs_lock) 
VS> 1,331583         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:129(sx:db->db_mtx) 
VS> 1,105058         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:427(sx:vq->vq_lock) 
VS> 1,080855         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:396(sx:vq->vq_lock) 
VS> 0,858568         /usr/src/sys/kern/vfs_cache.c:488(rw:Name Cache) 
VS> 0,831652         /usr/src/sys/vm/vm_kern.c:344(rw:kmem vm object) 
VS> 0,815439         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1845(sx:buf_hash_table.ht_locks[i].ht_lock) 
VS> 0,812613         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1256(sx:arc_mru->arcs_locks[i].arcs_lock) 
VS> 0,794274         /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1529(lockmgr:zfs) 
VS> 0,669845         /usr/src/sys/vm/uma_core.c:2097(sleep mutex:zio_cache)
VS> 
VS> 
VS> 
VS> 
VS> Short system description 
VS> CPU E5-1650
VS> MEM 128Gb ddr3-1600
VS> 
VS> Storage subsystem 
VS> 
VS> 36x1Tb WD RE4 drives on LSI SAS2308 Controler
VS> 3x180Gb Intel ssd 530 series as l2 cache 
VS> 
VS> 
VS> POOL is  18 mirrors, two drives in ich mirror  and 3 cache devices 
VS> 
VS> eg. 
VS> ....
VS>           mirror-14     ONLINE       0     0     0
VS>             gpt/disk28  ONLINE       0     0     0
VS>             gpt/disk29  ONLINE       0     0     0
VS>           mirror-15     ONLINE       0     0     0
VS>             gpt/disk30  ONLINE       0     0     0
VS>             gpt/disk31  ONLINE       0     0     0
VS>           mirror-16     ONLINE       0     0     0
VS>             gpt/disk32  ONLINE       0     0     0
VS>             gpt/disk33  ONLINE       0     0     0
VS>           mirror-17     ONLINE       0     0     0
VS>             gpt/disk34  ONLINE       0     0     0
VS>             gpt/disk35  ONLINE       0     0     0
VS>         cache
VS>           ada1          ONLINE       0     0     0
VS>           ada2          ONLINE       0     0     0
VS>           ada3          ONLINE       0     0     0
VS> 
VS> 
VS> POOL have two ZFS 
VS> 
VS> main with options (diffs from default) - 
VS> compression           lz4
VS> secondarycache        all
VS> sync                  disabled
VS> 
VS> Data size for it around 6Tb (compresed) eg 
VS> disk1  refcompressratio      1.56x                                          -
VS> disk1  written               5,99T                                          -
VS> disk1  logicalused           10,8T                                          -
VS> disk1  logicalreferenced     9,32T                                          -
VS> 
VS> 
VS> and another  with options 
VS> recordsize            4K, before it was 32k, but internal software use data blocks mostly 4k so we try to change (without real data realocate ) 
VS> compresion -s off 
VS> sync                  disabled
VS> secondarycache        all
VS> 
VS> DATA size on it around 1.4Tb
VS> 
VS> ARC configured to use max 80Gb 
VS> 
VS> top most time looks like this 
VS> 
VS> Mem: 14G Active, 13G Inact, 94G Wired, 497M Cache, 3297M Buf, 2214M Free
VS> ARC: 80G Total, 2010M MFU, 70G MRU, 49M Anon, 3822M Header, 4288M Other
VS> 
VS> 
VS> LA's - around 10-20 depend on day time.
VS> 
VS> 
VS> zpool iostat disk1 1
VS>                capacity     operations    bandwidth
VS> pool        alloc   free   read  write   read  write
VS> ----------  -----  -----  -----  -----  -----  -----
VS> disk1       7,45T  8,86T    546  1,49K  16,4M  13,4M
VS> disk1       7,45T  8,86T    272  3,91K  11,7M  27,4M
VS> disk1       7,45T  8,86T    344  2,94K  7,26M  25,2M
VS> disk1       7,45T  8,86T    224  2,02K  9,91M  21,8M
VS> disk1       7,45T  8,86T    244  2,35K  8,23M  18,4M
VS> disk1       7,45T  8,86T    245  2,54K  6,44M  23,4M
VS> disk1       7,45T  8,86T    114  2,94K  3,28M  13,3M
VS> disk1       7,45T  8,86T    288  4,43K  6,09M  33,5M
VS> disk1       7,45T  8,86T    157  1,26K  2,98M  15,7M
VS> disk1       7,45T  8,86T     94    842  3,07M  13,6M
VS> disk1       7,45T  8,86T    327  1,71K  9,63M  8,21M
VS> disk1       7,45T  8,86T    237  1,81K  5,73M  29,3M
VS> disk1       7,45T  8,86T    247  3,47K  5,17M  29,6M
VS> disk1       7,45T  8,86T    165  2,37K  3,22M  16,7M
VS> disk1       7,45T  8,86T    155  3,23K  3,27M  23,9M
VS> 
VS> Strange as timeout seted up to 10sec's.
VS> 
VS> What intresting - after reboot system work fine for some time, at last while ARC not become  80G size. 
VS> Low limit for arc is 40gb, strange but old system can take memory from arc eg like this 
VS> 
VS> 
VS> Mem: 32G Active, 8797M Inact, 78G Wired, 2370M Cache, 209M Buf, 3977M Free
VS> ARC: 43G Total, 2204M MFU, 28G MRU, 135M Anon, 7898M Header, 5301M Other
VS> 
VS> On new ARC getting to it max allowed size. 
VS> 
VS> So for now question is, what it can be, what we can try to improve system perfomance and so on?
VS> 
VS> 
VS> 
VS> _______________________________________________
VS> freebsd-hackers@freebsd.org mailing list
VS> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
VS> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131024075023.GA52443>