From owner-freebsd-fs@FreeBSD.ORG Mon Dec 1 17:28:29 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 85A83AE8 for ; Mon, 1 Dec 2014 17:28:29 +0000 (UTC) Received: from mail-wi0-f174.google.com (mail-wi0-f174.google.com [209.85.212.174]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 121ABC78 for ; Mon, 1 Dec 2014 17:28:28 +0000 (UTC) Received: by mail-wi0-f174.google.com with SMTP id h11so25369289wiw.13 for ; Mon, 01 Dec 2014 09:28:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=gkMfEn6Gsrv2MNu2RX/0+KuixZQcjHGX/HzUujseO3U=; b=ZZDX0ei9tbdeTiDB7nu5jI31n/RjZV06FDjfJJPuNFoZi77IBVYI12wQ5xf/D8D2no TD1Nx6WWoahfQD7rV0HuPub+HYaHZvypenq9k/FDmWkitLNo2cpVVXIywCxBGOyrjo4M 654ukuUwiV3C0ACxa7gv/0gwsmBMR+U8tDxPG1NBb+w82pD4stIOgGfaXrqX7gkncYi7 ZEKoKMaCdKCRhPzoOA48ja0Hu6dRxM20fClysnAcxC2lZ4x7JJ94BfeMGuAHnfsN4z8S 3alJEcp3m5CWT8fogAlEPoKEP7tRgtM92WqfAyyjStWhjaVr+HiryH0LPcreGM7uTRw5 wYDg== X-Gm-Message-State: ALoCoQmuf9zFO65JlVtNSL67x5Mb23phOEAqG4GNYnwEDdWyZf44gzAPCKdgE3SiX0xwB0oUju0L X-Received: by 10.194.176.170 with SMTP id cj10mr19713880wjc.8.1417454906604; Mon, 01 Dec 2014 09:28:26 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id ep6sm22772673wib.0.2014.12.01.09.28.25 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Dec 2014 09:28:25 -0800 (PST) Message-ID: <547CA5AA.8080105@multiplay.co.uk> Date: Mon, 01 Dec 2014 17:30:18 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Irregular disk IO and poor performance (possibly after reading a lot of data from pool) References: <1417438604.143909513.k6b3b33f@frv41.fwdcdn.com> In-Reply-To: <1417438604.143909513.k6b3b33f@frv41.fwdcdn.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Dec 2014 17:28:29 -0000 What disks? On 01/12/2014 13:21, Dmitriy Makarov wrote: > We have big ZFS pool (16TiB) with 36 disks that are grouped into 18 mirror devices. > > This weekend we were maintaining data on the pool. > Two days straight 16 processes were busy reading files (to calculate checksums and things like that) > > Starting from the monday morning, few hours after maintainance was terminated > we started to observe abnormal ZFS behaviour that was also accompanied by > very very poor pool performance (many processes were blocked in zio->i). > > But the most strange thing is how IO is distributed between mirror devices. > Normally, our 'iostat -x 1' looks like > > device r/s w/s kr/s kw/s qlen svc_t %b > md0 0.0 5.9 0.0 0.0 0 0.0 0 > da0 28.7 178.2 799.6 6748.3 1 3.8 58 > da1 23.8 180.2 617.9 6748.3 1 3.4 56 > da2 44.6 168.3 681.3 6733.9 1 5.2 72 > da3 38.6 164.4 650.6 6240.3 1 4.9 65 > da4 29.7 176.3 471.3 5935.3 0 4.1 58 > da5 27.7 180.2 546.1 6391.3 1 3.9 57 > da6 27.7 238.6 555.0 6714.6 0 3.7 68 > da7 28.7 239.6 656.0 6714.6 0 3.3 58 > da8 26.7 318.8 738.7 8304.4 0 2.5 54 > da9 27.7 315.9 725.3 7769.7 0 3.0 77 > da10 23.8 268.3 510.0 7663.7 0 2.6 56 > da11 32.7 276.3 905.5 7697.9 0 3.4 70 > da12 24.8 293.1 559.0 6222.0 2 2.3 53 > da13 27.7 285.2 279.7 6058.1 1 2.9 62 > da14 29.7 226.8 374.3 5733.3 0 3.2 57 > da15 32.7 220.8 532.2 5538.7 1 3.3 65 > da16 30.7 165.4 638.2 4537.6 1 3.8 51 > da17 39.6 173.3 819.9 4884.2 1 3.2 46 > da18 28.7 221.8 765.4 5659.1 1 2.6 42 > da19 30.7 214.9 464.4 5417.4 0 4.6 78 > da20 32.7 177.2 725.3 4732.7 1 4.0 63 > da21 29.7 177.2 448.6 4722.8 0 5.3 66 > da22 19.8 153.5 398.6 4168.3 0 2.5 35 > da23 16.8 151.5 291.1 4243.6 1 2.9 39 > da24 26.7 186.2 547.1 5018.4 1 4.4 68 > da25 30.7 190.1 709.0 5096.6 1 5.0 71 > da26 28.7 222.8 690.7 5251.1 0 3.0 55 > da27 21.8 213.9 572.3 5248.6 0 2.8 49 > da28 34.7 177.2 1096.2 5027.8 1 4.9 65 > da29 36.6 175.3 1172.9 5012.0 2 4.9 63 > da30 22.8 197.1 462.9 5906.6 0 2.8 51 > da31 25.7 204.0 445.6 6138.3 0 3.4 62 > da32 31.7 170.3 557.0 5600.6 1 4.6 58 > da33 33.7 161.4 698.1 5509.5 1 4.8 60 > da34 28.7 269.3 473.8 6661.6 1 5.2 77 > da35 27.7 268.3 424.3 6440.8 0 5.6 75 > > > kw/s is always distributed pretty much evenly. > Now it looks mostly like this: > > device r/s w/s kr/s kw/s qlen svc_t %b > md0 0.0 18.8 0.0 0.0 0 0.0 0 > da0 35.7 0.0 1070.9 0.0 0 13.3 37 > da1 38.7 0.0 1227.0 0.0 0 12.7 40 > da2 25.8 0.0 920.2 0.0 0 12.0 26 > da3 26.8 0.0 778.0 0.0 0 10.9 23 > da4 22.8 0.0 792.4 0.0 0 14.4 25 > da5 26.8 0.0 1050.5 0.0 0 13.4 27 > da6 32.7 0.0 1359.3 0.0 0 17.0 41 > da7 23.8 229.9 870.7 17318.1 0 3.0 55 > da8 58.5 0.0 1813.7 0.0 1 12.9 56 > da9 63.4 0.0 1615.0 0.0 0 12.4 61 > da10 48.6 0.0 1448.0 0.0 0 16.7 55 > da11 49.6 0.0 1148.2 0.0 1 16.7 60 > da12 47.6 0.0 1508.4 0.0 0 14.8 46 > da13 47.6 0.0 1417.7 0.0 0 17.9 55 > da14 44.6 0.0 1997.5 0.0 1 15.6 49 > da15 48.6 0.0 2061.4 0.0 1 14.2 47 > da16 44.6 0.0 1587.7 0.0 1 16.9 51 > da17 45.6 0.0 1326.1 0.0 2 15.7 55 > da18 50.5 0.0 1433.6 0.0 2 16.7 57 > da19 57.5 0.0 2415.8 0.0 3 20.4 70 > da20 52.5 222.0 2097.1 10613.0 5 12.8 100 > da21 52.5 256.7 1967.8 11498.5 3 10.6 100 > da22 37.7 433.1 1342.4 12880.1 4 5.5 99 > da23 42.6 359.8 2304.3 13073.8 5 7.2 101 > da24 33.7 0.0 1256.7 0.0 1 15.4 40 > da25 26.8 0.0 853.8 0.0 2 15.1 32 > da26 23.8 0.0 343.9 0.0 1 12.4 28 > da27 26.8 0.0 400.4 0.0 0 12.4 31 > da28 15.9 0.0 575.3 0.0 1 11.4 17 > da29 20.8 0.0 750.7 0.0 0 14.4 24 > da30 37.7 0.0 952.4 0.0 0 12.6 37 > da31 29.7 0.0 777.0 0.0 0 13.6 37 > da32 54.5 121.9 1824.6 6514.4 7 27.7 100 > da33 56.5 116.9 2017.3 6213.6 6 29.7 99 > da34 42.6 0.0 1303.3 0.0 1 14.9 43 > da35 45.6 0.0 1400.9 0.0 2 14.8 45 > > Some deviced have 0.0 kw/s for long period of time, > then others and so on and so on. > Here some more results: > > device r/s w/s kr/s kw/s qlen svc_t %b > md0 0.0 37.9 0.0 0.0 0 0.0 0 > da0 58.9 173.7 1983.5 4585.3 3 11.2 87 > da1 49.9 162.7 1656.2 4548.4 3 14.0 95 > da2 40.9 187.6 1476.5 3466.6 1 4.8 58 > da3 42.9 188.6 1646.7 3466.6 0 5.3 64 > da4 54.9 33.9 2222.6 1778.4 1 13.3 63 > da5 53.9 37.9 2429.6 1778.4 2 12.9 68 > da6 42.9 33.9 1445.1 444.6 0 10.3 45 > da7 40.9 28.9 2045.9 444.6 0 12.3 43 > da8 53.9 0.0 959.6 0.0 1 22.7 62 > da9 29.9 0.0 665.2 0.0 1 52.1 64 > da10 52.9 83.8 1845.3 2084.8 2 8.2 64 > da11 44.9 103.8 1654.2 4895.2 1 8.8 71 > da12 50.9 60.9 1273.0 2078.3 1 10.3 69 > da13 39.9 57.9 940.1 2078.3 0 15.4 75 > da14 45.9 72.9 977.0 3178.6 0 8.5 63 > da15 48.9 72.9 1000.5 3178.6 0 9.6 72 > da16 42.9 74.9 1187.6 2118.8 1 6.7 51 > da17 48.9 82.8 1651.7 3013.0 0 5.7 52 > da18 67.9 78.8 2735.5 2456.1 0 11.5 75 > da19 52.9 79.8 2436.6 2456.1 0 13.1 82 > da20 48.9 91.8 2623.8 1682.6 1 7.2 60 > da21 52.9 92.8 1893.2 1682.6 0 7.1 61 > da22 67.9 20.0 2518.0 701.1 0 13.5 79 > da23 68.9 23.0 3331.8 701.1 1 13.6 77 > da24 45.9 17.0 2148.7 369.8 1 11.6 47 > da25 36.9 18.0 1747.5 369.8 1 12.6 46 > da26 46.9 1.0 1873.3 0.5 0 21.3 55 > da27 38.9 1.0 1395.7 0.5 0 34.6 58 > da28 34.9 9.0 1523.5 53.9 0 14.1 39 > da29 26.9 10.0 1124.8 53.9 1 13.8 28 > da30 44.9 0.0 1887.2 0.0 0 18.8 50 > da31 47.9 0.0 2273.0 0.0 0 20.2 49 > da32 65.9 90.8 2221.6 1730.5 3 9.7 77 > da33 79.8 90.8 3304.9 1730.5 1 9.9 88 > da34 75.8 134.7 3638.7 3938.1 2 10.2 90 > da35 49.9 209.6 1792.4 5756.0 2 8.1 85 > > > md0 0.0 19.0 0.0 0.0 0 0.0 0 > da0 38.0 194.8 1416.1 1175.8 1 10.6 100 > da1 40.0 190.8 1424.6 1072.9 2 10.4 100 > da2 37.0 0.0 1562.4 0.0 0 14.9 40 > da3 31.0 0.0 1169.8 0.0 0 14.0 33 > da4 44.0 0.0 2632.4 0.0 0 18.0 45 > da5 41.0 0.0 1944.6 0.0 0 19.0 45 > da6 38.0 0.0 1786.2 0.0 1 18.4 44 > da7 45.0 0.0 2275.7 0.0 0 16.0 48 > da8 80.9 0.0 4151.3 0.0 2 24.1 85 > da9 83.9 0.0 3256.2 0.0 3 21.2 83 > da10 61.9 0.0 3657.3 0.0 1 18.9 65 > da11 53.9 0.0 2532.5 0.0 1 18.7 56 > da12 54.9 0.0 2650.8 0.0 0 18.9 60 > da13 48.0 0.0 1975.5 0.0 0 19.6 53 > da14 43.0 0.0 1802.7 0.0 2 14.1 43 > da15 49.0 0.0 2455.5 0.0 0 14.0 48 > da16 45.0 0.0 1521.5 0.0 1 16.0 50 > da17 45.0 0.0 1650.8 0.0 4 13.7 47 > da18 48.0 0.0 1618.9 0.0 1 15.0 54 > da19 47.0 0.0 1982.0 0.0 0 16.5 55 > da20 52.9 0.0 2186.3 0.0 0 19.8 65 > da21 61.9 0.0 3020.5 0.0 0 16.3 61 > da22 70.9 0.0 3309.7 0.0 1 15.5 67 > da23 67.9 0.0 2742.3 0.0 2 16.5 73 > da24 38.0 0.0 1426.1 0.0 1 15.5 40 > da25 41.0 0.0 1905.6 0.0 1 14.0 39 > da26 43.0 0.0 2371.1 0.0 0 14.2 40 > da27 46.0 0.0 2178.3 0.0 0 15.2 45 > da28 44.0 0.0 2092.9 0.0 0 12.4 43 > da29 41.0 0.0 1442.1 0.0 1 13.4 37 > da30 42.0 37.0 1171.3 645.9 1 17.5 62 > da31 27.0 67.9 713.8 290.7 0 16.7 64 > da32 47.0 0.0 1043.5 0.0 0 13.3 43 > da33 50.0 0.0 1741.3 0.0 1 15.7 57 > da34 42.0 0.0 1119.9 0.0 0 18.2 55 > da35 45.0 0.0 1071.4 0.0 0 15.7 55 > > > First thing we did is tried to reboot. > It took system more than 5 minutes to import the pool (normally it's a fraction of a second). > Nedless to say reboot did not help a bit. > > What can we do about this problem? > > > System info: > FreeBSD 11.0-CURRENT #5 r260625 > > zpool get all disk1 > NAME PROPERTY VALUE SOURCE > disk1 size 16,3T - > disk1 capacity 59% - > disk1 altroot - default > disk1 health ONLINE - > disk1 guid 4909337477172007488 default > disk1 version - default > disk1 bootfs - default > disk1 delegation on default > disk1 autoreplace off default > disk1 cachefile - default > disk1 failmode wait default > disk1 listsnapshots off default > disk1 autoexpand off default > disk1 dedupditto 0 default > disk1 dedupratio 1.00x - > disk1 free 6,56T - > disk1 allocated 9,76T - > disk1 readonly off - > disk1 comment - default > disk1 expandsize 0 - > disk1 freeing 0 default > disk1 feature@async_destroy enabled local > disk1 feature@empty_bpobj active local > disk1 feature@lz4_compress active local > disk1 feature@multi_vdev_crash_dump enabled local > disk1 feature@spacemap_histogram active local > disk1 feature@enabled_txg active local > disk1 feature@hole_birth active local > disk1 feature@extensible_dataset enabled local > disk1 feature@bookmarks enabled local > > > > zfs get all disk1 > NAME PROPERTY VALUE SOURCE > disk1 type filesystem - > disk1 creation Wed Sep 18 11:47 2013 - > disk1 used 9,75T - > disk1 available 6,30T - > disk1 referenced 9,74T - > disk1 compressratio 1.63x - > disk1 mounted yes - > disk1 quota none default > disk1 reservation none default > disk1 recordsize 128K default > disk1 mountpoint /......... local > disk1 sharenfs off default > disk1 checksum on default > disk1 compression lz4 local > disk1 atime off local > disk1 devices on default > disk1 exec off local > disk1 setuid off local > disk1 readonly off default > disk1 jailed off default > disk1 snapdir hidden default > disk1 aclmode discard default > disk1 aclinherit restricted default > disk1 canmount on default > disk1 xattr off temporary > disk1 copies 1 default > disk1 version 5 - > disk1 utf8only off - > disk1 normalization none - > disk1 casesensitivity sensitive - > disk1 vscan off default > disk1 nbmand off default > disk1 sharesmb off default > disk1 refquota none default > disk1 refreservation none default > disk1 primarycache all default > disk1 secondarycache none local > disk1 usedbysnapshots 0 - > disk1 usedbydataset 9,74T - > disk1 usedbychildren 9,71G - > disk1 usedbyrefreservation 0 - > disk1 logbias latency default > disk1 dedup off default > disk1 mlslabel - > disk1 sync standard local > disk1 refcompressratio 1.63x - > disk1 written 9,74T - > disk1 logicalused 15,8T - > disk1 logicalreferenced 15,8T - > > > This is very severe, thanks. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"