From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 17:14:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 55B7C2DE for ; Thu, 11 Apr 2013 17:14:32 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta11.emeryville.ca.mail.comcast.net (qmta11.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:211]) by mx1.freebsd.org (Postfix) with ESMTP id 3B2A7B9E for ; Thu, 11 Apr 2013 17:14:32 +0000 (UTC) Received: from omta08.emeryville.ca.mail.comcast.net ([76.96.30.12]) by qmta11.emeryville.ca.mail.comcast.net with comcast id NckQ1l0010FhH24ABhEWJe; Thu, 11 Apr 2013 17:14:30 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta08.emeryville.ca.mail.comcast.net with comcast id NhEU1l00b1t3BNj8UhEU1b; Thu, 11 Apr 2013 17:14:30 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 51B6E73A33; Thu, 11 Apr 2013 10:14:28 -0700 (PDT) Date: Thu, 11 Apr 2013 10:14:28 -0700 From: Jeremy Chadwick To: Adam Nowacki Subject: Re: ZFS slow reads for unallocated blocks Message-ID: <20130411171428.GA56127@icarus.home.lan> References: <5166EA43.7050700@platinum.linux.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5166EA43.7050700@platinum.linux.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365700470; bh=n139eCt+I4pSz9PE6DsND372GhSx/TOAQ9nnc+LzNJc=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ZYxsQvm3tPhcEcjvBibP6s7ULTBz34ENeYnvZKqTk6Jt24somPUsVNiAlye++Fcdc Zkt8UHYU+C5VEbyGXy7jHRzamtK6qqG9xBMro99uFaVBjefqnoiKTneFKHWt+tXmGE HySAFUYbd3tnUy1YJZlsI35f1+kg2xXD7juyqV60g7pcWKJ/d9PhLo1ZKF9haNS8Wb L6YhkhUjEcECEBNXdHxBFa7vTBdy7fEHx6ph3ySYzU/L5uB1+mimbJ6auVwP6C4BG0 M9zBoo82eyF67c2Qq/RBkQIS7vF+rllq3vabgwdX+Cx/Qy9DFVrf7VgPL7PmRYb3lT Ni/fDTNOI5KiA== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 17:14:32 -0000 On Thu, Apr 11, 2013 at 06:52:19PM +0200, Adam Nowacki wrote: > This one is quite weird - reads from files that were created and > resized with ftruncate (so no actual data on disk) are considerably > slower and use more CPU time than files with data. If compression is > enabled this will also affect files with long runs of zeroes as ZFS > won't write any data to disk in this case. There is no I/O on the > pool during the read tests - all fits into 10GB ARC. > > FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 > 15:51:26 UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC > amd64 > > Mem: 264M Active, 82M Inact, 12G Wired, 100M Cache, 13M Buf, 3279M Free > Swap: 5120M Total, 5120M Free > > # zfs create -o atime=off -o recordsize=128k -o compression=off -o > mountpoint=/home/testfs home/testfs > > --- truncated file: > > # time truncate -s 1G /home/testfs/trunc1g > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.434817 secs (2469410435 bytes/sec) > 0.000u 0.435s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 3.809560 secs (281854564 bytes/sec) > 0.000u 3.779s 0:03.81 98.9% 25+2755k 0+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.070u 14.031s 0:14.19 99.3% 15+2755k 0+0io 0pf+0w > ^^^^^^^ 14 seconds compared to 1 second for random data > > --- file filled with zeroes: > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 2.375426 secs (452020732 bytes/sec) > 0.000u 0.525s 0:02.37 21.9% 23+2533k 1+1io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.199078 secs (5393571244 bytes/sec) > 0.000u 0.200s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 0.436472 secs (2460046434 bytes/sec) > 0.015u 0.421s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > # time cat /home/testfs/zero1g > /dev/null > 0.023u 1.156s 0:01.18 99.1% 15+2779k 0+0io 0pf+0w > > --- file filled with random bytes: > > # time dd if=/dev/random of=/home/testfs/random1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 16.116569 secs (66623474 bytes/sec) > 0.000u 13.214s 0:16.11 81.9% 25+2750k 0+1io 0pf+0w > > # time dd if=/home/testfs/random1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.207115 secs (5184280044 bytes/sec) > 0.000u 0.208s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > # time dd if=/home/testfs/random1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 0.432518 secs (2482536705 bytes/sec) > 0.023u 0.409s 0:00.43 97.6% 26+2828k 0+0io 0pf+0w > > # time cat /home/testfs/random1g > /dev/null > 0.031u 1.053s 0:01.08 100.0% 15+2770k 0+0io 0pf+0w > > --- compression on: > > # zfs create -o atime=off -o recordsize=128k -o compression=lzjb -o > mountpoint=/home/testfs home/testfs > > --- file filled with zeroes: > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 1.007765 secs (1065468404 bytes/sec) > 0.000u 0.458s 0:01.01 44.5% 26+2880k 1+1io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.630737 secs (1702360431 bytes/sec) > 0.000u 0.630s 0:00.63 100.0% 25+2742k 0+0io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 4.089175 secs (262581530 bytes/sec) > 0.015u 4.036s 0:04.09 98.7% 25+2758k 0+0io 0pf+0w > > # time cat /home/testfs/zero1g > /dev/null > 0.031u 15.863s 0:15.95 99.6% 15+2754k 0+0io 0pf+0w > ^^^^^^^ > > --- it appears recordsize has a huge effect on this (recordsize=32k): > > # zfs create -o atime=off -o recordsize=32k -o compression=off -o > mountpoint=/home/testfs home/testfs > > # time truncate -s 1G testfs/trunc1g > 0.000u 0.000s 0:00.01 0.0% 0+0k 1+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.047u 5.842s 0:05.93 99.1% 15+2761k 0+0io 0pf+0w > ^^^^^^ > > --- recordsize=4k: > > # zfs create -o atime=off -o recordsize=4k -o compression=off -o > mountpoint=/home/testfs home/testfs > > # time truncate -s 1G testfs/trunc1g > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.047u 1.441s 0:01.52 97.3% 15+2768k 0+0io 0pf+0w > ^^^^^^ Take a look at src/bin/cat/cat.c, function raw_cat(). Therein lies the answer. TL;DR -- cat != dd. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |