From owner-svn-src-head@freebsd.org  Fri Oct 16 21:54:57 2015
Return-Path: <owner-svn-src-head@freebsd.org>
Delivered-To: svn-src-head@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 361E9A17C0B;
 Fri, 16 Oct 2015 21:54:57 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E641039F;
 Fri, 16 Oct 2015 21:54:56 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1ZnCxg-000N0n-JH; Sat, 17 Oct 2015 00:54:52 +0300
Date: Sat, 17 Oct 2015 00:54:52 +0300
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: Warner Losh <imp@bsdimp.com>
Cc: Warner Losh <imp@FreeBSD.org>, src-committers <src-committers@freebsd.org>,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject: Re: svn commit: r289405 - head/sys/ufs/ffs\
Message-ID: <20151016215452.GQ6469@zxy.spb.ru>
References: <201510160306.t9G3622O049128@repo.freebsd.org>
 <20151016131940.GE42243@zxy.spb.ru>
 <3ADA7934-3EE1-449E-A8D1-723B73020C13@bsdimp.com>
 <20151016201850.GP6469@zxy.spb.ru>
 <4FC55895-99AF-4E5B-9E1B-C5085F3FC178@bsdimp.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <4FC55895-99AF-4E5B-9E1B-C5085F3FC178@bsdimp.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
X-BeenThere: svn-src-head@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SVN commit messages for the src tree for head/-current
 <svn-src-head.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-head/>
List-Post: <mailto:svn-src-head@freebsd.org>
List-Help: <mailto:svn-src-head-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Oct 2015 21:54:57 -0000

On Fri, Oct 16, 2015 at 03:00:50PM -0600, Warner Losh wrote:

> >>>> Do not relocate extents to make them contiguous if the underlying drive can do
> >>>> deletions. Ability to do deletions is a strong indication that this
> >>>> optimization will not help performance. It will only generate extra write
> >>>> traffic. These devices are typically flash based and have a limited number of
> >>>> write cycles. In addition, making the file contiguous in LBA space doesn't
> >>>> improve the access times from flash devices because they have no seek time.
> >>> 
> >>> In reality, flash devices have seek time, about 0.1ms.
> >>> Many flash devices can do 8 simultaneously "seek" (I think NVMe can do
> >>> more).
> >> 
> >> That's just not true. tREAD for most flash is a few tens of microseconds. The
> >> streaming time is at most 10 microseconds. There's no "seek" time in the classic
> >> sense. Once you get the data, you have it. There's no extra "read time" in
> >> the NAND flash parts.
> >> 
> >> And the number of simultaneous reads depends a lot on how the flash vendor
> >> organized the flash. Many of today's designs use 8 or 16 die parts that have 2
> >> to 4 planes on them, giving a parallelism in the 16-64 range. And that's before
> >> we get into innovative strategies that use partial page reads to decrease tREAD
> >> time and novel data striping methods.
> >> 
> >> Seek time, as a separate operation, simply doesn't exist.
> >> 
> >> Furthermore, NAND-based devices are log-structured with garbage collection
> >> for both retention and to deal with retired blocks in the underlying NAND. The
> >> relationship between LBA ranges and where the data is at any given time on
> >> the NAND is almost uncorrelated.
> >> 
> >> So, rearranging data so that it is in LBA contiguous ranges doesn't help once
> >> you're above the FFS block level.
> > 
> > Stream of random reads 512-4096 bytes from most flash SATA drives in one
> > thread give about 10K IOPS. This is only 40Mbit/s from 6*0.8 Gbit/s
> > SATA bandwidth. You may decompose 0.1ms to different, real delay (bank
> > select, command process and etc.) or give 0.1ms seek time for all
> > practical purpose.
> 
> I strongly disagree. That's not seek time in the classic sense. All of those 100us
> are the delay from reading the data from the flash. The reason I'm so adamant
> is that adjacent pages read have exactly the same cost. In a spinning disk,
> adjacent sectors read have a tiny cost compared to moving the head (seeking).
> 
> Then again, I spent almost three years building a PCIe NAND-based flash
> drive, so maybe I'm biased by that experience...

For internal view you right.
For external view this delay like seek time.
For general HDD total_time = seek_time + transfer_time.
seek_time independed of block_size.
transfer_time depended of block_size and is block_size/transfer_speed.

http://tweakers.net/benchdb/testcombo/3817
https://docs.google.com/spreadsheets/d/1BJ-XY0xwc1JvHJxfOGbcW2Be-q5GQbATcMOh9W0d1-U/edit?usp=sharing

This is very closed aproximated by 0.11+block_size/461456. As for HDD.
Yes, no real seek. But in all models this is like seek time,
regardless of real cause.