Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Oct 2015 15:00:50 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Warner Losh <imp@FreeBSD.org>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r289405 - head/sys/ufs/ffs
Message-ID:  <4FC55895-99AF-4E5B-9E1B-C5085F3FC178@bsdimp.com>
In-Reply-To: <20151016201850.GP6469@zxy.spb.ru>
References:  <201510160306.t9G3622O049128@repo.freebsd.org> <20151016131940.GE42243@zxy.spb.ru> <3ADA7934-3EE1-449E-A8D1-723B73020C13@bsdimp.com> <20151016201850.GP6469@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]

> On Oct 16, 2015, at 2:18 PM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
> 
> On Fri, Oct 16, 2015 at 01:22:44PM -0600, Warner Losh wrote:
> 
>> 
>>> On Oct 16, 2015, at 7:19 AM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
>>> 
>>> On Fri, Oct 16, 2015 at 03:06:02AM +0000, Warner Losh wrote:
>>> 
>>>> Author: imp
>>>> Date: Fri Oct 16 03:06:02 2015
>>>> New Revision: 289405
>>>> URL: https://svnweb.freebsd.org/changeset/base/289405
>>>> 
>>>> Log:
>>>> Do not relocate extents to make them contiguous if the underlying drive can do
>>>> deletions. Ability to do deletions is a strong indication that this
>>>> optimization will not help performance. It will only generate extra write
>>>> traffic. These devices are typically flash based and have a limited number of
>>>> write cycles. In addition, making the file contiguous in LBA space doesn't
>>>> improve the access times from flash devices because they have no seek time.
>>> 
>>> In reality, flash devices have seek time, about 0.1ms.
>>> Many flash devices can do 8 simultaneously "seek" (I think NVMe can do
>>> more).
>> 
>> That's just not true. tREAD for most flash is a few tens of microseconds. The
>> streaming time is at most 10 microseconds. There's no "seek" time in the classic
>> sense. Once you get the data, you have it. There's no extra "read time" in
>> the NAND flash parts.
>> 
>> And the number of simultaneous reads depends a lot on how the flash vendor
>> organized the flash. Many of today's designs use 8 or 16 die parts that have 2
>> to 4 planes on them, giving a parallelism in the 16-64 range. And that's before
>> we get into innovative strategies that use partial page reads to decrease tREAD
>> time and novel data striping methods.
>> 
>> Seek time, as a separate operation, simply doesn't exist.
>> 
>> Furthermore, NAND-based devices are log-structured with garbage collection
>> for both retention and to deal with retired blocks in the underlying NAND. The
>> relationship between LBA ranges and where the data is at any given time on
>> the NAND is almost uncorrelated.
>> 
>> So, rearranging data so that it is in LBA contiguous ranges doesn't help once
>> you're above the FFS block level.
> 
> Stream of random reads 512-4096 bytes from most flash SATA drives in one
> thread give about 10K IOPS. This is only 40Mbit/s from 6*0.8 Gbit/s
> SATA bandwidth. You may decompose 0.1ms to different, real delay (bank
> select, command process and etc.) or give 0.1ms seek time for all
> practical purpose.

I strongly disagree. That’s not seek time in the classic sense. All of those 100us
are the delay from reading the data from the flash. The reason I’m so adamant
is that adjacent pages read have exactly the same cost. In a spinning disk,
adjacent sectors read have a tiny cost compared to moving the head (seeking).

Then again, I spent almost three years building a PCIe NAND-based flash
drive, so maybe I’m biased by that experience...

Warner


[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWIWWDAAoJEGwc0Sh9sBEAmroP+wWo8J4NU8z/c2CLNEE4usuk
1Qkb/8V8TgQ7+QZc+931yh4zOIqz2pL0XgHTjywHCIva28/lvoxSAPgLSSLBRm2l
JPHrTssUWLRMiBhB/bdIidVPhQeGoSWXP4OcoPC8EXpn5HC/LEe7LV0q2LK94OQh
vgHOo6AqF/0ttj/pZ/pnEC1DuXVFz6CdDOV3zKOpPB26QOMhElfGbBjVDrKx7jBq
iB14C0BUFAixzjBiP22o6oDw63NGwDKAertByiR4XebxTT/1hlYAuOeZm+pzFGg0
iTm9uJS9N/FG1lkAyUVjUVwA5jTWg1KeW0ABZIXVrZnDU8RFiji+K8ZSikWukEiX
CIZkq9GrZB87Wpi/us5tLLKvv3VQ6hLdcMzsLcbcpjSJz/l9X56Gn4gqKoC45usL
W4PELjcLXhJhvlSD7xsnZtcHnD5KvupKXI+d0qApOSjdJ7PGQBPfeIgjFywtlhlS
GZLloU0Om9oAuWcUc/CbV00eyfxjgYzjNjHWNqNBYFWb4AiHx8sTP+2HuDmh+v1R
oO3LJNRSAbjeJpoULU0KFuio5T3Nh0tv3HF5vJWiSWlgZtjs1JWrMCdkahpnFtq8
F19GHKlcgYAa5zP+FOQq2q/ZwLzf+VIza2bA943PUswMoD+2C88ELCvTzD11OwKF
0W8NR/Ni2gjZ8vgNxMXO
=niQ2
-----END PGP SIGNATURE-----

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4FC55895-99AF-4E5B-9E1B-C5085F3FC178>