FreeBSD Mail Archives

Date:      Thu, 24 Nov 2011 00:04:14 +0400
From:      Lev Serebryakov <lev@freebsd.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)?
Message-ID:  <1391930411.20111124000414@serebryakov.spb.ru>
In-Reply-To: <20111123194444.GE50300@deviant.kiev.zoral.com.ua>
References:  <1957615267.20111123230026@serebryakov.spb.ru> <20111123194444.GE50300@deviant.kiev.zoral.com.ua>


Hello, Kostik.
You wrote 23 ������ 2011 �., 23:44:44:

>>   Does UFS2 with softupdates (without journal) issues BIO_FLUSH to
>> GEOM layer when it need to ensure consistency on on-disk metadata?
> No. Softupdates do not need flushes.
  It need flushes. Because WITHOUT flashes on modern storage
architectures there is no way to be sure, that (I'm quoting your last
sentence) "writes reported as done by disk driver are indeed safely
landed in the involatile storage."

  It is sad, but it is true. Disk controllers have caches, disks have
caches. In virtual environment and with NAS (iSCSIS/FC/Whatever)
everything is even worse. And every layer LAYS about "landing", it was
shown, for example, by Brad Fitzpatrick many years ago (http://brad.livejournal.com/2116715.html).

  If SU don't mark its writes in special way as strictly-synchronous,
SU could not be sure, that data is really LANDED when bio is marked as
complete one. As far as I understand, there is no such way to mark bio
with BIO_WRITE command as such special case, and only way to ensure
landing is to call BIO_FLUSH after BIO_WRITE.

> You are making wrong conclusions from the false assumptions.
> The only requirement of the SU is that writes reported as done by disk
> driver are indeed safely landed in the involatile storage.
  See above. Only BIO_FLUSH could give some (but, again, not 100%, but
"best effort") guarantee, that completed BIO_WRITE is really landed.
Data could be queued on many layers, and without explicit FLUSH it
could not be really written for seconds or even minutes (but reported
as so).

  For example, for RAID5 descent performance it is vital to have some
write cache. And when it is software implementation, UPS could not
help from system panics.

  So, it is very sad, that SU and SU+J don't epress their requirements
via code! It seems, that even SU+J will not help from crashes in case
when some GEOM does write caching.

-- 
// Black Lion AKA Lev Serebryakov <lev@serebryakov.spb.ru>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1391930411.20111124000414>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation