Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Aug 2015 14:41:07 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        fs@freebsd.org
Subject:   Re: Using SSDs as swap
Message-ID:  <20150808114107.GD2072@kib.kiev.ua>
In-Reply-To: <55C5E697.4080102@digiware.nl>
References:  <55C5D48E.6010605@digiware.nl> <20150808102900.GA2072@kib.kiev.ua> <20150808103810.GB2072@kib.kiev.ua> <55C5E697.4080102@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 08, 2015 at 01:23:03PM +0200, Willem Jan Withagen wrote:
> On 8-8-2015 12:38, Konstantin Belousov wrote:
> > On Sat, Aug 08, 2015 at 01:29:00PM +0300, Konstantin Belousov wrote:
> >> On Sat, Aug 08, 2015 at 12:06:06PM +0200, Willem Jan Withagen wrote:
> >>> one of the following commits just passed with this in the log, and it
> >>> triggered again a question I've been having for some time again already.
> >>>
> >>> ----
> >>> Log:
> >>>   Enable BIO_DELETE passthru in GELI, so TRIM/UNMAP can work as expected
> >>> when
> >>>   GELI is used on a SSD or inside virtual machine, so that guest can tell
> >>>   host that it is no longer using some of the storage.
> >>> -----
> >>>
> >>> In ZFS I slice my SSD's into log and caches, but on a a server with
> >>> little memory (which can't be grown) I use a partion on each ssd as swap
> >>> as well. So swappinging does not have to seek, and has faster loading
> >>> time. To allocate a few GB on aan SSD to swap is not really all that
> >>> painfull, given current sizes, but the speed difference with regular
> >>> spindels is impressive.
> >>>
> >>> But the questions are:
> >>> 1) Does the swap driver understand that backing-store needs a TRIM?
> >> No.
> >>
> >>> 1a) if not would it be useful, and what would it take to implement?
> >> One good thing is that it is simply the question of coding: the VM
> >> already has a place where it informs the swap pager that the page copy
> >> in swap is no longer needed. this is the vm_pager_page_unswapped() call
> >> and swap pager method swap_pager_unswapped(). swp_pager_meta_ctl() would
> >> need to issue BIO_DELETE to the backing storage.
> >>
> >> On the other hand, note that this would increase the amount of work
> >> performed, even for the swap volumes located on the rotating media,
> >> which is more typical and reasonable setup.
> >>
> >> I think an implementation and a knob to turn it off, or configure per
> >> swap partition, would be reasonable.
> > 
> > One additional thing: while BIO_DELETE is in progress, the swap block
> > cannot be marked free, since otherwise we could write other page and
> > get it obliterated with the TRIM. This can be done async, but the
> > consequence is that swap space would be released and usable some time
> > after the page-in.  This will affect loads which are close to OOM.
> 
> Sort of makes sense to me...
> 
> I take it that BIO_DELETE fires and returns before TRIM is completed?
> But then the SSD accepts writes to a TRIMmed block, but then mixes this
> up? Possibly deleting a write to a to be trimmed block? This sort of
> strikes me as odd, but then I do not know the full intricate details of
> TRIM on SSD
> 
> Would it be possible to be notified that a TRIM has completed, only then
> to actually free the swap sectors?
This is exactly what I wrote above.

> And then perhaps the swap bookkeeping does not yet accommodate for a
> possible extra state?
It does not need to.  The in-flight BIO_DELETE remembers the intermediate
state, the swap block should be freed only after the storage reported the
BIO_DELETE as finished.  It is exactly the same as UFS handles trimming
of the free blocks, the bitmap of the used/freed blocks is only updated
after the BIO_DELETE is finished, not when the inode drops reference to
the block.

> 
> Speaking about blocks.... Does Swap take into account that disks could
> be of a sectorsize other than 512 bytes. I would guess so, since we
> could have a 4K disk as swap disk, and doing read-modify-write for swap
> is sure going to kill performance.
swap performs i/o in the page-sized chunks at least, which are min 4k on
all supported platforms (even on arms, where we do not support smaller
pages AFAIK).



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150808114107.GD2072>