From owner-freebsd-fs@FreeBSD.ORG Thu Mar 19 10:27:11 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0D998477 for ; Thu, 19 Mar 2015 10:27:11 +0000 (UTC) Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com [IPv6:2a00:1450:4010:c04::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7AF76A95 for ; Thu, 19 Mar 2015 10:27:10 +0000 (UTC) Received: by lbcgn8 with SMTP id gn8so49202823lbc.2 for ; Thu, 19 Mar 2015 03:27:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=LJR+vxL23ejP5YE7lr1R9+4UQSy/kuxARggOhddmdn8=; b=PQ5d6ncuYdE8yBcXba7fDPwSxO3wI1tgUmwToUnLICTjH2J3yGStRRwzmYbAz7toOh f/XCuch1iuUQWGRIk0vMnuPasw9/D19SokTauJ6kAWk257hZmYi9Po6JupteOC4mqjNR J4RAQ/Am1aGs/eD5/ljSyuRL71VGKbEi7zs0XV19q9kzC73ypwPwELFcjiKsb6Nb2BqX /qePwnb1tmKVkaxB5dHHFrOFGA2Hu8k+EDakgXGRjG3z4ntcH3CkawVjpNQckzoBwL6Y XzwBjloV0buCACnu9MpliB79bD1KFxnjRg4pdRUMdon6tHWpzK9XeWqjdwcV8IioR4Jl UyDg== X-Received: by 10.152.30.6 with SMTP id o6mr3128505lah.93.1426760828241; Thu, 19 Mar 2015 03:27:08 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id rk10sm198182lac.12.2015.03.19.03.27.06 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Mar 2015 03:27:07 -0700 (PDT) Sender: Alexander Motin Message-ID: <550AA479.2020404@FreeBSD.org> Date: Thu, 19 Mar 2015 12:27:05 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: john hood , freebsd-fs@FreeBSD.org Subject: Re: MMCSD erase optimization not quite right? References: <54E80BB6.2040501@glup.org> <54F42B6B.9080307@FreeBSD.org> <550A15E6.4060903@glup.org> In-Reply-To: <550A15E6.4060903@glup.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2015 10:27:11 -0000 On 19.03.2015 02:18, john hood wrote: > Unfortunately, this scheme of trying to expand to a single large range > and then erase it isn't very effective, at least with UFS. This isn't > the fault of the code in mmcsd. UFS doesn't issue BIO_DELETEs in a very > neat order-- it just issues them immediately when deallocating blocks. > That won't necessarily be in sequential order. I often see UFS issuing > some BIO_DELETEs within a 4MB block, and then some in another 4MB block, > which discards the first accumulated range, and then some more in the > first 4MB block. So the driver rarely actually accumulates a full 4MB > block and erases it, even if UFS is actually deleting > 4MB++. The only > situation that seems to work consistently enough to actually issue MMC > erase commands is when you delete a large file that was written with > large (1MB? 4MB?) writes on an otherwise idle system. Attached is a > diff with debugging printfs that shows this. > > This isn't the fault of the code in mmcsd, it shouldn't have to remember > multiple regions that BIO_DELETE has been issued on. Also, UFS perhaps > shouldn't issue BIO_DELETE immediately, because the block might be > reused soon (I'm not sure what the right answer here is for best flash > performance). It would be better if UFS had a kernel thread or daemon > that eventually found free regions, coalesced them into large blocks, > and BIO_DELETEd them down the stack. (fsck -E already does this, but > it's not the best place to handle this for routine operation.) > Alternately GEOM could have a BIO_DELETE manager module or > functionality, but that incurs some cost in the GEOM stack. > > I have no idea how ZFS does this, but ZFS is not very likely to be seen > on Raspberry Pis or SD cards :) ZFS actually does what you are talking about above. It accumulates list of freed blocks, aggregates them if possible, tracks reused ones, and pushes down remaining after some timeout. I don't know whether it results in many sequential 4MB chunks, but it gives as much aggregation as possible. It would be good if UFS did at least some aggregation of that kind, and I know that this topic is discussed from time to time. > Probably this issue should go out to one of the mailing lists instead of > just you and me... Probably. Done. -- Alexander Motin