From owner-freebsd-current@FreeBSD.ORG Tue Mar 5 05:27:14 2013 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5A06C6D6; Tue, 5 Mar 2013 05:27:14 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 273EFD7; Tue, 5 Mar 2013 05:27:13 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r255R0Gd012437; Mon, 4 Mar 2013 21:27:04 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201303050527.r255R0Gd012437@gw.catspoiler.org> Date: Mon, 4 Mar 2013 21:27:00 -0800 (PST) From: Don Lewis Subject: Re: access to hard drives is "blocked" by writes to a flash drive To: ian@FreeBSD.org In-Reply-To: <1362410177.1195.234.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: deeptech71@gmail.com, phk@phk.freebsd.dk, freebsd-current@FreeBSD.org, peter@rulingia.com X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 05:27:14 -0000 On 4 Mar, Ian Lepore wrote: > On Sun, 2013-03-03 at 19:01 -0800, Don Lewis wrote: >> On 3 Mar, Poul-Henning Kamp wrote: >> >> > For various reasons (see: Lemming-syncer) FreeBSD will block all I/O >> > traffic to other disks too, when these pileups gets too bad. >> >> The Lemming-syncer problem should have mostly been fixed by 231160 in >> head (231952 in stable/9 and 231967 in stable/8) a little over a year >> ago. The exceptions are atime updates, mmaped files with dirty pages, >> and quotas. Under certain workloads I still notice periodic bursts of >> seek noise. After thinking about it for a bit, I suspect that it could >> be atime updates, but I haven't tried to confirm that. >> >> When using TCQ or NCQ, perhaps we should limit the number of outstanding >> writes per device to leave some slots open for reads. We should >> probably also prioritize reads over writes unless we are under memory >> pressure. >> > > Then either those changes didn't have the intended effect, or the > problem we're seeing with lack of system responsiveness when there's a > large backlog of writes to a slow device is not the lemming-syncer > problem. It's also not a lack of TCQ/NCQ slots, given that no such > thing exists with SD card IO. > > When this is going on, the process driving the massive output spends > almost all its time in a wdrain wait, and if you try to launch an app > that isn't already in cache, a siginfo generally shows it to be in a > getblk wait. If your only drive is a single SD card, then you're pretty much hosed when I/O is blocked because the SD card is doing an erase. It can only handle one command at a time, and if a write blocks, there's nothing that we can do to get it to execute a read until it is done with the write command that it is hung up on. I'm not familiar with the lower layers, but things might be less bad if read ops can jump ahead and get sent to the drive before any queued writes.