From owner-freebsd-hackers@FreeBSD.ORG Tue Dec 27 16:59:43 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 59940106566B for ; Tue, 27 Dec 2011 16:59:43 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 27A388FC16 for ; Tue, 27 Dec 2011 16:59:42 +0000 (UTC) Received: by dakp5 with SMTP id p5so11049605dak.13 for ; Tue, 27 Dec 2011 08:59:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=m8dYOfuPXeqPLxkyu/bxTeisTCG+slcvyCKey+au6Xk=; b=aYGiE20jHXML2jeD5cE3GdKqYVUtRk4zc2x9Kwim7RJeWZDaooeRi1yRPG9zg0kgLg z0nX9J80tSyTnLY8FCxoReqMVpkFmgQDFpzRn+5eOeNYyTGv00rL3ua14EBLiJ25rwyS Qx0SfpoapeQ0Ww3ricI3wDP7EnnSxC4YpoEu4= MIME-Version: 1.0 Received: by 10.68.209.68 with SMTP id mk4mr37859709pbc.88.1325005181231; Tue, 27 Dec 2011 08:59:41 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.68.208.167 with HTTP; Tue, 27 Dec 2011 08:59:41 -0800 (PST) In-Reply-To: References: <20111226202414.GA18713@centaur.acm.jhu.edu> Date: Tue, 27 Dec 2011 08:59:41 -0800 X-Google-Sender-Auth: baA_8D0tIZK-gwn7ljmtoqfJofM Message-ID: From: mdf@FreeBSD.org To: Attilio Rao Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers@freebsd.org, Giovanni Trematerra , Venkatesh Srinivas Subject: Re: Per-mount syncer threads and fanout for pagedaemon cleaning X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Dec 2011 16:59:43 -0000 On Tue, Dec 27, 2011 at 8:05 AM, Attilio Rao wrote: > 2011/12/27 Giovanni Trematerra : >> On Mon, Dec 26, 2011 at 9:24 PM, Venkatesh Srinivas >> wrote: >>> Hi! >>> >>> I've been playing with two things in DragonFly that might be of interest >>> here. >>> >>> Thing #1 := >>> >>> First, per-mountpoint syncer threads. Currently there is a single thread, >>> 'syncer', which periodically calls fsync() on dirty vnodes from every mount, >>> along with calling vfs_sync() on each filesystem itself (via syncer vnodes). >>> >>> My patch modifies this to create syncer threads for mounts that request it. >>> For these mounts, vnodes are synced from their mount-specific thread rather >>> than the global syncer. >>> >>> The idea is that periodic fsync/sync operations from one filesystem should >>> not >>> stall or delay synchronization for other ones. >>> The patch was fairly simple: >>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/50e4012a4b55e1efc595db0db397b4365f08b640 >>> >> >> There's something WIP by attilio@ on that area. >> you might want to take a look at >> http://people.freebsd.org/~attilio/syncer_alpha_15.diff >> >> I don't know what hammerfs needs but UFS/FFS and buffer cache make a good >> job performance-wise and so the authors are skeptical about the boost that such >> a change can give. We believe that brain cycles need to be spent on >> other pieces of the system such as ARC and ZFS. > > More specifically, it is likely that focusing on UFS and buffer cache > for performance is not really useful, we should drive our efforts over > ARC and ZFS. > Also, the real bottlenecks in our I/O paths are in GEOM > single-threaded design, lack of unmapped I/O functionality, possibly > lack of proritized I/O, etc. Indeed, Isilon (and probably other vendors as well) entirely skip VFS_SYNC when the WAIT argument is MNT_LAZY. Since we're a distributed journalled filesystem, syncing via a system thread is not a relevant operation; i.e. all writes that have exited a VOP_WRITE or similar operation are already in reasonably stable storage in a journal on the relevant nodes. However, we do then have our own threads running on each node to flush the journal regularly (in addition to when it fills up), and I don't know enough about this to know if it could be fit into the syncer thread idea or if it's too tied in somehow to our architecture. Cheers, matthew