From owner-freebsd-fs@freebsd.org Sat Jan 21 10:03:59 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D701BCBA580 for ; Sat, 21 Jan 2017 10:03:59 +0000 (UTC) (envelope-from jordanhubbard@icloud.com) Received: from pv35p22im-ztdg05131101.me.com (pv35p22im-ztdg05131101.me.com [17.133.189.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B9D63155F for ; Sat, 21 Jan 2017 10:03:59 +0000 (UTC) (envelope-from jordanhubbard@icloud.com) Received: from process-dkim-sign-daemon.pv35p22im-ztdg05131101.me.com by pv35p22im-ztdg05131101.me.com (Oracle Communications Messaging Server 7.0.5.38.0 64bit (built Feb 26 2016)) id <0OK400200H06QU00@pv35p22im-ztdg05131101.me.com> for freebsd-fs@freebsd.org; Sat, 21 Jan 2017 09:03:53 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=4d515a; t=1484989433; bh=Zy5BKbTv2jfluEndkDRpty3LC147ZELzFhklCnanrDI=; h=From:Message-id:Content-type:MIME-version:Subject:Date:To; b=hkPUyrkjS3vz6tbbQXvRSvvpErN+ZRBtcPr+yddbAzu+ngvkWHAaabiN/dWMUhrc/ WJ1o60Dk+oCetr5EngkNgvirEYo7L2dVTPTj+McL22kAZwYWXqcyeKMASVvWORxa+d HoRWFKQEEKTmiTXETYa57IP8w+buZ1o+7oJdrg77fvKCImciwyiDx+cI1L/ML5VTAZ XNpW9l3z0o/LSma8MI0YLYX4RCzC6OteCsFeVaWji7A7ZIazypA4rawtKVvq6bziKd QUSG41DsbMw6Ka7rTeOthzBNDCr+dRCaeRSrfWfrvUot7iHfffPrHznsiGv1GtUVmx cOu+KkIen6sgg== Received: from [10.11.111.236] (50-250-239-90-static.hfc.comcastbusiness.net [50.250.239.90]) by pv35p22im-ztdg05131101.me.com (Oracle Communications Messaging Server 7.0.5.38.0 64bit (built Feb 26 2016)) with ESMTPSA id <0OK4002A5H6FYR00@pv35p22im-ztdg05131101.me.com>; Sat, 21 Jan 2017 09:03:52 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-01-21_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1034 suspectscore=18 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1603290000 definitions=main-1701210134 From: Jordan Hubbard Message-id: MIME-version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: Poor ZFS performance Date: Sat, 21 Jan 2017 01:03:51 -0800 In-reply-to: <595d8117-e2f2-fa4f-a45e-3a9fb93d0687@webmail.sub.ru> Cc: freebsd-fs@freebsd.org To: Alex Povolotsky References: <595d8117-e2f2-fa4f-a45e-3a9fb93d0687@webmail.sub.ru> X-Mailer: Apple Mail (2.3259) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Jan 2017 10:03:59 -0000 > On Jan 21, 2017, at 12:49 AM, Alex Povolotsky > wrote: >=20 > I'm writing lots of (tens of millions) relatively small files, hashing = them out in three-level directory, 100 entries per level. You=E2=80=99re never going to get good performance doing that. The = constraints placed on ZFS=E2=80=99 design for directories (UNIX API, = POSIX compliance, etc) and small file representation formats means = it=E2=80=99ll never be a =E2=80=9Cdatabase=E2=80=9D - the filesystem's = design, to say nothing of UNIX=E2=80=99s directory iteration / lookup = APIs, just aren't optimized for millions of small files because it was = never the intention that any UNIX filesystem be a low-cost KVS or = database analog. Things will get quickly pathological from a = performance perspective and fixing the pathologies would require such a = significant redesign of a number of different pieces of the puzzle here = that it=E2=80=99s never likely to happen. Your application would be far better served by using an actual database. = I=E2=80=99m not just suggesting this as a hypothetical, either. I=E2=80=99= ve dealt with several folks who went down this path, storing millions = aornd even billions of small files into ZFS, and the results have never = been pretty, nor have there been any easy options or =E2=80=9Csimple = tunables=E2=80=9D that were going to make those scenarios significantly = prettier. The advice was the same: This needs to be a database, and = there are plenty of those to choose from with all kinds of performance / = consistency / redundancy characteristics to pick and choose between. - Jordan