From owner-freebsd-questions@FreeBSD.ORG  Thu Aug 15 21:21:55 2013
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 412EC1CA
 for <freebsd-questions@freebsd.org>; Thu, 15 Aug 2013 21:21:55 +0000 (UTC)
 (envelope-from iamatt@gmail.com)
Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com
 [IPv6:2a00:1450:400c:c03::22f])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id C10A22A97
 for <freebsd-questions@freebsd.org>; Thu, 15 Aug 2013 21:21:54 +0000 (UTC)
Received: by mail-we0-f175.google.com with SMTP id q58so992057wes.6
 for <freebsd-questions@freebsd.org>; Thu, 15 Aug 2013 14:21:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=ekC369ntnDcMXyAVMTTOzOUAGTZRzrZTeHmBQ6fAzCo=;
 b=PjuW/ujrMEwnrD1owxm+W2XFBfs+5SblmDW3bC8h7y0QWz1DePi5eMu3FYtAQPeIVY
 llmYzf/gnEcOgyN7T6lN5lp/1AXZ+FNJ+A1kI+NOJ7nnXZCo7DPh2p2vuqPC+ptIZ3Wu
 6H1MsP4TN7Nh3aj1D5DliSRC0wJ2JApja+s0mUDBHIYmBiEnS6sLOPaMVB9O+a8jFxgr
 mgUTtx/YVi292A4+n/cnLi0yaXUJGQ06A4d3Wc+VjvZ/qBF5JLm9BP9HR66k3FDEi18v
 tRlbNSZV+l2nFYSh71dtjgtvpKclCkursuoWgkrLUpAyB0FAZ6Xi2mYZSLY3Kxeupa4B
 a9aw==
MIME-Version: 1.0
X-Received: by 10.180.37.164 with SMTP id z4mr3067872wij.30.1376601712865;
 Thu, 15 Aug 2013 14:21:52 -0700 (PDT)
Received: by 10.217.50.196 with HTTP; Thu, 15 Aug 2013 14:21:52 -0700 (PDT)
In-Reply-To: <611B3931-958B-4A46-A6BD-1CA541F32699@gmail.com>
References: <7E7AEB5A-7102-424E-8B1E-A33E0A2C8B2C@gmail.com>
 <CC3CFFD3-6742-447B-AA5D-2A4F6C483883@mac.com>
 <6483A298-6216-4306-913C-B3E0F4A3BC8D@gmail.com>
 <1F06D736-1019-4223-8546-5DBB0F5D878B@mac.com>
 <611B3931-958B-4A46-A6BD-1CA541F32699@gmail.com>
Date: Thu, 15 Aug 2013 16:21:52 -0500
Message-ID: <CAEeRwNVwZA5+cFFnB=nSJN2GTWmhvtoBmVmrCEzhXn04zxO8bQ@mail.gmail.com>
Subject: Re: copying milllions of small files and millions of dirs
From: iamatt <iamatt@gmail.com>
To: aurfalien <aurfalien@gmail.com>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Questions <freebsd-questions@freebsd.org>
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Aug 2013 21:21:55 -0000

I would use ndmp.  That is how we  archive our  nas crap  isilon stuff but
we have the backend accelerators   Not sure if there is ndmp for FreeBSD.
Like another poster said   you are most likely i/o bound anyway.


On Thu, Aug 15, 2013 at 2:14 PM, aurfalien <aurfalien@gmail.com> wrote:

>
> On Aug 15, 2013, at 11:52 AM, Charles Swiger wrote:
>
> > On Aug 15, 2013, at 11:37 AM, aurfalien <aurfalien@gmail.com> wrote:
> >> On Aug 15, 2013, at 11:26 AM, Charles Swiger wrote:
> >>> On Aug 15, 2013, at 11:13 AM, aurfalien <aurfalien@gmail.com> wrote:
> >>>> Is there a faster way to copy files over NFS?
> >>>
> >>> Probably.
> >>
> >> Ok, thanks for the specifics.
> >
> > You're most welcome.
> >
> >>>> Currently breaking up a simple rsync over 7 or so scripts which
> copies 22 dirs having ~500,000 dirs or files each.
> >>>
> >>> There's a maximum useful concurrency which depends on how many disk
> spindles and what flavor of RAID is in use; exceeding it will result in
> thrashing the disks and heavily reducing throughput due to competing I/O
> requests.  Try measuring aggregate performance when running fewer rsyncs at
> once and see whether it improves.
> >>
> >> Its 35 disks broken into 7 striped RaidZ groups with an SLC based ZIL
> and no atime, the server it self has 128GB ECC RAM.  I didn't have time to
> tune or really learn ZFS but at this point its only backing up the data for
> emergency purposes.
> >
> > OK.  If you've got 7 independent groups and can use separate network
> pipes for each parallel copy, then using 7 simultaneous scripts is likely
> reasonable.
> >
> >>> Of course, putting half a million files into a single directory level
> is also a bad idea, even with dirhash support.  You'd do better to break
> them up into subdirs containing fewer than ~10K files apiece.
> >>
> >> I can't, thats our job structure obviously developed by scrip kiddies
> and not systems ppl, but I digress.
> >
> > Identifying something which is "broken as designed" is still helpful,
> since it indicates what needs to change.
> >
> >>>> Obviously reading all the meta data is a PITA.
> >>>
> >>> Yes.
> >>>
> >>>> Doin 10Gb/jumbos but in this case it don't make much of a hoot of a
> diff.
> >>>
> >>> Yeah, probably not-- you're almost certainly I/O bound, not network
> bound.
> >>
> >> Actually it was network bound via 1 rsync process which is why I broke
> up 154 dirs into 7 batches of 22 each.
> >
> > Oh.  Um, unless you can make more network bandwidth available, you've
> saturated the bottleneck.
> > Doing a single copy task is likely to complete faster than splitting up
> the job into subtasks in such a case.
>
> Well, using iftop, I am now at least able to get ~1Gb with 7 scripts going
> were before it was in the 10Ms with 1.
>
> Also, physically looking at my ZFS server, it now shows the drives lights
> are blinking faster, like every second.  Were as before it was sort of
> seldom, like every 3 seconds or so.
>
> I was thinking to perhaps zip dirs up and then xfer the file over but it
> would prolly take as long to zip/unzip.
>
> This bloody project structure we have is nuts.
>
> - aurf
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe@freebsd.org"
>