From owner-freebsd-questions@freebsd.org Fri Jan 19 02:40:55 2018 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 521AAEBC576 for ; Fri, 19 Jan 2018 02:40:55 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-it0-x22d.google.com (mail-it0-x22d.google.com [IPv6:2607:f8b0:4001:c0b::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 116786D14D for ; Fri, 19 Jan 2018 02:40:55 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: by mail-it0-x22d.google.com with SMTP id c16so503326itc.5 for ; Thu, 18 Jan 2018 18:40:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=roxnUI2/BUTXXf/BjTsJ4TNzFAUiOWPFOf+h419F9tg=; b=toLrO29WVnAjrunOgmut9uQP0OS8acjc19ej5CHQUsigKhMF/0ZhbyBArvDNNsKXiS egZYM3VUl3p47HWWpGpZTQyqsF29Sf5ZJT7dxMVyezRTf+Ta+tU87XrHR/pM/uRN4mqM /2hObwBKumubwCBwRQTDbJCXtsg1rkKSQcazhbHfhVF6uSp8FRf+Mwsp4Y7S/uZa0lYM xFWg7ipi0cTthKsZsON3VHwDlgEPtbRcoHishqVEpB4+4odkAL4FxKtOopg965iICuOD xTii+S4L9rI6wuBtawOPA4gbLEWUnxfPXKNms3NJvN8kcnQpjh4uYY85kuI25F8TJEiF BQDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=roxnUI2/BUTXXf/BjTsJ4TNzFAUiOWPFOf+h419F9tg=; b=iUmObzawwbGlLGUg98ayPTA723WJA7G/qc3A2RfmGojCuk2/0w6+H2v+EfvbP3QuOR WV5E+kcnBraAwVTVX9RfRzqatGpSpY7J2GTE7KrhPKZGBM2SdgoK/NJgqrzYEnYmgkSH z5jZEKd9OVRfWAmUL7FvsGG621Y+zvAwvbarQagASBcbLJIWLFN2+ivKMu7BHdRR1CE9 AY7PB578tGfa/WxMw2kP3zEcov+0HF1Zu33zY+EUqPB3Wzf1QiPcfIhADVLCZXBSjunE lgvPGTjo8ZjGy1AFB08o0ub/Ff5uLvLWm6jvRdCg25TcS1pmQwCae0X0WEtYtjuyMLbU PiLg== X-Gm-Message-State: AKwxytdZCYsLpksNB6RGFYXaoxiPI/qQO3rg3nKTHoJUQ8AYFxyzzusf wTEfo5Qe5RcpVtYCWsNF4MTJ0bWk3Ih9fImVwKY= X-Google-Smtp-Source: ACJfBotDZdU5RCtN7W79J4Szy4N389mAJ1oBM9+vqiy1ZkyPvXJKm70DzxLssZvpOxgDqiAKi6lSBskvxsKO8A1Piw4= X-Received: by 10.36.77.65 with SMTP id l62mr31499465itb.42.1516329653902; Thu, 18 Jan 2018 18:40:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.2.159.21 with HTTP; Thu, 18 Jan 2018 18:40:53 -0800 (PST) In-Reply-To: <68377.1516327618@segfault.tristatelogic.com> References: <68377.1516327618@segfault.tristatelogic.com> From: Adam Vande More Date: Thu, 18 Jan 2018 20:40:53 -0600 Message-ID: Subject: Re: Splitting up sets of files for archiving (e.g. to tape, optical media) To: "Ronald F. Guilmette" Cc: FreeBSD Questions Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jan 2018 02:40:55 -0000 On Thu, Jan 18, 2018 at 8:06 PM, Ronald F. Guilmette wrote: > > This isn't really FreeBSD specific, but in my experience the folks on > this list have a lot of knowledge about a lot nice, useful free software > tools, so I hope nobody will begrudgd me for asking this question here. > > I'm looking for a pre-existing software tool, which may or may not already > exist, and which will do the following job... > > Problem statement: > > Imagine that you have a big set of files that you would like to archive > to some sort of archiving media, such as tapes, or optical media, where > each unit of said archiving media has a capacity considerably less than > the total aggregate size of all of the files you want to archive. > > Imagine further that you would like your set of input files to be spread > across the units of the output (archive) media such that no single input > file is ever split across more than one unit of the output media, in order > to simplify recovery/restore of individual files. > > Lastly, assume that it is desired to minimize, as much as reasonably > possible, the total number of output (archive) media units used to > archive the entire set of input files. (And to further this goal, > it is acceptable for files from any single input subdirectory to be > scattered among the various output media units. > > +_+_+_+_+_+_+_+_+_+_ > > In my case, I want to archive several hundred gigabytes onto a set of > blank BD-R disks. > > I plan to use ImgBurn to actually write the BD-R disks. > > So basically, I just need a tool to analyze the input file set, applying > some sort of bin packing algorithm, and then spit out a list of which > specific files should go into each specific archive volume, e.g. #01, #02, > #03... etc. Each such set of files will then, in turn, be hard-linked > into a temporary directory, and then, one by one, ImghBurn will be told > to write each of these temp directories to a single output BD-R disk. > > I have written a small software tool to do the above "splitting" job, > and I am currently improving upon it, but it occured to me that I > should at least ask if someone else has perhaps already perfected this > exact wheel that I am busy re-inventing. > > > Regards, > rfg > > > P.S. It seems unlikely that I'm the first and only person to have ever > written a tool to do this specific job, but on the off chance that I am, > I am more than willing to contribute my little tool to the ever-expanding > ports tree. > http://www.gnu.org/software/tar/manual/tar.html#Using-Multiple-Tapes -- Adam