From owner-freebsd-fs@FreeBSD.ORG Mon Jan 3 13:25:46 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 577D916A4CE for ; Mon, 3 Jan 2005 13:25:46 +0000 (GMT) Received: from naboo.binarysolutions.dk (port554.ds1-kd.adsl.cybercity.dk [212.242.185.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5F99043D3F for ; Mon, 3 Jan 2005 13:25:45 +0000 (GMT) (envelope-from kvs@binarysolutions.dk) Received: by naboo.binarysolutions.dk (Postfix, from userid 1000) id A972443B54; Mon, 3 Jan 2005 14:25:40 +0100 (CET) Date: Mon, 3 Jan 2005 14:25:40 +0100 From: Kenneth Vestergaard Schmidt To: freebsd-fs@freebsd.org Message-ID: <20050103132540.GB21037@binarysolutions.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-GPG-Fingerprint: A11F BB5D BD79 7228 A198 CF53 D508 53A9 2213 E772 X-GPG-Key: http://www.binarysolutions.dk/~kvs/key.asc User-Agent: Mutt/1.5.6+20040722i Subject: Extending di_nlink and its ilk X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jan 2005 13:25:46 -0000 Hello. I've run into a wee problem trying to create a nice backup-machine. We made it using rsync, hardlinks, and a modified link-by-hash patch for rsync. link-by-hash creates an md4 checksum of the file's contents. It then stores the file in /dana/hashes/abcdef/1234567890 and hardlinks it to the correct place. This way, identical files only get stored once. At this point, we ran into the problem with di_nlink and related fields only being 16-bit, since we were creating more than 32765 sub-directories. I fixed this by only creating 256 directories, each containing a lot of files. However, we soon ran into yet another problem, that of more than 32767 links to one file - when we link by contents, this limit comes up real quick. My initial idea was to patch the file-system to use one of the spare values at the end of various inode-structs to provide a 32-bit or 64-bit value to the link count. Of course, some backward-compatible scheme must be employed were the original di_nlink is read first, but I wanted to hear if this is a totally hare-brained scheme before I start doing it, or if it would actually be useful to others? The only other choice I have is a couple of extremely ugly hacks to rsync, which I'd rather not do. -- Best Regards Kenneth Vestergaard Schmidt