Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Jan 2005 14:25:40 +0100
From:      Kenneth Vestergaard Schmidt <kvs@binarysolutions.dk>
To:        freebsd-fs@freebsd.org
Subject:   Extending di_nlink and its ilk
Message-ID:  <20050103132540.GB21037@binarysolutions.dk>

next in thread | raw e-mail | index | archive | help
Hello.

I've run into a wee problem trying to create a nice backup-machine. We made
it using rsync, hardlinks, and a modified link-by-hash patch for rsync.

link-by-hash creates an md4 checksum of the file's contents. It then stores
the file in /dana/hashes/abcdef/1234567890 and hardlinks it to the correct
place. This way, identical files only get stored once.

At this point, we ran into the problem with di_nlink and related fields
only being 16-bit, since we were creating more than 32765 sub-directories.

I fixed this by only creating 256 directories, each containing a lot of
files. However, we soon ran into yet another problem, that of more than
32767 links to one file - when we link by contents, this limit comes up
real quick.

My initial idea was to patch the file-system to use one of the spare
values at the end of various inode-structs to provide a 32-bit or 64-bit
value to the link count. Of course, some backward-compatible scheme must
be employed were the original di_nlink is read first, but I wanted to
hear if this is a totally hare-brained scheme before I start doing it,
or if it would actually be useful to others?

The only other choice I have is a couple of extremely ugly hacks to rsync,
which I'd rather not do.


-- 
Best Regards

Kenneth Vestergaard Schmidt



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050103132540.GB21037>