Date: Tue, 23 Oct 2007 03:37:11 -0700 From: Tim Kientzle <kientzle@freebsd.org> To: josh.carroll@gmail.com Cc: Bruce Cran <bruce@cran.org.uk>, current@freebsd.org Subject: Re: bsdtar can't handle files >8GB Message-ID: <471DCED7.2020500@freebsd.org> In-Reply-To: <8cb6106e0710222017p133ddccyc973c6ebcd23e270@mail.gmail.com> References: <471CF3F3.6070803@cran.org.uk> <8cb6106e0710222017p133ddccyc973c6ebcd23e270@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------040305050300030903010507 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Josh Carroll wrote: >>tar: Unrecognized archive format: Inappropriate file type or format >>tar: Error exit delayed from previous errors. > > Confirmed in RELENG_7 as well. Interestingly enough, if the file > inside the tarball is nothing but zeros (dd if=/dev/zero ...), I don't > get this error. However, it doesn't work either. The resulting file > is 0 bytes, rather than 10 GB of \0. Try the attached patch, which I think fixes this problem. I need to do some more testing before I commit it, but your feedback will certainly help. The failure here is that libarchive was erroneously interpreting the large file as having a zero-byte body, then generating the error above when it tried to read the next header. This bug crept in when I was working on read support for GNU tar's new --pax --sparse format. I need to test that in the case where a sparse entry has more than 8G of non-hole data. Tim Kientzle --------------040305050300030903010507 Content-Type: text/x-patch; name="archive_tar_largefile.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="archive_tar_largefile.patch" Index: archive_read_support_format_tar.c =================================================================== --- archive_read_support_format_tar.c (revision 510) +++ archive_read_support_format_tar.c (working copy) @@ -164,6 +164,7 @@ struct sparse_block *sparse_last; int64_t sparse_offset; int64_t sparse_numbytes; + int64_t sparse_realsize; int sparse_gnu_major; int sparse_gnu_minor; char sparse_gnu_pending; @@ -440,6 +441,7 @@ free(sp); } tar->sparse_last = NULL; + tar->sparse_realsize = -1; /* Mark this as "unset" */ r = tar_read_header(a, tar, entry); @@ -1388,9 +1390,10 @@ } if (wcscmp(key, L"GNU.sparse.name") == 0) archive_entry_copy_pathname_w(entry, value); - if (wcscmp(key, L"GNU.sparse.realsize") == 0) - archive_entry_set_size(entry, - tar_atol10(value, wcslen(value))); + if (wcscmp(key, L"GNU.sparse.realsize") == 0) { + tar->sparse_realsize = tar_atol10(value, wcslen(value)); + archive_entry_set_size(entry, tar->sparse_realsize); + } break; case 'L': /* Our extensions */ @@ -1471,11 +1474,22 @@ /* POSIX has reserved 'security.*' */ /* Someday: if (wcscmp(key, L"security.acl")==0) { ... } */ if (wcscmp(key, L"size")==0) { - tar->entry_bytes_remaining = tar_atol10(value, wcslen(value)); - archive_entry_set_size(entry, tar->entry_bytes_remaining); + /* "size" is the size of the data in the entry. */ + tar->entry_bytes_remaining + = tar_atol10(value, wcslen(value)); + /* + * But, "size" is not necessarily the size of + * the file on disk; if this is a sparse file, + * the disk size may have already been set from + * GNU.sparse.realsize. + */ + if (tar->sparse_realsize < 0) { + archive_entry_set_size(entry, + tar->entry_bytes_remaining); + tar->sparse_realsize + = tar->entry_bytes_remaining; + } } - tar->entry_bytes_remaining = 0; - break; case 'u': if (wcscmp(key, L"uid")==0) --------------040305050300030903010507--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?471DCED7.2020500>