Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Sep 2005 22:43:10 -0700
From:      Tim Kientzle <kientzle@freebsd.org>
To:        Tim Kientzle <kientzle@freebsd.org>
Cc:        Garrett Wollman <wollman@csail.mit.edu>, freebsd-current@freebsd.org, Ed Maste <emaste@phaedrus.sandvine.ca>
Subject:   Re: Bsdtar and archive torture tests
Message-ID:  <433A2D6E.7020205@freebsd.org>
In-Reply-To: <433A2882.4030003@freebsd.org>
References:  <20050926195807.GD95971@sandvine.com>	<17208.30606.117170.36398@khavrinen.csail.mit.edu>	<20050927001650.GA9994@sandvine.com>	<20050927180021.GB9994@sandvine.com> <433A2882.4030003@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------020704000509080109060909
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Ed,

Try the attached patch (for /usr/src/lib/libarchive) and
let me know if that fixes it for you.

libarchive was actually skipping the UTF-8 conversion when
storing the long linkname but then (correctly) converting
from UTF-8 on extraction.  The patch fixes the pax archive
writer so it does correctly convert to UTF-8.

Tim

Tim Kientzle wrote:
> Hmmm.... Looking at the internals of the generated archive
> shows that the extended attribute is definitely getting
> stored incorrectly.  I'll look into this.
> 
> If you see any other problems, please let me know!
> 
> Tim
> 
> 
> Ed Maste wrote:
> 
>> On Mon, Sep 26, 2005 at 08:16:50PM -0400, Ed Maste wrote:
>>
>>
>>> Hmm, good point.  I haven't set it to anything; locale(1) shows
>>> that the LC_ variables are set to "C".  So then I can see how this
>>> happens, but it's still surprising (to me) behaviour.
>>
>>
>>
>> Ok, now I've definately encountered some non-obvious behaviour.
>> A symlink target of 100 bytes or less keeps the same name, while
>> a target of more than 100 bytes gets munged from the converstion
>> to UTF-8 and back.
>>
>> For example, the symlink created by the following script doesn't
>> change the link target:
>>
>> #!/bin/sh
>> fname=$(printf $(jot -b \\303\\240 -s '' 50))
>> ln -fs $fname test
>> tar -cf - test | tar -tvf -
>>
>> but if the 50 in the jot command is changed to 51, the target
>> changes.  So I guess that the link target doesn't fit in the
>> standard header anymore, and needs an extended tag.  Having
>> different behaviour for the two cases does seem odd.
>>
>> -- 
>> Ed Maste, Sandvine Incorporated
>> _______________________________________________
>> freebsd-current@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to 
>> "freebsd-current-unsubscribe@freebsd.org"
>>
>>
> 
> 
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
> 
> 


--------------020704000509080109060909
Content-Type: text/plain;
 name="longsymlinkwcs.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="longsymlinkwcs.patch"

Index: archive_entry.c
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_entry.c,v
retrieving revision 1.31
diff -u -r1.31 archive_entry.c
--- archive_entry.c	21 Sep 2005 04:25:05 -0000	1.31
+++ archive_entry.c	28 Sep 2005 05:36:04 -0000
@@ -203,6 +203,8 @@
 static const char *
 aes_get_mbs(struct aes *aes)
 {
+	if (aes->aes_mbs == NULL && aes->aes_wcs == NULL)
+		return NULL;
 	if (aes->aes_mbs == NULL && aes->aes_wcs != NULL) {
 		/*
 		 * XXX Need to estimate the number of byte in the
@@ -224,6 +226,8 @@
 static const wchar_t *
 aes_get_wcs(struct aes *aes)
 {
+	if (aes->aes_wcs == NULL && aes->aes_mbs == NULL)
+		return NULL;
 	if (aes->aes_wcs == NULL && aes->aes_mbs != NULL) {
 		/*
 		 * No single byte will be more than one wide character,
@@ -463,6 +467,12 @@
 	return (aes_get_mbs(&entry->ae_hardlink));
 }
 
+const wchar_t *
+archive_entry_hardlink_w(struct archive_entry *entry)
+{
+	return (aes_get_wcs(&entry->ae_hardlink));
+}
+
 ino_t
 archive_entry_ino(struct archive_entry *entry)
 {
@@ -536,6 +546,12 @@
 	return (aes_get_mbs(&entry->ae_symlink));
 }
 
+const wchar_t *
+archive_entry_symlink_w(struct archive_entry *entry)
+{
+	return (aes_get_wcs(&entry->ae_symlink));
+}
+
 uid_t
 archive_entry_uid(struct archive_entry *entry)
 {
Index: archive_entry.h
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_entry.h,v
retrieving revision 1.17
diff -u -r1.17 archive_entry.h
--- archive_entry.h	10 Sep 2005 22:58:06 -0000	1.17
+++ archive_entry.h	28 Sep 2005 05:36:05 -0000
@@ -80,6 +80,7 @@
 gid_t			 archive_entry_gid(struct archive_entry *);
 const char		*archive_entry_gname(struct archive_entry *);
 const char		*archive_entry_hardlink(struct archive_entry *);
+const wchar_t		*archive_entry_hardlink_w(struct archive_entry *);
 ino_t			 archive_entry_ino(struct archive_entry *);
 mode_t			 archive_entry_mode(struct archive_entry *);
 time_t			 archive_entry_mtime(struct archive_entry *);
@@ -92,6 +93,7 @@
 int64_t			 archive_entry_size(struct archive_entry *);
 const struct stat	*archive_entry_stat(struct archive_entry *);
 const char		*archive_entry_symlink(struct archive_entry *);
+const wchar_t		*archive_entry_symlink_w(struct archive_entry *);
 uid_t			 archive_entry_uid(struct archive_entry *);
 const char		*archive_entry_uname(struct archive_entry *);
 
Index: archive_write_set_format_pax.c
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_write_set_format_pax.c,v
retrieving revision 1.30
diff -u -r1.30 archive_write_set_format_pax.c
--- archive_write_set_format_pax.c	21 Sep 2005 04:25:05 -0000	1.30
+++ archive_write_set_format_pax.c	28 Sep 2005 05:36:06 -0000
@@ -393,11 +393,14 @@
 
 	/* If link name is too long, add 'linkpath' to pax extended attrs. */
 	linkname = hardlink;
-	if (linkname == NULL)
+	if (linkname == NULL) {
 		linkname = archive_entry_symlink(entry_main);
+		wp = archive_entry_symlink_w(entry_main);
+	} else
+		wp = archive_entry_hardlink_w(entry_main);
 
 	if (linkname != NULL && strlen(linkname) > 100) {
-		add_pax_attr(&(pax->pax_header), "linkpath", linkname);
+		add_pax_attr_w(&(pax->pax_header), "linkpath", wp);
 		if (hardlink != NULL)
 			archive_entry_set_hardlink(entry_main,
 			    "././@LongHardLink");

--------------020704000509080109060909--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?433A2D6E.7020205>