Date: Sun, 01 Jul 2007 21:32:19 -0700 From: Garrett Cooper <youshi10@u.washington.edu> To: ports@FreeBSD.org Subject: +CONTENTS files Message-ID: <46887FD3.3080307@u.washington.edu>
next in thread | raw e-mail | index | archive | help
Pardon me for being naive, but wouldn't it be wiser for all of the data in the +CONTENTS file to be aggregated into sections instead of having line by line info? Example (net/samba_3.0.25a): @comment MD5:9e94560ac5e757d3bc5f922dcf3ab4fb man/man1/log2pcap.1.gz [~100 lines of repetitive data...] @comment MD5:9f5fc8df2a1383a175e165ef2e0b10cc man/man8/vfs_notify_fam.8.gz Could be aggregated into: @MD5 9e94560ac5e757d3bc5f922dcf3ab4fb man/man1/log2pcap.1.gz c58f068d603a12d4af867c15cf77e636 man/man1/nmblookup.1.gz [etc..] @end MD5 or something similar to XML. This would reduce the filesize from n bytes to n - (9 + 4 -1) * i_entries + 8. In larger package files this would reduce the amount of data parsing by a long shot. Also, more powerful scripting languages like Perl, Python, or smart parsers in C could make short work of this data and just extract the MD5 elements for comparison. Also, by doing a little extra work when creating packages by organizing all the sections together, I think that the file size could be reduced by a large degree. Similar fields to @comment MD5 could be reduced I believe, but with less benefit maybe, other than just the @unexec rmdir, etc lines. Either that, or the data should be organized into separate files I think (increases number of files, but reduces overall processing time IMO). Thanks, -Garrett
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46887FD3.3080307>