Date: Mon, 24 Sep 2001 21:49:34 +1000 From: Mark Hannon <markhannon@optushome.com.au> To: freebsd-hackers@freebsd.org Subject: dump files too large, nodump related?? Message-ID: <3BAF1DCE.4BA21B8D@optushome.com.au>
next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------558CB1854F512336B4523BA1 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi, I have noticed some strange behaviour with 4.3-RELEASE and dump. I have been dumping my filesystems through gzip into a compressed dumpfile. Some of the resulting dumps have been MUCH larger than I would expect. As an example, I have just dumped my /home partition .... note that lots of directories on this partition are marked nodump, eg /home/ftp which is one of the biggest users of diskspace. Building 8 level dump of /home and writing it to /var/dumps//home8.gz (gzipped) DUMP: Date of this level 8 dump: Mon Sep 24 21:13:55 2001 DUMP: Date of last level 1 dump: Tue Sep 18 20:15:43 2001 DUMP: Dumping /dev/ad0s1h (/home) to standard output DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 360780 tape blocks. DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 30.76% done, finished in 0:11 DUMP: 60.89% done, finished in 0:06 DUMP: DUMP: 360664 tape blocks DUMP: finished in 849 seconds, throughput 424 KBytes/sec DUMP: level 8 dump on Mon Sep 24 21:13:55 2001 DUMP: DUMP IS DONE The GZIPPED dumpfile is 289 MB!!! I wrote a little perl script to check the table of contents and estimate how big the dump should be (see attached) and this gives an interesting result. doorway:~> proj/dumpsize/dumpsize.pl /home /var/dumps/home8.gz Level 8 dump of /home on doorway.home.lan:/dev/ad0s1h Label: none The level 0 dump of /home partition written to /var/dumps/home8.gz contains 689 files totalling 146450 KB, cf size of dumpfile = 282063 ( 360660 ) KB The following files are larger than 1024 KB in size: 163264 ./mark/.netscape/xover-cache/host-news/athome.aus.service.snm 1343488 ./mark/.netscape/xover-cache/host-news/athome.aus.support.snm 2097152 ./mark/.netscape/xover-cache/host-news/athome.aus.users.linux.snm 1754819 ./mark/.netscape/xover-cache/host-news/hostinfo.dat 1122336 ./samba/profile.9x/mark/USER.DAT 1441792 ./samba/profile.9x/tuija/History/History.IE5/index.dat 92440996 ./tuija/Mail/Archive/Sent Items 2001 2985510 ./tuija/My Documents/gas1.JPG 2528914 ./tuija/My Documents/gas2.JPG The interesting thing here is that the sum of all the file sizes in the dump is only 147MB cf the 361MB uncompressed dump size!!!!!!! This is a discrepancy of 210MB. (This would line up with the 180MB ISO image plus other dribs and drabs that I have stored in a nodump flagged directory since my last dump) Any ideas of what is wrong? Are the nodumped files stored on the dump for some reason (even though they don't appear in the restore table of contents) Regards/Mark --------------558CB1854F512336B4523BA1 Content-Type: application/x-perl; name="dumpsize.pl" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dumpsize.pl" #!/usr/bin/perl # # $Id: dumpsize.pl,v 1.4 2001/09/23 05:33:22 mark Exp mark $ # # Usage: dumpsize.pl [-list] partition gzipped_dumpfilename # use strict; my ($progname) = "dumpsize.pl"; # Name of this program for errors my ($progusage) = "[-list] partition gzipped_dumpfilename"; my ($list_flag) = 0; # 1 if -list specified on command line my ($threshold) = 1024 * 1024; # Threshold to print out files my ($tmp_dump_gzip) = "/tmp/dump_gzip"; my ($tmp_dump_toc) = "/tmp/dump_toc"; my ($partition); # Name of partition in dumpfile my ($dumpfile); # Name of dumpfile my ($dumplevel) = 0; # Dump level - not implemented yet! my ($dump_size); # Size, in bytes, of unzipped dumpfile my ($dump_size_gzip); # Size, in bytes, of gzipped dumpfile my ($dump_is_gzipped) = 1; # 1 if dumpfile has been gzipped my ($i); # Loop counter my ($total_size) = 0; # Sum of file sizes included in dump my (@line); # All lines for TOC stored in this array my (@token); # Temporary variable used to split lines my (@leaf_file); # Unsorted array of leafnodes found in TOC my (@file_list); # Sorted array of leafnode filenames my (@file_size); # Parallel array of file sizes ... my ($no_files) = 0; # Number of files on tape # ----------------------------------------------------------------------------------- # Parse command line, open dump file and table of contents unless( $ARGV[0] ne "" ){ die( "Usage %s %s\n", $progname, $progusage); } if ( $ARGV[0] eq "-list" ){ $list_flag = 1; $partition = $ARGV[1]; $dumpfile = $ARGV[2]; } else { $list_flag = 0; $partition = $ARGV[0]; $dumpfile = $ARGV[1]; } unless( chdir ( $partition ) ){ die ( "$progname : Unable to chdir to partition $partition\n" ); } unless( -e $dumpfile ){ die( "$progname : Unable to open gzipped dumpfile $dumpfile\n"); } # ----------------------------------------------------------------------------------- # Calculate uncompressed size of dumpfile $dump_size_gzip = -s $dumpfile; system( "/usr/bin/gzip -l $dumpfile > $tmp_dump_gzip.$$" ); unless( open( GZIP_DETAILS, "$tmp_dump_gzip.$$" ) ){ die( "$progname : Unable to open TOC file $tmp_dump_gzip.$$\n" ); } @line = <GZIP_DETAILS>; $line[1] =~ s/^ +//; @token = split( / +/, $line[1] ); $dump_size = $token[1]; # ----------------------------------------------------------------------------------- # Parse restore TOC and look for all leaf entries, store contents into @file_list system( "/usr/bin/zcat $dumpfile | /sbin/restore tvf - > $tmp_dump_toc.$$" ); unless( open( RESTORE_TOC, "$tmp_dump_toc.$$" ) ){ die( "$progname : Unable to open TOC file $tmp_dump_toc.$$\n" ); } @line = <RESTORE_TOC>; for ( $i = 0 ; $i < @line ; $i++ ){ if ( $line[$i] =~ /^leaf/ ){ @token = split(/\t/, $line[$i] ); $leaf_file[$i] = $token[1]; chop( $leaf_file[$i] ); } } @file_list = sort( @leaf_file ); for ( $i = 0 ; $i < @file_list ; $i++ ){ $file_size[$i] = -s $file_list[$i]; if ( $file_size[$i] > 0 ){ $no_files++; } $total_size += $file_size[$i]; } # ----------------------------------------------------------------------------------- # Print detailed list of dumpfiles if ( $list_flag == 1 ){ for ( $i = 0 ; $i < @file_list ; $i++ ){ printf( "%d\t%s\n" , $file_size[$i], $file_list[$i] ); } } # ----------------------------------------------------------------------------------- # Print summary of results printf( "The level %d dump of %s partition written to %s\n", $dumplevel, $partition, $dumpfile ); printf( "contains %d files totalling %d KB, cf size of dumpfile = %d ( %d ) KB\n", $no_files, $total_size/1024, $dump_size_gzip/1024, $dump_size/1024 ); printf( "\nThe following files are larger than %d KB in size:\n", $threshold/1024 ); for ( $i = 0 ; $i < @file_list ; $i++ ){ if ( $file_size[$i] > $threshold ){ printf( "%d\t%s\n" , $file_size[$i], $file_list[$i] ); } } # ----------------------------------------------------------------------------------- # Cleanup temporary files etc. unlink ( "$tmp_dump_gzip.$$" ); unlink ( "$tmp_dump_toc.$$" ); --------------558CB1854F512336B4523BA1-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BAF1DCE.4BA21B8D>