Date: Sat, 27 Dec 2008 01:40:13 -0800 From: Gary Kline <kline@thought.org> To: Giorgos Keramidas <keramida@ceid.upatras.gr> Cc: FreeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: how can i be certain that a file has copied exactly? Message-ID: <20081227094012.GA39306@thought.org> In-Reply-To: <8763l61gbd.fsf@kobe.laptop> References: <20081227011335.GA29354@thought.org> <87ocyy2you.fsf@kobe.laptop> <20081227015634.GB29639@thought.org> <8763l61gbd.fsf@kobe.laptop>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Dec 27, 2008 at 04:51:18AM +0200, Giorgos Keramidas wrote: > On Fri, 26 Dec 2008 17:56:34 -0800, Gary Kline <kline@thought.org> wrote: > > On Sat, Dec 27, 2008 at 03:29:05AM +0200, Giorgos Keramidas wrote: > >> On Fri, 26 Dec 2008 17:13:39 -0800, Gary Kline <kline@thought.org> wrote: > >> > is there a way i can be sure that my little C program has copied a > >> > dos/win file named, say, foo.htm\;7 to simply foo.htm? > >> > > >> > my program uses fopen/fgets/fputs to copy the markup files. of the > >> > several i have copied, no problem. unless i hack cmp or diff, i have > >> > to avoid the shell. > >> > > >> > any ideas? in other words, does anybody have a prefab cmp(oldfile, > >> > newfile) fn? > >> > >> You don't need a prefab `cmp' function, because the base system already > >> includes tools that can help: > >> > >> cmp file1 file2 ; echo $? > >> md5 file1 file2 > >> sha1 file1 file2 > >> sha256 file1 file2 > > > > the problem is that there are several thousands of these files with > > dos names and an embedded '\;'7 in the file names. the shell gets in > > the way. i have tried > > > > sprintf(cmdbuf, "/usr/bin/cmp %s %s", orig, new); > > system(cmdbuf); > > > > chokes on the embedded bytes. > > > > i'm thinking of using > > > > find . -name "*" -print -exec {} \; > > > > and let me program select out the file suffix. i unlink the screwy > > dos-ish filename. that's why i want to be sure the copied/renamed > > files are right. > > Use quoting (and snprintf() because it supports range-checks for the > buffer you are passing to it): > > snprintf(cmdbuf, sizeof(cmdbuf), "cmp \"%s\" \"%s\"", orig, new); > howdy, in a word, YES, /usr/bin/cmp saved the save before i unlinked the oldfile. here is the strangeness. maybe you know, giorgos, or somebody else on-list. At first--before i got smart and used your snprintf to simply /bin/cp and then unlink---yes, or /bin/mv, or simply rename()--- Before, while i creating via fgets/fputs a new file, everything went fine until i ran out of buffer space. i increased to buf[4096] to buf[65535]. more files were successfully copied from dos\;5 to .dos/*.htm, actually. suddenly, cmp caught a mismatch and the program exited. a careful diff showed the err a something like line 3751. my copy was missing a byte near the EOF: </body></html minus the closing ">" so i upped the buffer space to 256000; same thing. is there a lim on the sizeof arrays, or is it [more likely] sloppy hacking? the size of the last file that wouldn't copy is 202K. just wondering. as i said, using snprintf() with quotes works, so i can do the same with the jpeg and gif files. just cp or mv then to a cleaner, more rational unix-esque [[ :-) ]] name. gary -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081227094012.GA39306>