From owner-freebsd-bugs@FreeBSD.ORG Tue Apr 27 19:10:23 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 99C2216A4CF for ; Tue, 27 Apr 2004 19:10:23 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82E1943D48 for ; Tue, 27 Apr 2004 19:10:23 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i3S2ANmE083320 for ; Tue, 27 Apr 2004 19:10:23 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i3S2ANNO083319; Tue, 27 Apr 2004 19:10:23 -0700 (PDT) (envelope-from gnats) Date: Tue, 27 Apr 2004 19:10:23 -0700 (PDT) Message-Id: <200404280210.i3S2ANNO083319@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: "Dorr H. Clark" Subject: Re: bin/53475: cp(1) copies files in reverse order to destination X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Dorr H. Clark" List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Apr 2004 02:10:23 -0000 The following reply was made to PR bin/53475; it has been noted by GNATS. From: "Dorr H. Clark" To: freebsd-gnats-submit@FreeBSD.org, louie@TransSys.COM Cc: Subject: Re: bin/53475: cp(1) copies files in reverse order to destination Date: Tue, 27 Apr 2004 19:03:14 -0700 --- cp.c_orig Sun Apr 25 06:32:27 2004 +++ cp.c Sun Apr 25 06:33:50 2004 @@ -94,7 +94,6 @@ enum op { FILE_TO_FILE, FILE_TO_DIR, DIR_TO_DNE }; static int copy(char *[], enum op, int); -static int mastercmp(const FTSENT * const *, const FTSENT * const *); static void siginfo(int __unused); int @@ -274,7 +273,7 @@ mask = ~umask(0777); umask(~mask); - if ((ftsp = fts_open(argv, fts_options, mastercmp)) == NULL) + if ((ftsp = fts_open(argv, fts_options, NULL)) == NULL) err(1, "fts_open"); for (badcp = rval = 0; (curr = fts_read(ftsp)) != NULL; badcp = 0) { switch (curr->fts_info) { @@ -478,32 +477,6 @@ if (errno) err(1, "fts_read"); return (rval); -} - -/* - * mastercmp -- - * The comparison function for the copy order. The order is to copy - * non-directory files before directory files. The reason for this - * is because files tend to be in the same cylinder group as their - * parent directory, whereas directories tend not to be. Copying the - * files first reduces seeking. - */ -static int -mastercmp(const FTSENT * const *a, const FTSENT * const *b) -{ - int a_info, b_info; - - a_info = (*a)->fts_info; - if (a_info == FTS_ERR || a_info == FTS_NS || a_info == FTS_DNR) - return (0); - b_info = (*b)->fts_info; - if (b_info == FTS_ERR || b_info == FTS_NS || b_info == FTS_DNR) - return (0); - if (a_info == FTS_D) - return (-1); - if (b_info == FTS_D) - return (1); - return (0); } static void As quoted above, the comments in cp.c tell us the function mastercmp() is an attempt to improve performance based on knowing something about physical disks. This is an old optimization strategy (it's in the original version of cp.c). AFAIK, in the updated BSD filesystem, when we copy a file, we don't actually move the physical data block of the file but change the information in its inode such as the address of its data block and owner. Deleting mastercmp() and setting the comparison paramter to NULL for the function fts_open() suppresses the behavior in the bug. The next question is whether deleting mastercmp eliminates an optimization. Our testing shows the exact opposite, mastercmp is degrading performance. We did several experiments with cp -R to measure elapsed time on transfers between devices of differing file system types (to avoid UFS2 optimizations). Our results show removing mastercmp yields a small performance gain (note: we had no SCSI devices available, and second note: variability in file system performance seems dominated by other factors). M. K. McKusick has indicated in seminars that modern disk drives lie to the driver about their physical layouts. The use of mastercmp in cp.c is a legacy optimization from a different era of disk technology. We recommend removing this call from cp.c to address 53475. Ting Hui, engineer Dorr H. Clark, advisor Graduate School of Engineering Santa Clara University