From owner-freebsd-current@FreeBSD.ORG Mon Nov 21 02:57:03 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5E13716A420; Mon, 21 Nov 2005 02:57:03 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from smtp4.server.rpi.edu (smtp4.server.rpi.edu [128.113.2.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id B410843D45; Mon, 21 Nov 2005 02:57:02 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp4.server.rpi.edu (8.13.0/8.13.0) with ESMTP id jAL2v0ET012598; Sun, 20 Nov 2005 21:57:01 -0500 Mime-Version: 1.0 Message-Id: In-Reply-To: <20051120192914.GC19572@uk.tiscali.com> References: <20051116161540.GB4383@uk.tiscali.com> <437F7E22.5050800@freebsd.org> <20051120192914.GC19572@uk.tiscali.com> Date: Sun, 20 Nov 2005 21:56:58 -0500 To: Brian Candler , Tim Kientzle From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-CanItPRO-Stream: default X-RPI-SA-Score: undef - spam-scanning disabled X-Scanned-By: CanIt (www . canit . ca) on 128.113.2.4 Cc: freebsd-current@freebsd.org Subject: Re: Order of files with 'cp' X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Nov 2005 02:57:03 -0000 At 7:29 PM +0000 11/20/05, Brian Candler wrote: >On Sat, Nov 19, 2005 at 11:33:54AM -0800, Tim Kientzle wrote: >> Brian Candler wrote: > > > I've noticed on FreeBSD-5.4 and -6.0 that the order in which > > > 'cp' copies multiple files does not match the order they're > > > given on the command line. > > ... > > > I've had a look through the code, and it seems that cp calls > > > fts_open() with the list of files in argv; fts_open then does > > > a qsort() on the arguments, using the comparison function > > > mastercmp() provided by cp: > > >> My suggestion: Have 'cp' call fts_open once for each >> command-line argument, instead of giving fts_open the entire >> argv list to muck with. > >Erm, but that just undoes the reason for calling fts_open with >mastercmp in the first place, which is to get it to pick files >before directories (or vice versa, as its behaviour seems to >be) as an 'optimisation'. If I understand the situation right, the suggestion would not completely undo the optimization that 'cp' is trying to do. Consider the command: cp -rp file1 dir1 file2 dir2 destdir The suggestion would mean the files going into destdir itself would not be sorted, but (if I understand this thread) files copied into destdir/dir1 and destdir/dir2 would still be sorted. Apparently this "sorting optimization" in `cp' goes all the way back to the original version of `cp' from 1994. While I expect we should change it to something better, I don't think we have any urgent reason to fix it immediately. Which is to say, let's figure out what the issues are, and come up with the best fix instead of the "easiest change" which we can rush to implement. *Assuming* the comment is correct, and that there *is* some performance benefit by copying files before directories, then it still seems to me that sorting all the files is a pretty clumsy heavy-handed way to accomplish that. These days some people have directories with tens of thousands of entries in them. Do we really want the overhead of "sorting" all of those entries just so files are copied before directories? I think a better fix might be to add an option to fts_open() which tells it to "process files before directories" (or visa-versa) in any given directory. Then `cp' could turn on that bit, and avoid the fake sort. It seems to me that if fts_open realizes that is wanted, then it could implement that behavior in some manner which is faster than sorting all entries. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu