From owner-freebsd-ports@FreeBSD.ORG Mon May 28 11:04:36 2007 Return-Path: X-Original-To: ports@freebsd.org Delivered-To: freebsd-ports@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2CCD816A41F; Mon, 28 May 2007 11:04:36 +0000 (UTC) (envelope-from Hartmut.Brandt@dlr.de) Received: from smtp-1.dlr.de (smtp-1.dlr.de [195.37.61.185]) by mx1.freebsd.org (Postfix) with ESMTP id BCD7B13C483; Mon, 28 May 2007 11:04:35 +0000 (UTC) (envelope-from Hartmut.Brandt@dlr.de) Received: from [129.247.12.6] ([129.247.12.6]) by smtp-1.dlr.de with Microsoft SMTPSVC(6.0.3790.1830); Mon, 28 May 2007 12:51:16 +0200 Message-ID: <465AB421.10802@dlr.de> Date: Mon, 28 May 2007 12:51:13 +0200 From: Hartmut Brandt User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: Stephen Montgomery-Smith References: <4659EF80.70100@math.missouri.edu> In-Reply-To: <4659EF80.70100@math.missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 28 May 2007 10:51:16.0185 (UTC) FILETIME=[1DA1E490:01C7A116] Cc: ports@freebsd.org, hackers@freebsd.org Subject: Re: Looking for speed increases in "make index" and pkg_version for ports X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 May 2007 11:04:36 -0000 Stephen Montgomery-Smith wrote: > I have been thinking a lot about looking for speed increases for "make > index" and pkg_version and things like that. So for example, in > pkg_version, it calls "make -V PKGNAME" for every installed package. Now > "make -V PKGNAME" should be a speedy operation, but the make has to load > in and analyze bsd.port.mk, a quite complicated file with about 200,000 > characters in it, when all it is needing to do is to figure out the > value of the variable PKGNAME. > > I suggest rewriting "make" so that variables are only evaluated on a > "need to know" basis. So, for example, if all we need to know is > PKGNAME, there is no need to evaluate, for example, _RUN_LIB_DEPENDS, > unless the writer of that particular port has done something like having > PORTNAME depend on the value of _RUN_LIB_DEPENDS. So "make" should > analyze all the code it is given, and only figure it out if it is needed > to do so. This would include, for example, figuring out .for and .if > directives on a need to know basis as well. > > I have only poked around a little inside the source for make, but I have > a sense that this would be a major undertaking. I certainly have not > thought through what it entails in more than a cursory manner. However > I am quite excited about the possibility of doing this, albeit I may > well put off the whole thing for a year or two or even forever depending > upon other priorities in my life. > > However, in the mean time I want to throw this idea out there to get > some feedback, either of the form of "this won't work," or of the form > "I will do it," or "I have tried to do this." Having done a great deal of rewriting of make some two years ago I can tell you that even a small change to make is a tough job testing-wise: run all the combinations of !-j and -j on all architectures and run the change through the port-building cluster. That's a warning to start with. Second I would start with careful profiling to find out where the problem actually is. You might be surprised. As an example: several times the idea came up to use a hash structure instead of linear lists for make variables. I got a patch for this and - it makes absolutely no difference performance-wise (well, there was some indication that performance gets worse, but that was around or below noise level). With careful I mean to find out who takes the time: 1. make and its sub-makes for a) reading the file; b) parsing the file (note that .if and .for processing is done while parsing); c) processing targets. 2. sub-shells executed for executing targets commands (note, that make optimizes the subshells away when there are no special shell symbol in the command line) 3. executed programs (find, sort, ...) Until you have numbers for this everything is rather moot. It might be a good idea to put some performance measurement hooks into make for this to do. If anybody wants to work on make, I would rather recommend to implement %-rules :-) And if anybody wants to recommend gmake over make(1) - look into the code, what mess that is :-/ Regards, harti