Date: Wed, 16 Sep 1998 10:03:11 +0200 From: Eivind Eklund <eivind@yes.no> To: Terry Lambert <tlambert@primenet.com> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: Unused functions Message-ID: <19980916100311.62440@follo.net> In-Reply-To: <199809152127.OAA25996@usr09.primenet.com>; from Terry Lambert on Tue, Sep 15, 1998 at 09:27:00PM %2B0000 References: <19980915120620.32623@follo.net> <199809152127.OAA25996@usr09.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Sep 15, 1998 at 09:27:00PM +0000, Terry Lambert wrote: > > [... on dead code elimination at the final code level ...] > > > > It allows the programmer and the C scoping rules to > > > > work together to determine what should be associated and what need not. > > > > > > Instead of the compiler merely calculating hamiltonian cycles in > > > the dependency graph to do dead code elimination. > > > > I don't get what Hamilton cycles has to do with this. It looks like a > > simple mark-and-sweep GC to me, and I can't see how looking for > > Hamilton cycles are going to find. > > You are looking for code without cycles through known referenced > symbols (if it has a cyle through a known referenced symbol, you > add it to the list of symbols). Since you already have the tree > in memory, with the edges correctly arranged, this is simpler than > trying to run Warshall's across the entrie symbol table from the > point of view of each node (and cheaper by an order of magnitude). Hmmm. I can't see any reason why you would want to do a Floyd-Warshall - that calculate the connectivity of the entire graph, which is really producing a lot of data you don't need. I can see two ways of doing this that seem useful. Both of them require (well, prefer) a bit of used/unused for each symbol (partial pseudo-code follows below). (1) Recursive mark'n'sweep void Symbol_MarkAllChildren(struct Symbol *pSymbol) { int iChild; if (!pSymbol->isUsed) { pSymbol->isUsed = 1; for (iChild = 0; iChild < pSymbol->nChildren; iChild++) Symbol_MarkAllChildren(pSymbol->aChildren[iChild]); } } Clear_isUsed_in_all_symbols(); for (pSymbol over each symbol in the initially used set) { Symbol_MarkAllChildren(pSymbol); } /* Symbols marked with 'isUsed' are needed - the rest can be discarded */ (2) Non-recursive mark'n'sweep void Symbol_MarkChildren(struct Symbol *pSymbol) { int iChild; for (iChild = 0; iChild < pSymbol->nChildren; iChild++) { if (!pSymbol->aChildren[iChild]->isUsed) { pSymbol->aChildren[iChild]->isUsed = 1; Symbol_TransferToTempUsedList(pSymbol->aChildren[iChild]); } } } Clear_isUsed_in_all_symbols(); for (pSymbol over each symbol in the initially used set) { Symbol_MarkChildren(pSymbol); Symbol_TransferToFinalUsedList(pSymbol); } do { for (pSymbol over each symbol in the TempUsedList) { if (pSymbol->isUsed) { Symbol_MarkChildren(pSymbol); Symbol_TransferToFinalUsedList(pSymbol); } } } while(!List_isEmpty(TempUsedList)); /* Symbols marked isUsed are needed - the rest can be discarded. The symbols to be used are in the 'FinalUsedList', while the ones that are to be discarded are left in the inital pool */ Both algorithm 1 and 2 are o(N) and O(N) in the number of edges connecting the set of used symbols. The only case you can avoid examining all of these edges is if all symbols are included in the final used set; then you could do a cutoff. I don't think it would be worthwhile, though. The algorithms are sort-of equal - algorithm 1 just use the stack for storing examination order, while algorithm 2 does the same thing using three lists, giving a slightly different order for the marking (basically a depth-first vs a breadth-first walk). > > Also, I can't think of a single case where I have written code that is > > likely to have even a single Hamilton cycle - I usually don't call > > main() from elsewhere in my program (and I certainly don't call > > _start). If you can involve Hamilton cycles at all here, it sounds > > like it must be on a subgraph. How? > > Consider all top level "known referenced" symbols to be connected > by an edge. Connected to _what_ by an edge? > > For those following: A Hamilton cycle touch every node in the graph > > exactly once, and forms a cycle. > > More importantly, you can tell as DAG from a DCG. Symbols in a DCG, > where "known refrenced" forms an edge to the rest of the graph, means > that you have to reference increasingly fewer nodes as you go on. [...] > You sort these onto either the "used" or the "unknown" list. > > Then you look for a DCG using the "used" as a list, and traverse only > the "unknown" entries. If you find a cycle, you add the "unknown" > symbols in the cycle to the "used" list. Why is this easier/faster than the trivial algorithms I present above? And where does hamilton cycles (as opposed to just cycles in the graph) enter into this? I can sort of get a picture of using cycles to separate used/unused symbols, but I can't see any direct advantages to it, and I can't see Hamiltonian cycles in there at all. What am I missing here? (I'm interested due to sometimes needing this type of algorithms, and if something else is more effective than the techniques outlined above - well, then I want that :-) > Actually, you should talk to SEF about this; he's an old compiler > guru, and he worked on the Microsoft compiler for SCO. SEF, this is your call. Speak up. Eivind. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980916100311.62440>