Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Sep 1998 10:03:11 +0200
From:      Eivind Eklund <eivind@yes.no>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: Unused functions
Message-ID:  <19980916100311.62440@follo.net>
In-Reply-To: <199809152127.OAA25996@usr09.primenet.com>; from Terry Lambert on Tue, Sep 15, 1998 at 09:27:00PM %2B0000
References:  <19980915120620.32623@follo.net> <199809152127.OAA25996@usr09.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Sep 15, 1998 at 09:27:00PM +0000, Terry Lambert wrote:
> > [... on dead code elimination at the final code level ...]
> > > > It allows the programmer and the C scoping rules to 
> > > > work together to determine what should be associated and what need not.
> > > 
> > > Instead of the compiler merely calculating hamiltonian cycles in
> > > the dependency graph to do dead code elimination.
> > 
> > I don't get what Hamilton cycles has to do with this.  It looks like a
> > simple mark-and-sweep GC to me, and I can't see how looking for
> > Hamilton cycles are going to find.
> 
> You are looking for code without cycles through known referenced
> symbols (if it has a cyle through a known referenced symbol, you
> add it to the list of symbols).  Since you already have the tree
> in memory, with the edges correctly arranged, this is simpler than
> trying to run Warshall's across the entrie symbol table from the
> point of view of each node (and cheaper by an order of magnitude).

Hmmm.  I can't see any reason why you would want to do a
Floyd-Warshall - that calculate the connectivity of the entire graph,
which is really producing a lot of data you don't need.

I can see two ways of doing this that seem useful.  Both of them
require (well, prefer) a bit of used/unused for each symbol (partial
pseudo-code follows below).

(1) Recursive mark'n'sweep
void
Symbol_MarkAllChildren(struct Symbol *pSymbol) {
   int iChild;

   if (!pSymbol->isUsed) {
      pSymbol->isUsed = 1;
      for (iChild = 0; iChild < pSymbol->nChildren; iChild++)
         Symbol_MarkAllChildren(pSymbol->aChildren[iChild]);
   }
}

Clear_isUsed_in_all_symbols();
for (pSymbol over each symbol in the initially used set) {
   Symbol_MarkAllChildren(pSymbol);
}
/* Symbols marked with 'isUsed' are needed - the rest can be discarded */

(2) Non-recursive mark'n'sweep
void
Symbol_MarkChildren(struct Symbol *pSymbol) {
   int iChild;
   for (iChild = 0; iChild < pSymbol->nChildren; iChild++) {
      if (!pSymbol->aChildren[iChild]->isUsed) {
         pSymbol->aChildren[iChild]->isUsed = 1;
         Symbol_TransferToTempUsedList(pSymbol->aChildren[iChild]);
      }
   }
}

Clear_isUsed_in_all_symbols();
for (pSymbol over each symbol in the initially used set) {
   Symbol_MarkChildren(pSymbol);
   Symbol_TransferToFinalUsedList(pSymbol);
}
do {
   for (pSymbol over each symbol in the TempUsedList) {
      if (pSymbol->isUsed) {
	   Symbol_MarkChildren(pSymbol);
           Symbol_TransferToFinalUsedList(pSymbol);
      }
   }
} while(!List_isEmpty(TempUsedList));
/* Symbols marked isUsed are needed - the rest can be discarded.  The
   symbols to be used are in the 'FinalUsedList', while the ones that
   are to be discarded are left in the inital pool */

Both algorithm 1 and 2 are o(N) and O(N) in the number of edges
connecting the set of used symbols.  The only case you can avoid
examining all of these edges is if all symbols are included in the
final used set; then you could do a cutoff.  I don't think it would be
worthwhile, though.

The algorithms are sort-of equal - algorithm 1 just use the stack for
storing examination order, while algorithm 2 does the same thing using
three lists, giving a slightly different order for the marking
(basically a depth-first vs a breadth-first walk).

> > Also, I can't think of a single case where I have written code that is
> > likely to have even a single Hamilton cycle - I usually don't call
> > main() from elsewhere in my program (and I certainly don't call
> > _start).  If you can involve Hamilton cycles at all here, it sounds
> > like it must be on a subgraph.  How?
> 
> Consider all top level "known referenced" symbols to be connected
> by an edge.

Connected to _what_ by an edge?

> > For those following: A Hamilton cycle touch every node in the graph
> > exactly once, and forms a cycle.
> 
> More importantly, you can tell as DAG from a DCG.  Symbols in a DCG,
> where "known refrenced" forms an edge to the rest of the graph, means
> that you have to reference increasingly fewer nodes as you go on.

[...]
> You sort these onto either the "used" or the "unknown" list.
> 
> Then you look for a DCG using the "used" as a list, and traverse only
> the "unknown" entries.  If you find a cycle, you add the "unknown"
> symbols in the cycle to the "used" list.

Why is this easier/faster than the trivial algorithms I present above?
And where does hamilton cycles (as opposed to just cycles in the
graph) enter into this?

I can sort of get a picture of using cycles to separate used/unused
symbols, but I can't see any direct advantages to it, and I can't see
Hamiltonian cycles in there at all.

What am I missing here?  (I'm interested due to sometimes needing this
type of algorithms, and if something else is more effective than the
techniques outlined above - well, then I want that :-)

> Actually, you should talk to SEF about this; he's an old compiler
> guru, and he worked on the Microsoft compiler for SCO.

SEF, this is your call.  Speak up.

Eivind.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980916100311.62440>