Date: Thu, 13 Feb 1997 16:05:31 -0800 (PST) From: Jake Hamby <hamby@aris.jpl.nasa.gov> To: Satoshi Asami <asami@vader.cs.berkeley.edu> Cc: jmb@freefall.freebsd.org, hackers@freebsd.org Subject: Re: Sun Workshop compiler vs. GCC? Message-ID: <Pine.GSO.3.95.970213154912.10210A-100000@aris> In-Reply-To: <199702132243.OAA18747@vader.cs.berkeley.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 13 Feb 1997, Satoshi Asami wrote: > I'm not saying options don't make a huge difference, I know I can make > my compiler do totally stupid things (like if I take out -O :). > > I don't know what the -native option does, but what I'm saying is that > once the "simple" optimizations are covered, adding more and more > complex optimizations (as suggested by the "taking 3 times more to > compile" comment) is not going to give you much difference. -native causes it to optimize for whatever CPU is one the machine you're using to compile (e.g. 486, Pentium, PPro, sun4c, sun4m, sun4u, etc.) It will generate code that's compatible with the other processors, though (you can choose some other options and generate code that ONLY runs on UltraSPARC or PPro). -fast includes -xO4 and -native. As for "not making much difference", I'd be inclined to disagree. Here's what the Sun compiler does at the different levels (for x86): -xO1 Preloads arguments from memory, cross jump- ing (tail merging), as well as the single pass of the default optimization. -xO2 Schedules both high- and low-level instruc- tions and performs improved spill analysis, loop memory-reference elimination, register lifetime analysis, enhanced register alloca- tion, and elimination of global common subex- pression. -xO3 Performs loop strength reduction, induction variable elimination, as well as the optimi- zation done by level 2. -xO4 Performs loop unrolling, avoids creating stack frames when possible, and automatically inlines functions contained in the same file, as well as the optimization done by levels 2 and 3. Note that this optimization level can cause stack traces from adb and dbx to be incorrect. -xO5 Generates the highest level of optimization. Uses optimization algorithms that take more compilation time or that do not have as high a certainty of improving execution time. Some of these include generating local calling convention entry points for exported func- tions, further optimizing spill code, and added analysis to improve instruction scheduling. Your argument probably applies to -xO5, but it sounds like -xO4 performs some very useful optimizations indeed. I routinely run the Metrowerks compiler at -O7 (it has four levels of optimization, plus peephole optimization plus code scheduling), so either of these compilers sounds a lot more sophisticated than GCC, if for no other reason than the granularity of choices available. > Of course, if the original Sun compiler was very brain damaged, you > could see a big improvement. Maybe it was running in 386 mode without > -native or something? :) If you read the man page for the 4.0 ProCompiler, it sounds much less sophisticated than what I've just excerpted, at least for x86. It also didn't support PPro optimization. BTW, Sun is dropping support for the 386 as of Solaris 2.6, so one would presume that they'll recompile with optimizations for best performance across 486/Pentium/PPro, with perhaps separate versions of bcopy() or other speed-critical functions (they already do this on UltraSPARC). ------------------------------------------------------------------------------ |Jake Hamby| APT Engineer at JPL, CS student at Cal Poly, and BeOS developer!| ------------------------------------------------------------------------------ "Life is hard..."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.3.95.970213154912.10210A-100000>