Date: Wed, 25 Feb 2004 16:06:03 -0500 From: Chuck Swiger <cswiger@mac.com> To: Petri Helenius <pete@he.iki.fi> Cc: freebsd-performance@freebsd.org Subject: Re: Bad performance on alpha? (make buildworld) Message-ID: <403D0E3B.6090805@mac.com> In-Reply-To: <403D06AE.8070903@he.iki.fi> References: <20040223192103.59ad7b69.lehmann@ans-netz.de> <20040224202652.GA13675@diogenis.ceid.upatras.gr> <5410C982-6730-11D8-8D4C-003065ABFD92@mac.com> <20040225025953.GH10121@gsmx07.alcatel.com.au> <403C6A24.80804@he.iki.fi> <20040225194324.GI10121@gsmx07.alcatel.com.au> <403D06AE.8070903@he.iki.fi>
next in thread | previous in thread | raw e-mail | index | archive | help
Petri Helenius wrote: > Talking about different instruction sets and compiler scheluding > options. Would it be considered a good idea to introduce a sysctl which > would contain the maximum mcpu= value for the currently running > architechture? This way one could provide with multiple executables and > a startup script, in the fashion of: > prog.i386 > prog.pentium2 > prog.pentium3 > prog.pentium4 > prog.athlon-mp > etc... The idea you've suggested is interesting, although the distinction between code generation between a P2 and P4, or for the AMD chips is fairly minimal for most code, the obvious exception being code which tries to take advantage of CPU features like MMX, SSE, & 3D-Now! In other words, your suggestion wouldn't help grep or the kernel very much, but could be fairly useful for multimedia apps. There's also a very good implementation for supporting multiple architectures within a single binary, called the Mach-O executable format (rather than ELF) used to create "fat binaries", or "MAB"s (multi-architecture binaries). Mach-O is the format used by NEXTSTEP and MacOS X. Typically, adding a new architecture only adds about ~15% to the size of a particular executable, although that can vary quite widely. From /usr/include/mach-o/arch.h: /* The NXArchInfo structs contain the architectures symbolic name * (such as "ppc"), its CPU type and CPU subtype as defined in * mach/machine.h, the byte order for the architecture, and a * describing string (such as "PowerPC"). * There will both be entries for specific CPUs (such as ppc604e) as * well as generic "family" entries (such as ppc). */ typedef struct { const char *name; cpu_type_t cputype; cpu_subtype_t cpusubtype; enum NXByteOrder byteorder; const char *description; } NXArchInfo; #if __cplusplus extern "C" { #endif /* __cplusplus */ /* NXGetAllArchInfos() returns a pointer to an array of all known * NXArchInfo structures. The last NXArchInfo is marked by a NULL name. */ extern const NXArchInfo *NXGetAllArchInfos(void); /* NXGetLocalArchInfo() returns the NXArchInfo for the local host, or NULL * if none is known. */ extern const NXArchInfo *NXGetLocalArchInfo(void); /* NXGetArchInfoFromName() and NXGetArchInfoFromCpuType() return the * NXArchInfo from the architecture's name or cputype/cpusubtype * combination. A cpusubtype of CPU_SUBTYPE_MULTIPLE can be used * to request the most general NXArchInfo known for the given cputype. * NULL is returned if no matching NXArchInfo can be found. */ extern const NXArchInfo *NXGetArchInfoFromName(const char *name); extern const NXArchInfo *NXGetArchInfoFromCpuType(cpu_type_t cputype, cpu_subtype_t cpusubtype); /* NXFindBestFatArch() is passed a cputype and cpusubtype and a set of * fat_arch structs and selects the best one that matches (if any) and returns * a pointer to that fat_arch struct (or NULL). The fat_arch structs must be * in the host byte order and correct such that the fat_archs really points to * enough memory for nfat_arch structs. It is possible that this routinecould * fail if new cputypes or cpusubtypes are added and an old version of this * routine is used. But if there is an exact match between the cputype and * cpusubtype and one of the fat_arch structs this routine will alwayssucceed. */ extern struct fat_arch *NXFindBestFatArch(cpu_type_t cputype, cpu_subtype_t cpusubtype, struct fat_arch *fat_archs, unsigned long nfat_archs); [ ... ] ---------- /usr/include/mach/machine.h supports the following CPUTYPEs: /* * Machine types known by all. */ #define CPU_TYPE_ANY ((cpu_type_t) -1) #define CPU_TYPE_VAX ((cpu_type_t) 1) /* skip ((cpu_type_t) 2) */ /* skip ((cpu_type_t) 3) */ /* skip ((cpu_type_t) 4) */ /* skip ((cpu_type_t) 5) */ #define CPU_TYPE_MC680x0 ((cpu_type_t) 6) #define CPU_TYPE_I386 ((cpu_type_t) 7) /* skip CPU_TYPE_MIPS ((cpu_type_t) 8) */ /* skip ((cpu_type_t) 9) */ #define CPU_TYPE_MC98000 ((cpu_type_t) 10) #define CPU_TYPE_HPPA ((cpu_type_t) 11) /* skip CPU_TYPE_ARM ((cpu_type_t) 12) */ #define CPU_TYPE_MC88000 ((cpu_type_t) 13) #define CPU_TYPE_SPARC ((cpu_type_t) 14) #define CPU_TYPE_I860 ((cpu_type_t) 15) /* skip CPU_TYPE_ALPHA ((cpu_type_t) 16) */ /* skip ((cpu_type_t) 17) */ #define CPU_TYPE_POWERPC ((cpu_type_t) 18) ...which appear to be a proper superset of the platforms FreeBSD supports. For the sake of reference, since the CPU_SUBTYPE list is ~200 lines, here are the x86 variants MachO knows about: /* * I386 subtypes. */ #define CPU_SUBTYPE_I386_ALL ((cpu_subtype_t) 3) #define CPU_SUBTYPE_386 ((cpu_subtype_t) 3) #define CPU_SUBTYPE_486 ((cpu_subtype_t) 4) #define CPU_SUBTYPE_486SX ((cpu_subtype_t) 4 + 128) #define CPU_SUBTYPE_586 ((cpu_subtype_t) 5) #define CPU_SUBTYPE_INTEL(f, m) ((cpu_subtype_t) (f) + ((m) << 4)) #define CPU_SUBTYPE_PENT CPU_SUBTYPE_INTEL(5, 0) #define CPU_SUBTYPE_PENTPRO CPU_SUBTYPE_INTEL(6, 1) #define CPU_SUBTYPE_PENTII_M3 CPU_SUBTYPE_INTEL(6, 3) #define CPU_SUBTYPE_PENTII_M5 CPU_SUBTYPE_INTEL(6, 5) #define CPU_SUBTYPE_INTEL_FAMILY(x) ((x) & 15) #define CPU_SUBTYPE_INTEL_FAMILY_MAX 15 #define CPU_SUBTYPE_INTEL_MODEL(x) ((x) >> 4) #define CPU_SUBTYPE_INTEL_MODEL_ALL 0 -- -Chuck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?403D0E3B.6090805>