From owner-freebsd-hackers Sun Mar 10 14: 5:55 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from dastardly.newsbastards.org.72.27.172.IN-addr.ARPA.NetScum.dyndns.dk (pop-ls-10-3-dialup-2.freesurf.ch [194.230.238.2]) by hub.freebsd.org (Postfix) with ESMTP id 9E99E37B402 for ; Sun, 10 Mar 2002 14:05:13 -0800 (PST) Received: from beerswilling.netscum.dyndns.dk (dcf77-zeit.netscum.dyndns.dk [172.27.72.27] (may be forged)) by dastardly.newsbastards.org.72.27.172.IN-addr.ARPA.NetScum.dyndns.dk (8.11.6/8.11.6) with ESMTP id g2AK5TR99335 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified FAIL) for ; Sun, 10 Mar 2002 21:05:36 +0100 (CET) (envelope-from bounce@netscum.dyndns.dk) Received: (from root@localhost) by beerswilling.netscum.dyndns.dk (8.11.6/8.11.6) id g2AK5SN99334; Sun, 10 Mar 2002 21:05:28 +0100 (CET) (envelope-from bounce@netscum.dyndns.dk) Date: Sun, 10 Mar 2002 21:05:28 +0100 (CET) Message-Id: <200203102005.g2AK5SN99334@beerswilling.netscum.dyndns.dk> From: BOUWSMA Beery Subject: Performance of FreeBSD vs NetBSD (was: Re: Performance of -current vs -stable) To: hackers@freebsd.org Organization: Men not wearing any pants that dont shave X-Hacked: via telnet to your port 25, what else? X-Internet-Access-Provided-By: Slow Dial-in Modem X-NetScum: Yes X-One-And-Only-Real-True-Fluffy: No Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG [replies sent directly to me may timeout and bounce, since I'm not online as often as I should be, but I'll check the list archives] I wrote a while back, in an old freebsd-current list thread, in which it was determined that WITNESS was to be avoided in -current if one wanted decent performance... > Hmmm, a few weeks ago I did some totally unscientific testing, noting > that -current was much slower than -stable, by playing an mp3 with an [...] > explain why FreeBSD's mpg123 takes ~60% CPU and NetBSD's ~30% (vs the > ~90+% usage by -current)... > Oh, I'll try rebuilding -current Real Soon^W^W later today, without > WITNESS, and compare, just to stay on-topic for this list. And so I did. (build sans WITNESS, not stay on-topic) The results were, well, interesting. I built both a WITNESS and a WITNESSless kernel with more recent k0deZ, and in the case of playing an mp3 file with `mpg123', I saw practically no difference between the two, based on %cpu as shown by `top' (like I say, completely unscientific and inaccurate) That's interesting, because the previous -current+WITNESS reported a sound-related lock order reversal and mpg123 took >90% cpu, while neither of the more recent kernels had this lock order reversal. In fact, the %cpu needed by `mpg123' seemed identical between -current, both with and without WETNESS, and -stable. Look: Stable CPU states: 52.7% user, 0.0% nice, 2.3% system, 5.8% interrupt, 39.2% idle PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 272 beer 45 0 344M 908K RUN 3:10 56.10% 56.10% mpg123-O3 260 root 28 0 1440K 1176K RUN 0:04 0.24% 0.24% top Current+WITNESS CPU states: 51.9% user, 0.0% nice, 8.3% system, 12.4% interrupt, 27.4% idle PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 305 beer 115 0 113M 960K RUN 2:06 55.91% 55.91% mpg123-O3 10 root -16 0 0K 12K RUN 1:52 27.39% 27.39% idle 29 root -60 -179 0K 12K WAIT 0:14 5.96% 5.96% irq5: sbc0 12 root -48 -167 0K 12K WAIT 0:07 2.00% 2.00% swi6: tty:sio 313 root 97 0 1544K 1264K RUN 0:02 1.19% 1.17% top Current-without CPU states: 54.0% user, 0.0% nice, 6.8% system, 10.6% interrupt, 28.7% idle PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 322 beer 117 0 97400K 1384K RUN 4:44 55.18% 55.18% mpg123-O3 10 root -16 0 0K 12K RUN 5:38 26.95% 26.95% idle 29 root -60 -179 0K 12K WAIT 0:35 6.01% 6.01% irq5: sbc0 12 root -48 -167 0K 12K WAIT 0:19 2.83% 2.83% swi6: tty:sio 313 root 97 0 1544K 1252K RUN 0:16 1.07% 1.07% top In both -current and -stable, the audio is usually smooth but periodically has a hiccup or two and loops briefly. But the very same hardware, booted into NetBSD off the same disk, running a NetBSD-native binary of mpg123 on NetBSD-current shows this: CPU states: 38.1% user, 0.0% nice, 1.5% system, 1.0% interrupt, 59.4% idle PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 229 beer 10 0 308K 3828K aud_wr 1:17 37.16% 37.16% mpg123 245 root 28 0 164K 860K CPU 0:00 0.89% 0.29% top This machine can happily do a second task without it needing to be `nice'd and still exude clean audio. Not possible with FreeBSD. Just in case I had botched the optimizations for the FreeBSD versions of mpg123, I compiled them statically (I couldn't get the NetBSD version to run under FreeBSD tho), and ran those under NetBSD with the COMPAT_FREEBSD kernel option. Under FreeBSD I saw no change in CPU needed, whilst surprisingly, the static FreeBSD binary run under NetBSD on the same hardware needed *less* cpu than before: CPU states: 20.3% user, 0.0% nice, 1.0% system, 0.0% interrupt, 78.7% idle PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 241 beer 36 0 512K 2020K RUN 0:24 21.71% 21.63% mpg123-stati 268 root 28 0 164K 860K CPU 0:00 0.74% 0.24% top I'm concerned. I know this isn't at all scientific, but why would NetBSD require about 1/3 the CPU to run the same binary as either -stable or -current FreeBSD on the same hardware? Can I be slowing my FreeBSD OSen by using a kernel option that I shouldn't, or is something else coming into play? Another thing of note, with this hardware, under FreeBSD, it takes some seconds (ten or so) from the time I start mpg123 until it stops pegging the CPU and starts to play audio, while with NEtBSD, playing starts immediately with both binaries. If that means anything. -stable: ls: /etc/malloc.conf: No such file or directory -current: /FreeBSD-CURRENT/etc/malloc.conf -> aj sound card: sbc0: at port 0x220-0x22f,0x388-0x38b,0x330-0x331 irq 5 drq 1,0 on isa0 pcm0: on sbc0 cpu: CPU: Pentium/P54C (75.00-MHz 586-class CPU) Origin = "GenuineIntel" Id = 0x524 Stepping = 4 Features=0x1bf real memory = 75497472 (73728K bytes) (entire dmesg on request if of interest) kernel config options used, if something is obviously a hog, -stable: options INET #InterNETworking options FFS #Berkeley Fast Filesystem options FFS_ROOT #FFS usable as root device [keep this!] options SOFTUPDATES #Enable FFS soft updates support options MFS #Memory Filesystem options PROCFS #Process filesystem options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=2000 #Delay (in ms) before probing SCSI options USERCONFIG #boot -c editor options VISUAL_USERCONFIG #visual boot -c editor options KTRACE #ktrace(1) support options SYSVSHM #SYSV-style shared memory options SYSVMSG #SYSV-style message queues options SYSVSEM #SYSV-style semaphores options P1003_1B #Posix P1003_1B real-time extensions options _KPOSIX_PRIORITY_SCHEDULING options ICMP_BANDLIM #Rate limit bad replies options KBD_INSTALL_CDEV # install a CDEV entry in /dev options ATA_STATIC_ID #Static device numbering options PQ_CACHESIZE=512 # color for 512k/16k cache options CPU_SUSP_HLT options USER_LDT #allow user-level control of i386 ldt options SHMMAXPGS=10000 # max amount of shared memory pages (4k on i386) options SHMALL=8192 # max amount of shared memory (bytes) options SHMMIN=1 # min shared memory segment size (bytes) options SHMMNI=200 # max number of shared memory identifiers options SHMSEG=200 # max shared memory segments per process options SEMMAP=31 # amount of entries in semaphore map options SEMMNI=128 # number of semaphore identifiers in the system options SEMMNS=65536 # number of semaphores in the system options SEMMNU=31 # number of undo structures in the system options SEMMSL=512 # max number of semaphores per id options SEMOPM=101 # max number of operations per semop call options SEMUME=11 # max number of undo entries per process options DDB options DDB_UNATTENDED options PERFMON options PPP_BSDCOMP #PPP BSD-compress support options PPP_DEFLATE #PPP zlib/deflate/gzip support options PPP_FILTER #enable bpf filtering (needs bpf) options RANDOM_IP_ID options ICMP_BANDLIM options NFS_NOSERVER #Disable the NFS-server code. options UFS_DIRHASH options EXT2FS options CAM_MAX_HIGHPOWER=2 options MSGBUF_SIZE=40960 options PPS_SYNC options MAXCONS=12 # number of virtual consoles options SC_DISABLE_REBOOT # disable reboot key sequence options SC_HISTORY_SIZE=500 # number of history buffer lines options SC_NORM_ATTR="(FG_YELLOW|BG_BLACK)" options SC_NORM_REV_ATTR="(FG_BLACK|BG_GREEN)" options SC_KERNEL_CONS_ATTR="(FG_RED|BG_BLACK)" options SC_KERNEL_CONS_REV_ATTR="(FG_BLACK|BG_RED)" options CLK_USE_I8254_CALIBRATION options CLK_USE_TSC_CALIBRATION options NMBCLUSTERS=4096 (I post this in case something is glaringly obvious, before I start to randomly disable half these options to see if things change) Rest of kernel config skipped here, but can be posted if of interest I haven't paid much attention to CPU usage of mpg123 on a 500MHz machine with both OSen, but it's in the low single digits, and is not as blindingly obvious as with this slow machine. I've done a `buildworld' on this 75MHz machine for FreeBSD-current and -stable is in progress, so I think I'll try a NetBSD build too and see how elapsed time (many many hours) compares, not that that is a reliable indicator of speed and efficiency either. confused, barry bouwsma To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message