Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Mar 2002 23:34:11 +0100 (CET)
From:      BOUWSMA Beery <freebsd-user@dcf77-zeit.netscum.dyndns.dk>
To:        hackers@freebsd.org
Subject:   [LONG] Re: Performance of FreeBSD vs NetBSD (was: Re: Performance of -current vs -stable)
Message-ID:  <200203132234.g2DMYBO03364@beerswilling.netscum.dyndns.dk>

next in thread | raw e-mail | index | archive | help
(sorry for the delay in following-up to this thread; when the Big Blue
 Room is cloudless and approaching 25 degrees at this time of year, I feel
 an uncontrollable craving to lock myself in that room most of the day)

I wrote:

> > Hmmm, a few weeks ago I did some totally unscientific testing, noting
> > that -current was much slower than -stable, by playing an mp3 with an

And then a lot of people responded.  So let me attempt to restate
things, and possibly clear things up thereby.  Or, you can just skip
straight to the end, where I reveal just what I did to restore similar
performance with FreeBSD that I saw under NetBSD, which shall terminate
this thread.  One can hope.


My observations were as follows:
o)  I had problems doing ``work'' and listening to mp3s with a native
    mpg123 binary under both FreeBSD-CURRENT and 4.5-STABLE.
o)  I had no problems with a comparable native binary and NetBSD-current.

o)  Both FreeBSD-CURRENT and FreeBSD-stable performed roughly identically,
    both with and without the kernel WITNESS option, so I wasn't seeing
    the killer performance there that others have noted, just as a side note.

I then asked if any of the config options I posted from part of my kernel
configuration for -STABLE were known show-stoppers to be avoided.  By the
time I had updated my archive of the mailing lists a day or two later,
nobody had pointed an accusing finger, so I've decided to do somewhat
more extensive testing.

Seat-of-the-pants observation with `top' showed an apparent improvement
by a factor of two in CPU usage when running the native NetBSD binary
under NetBSD.  Other observations I've made, that I'll be using as
datapoints later, are that a normal `buildworld' for both -current
and -stable on this 75MHz hardware take somewhere arount 1000 to 1100
minutes or so; also, a `nice'd `installworld' (out of necessity niced
in order to get relatively real-time audio playback with only a few
pauses each minute) took two or three hours when running mpg123.


Then I took one of the FreeBSD binaries and re-linked it statically,
in order to run it under NetBSD as well as both FreeBSDen.  With this,
the FreeBSD performance was unchanged, whilst that of NetBSD actually
improved by `top' to a ratio of ~3:1 CPU usage by FreeBSD.

Now I'll be doing other tests, to guess whether this is a real system-
like issue, or if it only affects mpg123, or my audio setup, or what.
Ideas include timing a comparable build process under NetBSD (which
does rather differ from that of FreeBSD, so perhaps only for amusement
value), and attempting to run the same build process with both Net-
and FreeBSD.  Other tests limited to a `buildkernel' may be tried,
so I can get more results than a build per day and a half.


Hey, oboyoboy, one -STABLE FreeBSD test gave these results:
bash-2.05a$ time /usr/obj/ports/5.0/usr/ports/audio/mpg123/work/mpg123-0.59r/mp
g123-static-O3-current -t -v /usr/home/mp3/hr-XXL-chillout-11.aug.mp3
[...]
Playing MPEG stream from hr-XXL-chillout-11.aug.mp3 ...
Junk at the beginning 00000000
MPEG 1.0, Layer: III, Freq: 44100, mode: Joint-Stereo, modext: 2, BPF : 522
Channels: 2, copyright: No, original: Yes, CRC: No, emphasis: 0.
Bitrate: 160 Kbits/s, Extension value: 0
Audio: 1:1 conversion, rate: 44100, encoding: signed 16 bit, channels: 2
Frame# 308633 [    0], Time: 134:22.24 [00:00.00],
[120:20] Decoding of hr-XXL-chillout-11.aug.mp3 finished.
real    35m43.727s
user    33m41.078s
sys     0m19.797s
bash-2.05a$

This seems to imply that at 35 realtime minutes for a 120 minute file,
FreeBSD-STABLE can play back at about 3 1/2 times realtime on a lightly
loaded system.  Much closer to the NetBSD `top' CPU ratio.  This points
to the actual sound k0deZ as being responsible for the slowdown that I
experience.


Now I'll try to respond to points others have made and further muddy
the waters, or something.

Martin Ankerl noted:

> One real test is to
> measure how long your machine needs to decode a stream without threads with
> 100% CPU. Using mpg123 you can do this with
> time mpg123 -t mp3stream.mp3

Good idea.  Here's NetBSD-current compared with FreeBSD-current (same
static binary on all three OSen):
(time /usr/obj/ports/5.0/usr/ports/audio/mpg123/work/mpg123-0.59r/mpg123-static-
O3-current -t -v /usr/home/mp3/hr-XXL-chillout-21.okt.mp3 )

NetBSD:
[...]
Playing MPEG stream from hr-XXL-chillout-21.okt.mp3 ...
Junk at the beginning 00000000
MPEG 1.0, Layer: III, Freq: 44100, mode: Joint-Stereo, modext: 2, BPF : 417
Channels: 2, copyright: No, original: Yes, CRC: No, emphasis: 0.
Bitrate: 128 Kbits/s, Extension value: 0
Audio: 1:1 conversion, rate: 44100, encoding: signed 16 bit, channels: 2
Frame# 382927 [    0], Time: 166:42.99 [00:00.00],
[174:13] Decoding of hr-XXL-chillout-21.okt.mp3 finished.
real    49m56.748s
user    48m10.644s
sys     0m18.401s

FreeBSD-CURRENT:
[...]
Frame# 382927 [    0], Time: 166:42.99 [00:00.00],
[174:13] Decoding of hr-XXL-chillout-21.okt.mp3 finished.
real    52m28.443s
user    49m5.392s
sys     0m48.884s


This difference here is *not* something I'm going to lose sleep over;
both systems mostly comparably idle -- this would be a Junior Nitpicking
Kernel Hacker task, if anything.


> or
> time mpg123 -s mp3stream.mp3 > /dev/null
> if you additionally want to measure the I/O time.

Why not?  I've got three OSen to compare, so here I try FreeBSD-4.5-STABLE
against NetBSD.
time /usr/obj/ports/5.0/usr/ports/audio/mpg123/work/mpg123-0.59r/mpg123-static-O
3-current -s /home/mp3/radio1-johnpeel-07.nov.mp3 > /dev/null

NetBSD:
[...]
Playing MPEG stream from radio1-johnpeel-07.nov.mp3 ...
Junk at the beginning 00000000
MPEG 1.0 layer III, 128 kbit/s, 44100 Hz joint-stereo
[120:00] Decoding of radio1-johnpeel-07.nov.mp3 finished.
2306.8u 10.6s 39:43.09 97.2% 0+0k 0+0io 533pf+0w

FreeBSD-STABLE:
[...]
[120:00] Decoding of radio1-johnpeel-07.nov.mp3 finished.
2076.543u 13.556s 35:59.23 96.7%        539+482k 0+0io 1582pf+0w

Well, well.  According to *this* test, I would expect *not* to be seeing
far worse performance with -stable than with NetBSD.



Brian T.Schellenberger noted:

> FWIW, I listten to music and do other things all the time, but I have 512M of
> ram and a 900MHz CPU, and I'm guessng he doesn't.

Well, I do (sort of), but what fun is that? :-)   I mean, for most of
what I do, it makes little difference if my machine is 99,9% idle vs
97% idle.  Using today's `slow' machines makes inefficiencies all the
more obvious, and it's great when I can take these two machines that
a friend tossed out, never wanting to set eyes upon again, and with
FreeBSD or NetBSD or whatever, get a workstation (or server) which
can play smooth audio plus let one do Real Work.



Kris Kennaway observed:

> > | As you are no doubt aware there are significant infrastructural
> > | changes in -current relating to SMP scalability.  [...]
> > | Basically, it's a known issue.

> > At -stable as well as -current or at -current only?

> What I'm talking about is a -current issue only.  I don't recall
> reading the earlier thread.

Okay.  I see a minor performance hit with -current that isn't enough
to get me riled up, with the above tests (decoding only, audio path
doesn't enter the picture).

However, when the audio path enters the picture in both -current and
-stable, then I see a major bigtime performance drop, while NetBSD's
observed unscientific unofficial performance doesn't appear to suffer.

Given that -current has known issues, and is in a state of flux, and
I see only a minor performance difference between -current and -stable,
I'll probably just conduct further tests against -stable.

Also, the last time I did any serious audio work with FreeBSD was back
when 3.3(?) was -stable, and then there were definite issues with 4.0
as -current that made it unsuitable for any of the uses I needed.  I
really should throw together a releng-3 machine as Yet Another Reference
for a number of things I see.  However, I've been using one of the PCI
audio cards with 4-stable recently without experiencing the problems
with audio sampling that 4-current of a couple years back gave me.

And another thing, back when I was using FreeBSD-3.x for my audio
machines, I saw noticeably better performance with them compared to
NetBSD-current of the time, such that I settled on FreeBSD for the
audio sampling/encoding machines I built then.  Not a factor of two
or three or anything, but enough to give me more breathing room.



Luigi Rizzo asked again:

> > > > what compile time options were used in the two cases ?
> > > > They surely can make a huge difference.

> > > Could it also be a possibility, that the NetBSD defaults differ from
> > > the FreeBSD defaults, I think this could make some difference too. :)

> > actually he mentioned in his post that he used the _same_ binary on fbsd
> > and netbsd (statically linked, netbsd with fbsd emu layer)

> actually later in the same message he mentioned he used a different
> binary, and the  "top" output showed two different names.

This all goes to show that my messages are no glistening example of
brevity and clarity, and that I probably need to go into politics.

The long story is that I recompiled all three *BSD binaries long ago
to get a modest speedup (50%?) by stealing the NetBSD options for
FreeBSD, or vice versa, or something.  So the native binaries were
somewhat similar, if not identical.

For reference, here's what gave me the FreeBSD-STABLE binary that may
have been used for my initial report, before building the static version:
<clickety-click>
waaah!  I just lost my working directory from .build_done.mpg123-0.59r_4
including all my hacks and Makefile modifications!  Curses.

Oh well, here's a guess as to what I used to get the FreeBSD-CURRENT
binary, which I relinked statically to use for the tests after my
initial observations, based on the Makefile contents and the options
I gave:

-O -pipe -O3 -mpentium  -mcpu=pentium -march=pentium -Wall -ansi -pedantic
-funroll-all-loops -ffast-math -fomit-frame-pointer \
         -DROT_I386 -DI386_ASSEM -DREAL_IS_FLOAT  -DPENTIUM_OPT
-DREAD_MMAP -DUSE_MMAP -DOSS -DTERM_CONTROL


The NetBSD options probably resemble
netbsd-i386-elf:
        $(MAKE) CC=cc LDFLAGS=-static \
                OBJECTS='decode_i386.o dct64_i386.o decode_i586.o \
                        audio_sun.o term.o' \
                CFLAGS='$(CFLAGS) -Wall -ansi -pedantic -O4 -fomit-frame-pointer
 \
                        -funroll-all-loops -ffast-math -DROT_I386 \
                        -DI386_ASSEM -DPENTIUM_OPT -DREAL_IS_FLOAT -DUSE_MMAP \
                        -DREAD_MMAP -DNETBSD -DTERM_CONTROL' \
                mpg123-make
Since I wasn't suffering awful performance with this binary natively,
I didn't tweak it much beyond what that package build gave me.

Initially I was using the native binaries from each OS and release,
but after building the FreeBSD static binaries, I used that for later
tests.  When I ran that on both FreeBSD-current and -stable, I saw
performance almost identical to the non-static binaries, so I didn't
bother to capture the `top' output that I already had from the native
shared library binaries, which is why the executable names differed.

Where I did see a change, in running the FreeBSD static binary with
NetBSD compatibility, I did capture the new `top' output.  Sorry that
it wasn't clear that the pictured `top' outputs under FreeBSD were
valid for the -static binary as well.


FWIW, I've just built a new -stable binary, using
cc -O -pipe -DMAXPARTITIONS=16  -DINET6 -O3 -mpentium  -mcpu=pentium -march=pent
ium -O3  -Wall -ansi -pedantic -funroll-all-loops -ffast-math -fomit-frame-point
er  -DROT_I386 -DI386_ASSEM -DREAL_IS_FLOAT -DPENTIUM_OPT  -DREAD_MMAP -DUSE_MMA
P  -DTERM_CONTROL -c mpg123.c
(the -DMAXPARTITIONS is intended for kernel/world building, but I
didn't see an obvious way to specify a world-only `make' in make.conf
the way one can for kernel builds, so everything gets it so that I
don't have to hack as many kernel source files -- I think `COPTS'
does the opposite of what I want, giving additional flags used when
*not* building world, if I read right)

The -DOSS is missing to match NetBSD, but still no joy at startup,
where `truss' reveals many seconds spent in

THIS SOFTWARE COMES WITH ABSOLUTELY NO WARRANTY! USE AT YOUR OWN RISK!
write(2,0xbfbff384,71)                           = 71 (0x47)
open("/dev/dsp",0x1,00)                          = 3 (0x3)
ioctl(3,SNDCTL_DSP_GETBLKSIZE,0x8093f20)         = 0 (0x0)
ioctl(3,AUDIO_COMPAT_FLUSH,0x0)                  = 0 (0x0)
ioctl(3,SNDCTL_DSP_SETFMT,0xbfbffa10)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_STEREO,0xbfbffa0c)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_SPEED,0xbfbffa08)             = 0 (0x0)
ioctl(3,SNDCTL_DSP_SETFMT,0xbfbffa10)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_STEREO,0xbfbffa0c)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_SPEED,0xbfbffa08)             = 0 (0x0)
   [hundreds such lines snipped]
ioctl(3,SNDCTL_DSP_SETFMT,0xbfbffa10)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_STEREO,0xbfbffa0c)            = 0 (0x0)
ioctl(3,SNDCTL_DSP_SPEED,0xbfbffa08)             = 0 (0x0)
close(3)                                         = 0 (0x0)
__sysctl(0xbfbffa00,0x2,0x8094900,0xbfbff9fc,0x0,0x0) = 0 (0x0)
sigaction(SIGINT,0xbfbffa5c,0xbfbffa44)          = 0 (0x0)
open("hr-XXL-clubnite-from-27.okt.2001.mp3",0x0,07) = 3 (0x3)
lseek(3,0x0,2)                                   = 146176000 (0x8b67800)


=-=-=-=-=-=-=-=-=-=-=-=

Well, if you're down here, you either read all of the above, so that
you've earned hearing about what I discovered gave me CPU back, or
else you've skipped down here to learn that.  Here goes.  Now, with
-stable, I'm running the static mpg123 I used earlier, and I see

CPU states: 28.3% user,  0.0% nice,  3.1% system,  2.7% interrupt, 65.9% idle
  PID USERNAME PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU COMMAND
  250 beer      33   0   141M  2524K RUN      3:41 29.15% 29.15% mpg123-static-
  157 root      10 -52   888K   548K nanslp   1:18  0.10%  0.10% radioclkd
   90 root       2 -52  2560K  1544K select   0:11  0.00%  0.00% ntpd
  244 root      28   0  1440K  1180K RUN      0:07  0.00%  0.00% top

As a reminder, here's how NetBSD-native looked:
> CPU states: 38.1% user,  0.0% nice,  1.5% system,  1.0% interrupt, 59.4% idle
>   PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
>   229 beer      10    0   308K 3828K aud_wr     1:17 37.16% 37.16% mpg123

And here's how NetBSD with this same FreeBSD static binary seemed to look:

> CPU states: 20.3% user,  0.0% nice,  1.0% system,  0.0% interrupt, 78.7% idle
>   PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
>   241 beer      36    0   512K 2020K RUN        0:24 21.71% 21.63% mpg123-stati

I'm not going to worry over this unscientific difference; I'm already
seeing about a factor of two improvement under FreeBSD.  But why?

> sound card:  sbc0: <ESS ES1868> at port 0x220-0x22f,0x388-0x38b,0x330-0x331 irq 5 drq 1,0 on isa0
>              pcm0: <ESS 18xx DSP> on sbc0

Just as a test, I switched out this card, which I had previously used
for all the measurements and observations, for a different one:

pcm0: <Creative CT5880-C> port 0xfcc0-0xfcff irq 9 at device 11.0 on pci0

So, it seems that the `sbc' soundblaster k0deZ, as used by my ES1868, are
responsible for the slowdown I saw.

What remains to do are such things as...
o)  Seeing if NetBSD's performance changes any way with this card
    [answer:  maybe...]
o)  Seeing if the 20%-CPU figure I saw under NetBSD is repeatable, or...
    [answer:  yes, except that with this card, it's more like 17%]
o)  Trying other genuine SB16 cards to see if there's a difference
o)  Timing a `buildkernel' or something with mpg123 in parallel
    [answer:  dropouts happen occasionally since I didn't `nice' the
     mpg123, but it takes a whopping 64min]
o)  Doing real ``work'' while listening to music



thanks to all who replied, and hope something above is useful
barry bouwsma
 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200203132234.g2DMYBO03364>