Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jan 2002 15:43:47 +0100 (CET)
From:      Michal Mertl <mime@traveller.cz>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        arch@FreeBSD.ORG
Subject:   Re: 64 bit counters again
Message-ID:  <Pine.BSF.4.41.0201181420210.15107-100000@prg.traveller.cz>
In-Reply-To: <3C47E1B2.6938136@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 18 Jan 2002, Terry Lambert wrote:

> Michal Mertl wrote:
> > > 4)    Measure CPU overhead as well as I/O overhead.
> >
> > I don't know what do you mean by I/O overhead here.
>
> Say you could flood a gigabit interface, and it was 6% of the
> CPU on average.  Now after you patches, suppose that it's 10%
> of the CPU.  The limiting factor is the interface... but that's
> only for your application, which is not doing CPU intensive
> processing.  Something that did a lot of CPU work (like SSL),
> would have a different profile, and you would be limiting the
> application by causing it to become CPU bound earlier.
>

That's explaining only CPU overhead which I knew there is some.

> > > 6)    Use an SMP system, make sure that you have a sender
> > >       on both CPUs, and measure TLB shootdown and page
> > >       mapping turnover to ensure you get that overhead in
> > >       there, too (plus the lock overhead).
> >
> > I'm afraid I don't understand. I don't see that deep into kernel
> > unfortunately. If you tell me what to look at and how...
>
> The additional locks required for i386 64 bit atomicity will,
> if the counter is accessed by more than one CPU, result in
> bus contention for inter-CPU coherency.
>

What additional locks? The lock prefix for cmpxchg8b? It's required for 32
bit too and it increases time spent on operation from 3 to 21 clocks
making the difference between 32 and 64 bit "only" 29 clocks instead on
47.

> > > 7)    Make sure you are sending data already in the kernel,
> > >       so you aren't including copy overhead in the CPU cost,
> > >       since practically no one implements servers with copy
> > >       overhead these days.
> >
> > What do you mean by that? Zero-copy operation? Like sendfile? Is Apache
> > 1.x zero-copy?
>
> Yes, zero copy.  Sendfile isn't ideal, but works.  Apache is
> not zero copy.  The idea is to not include a lot of CPU work
> on copies between the user space and the kernel, which aren't
> going to happen in an extremely optimized application.
>

An "extremely optimized" application is a thing which would have
an administrator who doesn't enable costly counters.

> > > If you push data at 100Mbit, and not even at full throttle at
> > > that, you can't reasonably expect to see a slowdown when you
> > > have other bottlenecks between you and the changes.
> > >
> > > In particular, you're not going to see things like the pool
> > > size go up because of increased pool retention time, etc.,
> > > due to the overhead of doing the calculations.
> >
> > That's probably correct eventhough I again don't fully understand what
> > you're talking about :-).
>
> Look at the max number of mbufs allocated.  They form a pool
> of type stable memory from which mbufs are allocated (things

<snip>

Thanks.

> > > Also, realize that even though simply pushing data doesn't
> > > use up a lot of CPU on FreeBSD if you do it right, even 2%
> > > or 4% increase in CPU overhead overall is enough to cause
> > > problems for already CPU-bound applications (i.e. that's
> > > ~40 less SSL connections per server).
> >
> > You're right with that too. Of course I know that at full CPU load the
> > clocks will be missing and maybe other things (memory bandwidth with
> > locked operations?) will suffer.
>
> Yes.  It's important to know whether it is significant for
> the bottleneck figure of merit for a particular application.
>
> For SSL, this is CPU cycles.  For an NFS server, this is how
> much data it can push in a given period of time (overall
> throughput).  For some other application, it's some other
> number.

<snip>

Agreed.

> > > But we can wait for your effects on the mbuf count high
> > > watermark and CPU utilization values before jumping to any
> > > conclusions...
> >
> > I'm afraid I can't provide any measurement with faster interfaces. I can
> > try to use real server to sned me some data so it's executing on both
> > processors, but I would probably become limited with 100Mbit sooner than
> > I'll notice processors have less time to do their job :(.
>
> Well, you probably should collect *all* statistics you can,
> in the most "this is the only thing I'm doing with the box"
> way you can, before and after the code change, and then plot
> the ones that get worse (or better) as a result of the change.

Will do eventually, but unfotunately don't have the time to devote to it
at the moment.

> > THE MOST IMPORTANT QUESTION, to which lots of you probably know answer
> > is, DO WE NEED ATOMIC OPERATIONS FOR ACCESSING DIFFERENT COUNTERS (e.g.
> > network-device (modified in ISR? - YES/NO) or network-protocol or
> > filesystem ...)? NO MATTER WHAT THE SIZE OF THE COUNTER IS.
> >
> I think the answer is "yes, we need atomic counters".  Whether they
> need to be 64 bit or just 32 bit is really application dependent
> (we have all agreed to that, I think).

Thanks. Do you think it's always true (STABLE/CURRENT,network device
ISRs, /sys/netinet routines) ?

> See Bruce's posting about atomicity; I think it speaks very
> eleoquently on the issue (much more brief than what I'd write
> to say the same thing ;^)).

If you mean the email where he talks about atomic_t ("atomic_t would be
"int" if anything") it doesn't fully apply. I am not inventing atomic_t
anymore anyway :-). Isn't there a platform, which better works with 64 bit
ints than with 32 bits (a-la 32/16 bits on modern i386)?


-- 
Michal Mertl
mime@traveller.cz










To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.41.0201181420210.15107-100000>