Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Oct 1999 19:52:54 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        freebsd@gndrsh.dnsmgr.net (Rodney W. Grimes), tlambert@primenet.com, twinkle.star@263.net, freebsd-smp@FreeBSD.ORG
Subject:   Re: inquire(second time)
Message-ID:  <199910260252.TAA11014@apollo.backplane.com>
References:   <199910260226.TAA26348@usr06.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:> CPU demand vs memory bandwidth is only aggrevated, and thusly
:> requireing a faster memory subsystem, by SMP.
:
:It was my understanding that this would apply to memory mapped I/O,
:as well, but of course I haven't investigated RAMBus closely enough
:(obviously) to be able to say for sure.

    Like all memory subsystems, RAMBus depends heavily on pipeline burst
    memory ops -- i.e. cache line fills and drains to obtain its performance.
    The two main features of RAMBus is that the RAMBus controller uses a
    MIPS-like command queueing interface between the cpu and the controller
    that allows memory ops to be pipelined and a special very high speed
    serial-like interface between the controller and memory.

    But this does didily for I/O operations.  If you read from an I/O location,
    whether it is memory-mapped or not, the cpu will stall badly.  The same
    goes for writing to an I/O location though in the case of writing there
    is usually a small write pipeline that allows some decoupling to occur.
    RAMBus doesn't help here.

    The only way to decouple the I/O bus from the processor completely is to 
    use a bus-master-DMA messaging interface whereby the cpu stores an I/O
    request in main memory and the I/O board DMAs the message in and then
    DMAs the response back out.  A DMA transaction involves no stalls (or,
    more specifically, the I/O board can release the bus when its pipeline
    is full to avoid stalling other memory or I/O transactions on the bus).

    Alternatively it is possible for the I/O device to have a certain amount
    of cacheable local memory which is mapped into the cpu's address space,
    allowing the cpu to access I/O requests and responses through its L1 and
    L2 caches directly (and to burst-cache-line-fill reading back the I/O
    request).  This alternative offers double the bandwidth (you avoid a
    copy) but requires a bus-snooping cache to allow the I/O device to
    cache-invalidate the area the message response is stored in.

    The former mechanisms can also theoretically use the L2 cache and not 
    even bother to flush the I/O requests to main memory, depending on the
    sophistication of the cpu/cache/memory subsystem.  This allows the I/O
    board to be implemented without duel port memory and maintains the
    fiction of a main-memory rendezvous without actually incuring the overhead
    of main memory.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:The idea I was trying to communicate is that memory mapped I/O is
:a hell of a lot more friendly to SMP than inb/outb based device
:communication (c.v. my keyboard LED example later in the posting).
:
:Anyway, if the RAMBus stuff is useless for direct memory mapped
:regions on devices (e.g. they can't come in except via slower
:memory elsewhere), then consider me thwacked.  8-).
:
:					Terry Lambert
:					terry@lambert.org



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199910260252.TAA11014>