Date: Mon, 25 Oct 1999 19:52:54 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Terry Lambert <tlambert@primenet.com> Cc: freebsd@gndrsh.dnsmgr.net (Rodney W. Grimes), tlambert@primenet.com, twinkle.star@263.net, freebsd-smp@FreeBSD.ORG Subject: Re: inquire(second time) Message-ID: <199910260252.TAA11014@apollo.backplane.com> References: <199910260226.TAA26348@usr06.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:> CPU demand vs memory bandwidth is only aggrevated, and thusly :> requireing a faster memory subsystem, by SMP. : :It was my understanding that this would apply to memory mapped I/O, :as well, but of course I haven't investigated RAMBus closely enough :(obviously) to be able to say for sure. Like all memory subsystems, RAMBus depends heavily on pipeline burst memory ops -- i.e. cache line fills and drains to obtain its performance. The two main features of RAMBus is that the RAMBus controller uses a MIPS-like command queueing interface between the cpu and the controller that allows memory ops to be pipelined and a special very high speed serial-like interface between the controller and memory. But this does didily for I/O operations. If you read from an I/O location, whether it is memory-mapped or not, the cpu will stall badly. The same goes for writing to an I/O location though in the case of writing there is usually a small write pipeline that allows some decoupling to occur. RAMBus doesn't help here. The only way to decouple the I/O bus from the processor completely is to use a bus-master-DMA messaging interface whereby the cpu stores an I/O request in main memory and the I/O board DMAs the message in and then DMAs the response back out. A DMA transaction involves no stalls (or, more specifically, the I/O board can release the bus when its pipeline is full to avoid stalling other memory or I/O transactions on the bus). Alternatively it is possible for the I/O device to have a certain amount of cacheable local memory which is mapped into the cpu's address space, allowing the cpu to access I/O requests and responses through its L1 and L2 caches directly (and to burst-cache-line-fill reading back the I/O request). This alternative offers double the bandwidth (you avoid a copy) but requires a bus-snooping cache to allow the I/O device to cache-invalidate the area the message response is stored in. The former mechanisms can also theoretically use the L2 cache and not even bother to flush the I/O requests to main memory, depending on the sophistication of the cpu/cache/memory subsystem. This allows the I/O board to be implemented without duel port memory and maintains the fiction of a main-memory rendezvous without actually incuring the overhead of main memory. -Matt Matthew Dillon <dillon@backplane.com> :The idea I was trying to communicate is that memory mapped I/O is :a hell of a lot more friendly to SMP than inb/outb based device :communication (c.v. my keyboard LED example later in the posting). : :Anyway, if the RAMBus stuff is useless for direct memory mapped :regions on devices (e.g. they can't come in except via slower :memory elsewhere), then consider me thwacked. 8-). : : Terry Lambert : terry@lambert.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199910260252.TAA11014>