Date: Sun, 05 Jul 2009 22:16:27 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Adrian Chadd <adrian@freebsd.org> Cc: freebsd-arch@freebsd.org Subject: Re: DFLTPHYS vs MAXPHYS Message-ID: <4A50FC0B.9090601@FreeBSD.org> In-Reply-To: <d763ac660907051158i256c0f93n4a895a992c2a8c34@mail.gmail.com> References: <4A4FAA2D.3020409@FreeBSD.org> <20090705100044.4053e2f9@ernst.jennejohn.org> <4A50667F.7080608@FreeBSD.org> <20090705223126.I42918@delplex.bde.org> <4A50BA9A.9080005@FreeBSD.org> <20090706005851.L1439@besplex.bde.org> <4A50DEE8.6080406@FreeBSD.org> <20090706034250.C2240@besplex.bde.org> <4A50F619.4020101@FreeBSD.org> <d763ac660907051158i256c0f93n4a895a992c2a8c34@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Adrian Chadd wrote: > 2009/7/6 Alexander Motin <mav@freebsd.org>: > >> In this tests you've got almost only negative side of effect, as you have >> said, due to cache misses. Do you really have CPU with so small L2 cache? >> Some kind of P3 or old Celeron? But with 64K MAXPHYS you just didn't get any >> benefit from using bigger block size. > > All the world isn't your current desktop box with only SATA devices :) This is laptop and what do you mean by "only SATA"? You know any storage which performance degrade from big transactions? > There have been and will be plenty of little embedded CPUs with tiny > amounts of cache for quite some time to come. Fine, lets set it to 8K on ARM. What do want to say by that? > You're also doing simple stream IO tests. Please re-think the thought > experiment with a whole lot of parallel IO going on rather than just > straight single stream IO. Please don't. Parallel access with big blocks becomes just more linear with growing block length. For modern drives with >100MB/s speeds and 10ms access time it is just a madness to transfer less then 1MB in one transaction with random access. > Also, please realise that part of having your cache thrashed is what > it does to the performance of -other- code. dd may be fast, but if > you're constantly purging your caches by copying around all of that > data, subsequent code has to go and freshen the cache again. On older > and anaemic embedded/low power boxes the cost of a cache miss vs cache > hit can still be quite expensive. I think that anaemic embedded/low power boxes will prefer to handle operation by chipset hardware as much as possible without interrupting CPU. Also please read one of my previous posts. I don't see why, with, for example, 1M user-level buffer, buffer-cache backed access spited into many small disk transactions could less trash CPU cache. It just transmit same amount of data into the same buffer cachememory addresses. It is not a disk transaction DMA size trashes the cache. If you want to fight it - OK, but not there. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A50FC0B.9090601>