Date: Wed, 3 Sep 2008 11:56:09 -0600 (MDT) From: Scott Long <scottl@samsco.org> To: Igor Sysoev <is@rambler-co.ru> Cc: Kostik Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org, Tor Egge <tegge@freebsd.org> Subject: Re: vfs.ffs.rawreadahead Message-ID: <20080903114853.Q39726@pooker.samsco.org> In-Reply-To: <20080903174452.GB73831@rambler-co.ru> References: <20080903095352.GA62541@rambler-co.ru> <20080903123955.GE2038@deviant.kiev.zoral.com.ua> <20080903124733.GH62541@rambler-co.ru> <20080903103846.T39726@pooker.samsco.org> <20080903174452.GB73831@rambler-co.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 3 Sep 2008, Igor Sysoev wrote: > On Wed, Sep 03, 2008 at 10:44:46AM -0600, Scott Long wrote: > >> On Wed, 3 Sep 2008, Igor Sysoev wrote: >>> On Wed, Sep 03, 2008 at 03:39:55PM +0300, Kostik Belousov wrote: >>> >>>> On Wed, Sep 03, 2008 at 01:53:52PM +0400, Igor Sysoev wrote: >>>>> Hi, >>>>> >>>>> could anyone tell what does vfs.ffs.rawreadahead enable ? >>>>> As I understand it's used in DIRECTIO code that allows read data >>>>> directly to an userland buffer bypassing the buffer cache. >>>>> What I can not understand where the read ahead data can be placed in ? >>>> >>>> The operation of the ffs_rawread is more accurately described as >>>> bypassing the page cache. It creates the physical buffer that maps >>>> the user pages. >>>> >>>> The readahead is performed only when the supplied user memory region >>>> is bigger then blocksize. In this case, two reads are performed >>>> simultaneously, with both buffers mapping consequent blocks from >>>> user-supplied buffers. The read operation looks like footsteps. >>> >>> Nice! >>> >>> As I understand the size limit of one read operation is MAXPHYS, which is >>> equal to 128K due to LBA28 ATA limit. On SCSI, SATA, and LBA48 ATA this >>> limit >>> can be increased. Is it safe ? >> >> The value of MAXPHYS is unrelated to capabilities or limitations of ATA. >> It was chosen based on the needs to prevent an excessive amount of >> parallel I/O from exhausting the kernel address space and system memory. >> In fact, the concern was with SCSI, not with ATA. >> >> MAXPHYS can be raised, especially on 64bit platforms, but doing so also >> bloats the sizes of a few key data structures. I've been looking at a >> solution for this, and I'd rather that people keep their MAXPHYS changes >> confined to their local trees rather than changing FreeBSD unless they >> also solve the associated side effects. > > As I understand MAXPHYS affects at least on pager_map size: on modern > machines it's usually 256 * MAXPHYS = 32M, therefore increasing MAXPHYS > will increase the map too. This is intended and desirable. > > The 128K is probably good value and I do not suggest to increase it by > default, I just want to increase MAXPHYS to improve disk throughput > on some hosts where nginx serves large files (1G+) using DIRECTIO. I've tested increases up to 1M, and they all are very beneficial not only for silly sequential style benchmarks but also for clustered i/o. 256-512k is the sweet spot, but Windows has set the standard at 1M and I'd like to have FreeBSD follow suit eventually. > > BTW, is it possible to change MAXPHYS to a loader tunnable ? > > No. Struct buf is sized based on MAXPHYS, and there's no convenient way yet to dynamically size that at runtime. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080903114853.Q39726>