From owner-freebsd-stable@FreeBSD.ORG Wed Sep 3 17:46:57 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC2FF10668C2; Wed, 3 Sep 2008 17:46:57 +0000 (UTC) (envelope-from is@rambler-co.ru) Received: from relay0.rambler.ru (relay0.rambler.ru [81.19.66.187]) by mx1.freebsd.org (Postfix) with ESMTP id 815298FC15; Wed, 3 Sep 2008 17:46:57 +0000 (UTC) (envelope-from is@rambler-co.ru) Received: from localhost (is1.park.rambler.ru [81.19.64.121]) by relay0.rambler.ru (Postfix) with ESMTP id A73665D01; Wed, 3 Sep 2008 21:46:55 +0400 (MSD) Date: Wed, 3 Sep 2008 21:44:53 +0400 From: Igor Sysoev To: Scott Long Message-ID: <20080903174452.GB73831@rambler-co.ru> References: <20080903095352.GA62541@rambler-co.ru> <20080903123955.GE2038@deviant.kiev.zoral.com.ua> <20080903124733.GH62541@rambler-co.ru> <20080903103846.T39726@pooker.samsco.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <20080903103846.T39726@pooker.samsco.org> User-Agent: Mutt/1.5.13 (2006-08-11) Cc: Kostik Belousov , freebsd-stable@freebsd.org, Tor Egge Subject: Re: vfs.ffs.rawreadahead X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Sep 2008 17:46:57 -0000 On Wed, Sep 03, 2008 at 10:44:46AM -0600, Scott Long wrote: > On Wed, 3 Sep 2008, Igor Sysoev wrote: > >On Wed, Sep 03, 2008 at 03:39:55PM +0300, Kostik Belousov wrote: > > > >>On Wed, Sep 03, 2008 at 01:53:52PM +0400, Igor Sysoev wrote: > >>>Hi, > >>> > >>>could anyone tell what does vfs.ffs.rawreadahead enable ? > >>>As I understand it's used in DIRECTIO code that allows read data > >>>directly to an userland buffer bypassing the buffer cache. > >>>What I can not understand where the read ahead data can be placed in ? > >> > >>The operation of the ffs_rawread is more accurately described as > >>bypassing the page cache. It creates the physical buffer that maps > >>the user pages. > >> > >>The readahead is performed only when the supplied user memory region > >>is bigger then blocksize. In this case, two reads are performed > >>simultaneously, with both buffers mapping consequent blocks from > >>user-supplied buffers. The read operation looks like footsteps. > > > >Nice! > > > >As I understand the size limit of one read operation is MAXPHYS, which is > >equal to 128K due to LBA28 ATA limit. On SCSI, SATA, and LBA48 ATA this > >limit > >can be increased. Is it safe ? > > The value of MAXPHYS is unrelated to capabilities or limitations of ATA. > It was chosen based on the needs to prevent an excessive amount of > parallel I/O from exhausting the kernel address space and system memory. > In fact, the concern was with SCSI, not with ATA. > > MAXPHYS can be raised, especially on 64bit platforms, but doing so also > bloats the sizes of a few key data structures. I've been looking at a > solution for this, and I'd rather that people keep their MAXPHYS changes > confined to their local trees rather than changing FreeBSD unless they > also solve the associated side effects. As I understand MAXPHYS affects at least on pager_map size: on modern machines it's usually 256 * MAXPHYS = 32M, therefore increasing MAXPHYS will increase the map too. The 128K is probably good value and I do not suggest to increase it by default, I just want to increase MAXPHYS to improve disk throughput on some hosts where nginx serves large files (1G+) using DIRECTIO. BTW, is it possible to change MAXPHYS to a loader tunnable ? -- Igor Sysoev http://sysoev.ru/en/