From owner-freebsd-hackers@FreeBSD.ORG Sun Mar 29 02:56:20 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 169C3FCF; Sun, 29 Mar 2015 02:56:20 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 986EBF0E; Sun, 29 Mar 2015 02:56:19 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DRDACMaBdV/95baINcgygwXASDD8JKCoUqSQKBawEBAQEBAX1BAoNSAQEEAQEBCxUrFwkLGw4EBgICDRkCKQEJGA4GCAcEARwEiA4NsgyYZAEBAQEBBQEBAQEBAQEbgSGKCIQgAQYBARs0B4ItDC8SgTMFlE2DXYNCOoVSjQ0ihAoiMQd7AQgXIn8BAQE X-IronPort-AV: E=Sophos;i="5.11,486,1422939600"; d="scan'208";a="200302379" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Mar 2015 22:56:17 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id ED295B3F77; Sat, 28 Mar 2015 22:56:17 -0400 (EDT) Date: Sat, 28 Mar 2015 22:56:17 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <69948517.7558875.1427597777962.JavaMail.root@uoguelph.ca> In-Reply-To: <20150328171315.GU2379@kib.kiev.ua> Subject: Re: MAXBSIZE increase MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@freebsd.org, Alexander Motin X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2015 02:56:20 -0000 Kostik wrote: > On Fri, Mar 27, 2015 at 10:57:05PM +0200, Alexander Motin wrote: > > Hi. > > > > Experimenting with NFS and ZFS I found an inter-operation issue: > > ZFS by > > default uses block of 128KB, while FreeBSD NFS (both client and > > server) > > is limited to 64KB requests by the value of MAXBSIZE. On file > > rewrite > > that limitation makes ZFS to do slow read-modify-write cycles for > > every > > write operation, instead of just writing the new data. Trivial > > iozone > > test show major difference between initial write and rewrite speeds > > because of this issue. > > > > Looking through the sources I've found and in r280347 fixed number > > of > > improper MAXBSIZE use cases in device drivers. After that I see no > > any > > reason why MAXBSIZE can not be increased to at least 128KB to match > > ZFS > > default (ZFS now supports block up to 1MB, but that is not default > > and > > so far rare). I've made a test build and also successfully created > > UFS > > file system with 128KB block -- not sure it is needed, but seems it > > survives this change well too. > > > > Is there anything I am missing, or it is safe to rise this limit > > now? > > This post is useless after the Bruce explanation, but I still want to > highlidht the most important point from that long story: > > increasing MAXBSIZE without tuning other buffer cache parameters > would dis-balance the buffer cache. Allowing bigger buffers > increases > fragmentation, while limiting the total number of buffers. Also, it > changes the tuning for runtime limits for amount of io in flight, see > hi/lo runningspace initialization. >From an NFS perspective, all it cares about is the maximum size of buffer cache block it can use. Maybe creating a separate constant that is specifically "max buffer cache block size", but does not define the maximum size of any file system's block would help? If the constant only defines maximum buffer cache block size, then it could be tuned based on architecture, so that amd64 could use much larger values for the buffer cache tunables. (As Bruce explained, i386 puts a very low limit on the buffer cache, due to KVM limitations.) Put another way, separate maximum buffer cache block from the maximum block size used by any on-disk file system. Other than KVM limits, I think the problems with increasing MAXBSIZE are because it is used as a maximum block size for file systems like UFS. Btw, since NFS already uses 64K buffers by default, there is already a dis-balanced buffer cache. Unfortunately, increasing BKVASIZE will make it allow even fewer buffers for i386. I don't know if buffer cache fragmentation has been causing anyone problems? Does anyone know if buffer cache fragmentation can cause any failure or will it just impact performance? (All I can see is that allocation of buffers > BKVASIZE can fragment the buffer cache's address space such that there might not be a contiguous area large enough for a buffer's allocation. I don't know what happens then?) I would like to see the NFS client be able to use 128K rsize/wsize. I would also like to see a larger buffer cache for machines like amd64 with a lot of RAM, so that wcommitsize (the size of write that the client can do asynchronously) can be much larger, too. (For i386, we probably have to live with a small buffer cache and maybe a 64K maximum buffer cache block size.) rick > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org" >