From owner-svn-src-all@FreeBSD.ORG  Fri Jun  3 08:28:50 2011
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@FreeBSD.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id A239B1065675; Fri,  3 Jun 2011 08:28:50 +0000 (UTC)
Date: Fri, 3 Jun 2011 08:28:50 +0000
From: Alexander Best <arundel@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Message-ID: <20110603082850.GA91167@freebsd.org>
References: <201105131848.p4DIm1j7079495@svn.freebsd.org>
	<201105282103.43370.pieter@degoeje.nl>
	<BANLkTimJY65boMPhnnT344cmwRUJ0Z=dSQ@mail.gmail.com>
	<20110531004247.C4034@besplex.bde.org>
	<BANLkTiknnuoC1hU6YD5H+SmCU1zP3zrv1A@mail.gmail.com>
	<20110531032658.N5049@besplex.bde.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20110531032658.N5049@besplex.bde.org>
Cc: svn-src-head@FreeBSD.org, mdf@FreeBSD.org, svn-src-all@FreeBSD.org,
	src-committers@FreeBSD.org, Pieter de Goeje <pieter@degoeje.nl>
Subject: Re: svn commit: r221853 - in head/sys: dev/md dev/null sys vm
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
	user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Jun 2011 08:28:50 -0000

On Tue May 31 11, Bruce Evans wrote:
> On Mon, 30 May 2011 mdf@FreeBSD.org wrote:
> 
> >On Mon, May 30, 2011 at 8:25 AM, Bruce Evans <brde@optusnet.com.au> wrote:
> >>On Sat, 28 May 2011 mdf@FreeBSD.org wrote:
> >>>...
> >>>Meanwhile you could try setting ZERO_REGION_SIZE to PAGE_SIZE and I
> >>>think that will restore things to the original performance.
> >>
> >>Using /dev/zero always thrashes caches by the amount <source buffer
> >>size> + <target buffer size> (unless the arch uses nontemporal memory
> >>accesses for uiomove, which none do AFAIK).  So a large source buffer
> >>is always just a pessimization.  A large target buffer size is also a
> >>pessimization, but for the target buffer a fairly large size is needed
> >>to amortize the large syscall costs.  In this PR, the target buffer
> >>size is 64K.  ZERO_REGION_SIZE is 64K on i386 and 2M on amd64.  64K+64K
> >>on i386 is good for thrashing the L1 cache.
> >
> >That depends -- is the cache virtually or physically addressed?  The
> >zero_region only has 4k (PAGE_SIZE) of unique physical addresses.  So
> >most of the cache thrashing is due to the user-space buffer, if the
> >cache is physically addressed.
> 
> Oops.  I now remember thinking that the much larger source buffer would be
> OK since it only uses 1 physical page.  But it is apparently virtually
> addressed.
> 
> >It will only have a
> >>noticeable impact on a current L2 cache in competition with other
> >>threads.  It is hard to fit everything in the L1 cache even with
> >>non-bloated buffer sizes and 1 thread (16 for the source (I)cache, 0
> >>for the source (D)cache and 4K for the target cache might work).  On
> >>amd64, 2M+2M is good for thrashing most L2 caches.  In this PR, the
> >>thrashing is limited by the target buffer size to about 64K+64K, up
> >>from 4K+64K, and it is marginal whether the extra thrashing from the
> >>larger source buffer makes much difference.
> >>
> >>The old zbuf source buffer size of PAGE_SIZE was already too large.
> >
> >Wouldn't this depend on how far down from the use of the buffer the
> >actual copy happens?  Another advantage to a large virtual buffer is
> >that it reduces the number of times the copy loop in uiomove has to
> >return up to the device layer that initiated the copy.  This is all
> >pretty fast, but again assuming a physical cache fewer trips is
> >better.
> 
> Yes, I had forgotten that I have to keep going back to the uiomove()
> level for each iteration.  That's a lot of overhead although not nearly
> as much as going back to the user level.  If this is actually important
> to optimize, then I might add a repeat count to uiomove() and copyout()
> (actually a different function for the latter).
> 
> linux-2.6.10 uses a mmapped /dev/zero and has had this since Y2K
> according to its comment.  Sigh.  You will never beat that by copying,
> but I think mmapping /dev/zero is only much more optimal for silly
> benchmarks.
> 
> linux-2.6.10 also has a seekable /dev/zero.  Seeks don't really work,
> but some of them "succeed" and keep the offset at 0 .  ISTR remember
> a FreeBSD PR about the file offset for /dev/zero not "working" because
> it is garbage instead of 0.  It is clearly a Linuxism to depend on it
> being nonzero.  IIRC, the file offset for device files is at best
> implementation-defined in POSIX.

i think you refer to [1]. i posted a patch as followup to that PR, but later
noticed that it is completely wrong. there was also a discussion on @hackers i
opened up with the subject line "seeking into /dev/{null,zero}". however not
much came out of it. POSIX doesn't have anything to say about seeking in
connection with /dev/{null,zero}. it only states that:

"The behavior of lseek() on devices which are incapable of seeking is 
implementation-defined.
The value of the file offset associated with such a device is undefined."

so basically we can decide for ourselves, whether /dev/{null,zero} shall be
capable or incapable of seeking.

i really think this issue should be solved once and for all and then also
mentioned in the zero(4) and null(4) man pages. so the question is:

how do we want /dev/zero and /dev/null to behave when seeking into the devices?

right now HEAD features the following semantics:

reading from /dev/null != seeking
writing to /dev/null != seeking
reading from /dev/zero == seeking
writing to /dev/zero != seeking

please don't get me wrong: i'm NOT saying the current semantics are wrong. the
issue in question is: the semantics need to be agreed upon and then documented
once and for all in the zero(4) and null(4) man pages, so people don't trip
over this questions every couple of years over and over again.

cheers.
alex

[1] http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/152485

> 
> Bruce


-- 
a13x