From owner-cvs-all@FreeBSD.ORG Thu Mar 27 09:58:48 2003 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CA76437B401 for ; Thu, 27 Mar 2003 09:58:48 -0800 (PST) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id 8D06E43FB1 for ; Thu, 27 Mar 2003 09:58:47 -0800 (PST) (envelope-from nate@rootlabs.com) Received: (qmail 29807 invoked by uid 1000); 27 Mar 2003 17:58:48 -0000 Date: Thu, 27 Mar 2003 09:58:48 -0800 (PST) From: Nate Lawson To: Bruce Evans In-Reply-To: <20030327180247.D1825@gamplex.bde.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-23.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: cvs-src@FreeBSD.org cc: Mike Silbersack cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org Subject: Re: Checksum/copy (was: Re: cvs commit: src/sys/netinet ip_output.c) X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 17:58:50 -0000 On Thu, 27 Mar 2003, Bruce Evans wrote: > On Wed, 26 Mar 2003, Mike Silbersack wrote: > > On Wed, 26 Mar 2003, Nate Lawson wrote: > > > I don't want to hijack the thread too much, but has thought gone into a > > > combined checksum and copy function? The first mention I can remember of > > > this is in RFC 817 p. 19-20. > > Is this RFC old? Combined checksum and copy hasn't been a larger > optimization since L1 caches became large enough, since to a first > approximation, everything is dominated by memory bandwidth and another > pass to calculate the checksum is free because copying left all the > data in the L1 cache. Yes, the RFC is old. However, there still may be performance advantages in ILP because while the data is being fetched the first time (for the copy), idle slots can be filled with an incremental checksum update. > > Heh, I don't think anyone has. What actually would make sense is for > > someone who feels like doing ASM timing to look at our bcopy routines / > > etc. > > I spent a lot of time on this about 7 years ago. See ~bde/cache on > freefall for old versions of programs that try lots of different > copy/read/write checksum methods. Better hardware made the differences > between various methods relatively small. One can probably do better > (50%?) for largish (1K+ ?) buffers using SSE instructions on i386's > now. We definitely should have an SSE version for P3+. The 128 bit instructions make a big difference. And for checksumming, you can do 8 packed adds at once. -Nate