From owner-freebsd-arch@FreeBSD.ORG Mon Aug 1 19:55:44 2005 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DBFBD16A41F; Mon, 1 Aug 2005 19:55:44 +0000 (GMT) (envelope-from keramida@linux.gr) Received: from aiolos.otenet.gr (aiolos.otenet.gr [195.170.0.93]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2F18D43D45; Mon, 1 Aug 2005 19:55:43 +0000 (GMT) (envelope-from keramida@linux.gr) Received: from beatrix.daedalusnetworks.priv (aris.bedc.ondsl.gr [62.103.39.226]) by aiolos.otenet.gr (8.13.4/8.13.4/Debian-1) with SMTP id j71JteVE020293; Mon, 1 Aug 2005 22:55:41 +0300 Received: from beatrix.daedalusnetworks.priv (localhost [127.0.0.1]) by beatrix.daedalusnetworks.priv (8.13.3+Sun/8.13.3) with ESMTP id j71Jte7q001422; Mon, 1 Aug 2005 22:55:40 +0300 (EEST) Received: (from keramida@localhost) by beatrix.daedalusnetworks.priv (8.13.3+Sun/8.13.3/Submit) id j71JtdwJ001421; Mon, 1 Aug 2005 22:55:39 +0300 (EEST) Date: Mon, 1 Aug 2005 22:55:39 +0300 From: Giorgos Keramidas To: Xin LI Message-ID: <20050801195539.GB1406@beatrix.daedalusnetworks.priv> References: <20050801182518.GA85423@frontfree.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050801182518.GA85423@frontfree.net> X-Mailman-Approved-At: Tue, 02 Aug 2005 11:42:32 +0000 Cc: freebsd-amd64@freebsd.org, freebsd-arch@freebsd.org Subject: Re: [RFC] Port of NetBSD's optimized amd64 string code X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2005 19:55:45 -0000 On 2005-08-02 02:25, Xin LI wrote: > Hi, Guys, > > Here is a patchset that I have produced to make our libc aware of the > NetBSD assembly implementation of the string related operations. I can't speak for the asm code, since I know barely enough amd64 things to read it, but there are a few typos you might want to fix before this gets committed. > + /* > + * Align to word boundry > + * Consider unrolling loop? s/boundry/boundary/ > + * (1) ~(((x & 0x7f....7f) + 0x7f....7f) | (x | 0x7f....7f)) > + * > + * evaluates to a non-zero value if any of the bytes in the > + * original word is zero. > + * > + * It also has the useful property that bytes in the result word > + * that coorespond to non-zero bytes in the original word have > + * the value 0x00, while bytes cooresponding to zero bytes have s/coorespond/correspond/ in the 2 last lines. > + * On little endian machines, the first byte in the result word > + * that cooresponds to a zero byte in the original byte is 0x80, Ditto. > + * so clz() can be used as above. On big endian machines, and > + * little endian machines without (or with a slow) clz() insn, > + * testing each byte in the original for zero is necessary Missing final period. > + testq $7,%rdx # copy first group of 1 to 7 words > + jz L2 # while swaping alternate bytes. s/swaping/swapping/ That's all I could spot.