From owner-freebsd-mips@FreeBSD.ORG Tue Feb 22 05:04:20 2011 Return-Path: Delivered-To: freebsd-mips@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5243F106566C; Tue, 22 Feb 2011 05:04:20 +0000 (UTC) (envelope-from c.jayachandran@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7B1B68FC17; Tue, 22 Feb 2011 05:04:19 +0000 (UTC) Received: by wyb32 with SMTP id 32so2559288wyb.13 for ; Mon, 21 Feb 2011 21:04:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=8Wlx3yyCJ2qSywBzxaaGfSBKSRwyvRUq4sHxJ5HyjcI=; b=mP0qL4JcDLX33sFIMRAU1rOdiYtArkf5prjZF2OyIsnKTBpr4wusM3APIe0nf4nPro qafZF1DcZf4rhEV14nwq/qZ5CSyC1VVKl02LKytjXbE7+pMnZD90MyyR9yL6mFKVZGEz KORt1ONVrg7nvbyA2OakiZttqDBhx6qVz/sYg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=dtuHV4FmFFlFojolA7kOAE4KExgpEkhxRgj94mI/eDUsftBw71jZSAFh83LAo8PEU3 TwsDB/6TAL+E8w/T+1XFBUWud3L8nqWPoX82I26sL5FishDhR/vlwIZvagjCDZs42ky4 hv5C8F33ciuZeLIETKqGAd/1Q3gVenyW6GbXg= MIME-Version: 1.0 Received: by 10.227.144.196 with SMTP id a4mr1923002wbv.122.1298351058000; Mon, 21 Feb 2011 21:04:18 -0800 (PST) Received: by 10.227.132.144 with HTTP; Mon, 21 Feb 2011 21:04:17 -0800 (PST) In-Reply-To: References: Date: Tue, 22 Feb 2011 10:34:17 +0530 Message-ID: From: "Jayachandran C." To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-mips@freebsd.org Subject: Re: lib/libc/mips/string/bzero.S -- problem in 64-bit mode. X-BeenThere: freebsd-mips@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to MIPS List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Feb 2011 05:04:20 -0000 On Tue, Feb 22, 2011 at 6:37 AM, Artem Belevich wrote: > Hi, > > I think htere's a problem with bzero implementation for 64-bit mips (SZRE= G=3D=3D8). > > http://svn.freebsd.org/viewvc/base/head/lib/libc/mips/string/bzero.S?revi= sion=3D209231&view=3Dmarkup > > LEAF(bzero) > =A0 =A0 =A0 =A0.set =A0 =A0noreorder > =A0 =A0 =A0 =A0blt =A0 =A0 =A0 =A0 =A0 =A0 a1, 3*SZREG, smallclr # small = amount to clear? > =A0 =A0 =A0 =A0PTR_SUBU =A0 =A0 =A0 =A0a3, zero, a0 =A0 =A0# compute # by= tes to word align address > =A0 =A0 =A0 =A0and =A0 =A0 =A0 =A0 =A0 =A0 a3, a3, SZREG-1 > =A0 =A0 =A0 =A0beq =A0 =A0 =A0 =A0 =A0 =A0 a3, zero, 1f =A0 =A0# skip if = word aligned > #if SZREG =3D=3D 4 > =A0 =A0 =A0 =A0PTR_SUBU =A0 =A0 =A0 =A0a1, a1, a3 =A0 =A0 =A0# subtract f= rom remaining count > =A0 =A0 =A0 =A0SWHI =A0 =A0 =A0 =A0 =A0 =A0zero, 0(a0) =A0 =A0 # clear 1,= 2, or 3 bytes to align > =A0 =A0 =A0 =A0PTR_ADDU =A0 =A0 =A0 =A0a0, a0, a3 > #endif > > #if SZREG =3D=3D 8 > =A0 =A0 =A0 =A0PTR_SUBU =A0 =A0 =A0 =A0a1, a1, a3 =A0 =A0 =A0# subtract f= rom remaining count > =A0 =A0 =A0 =A0PTR_ADDU =A0 =A0 =A0 =A0a0, a0, a3 =A0 =A0 =A0# align dst = to next word > =A0 =A0 =A0 =A0sll =A0 =A0 =A0 =A0 =A0 =A0 a3, a3, 3 =A0 =A0 =A0 # bits t= o bytes > =A0 =A0 =A0 =A0li =A0 =A0 =A0 =A0 =A0 =A0 =A0a2, -1 =A0 =A0 =A0 =A0 =A0# = make a mask > #if _BYTE_ORDER =3D=3D _BIG_ENDIAN > (a) =A0 =A0 REG_SRLV =A0 =A0 =A0 =A0a2, a2, a3 =A0 =A0 =A0# we want to ke= ep the MSB bytes > #endif > #if _BYTE_ORDER =3D=3D _LITTLE_ENDIAN > (b) =A0 =A0 REG_SLLV =A0 =A0 =A0 =A0a2, a2, a3 =A0 =A0 =A0# we want to ke= ep the LSB bytes > #endif > (c) =A0 =A0 nor =A0 =A0 =A0 =A0 =A0 =A0 a2, zero, a2 =A0 =A0# complement = the mask > =A0 =A0 =A0 =A0REG_L =A0 =A0 =A0 =A0 =A0 v0, -SZREG(a0) =A0# load the wor= d to partially clear > =A0 =A0 =A0 =A0and =A0 =A0 =A0 =A0 =A0 =A0 v0, v0, a2 =A0 =A0 =A0# clear = the bytes > =A0 =A0 =A0 =A0REG_S =A0 =A0 =A0 =A0 =A0 v0, -SZREG(a0) =A0# store it bac= k > #endif > > Let's suppose we're trying to bzero something at 0x1234567. =A0A3 will > contain number of bytes *remaining* until register-aligned address. > I.e. 1 in this case. > When we make it to (c) on big-endian platforms A2=3D0x00FFFFFF_FFFFFFFF > and on little-endianA2=3D0xFFFFFFFF_FFFFFF00. > after (c) it's 0xFF000000_00000000 and 0x00000000_000000FF > correspondingly, unless I've got NOR semantics wrong. > > Now we load register, AND it with a2 and write the result back. It > does not look right -- we're clearing *7* bytes instead of only one > and clobber the data that preceeds the start address. > > I believe correct code should look like this: > > #if SZREG =3D=3D 8 > =A0 =A0 =A0 =A0PTR_SUBU =A0 =A0 =A0 =A0a1, a1, a3 =A0 =A0 =A0# subtract f= rom remaining count > =A0 =A0 =A0 =A0PTR_ADDU =A0 =A0 =A0 =A0a0, a0, a3 =A0 =A0 =A0# align dst = to next word > =A0 =A0 =A0 =A0sll =A0 =A0 =A0 =A0 =A0 =A0 a3, a3, 3 =A0 =A0 =A0 # bits t= o bytes > =A0 =A0 =A0 =A0li =A0 =A0 =A0 =A0 =A0 =A0 =A0a2, -1 =A0 =A0 =A0 =A0 =A0# = make a mask > #if _BYTE_ORDER =3D=3D _BIG_ENDIAN > =A0 =A0 =A0 =A0REG_SLLV =A0 =A0 =A0 =A0a2, a2, a3 =A0 =A0 =A0# we want to= keep the MSB bytes > #endif > #if _BYTE_ORDER =3D=3D _LITTLE_ENDIAN > =A0 =A0 =A0 =A0REG_SRLV =A0 =A0 =A0 =A0a2, a2, a3 =A0 =A0 =A0# we want to= keep the LSB bytes > #endif > =A0 =A0 =A0 =A0REG_L =A0 =A0 =A0 =A0 =A0 v0, -SZREG(a0) =A0# load the wor= d to partially clear > =A0 =A0 =A0 =A0and =A0 =A0 =A0 =A0 =A0 =A0 v0, v0, a2 =A0 =A0 =A0# clear = the bytes > =A0 =A0 =A0 =A0REG_S =A0 =A0 =A0 =A0 =A0 v0, -SZREG(a0) =A0# store it bac= k > #endif > > One thing I don't quite understand is -- why do we bother with all > this manual masking at all? Why not just use REG_SHI for both SZREG=3D=3D= 4 > and SZREG=3D=3D8 cases? If I read the code I quoted above correctly, it > attempts to emulate SDL/SDR instructions. Using REG_SHI macro would > pick correct swl/swr/sdl/sdr variant based on register size and > endianness and would clear the unaligned bytes. > > =A0 =A0 =A0 =A0PTR_SUBU =A0 =A0 =A0 =A0a1, a1, a3 =A0 =A0 =A0# subtract f= rom remaining count > =A0 =A0 =A0 =A0REG_SHI zero, 0(a0) =A0 =A0 # clear 1..SZREG-1 bytes to al= ign > =A0 =A0 =A0 =A0PTR_ADDU =A0 =A0 =A0 =A0a0, a0, a3 I just tested this with a simple program - and there is certainly an issue here. If you can send me a patch, I can check that in after testing. The kernel version of bzero() does not seem to have the SZREG=3D=3D8 case, and this bug. Thanks, JC.