From owner-freebsd-mips@FreeBSD.ORG Tue Feb 22 01:38:54 2011 Return-Path: Delivered-To: freebsd-mips@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1DC1310656C1 for ; Tue, 22 Feb 2011 01:38:54 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id C81218FC12 for ; Tue, 22 Feb 2011 01:38:53 +0000 (UTC) Received: by mail-qw0-f54.google.com with SMTP id 8so501519qwj.13 for ; Mon, 21 Feb 2011 17:38:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:date:x-google-sender-auth :message-id:subject:from:to:cc:content-type; bh=HcaDq1jqF8nPAzWIm4eHX1LfK8G3I6203mp4rOWOHvg=; b=TgsDGFn4dms5ZdiGT6RJTPB4p8OnnpeN9g0lfRJW97j4mgO1InPT4VQQ88i0hL9lLs yWy6xcg3i7NWhuRYMFclfM6gD8IkNEBzGKOGecANQt7KxACSyTsZq+G6AvQpGvCSP6ql KZXU3mCb7PjY3xUphPGR90URyyuslCaKsgDxA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; b=cZ8UQHedQHCWw6yoz8taoo93X0tM6cBlf+8qs6gC067o6zEg/FwC1iB38zsgmWXj/w Qf1v+n954PU/DugmsqCgLUPTMgcYChE15UddWq7OqG/1QX+kQkH7o3xBY+fc2QdMlhup UGXf2G4JWDP61TsuS3Mxq2jqpjz6sDRgGITEc= MIME-Version: 1.0 Received: by 10.229.232.3 with SMTP id js3mr1493569qcb.182.1298336877612; Mon, 21 Feb 2011 17:07:57 -0800 (PST) Sender: artemb@gmail.com Received: by 10.229.215.71 with HTTP; Mon, 21 Feb 2011 17:07:57 -0800 (PST) Date: Mon, 21 Feb 2011 17:07:57 -0800 X-Google-Sender-Auth: xYHLx-SBs-ethYyfi6S6eW6D5ng Message-ID: From: Artem Belevich To: Juli Mallett , "C. Jayachandran" Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-mips@freebsd.org Subject: lib/libc/mips/string/bzero.S -- problem in 64-bit mode. X-BeenThere: freebsd-mips@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to MIPS List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Feb 2011 01:38:54 -0000 Hi, I think htere's a problem with bzero implementation for 64-bit mips (SZREG==8). http://svn.freebsd.org/viewvc/base/head/lib/libc/mips/string/bzero.S?revision=209231&view=markup LEAF(bzero) .set noreorder blt a1, 3*SZREG, smallclr # small amount to clear? PTR_SUBU a3, zero, a0 # compute # bytes to word align address and a3, a3, SZREG-1 beq a3, zero, 1f # skip if word aligned #if SZREG == 4 PTR_SUBU a1, a1, a3 # subtract from remaining count SWHI zero, 0(a0) # clear 1, 2, or 3 bytes to align PTR_ADDU a0, a0, a3 #endif #if SZREG == 8 PTR_SUBU a1, a1, a3 # subtract from remaining count PTR_ADDU a0, a0, a3 # align dst to next word sll a3, a3, 3 # bits to bytes li a2, -1 # make a mask #if _BYTE_ORDER == _BIG_ENDIAN (a) REG_SRLV a2, a2, a3 # we want to keep the MSB bytes #endif #if _BYTE_ORDER == _LITTLE_ENDIAN (b) REG_SLLV a2, a2, a3 # we want to keep the LSB bytes #endif (c) nor a2, zero, a2 # complement the mask REG_L v0, -SZREG(a0) # load the word to partially clear and v0, v0, a2 # clear the bytes REG_S v0, -SZREG(a0) # store it back #endif Let's suppose we're trying to bzero something at 0x1234567. A3 will contain number of bytes *remaining* until register-aligned address. I.e. 1 in this case. When we make it to (c) on big-endian platforms A2=0x00FFFFFF_FFFFFFFF and on little-endianA2=0xFFFFFFFF_FFFFFF00. after (c) it's 0xFF000000_00000000 and 0x00000000_000000FF correspondingly, unless I've got NOR semantics wrong. Now we load register, AND it with a2 and write the result back. It does not look right -- we're clearing *7* bytes instead of only one and clobber the data that preceeds the start address. I believe correct code should look like this: #if SZREG == 8 PTR_SUBU a1, a1, a3 # subtract from remaining count PTR_ADDU a0, a0, a3 # align dst to next word sll a3, a3, 3 # bits to bytes li a2, -1 # make a mask #if _BYTE_ORDER == _BIG_ENDIAN REG_SLLV a2, a2, a3 # we want to keep the MSB bytes #endif #if _BYTE_ORDER == _LITTLE_ENDIAN REG_SRLV a2, a2, a3 # we want to keep the LSB bytes #endif REG_L v0, -SZREG(a0) # load the word to partially clear and v0, v0, a2 # clear the bytes REG_S v0, -SZREG(a0) # store it back #endif One thing I don't quite understand is -- why do we bother with all this manual masking at all? Why not just use REG_SHI for both SZREG==4 and SZREG==8 cases? If I read the code I quoted above correctly, it attempts to emulate SDL/SDR instructions. Using REG_SHI macro would pick correct swl/swr/sdl/sdr variant based on register size and endianness and would clear the unaligned bytes. PTR_SUBU a1, a1, a3 # subtract from remaining count REG_SHI zero, 0(a0) # clear 1..SZREG-1 bytes to align PTR_ADDU a0, a0, a3 Thanks, --Artem