Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Feb 2011 17:07:57 -0800
From:      Artem Belevich <fbsdlist@src.cx>
To:        Juli Mallett <jmallett@freebsd.org>, "C. Jayachandran" <c.jayachandran@gmail.com>
Cc:        freebsd-mips@freebsd.org
Subject:   lib/libc/mips/string/bzero.S -- problem in 64-bit mode.
Message-ID:  <AANLkTik2evgf4-k85P%2Bsm953ofa0=UNd7o2uWhQw7qiB@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

I think htere's a problem with bzero implementation for 64-bit mips (SZREG==8).

http://svn.freebsd.org/viewvc/base/head/lib/libc/mips/string/bzero.S?revision=209231&view=markup

LEAF(bzero)
	.set	noreorder
	blt		a1, 3*SZREG, smallclr # small amount to clear?
	PTR_SUBU	a3, zero, a0	# compute # bytes to word align address
	and		a3, a3, SZREG-1
	beq		a3, zero, 1f	# skip if word aligned
#if SZREG == 4
	PTR_SUBU	a1, a1, a3	# subtract from remaining count
	SWHI		zero, 0(a0)	# clear 1, 2, or 3 bytes to align
	PTR_ADDU	a0, a0, a3
#endif

#if SZREG == 8
	PTR_SUBU	a1, a1, a3	# subtract from remaining count
	PTR_ADDU	a0, a0, a3	# align dst to next word
	sll		a3, a3, 3	# bits to bytes
	li		a2, -1		# make a mask
#if _BYTE_ORDER == _BIG_ENDIAN
(a)	REG_SRLV	a2, a2, a3	# we want to keep the MSB bytes
#endif
#if _BYTE_ORDER == _LITTLE_ENDIAN
(b)	REG_SLLV	a2, a2, a3	# we want to keep the LSB bytes
#endif
(c)	nor		a2, zero, a2	# complement the mask
	REG_L		v0, -SZREG(a0)	# load the word to partially clear
	and		v0, v0, a2	# clear the bytes
	REG_S		v0, -SZREG(a0)	# store it back
#endif

Let's suppose we're trying to bzero something at 0x1234567.  A3 will
contain number of bytes *remaining* until register-aligned address.
I.e. 1 in this case.
When we make it to (c) on big-endian platforms A2=0x00FFFFFF_FFFFFFFF
and on little-endianA2=0xFFFFFFFF_FFFFFF00.
after (c) it's 0xFF000000_00000000 and 0x00000000_000000FF
correspondingly, unless I've got NOR semantics wrong.

Now we load register, AND it with a2 and write the result back. It
does not look right -- we're clearing *7* bytes instead of only one
and clobber the data that preceeds the start address.

I believe correct code should look like this:

#if SZREG == 8
	PTR_SUBU	a1, a1, a3	# subtract from remaining count
	PTR_ADDU	a0, a0, a3	# align dst to next word
	sll		a3, a3, 3	# bits to bytes
	li		a2, -1		# make a mask
#if _BYTE_ORDER == _BIG_ENDIAN
	REG_SLLV	a2, a2, a3	# we want to keep the MSB bytes
#endif
#if _BYTE_ORDER == _LITTLE_ENDIAN
	REG_SRLV	a2, a2, a3	# we want to keep the LSB bytes
#endif
	REG_L		v0, -SZREG(a0)	# load the word to partially clear
	and		v0, v0, a2	# clear the bytes
	REG_S		v0, -SZREG(a0)	# store it back
#endif

One thing I don't quite understand is -- why do we bother with all
this manual masking at all? Why not just use REG_SHI for both SZREG==4
and SZREG==8 cases? If I read the code I quoted above correctly, it
attempts to emulate SDL/SDR instructions. Using REG_SHI macro would
pick correct swl/swr/sdl/sdr variant based on register size and
endianness and would clear the unaligned bytes.

	PTR_SUBU	a1, a1, a3	# subtract from remaining count
	REG_SHI	zero, 0(a0)	# clear 1..SZREG-1 bytes to align
	PTR_ADDU	a0, a0, a3

Thanks,
--Artem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTik2evgf4-k85P%2Bsm953ofa0=UNd7o2uWhQw7qiB>