Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Jan 2007 05:00:38 GMT
From:      Julian Seward <jseward@acm.org>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: bin/106734: [patch] SSE2 optimization for bzip2/libbz2
Message-ID:  <200701070500.l0750cTs018266@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR bin/106734; it has been noted by GNATS.

From: Julian Seward <jseward@acm.org>
To: Mikhail Teterin <mi@corbulon.video-collage.com>
Cc: bug-followup@freebsd.org
Subject: Re: bin/106734: [patch] SSE2 optimization for bzip2/libbz2
Date: Sun, 7 Jan 2007 05:08:43 +0000

 I believe this analysis is correct:
 
 >         /* Load the bytes: */
 >         n1 = (__m128i)_mm_loadu_pd((double *)(block + i1));
 >         n2 = (__m128i)_mm_loadu_pd((double *)(block + i2));
 > 
 > read beyond the end of the defined area of block.  block is
 > defined for [0 .. nblock + BZ_N_OVERSHOOT - 1], but I think
 > you are doing a SSE load at &block[nblock + BZ_N_OVERSHOOT - 2],
 > hence loading 15 bytes of garbage.
 
 Valgrind doesn't complain about the out-of-range access, because you
 are still accessing inside a valid malloc-allocated block.  But it
 does know that the read data is uninitialised, hence it complains
 when you do a comparison with that data followed by a conditional
 branch (or move) based on the result of the comparison.
 
 > This is possible... You think, the loop should exit earlier and test
 > the last (up to) 15 bytes one-by-one?
 
 Certainly the loop-end stuff needs to be fixed up somehow to reflect
 the 16 byte loads, but without further investigation I'm not sure how.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701070500.l0750cTs018266>