Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Jan 2007 05:00:38 GMT
From:      Julian Seward <jseward@acm.org>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: bin/106734: [patch] SSE2 optimization for bzip2/libbz2
Message-ID:  <200701070500.l0750cTs018266@freefall.freebsd.org>

index | next in thread | raw e-mail

The following reply was made to PR bin/106734; it has been noted by GNATS.

From: Julian Seward <jseward@acm.org>
To: Mikhail Teterin <mi@corbulon.video-collage.com>
Cc: bug-followup@freebsd.org
Subject: Re: bin/106734: [patch] SSE2 optimization for bzip2/libbz2
Date: Sun, 7 Jan 2007 05:08:43 +0000

 I believe this analysis is correct:
 
 >         /* Load the bytes: */
 >         n1 = (__m128i)_mm_loadu_pd((double *)(block + i1));
 >         n2 = (__m128i)_mm_loadu_pd((double *)(block + i2));
 > 
 > read beyond the end of the defined area of block.  block is
 > defined for [0 .. nblock + BZ_N_OVERSHOOT - 1], but I think
 > you are doing a SSE load at &block[nblock + BZ_N_OVERSHOOT - 2],
 > hence loading 15 bytes of garbage.
 
 Valgrind doesn't complain about the out-of-range access, because you
 are still accessing inside a valid malloc-allocated block.  But it
 does know that the read data is uninitialised, hence it complains
 when you do a comparison with that data followed by a conditional
 branch (or move) based on the result of the comparison.
 
 > This is possible... You think, the loop should exit earlier and test
 > the last (up to) 15 bytes one-by-one?
 
 Certainly the loop-end stuff needs to be fixed up somehow to reflect
 the 16 byte loads, but without further investigation I'm not sure how.


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701070500.l0750cTs018266>