Date: Tue, 5 Nov 2013 11:55:20 -0800 From: John-Mark Gurney <jmg@funkthat.com> To: Andriy Gapon <avg@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Xin LI <delphij@FreeBSD.org> Subject: Re: svn commit: r256132 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <20131105195519.GX73243@funkthat.com> In-Reply-To: <5278C496.4000002@FreeBSD.org> References: <201310080138.r981cPcT048197@svn.freebsd.org> <5278C496.4000002@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Andriy Gapon wrote this message on Tue, Nov 05, 2013 at 12:12 +0200: > on 08/10/2013 04:38 Xin LI said the following: > > Author: delphij > > Date: Tue Oct 8 01:38:24 2013 > > New Revision: 256132 > > URL: http://svnweb.freebsd.org/changeset/base/256132 > > > > Log: > > Improve lzjb decompress performance by reorganizing the code > > to tighten the copy loop. > > > > Submitted by: Denis Ahrens <denis h3q com> > > MFC after: 2 weeks > > Approved by: re (gjb) > > > > Modified: > > head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c > > > > Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c > > ============================================================================== > > --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c Mon Oct 7 22:30:03 2013 (r256131) > > +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c Tue Oct 8 01:38:24 2013 (r256132) > > @@ -117,7 +117,9 @@ lzjb_decompress(void *s_start, void *d_s > > src += 2; > > if ((cpy = dst - offset) < (uchar_t *)d_start) > > return (-1); > > - while (--mlen >= 0 && dst < d_end) > > + if (mlen > (d_end - dst)) > > + mlen = d_end - dst; > > + while (--mlen >= 0) > > *dst++ = *cpy++; > > } else { > > *dst++ = *src++; > > > > Intuitively it might seem that this change is indeed an improvement. > But given how "not dumb" (not sure if that always amounts to smart) the modern > compilers are, has anyone actually measured if this change indeed improves the > performance? > > Judging from the conversations on the ZFS mailing lists this change event hurt > performance in some environments. > Looks like the upstream is not going to take this change. > > So does it make any sense for us to make the code more obscure and different > from upstream? Unless performance benefits could indeed be demonstrated. The old code compiles to: 62c13: 49 f7 de neg %r14 62c16: 44 8d 58 ff lea -0x1(%rax),%r11d 62c1a: 4c 89 d3 mov %r10,%rbx 62c1d: 0f 1f 00 nopl (%rax) 62c20: 42 8a 14 33 mov (%rbx,%r14,1),%dl 62c24: 88 13 mov %dl,(%rbx) 62c26: 48 ff c3 inc %rbx 62c29: ff c8 dec %eax 62c2b: 85 c0 test %eax,%eax 62c2d: 7f f1 jg 62c20 <lzjb_decompress+0xa0> The load/store line is addresses 62c20-62c29... So it looks like clang is already doing the optimization that was added... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131105195519.GX73243>