Date: Fri, 10 Feb 2017 18:11:23 -0800 From: Mark Millard <markmi@dsl-only.net> To: Alexandre Martins <alexandre.martins@stormshield.eu> Cc: freebsd-arm <freebsd-arm@freebsd.org>, Ian Lepore <ian@FreeBSD.org> Subject: Re: bcopy/memmove optimization broken ? [looks like you are correct to me, I give supporting detail] Message-ID: <25360EAB-3079-4037-9FB5-B7781ED40FA6@dsl-only.net> In-Reply-To: <5335118.oK1KXXDaG5@pc-alex>
index | next in thread | previous in thread | raw e-mail
On 2017-Feb-10, at 7:51 AM, Alexandre Martins <alexandre.martins at stormshield.eu> wrote: > Hi all > > I see in the kernel code of bcopy/memmove an optimization if the copied region > don't overlap: > > https://svnweb.freebsd.org/base/head/sys/arm/arm/support.S?view=annotate#l403 > > I'm a newbie in ARM assembly, so, sorry if i'm wrong, but > - R0 is the dst (1st parameter) > - R1 is the src (2nd parameter) > > So : > subcc r3, r0, r1 /* if (dst > src) r3 = dst - src */ > subcs r3, r1, r0 /* if (src > dsr) r3 = src - dst */ > cmp r3, r2 /* if (r3 < len) we have an overlap */ > bcc PIC_SYM(_C_LABEL(memcpy), PLT) > > Seems to be inverted. Should it be that ? : > subcs r3, r0, r1 /* if (dst > src) r3 = dst - src */ > subcc r3, r1, r0 /* if (src > dsr) r3 = src - dst */ > cmp r3, r2 /* if (r3 < len) we have an overlap */ > bcs PIC_SYM(_C_LABEL(memcpy), PLT) > > > Best regards > > -- > Alexandre Martins > STORMSHIELD I did not expect something so central that has been around so long to look wrong but. . . Going through the supporting details of the original code based on my looking around here is what I found: #include <string.h> void *memmove(void *dest, const void *src, size_t n); So I'd expect (c'ish notation): r0==dest r1==src r2==n Then for (the comments vs. code is being challenged): (The comments do seem to give the intent correctly.) cmp r0, r1 RETeq /* Bail now if src/dst are the same */ subcc r3, r0, r1 /* if (dst > src) r3 = dst - src */ subcs r3, r1, r0 /* if (src > dsr) r3 = src - dst */ cmp r3, r2 /* if (r3 < len) we have an overlap */ bcc PIC_SYM(_C_LABEL(memcpy), PLT) . . . cmp r0,r1 should result in condition code (c'ish notation): eq=(r0==r1) cc=(r0<r1) cs=(r0>=r1) (Only the r0 position has no immediate-value alternative.) subcc r3,r0,r1 is: if (cc) r3=r0-r1 // no condition code change In other words: if (dst<src) r3=dst-src So it does not match the test listed in the comment as far as I can see. And in (mathematical) integer arithmetic the r3 result would be negative for dst-src. For non-negative arithmetic (natural or whole): undefined. subcs r3,r1,r0 is: if (cs) r3=r1-r0 // no condition code change In other words: if (dst>=src) r3=src-dst So it does not match the test listed in the comment as far as I can see. And in (mathematical) integer arithmetic the r3 result would be nonpositive for src-dst. But the earlier RETeq prevents the zero case, so: negative. For non-negative arithmetic (natural or whole): undefined. If it was only a normal mathemetical context r3=-abs(dst-src) would be a summary of the two sub instruction sequence as it is from what I can tell. For the purpose the summary should be: r3=abs(dst-src), given dst!=src . There is no need to wonder outside normal mathematical non-negative arithmetic in the process either. Your code change would have the right summary and use only normal mathematical rules from what I can tell: cmp r0, r1 RETeq /* Bail now if src/dst are the same */ subcs r3, r0, r1 /* if (dst > src) r3 = dst - src */ subcc r3, r1, r0 /* if (src > dsr) r3 = src - dst */ cmp r3, r2 /* if (r3 < len) we have an overlap */ bcs PIC_SYM(_C_LABEL(memcpy), PLT) . . . subcs r3,r0,r1 is: if (cs) r3=r0-r1 // no condition code change In other words: if (dst>=src) r3=dst-src. Given the prior RETeq, that is effectively: if (dst>src) r3=dst-src. And that matches the comments and would produce a positive result for r3, matching the normal mathematical result. subcc r3,r1,r0 is: if (cc) r3=r1-r0 // no condition code change In other words: if (dst<src) r3=src-dst And that matches the comments and would produce a positive result for r3, matching the normal mathematical result. Overall summary of the two updated sub?? instructions: r3=abs(dst-src), given dst!=src. And that would make for an appropriate comparison to n (i.e., to r2). It appears to have been as it is now since -r143175 when memmove was added to sys/arm/arm/support.S (2005-Apr-12). === Mark Millard markmi at dsl-only.nethelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25360EAB-3079-4037-9FB5-B7781ED40FA6>
