Date: Fri, 25 Jan 2008 09:49:55 +0100 From: Willem Jan Withagen <wjw@digiware.nl> To: John Hay <jhay@meraka.org.za> Cc: des@des.no, freebsd-arm@freebsd.org Subject: Re: sshd broken on arm? Message-ID: <4799A2B3.4060003@digiware.nl> In-Reply-To: <20080125041540.GA30262@zibbi.meraka.csir.co.za> References: <479880A7.1030107@digiware.nl> <20080124.084828.1608359032.imp@bsdimp.com> <864pd386mj.fsf@ds4.des.no> <20080124.110954.179240992.imp@bsdimp.com> <47991E08.6070609@digiware.nl> <20080125041540.GA30262@zibbi.meraka.csir.co.za>
next in thread | previous in thread | raw e-mail | index | archive | help
John Hay wrote: >>> The problem is that the char array isn't guaranteed to be aligned in >>> any way. The fix posted is correct. >>> >>> There may be other fixes too, such as using a union to force >>> alignment. >> Well I'm sort of puzzled right now since after preprocessing the >> variable allocation part boils down to: >> ===== >> struct msghdr msg; >> struct iovec vec; >> char ch = '\0'; >> ssize_t n; >> >> char tmp[((((unsigned)(sizeof(struct cmsghdr)) + (sizeof(int) - 1)) & >> ~(sizeof(int) - 1)) + (((unsigned)(sizeof(int)) + (sizeof(int >> ) - 1)) & ~(sizeof(int) - 1)))]; >> struct cmsghdr *cmsg; >> ===== >> So as far as I can see is char tmp[] included between 2 4-byte items and >> allocation should be "automagically" 4-byte aligned. >> >> Now adding simple code like tmp[0] = 50, the first part of the assembly >> is: (Comments are mine for as far as I can grasp them) > > Just doing tmp[0] = 50 will cause a byte access which should not be a > problem. The original code does something like this (simplified): > > char tmp[CMSG_SPACE(sizeof(int))]; > int *ti; > > ti = tmp; > *ti = 50; > > Now the 50 is an int and not a byte and then the alignment does matter. I know, But to figure out where the array temp is allocated on the stack the easiest to do that is to assign a value to its first element. In assembly you will then find that the starting address that the compiler has calculated for this array. And as far as I can tell [ fp, #-80 ] is going to be quad-aligned assuming that the framepointer is also quad-aligned. Now the not optimised version is that 50 is first loaded into r3, then moved into r2. And only then r2 is moved into memory. Whereas is would be allowed to skip the 'mov r2, r3' step, and do it directly from r3. You'd have to dump you erroneous code into asm, an have a look at what it generated. the gcc switch for that is -S and it leaves a .s file. Also do this for the case that works, and see the differences. Which is not going to be easy. Because by the time -O2 optimizing gets going, the code becomes rather obfuscated. I'm still trying to find an option that puts the source code and/or linenumbers in the asm code. I seem to remember that there used to be such a beast. But the gcc man-pages never stops.... --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4799A2B3.4060003>