Date: Sun, 13 Nov 2022 04:50:10 -0500 From: Paul Procacci <pprocacci@gmail.com> To: Daniel Cervus <DanieltheDeer@outlook.com> Cc: FreeBSD Questions Mailing List <freebsd-questions@freebsd.org> Subject: Re: Question about AMD64 ABI Message-ID: <CAFbbPuiQHDa6GdmD3BZDfgXJ9_pmQ1g2FYsQ=gvxK12veq%2BSfw@mail.gmail.com> In-Reply-To: <TYWP286MB2667FDC5425C52AD3A492052B8029@TYWP286MB2667.JPNP286.PROD.OUTLOOK.COM> References: <CAFbbPujfaSZ%2BxGsKPL4J-arydLCr7=YGyrBTt18Cg8q16z3Tdg@mail.gmail.com> <TYWP286MB2667FDC5425C52AD3A492052B8029@TYWP286MB2667.JPNP286.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000015a40f05ed570d3c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Nov 13, 2022 at 4:07 AM Daniel Cervus <DanieltheDeer@outlook.com> wrote: > Hi Paul, > > You mean I need to ensure higher bits are zeros (including signed int, > short, char...) when passing the parameters? Okay, I get it. > That's not what I said. I said: "32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register." I also said: "16 and 8 bit operands don't have this "built-in" so you would indeed need to ensure the higher bits are cleared." So what does this mean? Let's take the following C function prototype as well as its implementation .. something I just pulled out of thin air and is useless, but is only for demonstration purposes: ; rdi ; long long iAmGoingToDoStuff(long long value) iAmGoingToDoStuff: inc rdi mov rax, rdi ret Now let's take two separate asm snippets...one that WILL work as expected and another that WON'T work as expected: The working example is the easiest. Even though the prototype declares it accepts a long long, we only have to explicitly set edi as it's a 32 bit operand that automatically gets zero extended to 64 bits. workingExample: xor edi, edi ; Pass 0, all bits in rdi are 0 due to abi rules call iAmGoingToDoStuff ret With the nonworking example I've tried to provide a scenario with the value stored in edi was previously set by an outside force. That's the first `mov'. The second `mov' in this example only sets the lower 4 bits of edi, leaving the rest of the register alone. If the intent was to set the register to 0, then this is obviously a problem and you would have to ensure to clear the proper bits. nonWorkingExample: mov edi, 0xCF0A ; Set edi to a random value mov dl, 0 call iAmGoingToDoStuff ret My advice regarding manipulating 16 bits or smaller registers is ... don't unless you absolutely have to. If you know what EFLAGS stalls are and know some 16 computations you're doing won't stall on it, then it's not a problem. It's best however to just avoid them altogether. (Exceptions apply but are far as few between). By the way, is there any good resources that I can learn 64-bit FB (or > *nix) assembly programming from? > > This answer might seem clique but I swear it's not the intention. The best reference for instructions, associated operands, rules regarding zero extending of 32-bit registers, etc., is right in the manual: https://cdrdv2.intel.com/v1/dl/getContent/671110 The above naturally doesn't "teach" assembly, but it will be your "goto" for just about everything. I've been programming in assembly for well over 2 decades, damn nearing 3 decades now, and it's STILL the Holy Bible of how a CPU ticks ( no pun intended). With that said, each assembler has their own syntax when it comes down to translating its version of assembly into machine code. I certainly advocate for nasm as it's my goto but others might find the syntax of fasm, for example, more to their liking. It's up to you to find an assembler that's to your liking and read it's manual through and through. > Regards, > Daniel > > > ~Paul > =E5=9C=A8 2022=E5=B9=B411=E6=9C=8813=E6=97=A5=EF=BC=8C=E4=B8=8B=E5=8D=884= :52=EF=BC=8CPaul Procacci <pprocacci@gmail.com> =E5=86=99=E9=81=93=EF=BC=9A > > =EF=BB=BF > > > On Sat, Nov 12, 2022 at 10:31 PM Daniel Cervus <DanieltheDeer@outlook.com= > > wrote: > >> Hi everyone, >> >> I=E2=80=99m trying to do assembly programming on FB in 64-bit mode. I ha= ve a >> question, 64-bit mode requires parameters to be passed on 64-bit registe= rs. >> But when a parameter is 32-bit or smaller, do I need to sign-extend (or >> zero-extend) them to 64-bit? The System V ABI specifications only says "= The >> size of each argument gets rounded up to eightbytes." It=E2=80=99s somew= hat >> ambiguous. How to round up 'float', when they are passed on stack? >> >> Thanks, >> Daniel > > > (Didn't Reply all) > > Hi Daniel, > > There are a handful of operations that operate on 32bit registers that > automatically clear the high bits for you. > > 32-bit operands generate a 32-bit result, zero-extended to a 64-bit resul= t > in the destination general-purpose register. > 16 and 8 bit operands don't have this "built-in" so you would indeed need > to ensure the higher bits are cleared. > > mov dword edi, 1 is effectively setting rdi to the value of > 0x0000000000000001 > > As for sign extending the values, the answer is `no'.....under most > circumstances. If you are sticking to widths of 32 and 64 bits then you > are fine. > The moment you mess with 16bits or smaller, then yes, you need to ensure > no garbage lives in your higher bits because the cpu doesn't clear this f= or > you. > > "The size of each argument gets rounded up to eight bytes." > > The size of ALL arguments passed to the callee via general purpose > registers is 8 bytes "regardless of what a function def says". It's HOW > the callee operates upon the register arguments that matters. > > As for passing arguments on the stack ... you shouldn't have to. Not > only are there the GPR's rdi, rsi, rdx, rcx, r8, r9, r10 at your disposal > for int/scalar types there are also xmm0-xmm7 for your floats. > > Thanks, > ~Paul > > -- > __________________ > > :(){ :|:& };: > > --=20 __________________ :(){ :|:& };: --00000000000015a40f05ed570d3c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">= <div dir=3D"ltr" class=3D"gmail_attr">On Sun, Nov 13, 2022 at 4:07 AM Danie= l Cervus <<a href=3D"mailto:DanieltheDeer@outlook.com" target=3D"_blank"= >DanieltheDeer@outlook.com</a>> wrote:<br></div><blockquote class=3D"gma= il_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,2= 04,204);padding-left:1ex"> <div dir=3D"auto"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr">Hi Paul,</div> <div dir=3D"ltr"><br> </div> <div dir=3D"ltr">=C2=A0 You mean I need to ensure higher bits are zeros (in= cluding signed int, short, char...) when passing the parameters? Okay, I ge= t it. <br></div></div></div></div></blockquote><div><br></div><div><br></di= v><div>That's not what I said.<br></div><div>I said:=C2=A0 "32-bit= operands generate a 32-bit result, zero-extended to a 64-bit result in the= destination general-purpose register."<br></div><div>I also said: &qu= ot;16 and 8 bit operands don't have this "built-in" so you wo= uld indeed need to ensure the higher bits are cleared."<br><br></div><= div>So what does this mean?=C2=A0 Let's take the following C function p= rototype as well as its implementation .. something I just pulled out of th= in air and is useless, but is only for demonstration purposes:<br><br>=C2= =A0 =C2=A0 =C2=A0 =C2=A0 ; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 r= di<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 ; long long iAmGoingToDoStuff(long long v= alue)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 iAmGoingToDoStuff:<br>=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inc =C2=A0 =C2=A0 rdi<br>=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mov =C2=A0 =C2=A0 rax, rdi= <br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret<br><br></di= v><div>Now let's take two separate asm snippets...one that WILL work as= expected and another that WON'T work as expected:<br>The working examp= le is the easiest.=C2=A0 Even though the prototype declares it accepts a lo= ng long, we only have to</div><div>explicitly set edi as it's a 32 bit = operand that automatically gets zero extended to 64 bits.<br><br>=C2=A0 =C2= =A0 =C2=A0 =C2=A0 workingExample:<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 xor =C2=A0 =C2=A0 edi, edi =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0; Pass 0, all bits in rdi are 0 due to abi rules= <br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 call =C2=A0 =C2= =A0iAmGoingToDoStuff<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 ret<br><br>With the nonworking example I've tried to provide a s= cenario with the value stored in edi was previously set by an outside force= .=C2=A0 That's the first `mov'.<br></div><div>The second `mov' = in this example only sets the lower 4 bits of edi, leaving the rest of the = register alone.=C2=A0 If the intent was to set the register to 0, then</div= ><div>this is obviously a problem and you would have to ensure to clear the= proper bits.<br></div><div><br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 nonWorkingExamp= le:<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mov =C2=A0 = =C2=A0 edi, 0xCF0A =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ; Set edi to a= random value <br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m= ov =C2=A0 =C2=A0 dl, 0<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 call =C2=A0 =C2=A0iAmGoingToDoStuff<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret<br></div><div>=C2=A0<br></div><div>My advic= e regarding manipulating 16 bits or smaller registers is ... don't unle= ss you absolutely have to.=C2=A0 If you know what EFLAGS stalls are and kno= w some 16 computations you're doing won't stall on it, then it'= s not a problem.<br>It's best however to just avoid them altogether.=C2= =A0 (Exceptions apply but are far as few between).<br> <br></div><div><br><= /div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bo= rder-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"auto"><d= iv dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"></div> <div dir=3D"ltr">=C2=A0 By the way, is there any good resources that I can = learn 64-bit FB (or *nix) assembly programming from?<br><br></div></div></d= iv></div></blockquote><div><br>This answer might seem clique but I swear it= 's not the intention.=C2=A0 The best reference for instructions, associ= ated operands, rules regarding zero extending of 32-bit registers, etc., is= right in the manual:<br><br><a href=3D"https://cdrdv2.intel.com/v1/dl/getC= ontent/671110">https://cdrdv2.intel.com/v1/dl/getContent/671110</a></div><d= iv><br></div><div>The above naturally doesn't "teach" assembl= y, but it will be your "goto" for just about everything.=C2=A0 I&= #39;ve been programming in assembly for well over 2 decades, damn nearing 3= decades now, and it's STILL the Holy Bible of how a CPU ticks ( no pun= intended).<br><br></div><div>With that said, each assembler has their own = syntax when it comes down to translating its version of assembly into machi= ne code.<br></div><div>I certainly advocate for nasm as it's my goto bu= t others might find the syntax of fasm, for example, more to their liking.<= /div><div>It's up to you to find an assembler that's to your liking= and read it's manual through and through.<br></div><div><br></div><blo= ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left= :1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"auto"><div dir=3D= "ltr"><div dir=3D"ltr"><div dir=3D"ltr"></div> <div dir=3D"ltr"><br> </div> <div dir=3D"ltr">Regards,</div> <div dir=3D"ltr">Daniel</div> <div dir=3D"ltr"><br><br></div></div></div></div></blockquote><div><br></di= v><div>~Paul<br></div><div><br></div><div><br></div><div>=C2=A0</div><block= quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1= px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"auto"><div dir=3D"l= tr"><div dir=3D"ltr"><div dir=3D"ltr"> <blockquote type=3D"cite">=E5=9C=A8 2022=E5=B9=B411=E6=9C=8813=E6=97=A5=EF= =BC=8C=E4=B8=8B=E5=8D=884:52=EF=BC=8CPaul Procacci <<a href=3D"mailto:pp= rocacci@gmail.com" target=3D"_blank">pprocacci@gmail.com</a>> =E5=86=99= =E9=81=93=EF=BC=9A<br> <br> </blockquote> </div> <blockquote type=3D"cite"> <div dir=3D"ltr">=EF=BB=BF <div dir=3D"ltr"> <div> <div dir=3D"ltr"><br> </div> <br> <div class=3D"gmail_quote"> <div dir=3D"ltr" class=3D"gmail_attr">On Sat, Nov 12, 2022 at 10:31 PM Dani= el Cervus <<a href=3D"mailto:DanieltheDeer@outlook.com" target=3D"_blank= ">DanieltheDeer@outlook.com</a>> wrote:<br> </div> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left:1px solid rgb(204,204,204);padding-left:1ex"> Hi everyone,<br> <br> I=E2=80=99m trying to do assembly programming on FB in 64-bit mode. I have = a question, 64-bit mode requires parameters to be passed on 64-bit register= s. But when a parameter is 32-bit or smaller, do I need to sign-extend (or = zero-extend) them to 64-bit? The System V ABI specifications only says "The size of each argument gets rounde= d up to eightbytes." It=E2=80=99s somewhat ambiguous. How to round up = 'float', when they are passed on stack?<br> <br> Thanks,<br> Daniel</blockquote> </div> <br> </div> (Didn't Reply all)<br> <br clear=3D"all"> <div> <div> <div> <div>Hi Daniel,</div> <div><br> </div> <div>There are a handful of operations that operate on 32bit registers that= automatically clear the high bits for you.<br> <br> 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result = in the destination general-purpose register.<br> </div> <div>16 and 8 bit operands don't have this "built-in" so you = would indeed need to ensure the higher bits are cleared.<br> <br> </div> <div>mov dword edi, 1=C2=A0 is effectively setting rdi to the value of 0x00= 00000000000001</div> <br> </div> As for sign extending the values, the answer is `no'.....under most cir= cumstances.=C2=A0 If you are sticking to widths of 32 and 64 bits then you = are fine.</div> <div>The moment you mess with 16bits or smaller, then yes, you need to ensu= re no garbage lives in your higher bits because the cpu doesn't clear t= his for you.<span><br> <br> "The size of each argument gets rounded up to eight bytes."<br> </span></div> <div><br> The size of ALL arguments passed to the callee via general purpose register= s is 8 bytes "regardless of what a function def says".=C2=A0 It&#= 39;s HOW the callee operates upon the register arguments that matters.<br> <br> </div> <div>As for passing arguments on the stack ...=C2=A0 you shouldn't have= to.=C2=A0 Not only are there the GPR's rdi, rsi, rdx, rcx, r8, r9, r10= at your disposal for int/scalar types there are also xmm0-xmm7 for your fl= oats.<br> </div> <div><br> </div> <div> <div> <div>Thanks,</div> <div>~Paul</div> </div> </div> <br> -- <br> <div dir=3D"ltr">__________________<br> <br> :(){ :|:& };:</div> </div> </div> </div> </blockquote> </div> </div> </div> </blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr">__________= ________<br><br>:(){ :|:& };:</div></div> --00000000000015a40f05ed570d3c--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFbbPuiQHDa6GdmD3BZDfgXJ9_pmQ1g2FYsQ=gvxK12veq%2BSfw>