From nobody Sun Nov 13 09:50:10 2022 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4N973M11tvz4hS6h for ; Sun, 13 Nov 2022 09:50:23 +0000 (UTC) (envelope-from pprocacci@gmail.com) Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4N973L5wF6z43Td for ; Sun, 13 Nov 2022 09:50:22 +0000 (UTC) (envelope-from pprocacci@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oo1-xc32.google.com with SMTP id j6-20020a4ab1c6000000b004809a59818cso1225073ooo.0 for ; Sun, 13 Nov 2022 01:50:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=MChQOn4qn1EVCfc10G8B2UxJBCBO+GUUEH7d4IEHnzM=; b=PPqEAqwKzMkNo5+AzymSl51NYNGU1jSnlAf4lZAxO7OBgTqCLj4NuIL87kV9UBGWla 1lY9gxc9lIFCMx9yaPIXH4fFd/8vnJG5VFINMW9h1ZKAEaKR1UgtHURsitW1M5UE8AiX 74sxLwAdkIOQovuAhXCcmr1MD3afBxkz15wOxmArQ/emOZyFMUxDcbfTDa5t6ipd59fP t4DpsGTZWoEjNXhHdt3de7B5y9GT96k6/IN7f5nx9FXDyJiee1jd34MRS3/ChXwdS467 aIstgUU/oBQ0bDqrNuy8pnyqCijFhXJibTPDoh0sf6/UVf5iffk4HCKEXZXHWaGMM87U ZWLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=MChQOn4qn1EVCfc10G8B2UxJBCBO+GUUEH7d4IEHnzM=; b=fq5obPs5zSTgAa96K+ckI+KVguW8XMcoOsIGka/x/weNMN5HUcb1bhMA5pyqKrSaVa ghxShyTE+cO0G+isqOu42I/cl/8MX3iCsODS/2GTL7EUWOZ5YGJ1ERv6K9WZ4rdJa4LD Y9qc4I+Fu95HdwotMb/qbBsbf90JnXTHxgUWsYqm28o0hs7GTEVIJq+I1P4DSeB0HefS sUIzYI9gneXcEQHXZVzDYTp/7OQ7rvQsob+ku5Homw3j9XK7Tkmy80uI/0hkMh2TTDRS WAHUBWBz9F+eLfElYtjsPC5fpNa+dcE5CzfEI23bZZGSXb4i75WrYsRMG4q3tZCPLNHd 0SoA== X-Gm-Message-State: ANoB5pmqO9o2ba3IET6wuELp+dmIbxQMnYC5nRJ7OZjDbjRrhg0BcqhS ZJ51enYXu41WYtdbExtQlmN0JGMets/3/7s5yaZUSfTb3+D9 X-Google-Smtp-Source: AA0mqf4Zc0fYWRQdSy86xdq06yapckVIyv7r5tAwpaEr86b4aEaDwFp21eQzOJts00Nr1XYdwNEvKWBaea06fKcbIaQ= X-Received: by 2002:a4a:de96:0:b0:49f:c9c:5f3d with SMTP id v22-20020a4ade96000000b0049f0c9c5f3dmr3740077oou.41.1668333021011; Sun, 13 Nov 2022 01:50:21 -0800 (PST) List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Paul Procacci Date: Sun, 13 Nov 2022 04:50:10 -0500 Message-ID: Subject: Re: Question about AMD64 ABI To: Daniel Cervus Cc: FreeBSD Questions Mailing List Content-Type: multipart/alternative; boundary="00000000000015a40f05ed570d3c" X-Rspamd-Queue-Id: 4N973L5wF6z43Td X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-ThisMailContainsUnwantedMimeParts: N --00000000000015a40f05ed570d3c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Nov 13, 2022 at 4:07 AM Daniel Cervus wrote: > Hi Paul, > > You mean I need to ensure higher bits are zeros (including signed int, > short, char...) when passing the parameters? Okay, I get it. > That's not what I said. I said: "32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register." I also said: "16 and 8 bit operands don't have this "built-in" so you would indeed need to ensure the higher bits are cleared." So what does this mean? Let's take the following C function prototype as well as its implementation .. something I just pulled out of thin air and is useless, but is only for demonstration purposes: ; rdi ; long long iAmGoingToDoStuff(long long value) iAmGoingToDoStuff: inc rdi mov rax, rdi ret Now let's take two separate asm snippets...one that WILL work as expected and another that WON'T work as expected: The working example is the easiest. Even though the prototype declares it accepts a long long, we only have to explicitly set edi as it's a 32 bit operand that automatically gets zero extended to 64 bits. workingExample: xor edi, edi ; Pass 0, all bits in rdi are 0 due to abi rules call iAmGoingToDoStuff ret With the nonworking example I've tried to provide a scenario with the value stored in edi was previously set by an outside force. That's the first `mov'. The second `mov' in this example only sets the lower 4 bits of edi, leaving the rest of the register alone. If the intent was to set the register to 0, then this is obviously a problem and you would have to ensure to clear the proper bits. nonWorkingExample: mov edi, 0xCF0A ; Set edi to a random value mov dl, 0 call iAmGoingToDoStuff ret My advice regarding manipulating 16 bits or smaller registers is ... don't unless you absolutely have to. If you know what EFLAGS stalls are and know some 16 computations you're doing won't stall on it, then it's not a problem. It's best however to just avoid them altogether. (Exceptions apply but are far as few between). By the way, is there any good resources that I can learn 64-bit FB (or > *nix) assembly programming from? > > This answer might seem clique but I swear it's not the intention. The best reference for instructions, associated operands, rules regarding zero extending of 32-bit registers, etc., is right in the manual: https://cdrdv2.intel.com/v1/dl/getContent/671110 The above naturally doesn't "teach" assembly, but it will be your "goto" for just about everything. I've been programming in assembly for well over 2 decades, damn nearing 3 decades now, and it's STILL the Holy Bible of how a CPU ticks ( no pun intended). With that said, each assembler has their own syntax when it comes down to translating its version of assembly into machine code. I certainly advocate for nasm as it's my goto but others might find the syntax of fasm, for example, more to their liking. It's up to you to find an assembler that's to your liking and read it's manual through and through. > Regards, > Daniel > > > ~Paul > =E5=9C=A8 2022=E5=B9=B411=E6=9C=8813=E6=97=A5=EF=BC=8C=E4=B8=8B=E5=8D=884= :52=EF=BC=8CPaul Procacci =E5=86=99=E9=81=93=EF=BC=9A > > =EF=BB=BF > > > On Sat, Nov 12, 2022 at 10:31 PM Daniel Cervus > wrote: > >> Hi everyone, >> >> I=E2=80=99m trying to do assembly programming on FB in 64-bit mode. I ha= ve a >> question, 64-bit mode requires parameters to be passed on 64-bit registe= rs. >> But when a parameter is 32-bit or smaller, do I need to sign-extend (or >> zero-extend) them to 64-bit? The System V ABI specifications only says "= The >> size of each argument gets rounded up to eightbytes." It=E2=80=99s somew= hat >> ambiguous. How to round up 'float', when they are passed on stack? >> >> Thanks, >> Daniel > > > (Didn't Reply all) > > Hi Daniel, > > There are a handful of operations that operate on 32bit registers that > automatically clear the high bits for you. > > 32-bit operands generate a 32-bit result, zero-extended to a 64-bit resul= t > in the destination general-purpose register. > 16 and 8 bit operands don't have this "built-in" so you would indeed need > to ensure the higher bits are cleared. > > mov dword edi, 1 is effectively setting rdi to the value of > 0x0000000000000001 > > As for sign extending the values, the answer is `no'.....under most > circumstances. If you are sticking to widths of 32 and 64 bits then you > are fine. > The moment you mess with 16bits or smaller, then yes, you need to ensure > no garbage lives in your higher bits because the cpu doesn't clear this f= or > you. > > "The size of each argument gets rounded up to eight bytes." > > The size of ALL arguments passed to the callee via general purpose > registers is 8 bytes "regardless of what a function def says". It's HOW > the callee operates upon the register arguments that matters. > > As for passing arguments on the stack ... you shouldn't have to. Not > only are there the GPR's rdi, rsi, rdx, rcx, r8, r9, r10 at your disposal > for int/scalar types there are also xmm0-xmm7 for your floats. > > Thanks, > ~Paul > > -- > __________________ > > :(){ :|:& };: > > --=20 __________________ :(){ :|:& };: --00000000000015a40f05ed570d3c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Sun, Nov 13, 2022 at 4:07 AM Danie= l Cervus <DanieltheDeer@outlook.com> wrote:
Hi Paul,

=C2=A0 You mean I need to ensure higher bits are zeros (in= cluding signed int, short, char...) when passing the parameters? Okay, I ge= t it.


That's not what I said.
I said:=C2=A0 "32-bit= operands generate a 32-bit result, zero-extended to a 64-bit result in the= destination general-purpose register."
I also said: &qu= ot;16 and 8 bit operands don't have this "built-in" so you wo= uld indeed need to ensure the higher bits are cleared."

<= div>So what does this mean?=C2=A0 Let's take the following C function p= rototype as well as its implementation .. something I just pulled out of th= in air and is useless, but is only for demonstration purposes:

=C2= =A0 =C2=A0 =C2=A0 =C2=A0 ; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 r= di
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ; long long iAmGoingToDoStuff(long long v= alue)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 iAmGoingToDoStuff:
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inc =C2=A0 =C2=A0 rdi
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mov =C2=A0 =C2=A0 rax, rdi=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret

Now let's take two separate asm snippets...one that WILL work as= expected and another that WON'T work as expected:
The working examp= le is the easiest.=C2=A0 Even though the prototype declares it accepts a lo= ng long, we only have to
explicitly set edi as it's a 32 bit = operand that automatically gets zero extended to 64 bits.

=C2=A0 =C2= =A0 =C2=A0 =C2=A0 workingExample:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 xor =C2=A0 =C2=A0 edi, edi =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0; Pass 0, all bits in rdi are 0 due to abi rules=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 call =C2=A0 =C2= =A0iAmGoingToDoStuff
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 ret

With the nonworking example I've tried to provide a s= cenario with the value stored in edi was previously set by an outside force= .=C2=A0 That's the first `mov'.
The second `mov' = in this example only sets the lower 4 bits of edi, leaving the rest of the = register alone.=C2=A0 If the intent was to set the register to 0, then
this is obviously a problem and you would have to ensure to clear the= proper bits.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 nonWorkingExamp= le:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mov =C2=A0 = =C2=A0 edi, 0xCF0A =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ; Set edi to a= random value
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m= ov =C2=A0 =C2=A0 dl, 0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 call =C2=A0 =C2=A0iAmGoingToDoStuff
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret
=C2=A0
My advic= e regarding manipulating 16 bits or smaller registers is ... don't unle= ss you absolutely have to.=C2=A0 If you know what EFLAGS stalls are and kno= w some 16 computations you're doing won't stall on it, then it'= s not a problem.
It's best however to just avoid them altogether.=C2= =A0 (Exceptions apply but are far as few between).


<= /div>
=C2=A0 By the way, is there any good resources that I can = learn 64-bit FB (or *nix) assembly programming from?


This answer might seem clique but I swear it= 's not the intention.=C2=A0 The best reference for instructions, associ= ated operands, rules regarding zero extending of 32-bit registers, etc., is= right in the manual:

https://cdrdv2.intel.com/v1/dl/getContent/671110

The above naturally doesn't "teach" assembl= y, but it will be your "goto" for just about everything.=C2=A0 I&= #39;ve been programming in assembly for well over 2 decades, damn nearing 3= decades now, and it's STILL the Holy Bible of how a CPU ticks ( no pun= intended).

With that said, each assembler has their own = syntax when it comes down to translating its version of assembly into machi= ne code.
I certainly advocate for nasm as it's my goto bu= t others might find the syntax of fasm, for example, more to their liking.<= /div>
It's up to you to find an assembler that's to your liking= and read it's manual through and through.


Regards,
Daniel



~Paul


=C2=A0
=E5=9C=A8 2022=E5=B9=B411=E6=9C=8813=E6=97=A5=EF= =BC=8C=E4=B8=8B=E5=8D=884:52=EF=BC=8CPaul Procacci <pprocacci@gmail.com> =E5=86=99= =E9=81=93=EF=BC=9A

=EF=BB=BF


On Sat, Nov 12, 2022 at 10:31 PM Dani= el Cervus <DanieltheDeer@outlook.com> wrote:
Hi everyone,

I=E2=80=99m trying to do assembly programming on FB in 64-bit mode. I have = a question, 64-bit mode requires parameters to be passed on 64-bit register= s. But when a parameter is 32-bit or smaller, do I need to sign-extend (or = zero-extend) them to 64-bit? The System V ABI specifications only says "The size of each argument gets rounde= d up to eightbytes." It=E2=80=99s somewhat ambiguous. How to round up = 'float', when they are passed on stack?

Thanks,
Daniel

(Didn't Reply all)

Hi Daniel,

There are a handful of operations that operate on 32bit registers that= automatically clear the high bits for you.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result = in the destination general-purpose register.
16 and 8 bit operands don't have this "built-in" so you = would indeed need to ensure the higher bits are cleared.

mov dword edi, 1=C2=A0 is effectively setting rdi to the value of 0x00= 00000000000001

As for sign extending the values, the answer is `no'.....under most cir= cumstances.=C2=A0 If you are sticking to widths of 32 and 64 bits then you = are fine.
The moment you mess with 16bits or smaller, then yes, you need to ensu= re no garbage lives in your higher bits because the cpu doesn't clear t= his for you.

"The size of each argument gets rounded up to eight bytes."

The size of ALL arguments passed to the callee via general purpose register= s is 8 bytes "regardless of what a function def says".=C2=A0 It&#= 39;s HOW the callee operates upon the register arguments that matters.

As for passing arguments on the stack ...=C2=A0 you shouldn't have= to.=C2=A0 Not only are there the GPR's rdi, rsi, rdx, rcx, r8, r9, r10= at your disposal for int/scalar types there are also xmm0-xmm7 for your fl= oats.

Thanks,
~Paul

--
__________________

:(){ :|:& };:


--
__________= ________

:(){ :|:& };:
--00000000000015a40f05ed570d3c--