Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Jul 2023 18:19:23 -0700
From:      "Pat Maddox" <pat@patmaddox.com>
To:        "Warner Losh" <imp@bsdimp.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: How are syscall functions defined?
Message-ID:  <f78938e0-fb58-459d-8d8e-b054fc31ee38@app.fastmail.com>
In-Reply-To:  <CANCZdfrrxn6kh%2BTzkNdkQGhPPE_Zser2QaPDqHjMTO2PaUvf4A@mail.gmail.com>
References:  <5f311275-e307-4e78-a479-c6d4e7f116d5@app.fastmail.com> <32a0f7e7-11b7-443e-a601-40bec7798d8f@app.fastmail.com> <CANCZdfrrxn6kh%2BTzkNdkQGhPPE_Zser2QaPDqHjMTO2PaUvf4A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hey Warner,

Thanks for taking the time to walk through that, it's super helpful. Ful=
l disclosure: this is much deeper into src/ than I've ever ventured befo=
re.

The generated jail_attach.S and corresponding RSYSCALL / KERNCALL defini=
tions make sense to me.

Now I'm wondering about the other side of the boundary - how that assemb=
ly makes its way to the kernel implementation.

Here's what I think happens:

1. CPU sends a trap which leads to `call amd64_syscall` [1]
2. amd64_syscall [2] calls syscallenter [3]
3. syscallenter calls sv_fetch_syscall_args [4] which is set to cpu_fetc=
h_syscall_args [5]
4. cpu_fetch_syscall_args uses the syscall arg code as an index into sys=
ent [6]
5. syscallenter calls the syscall entry sycall property [7]

So what we get, in shortened form, is:

1. libc produces assembly `mov 436,%eax; KERNCALL`
2. syscallenter grabs sysent[436] and calls its sycall property, which i=
n this case is sys_jail_attach [8]

Whew.

Is that right?

Pat

---

[1] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/exception.S#n580
[2] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/trap.c#n1187
[3] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n58
[4] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n82
[5] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/elf_machdep.c#n87
[6] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/trap.c#n1080
[7] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n162
[8] https://cgit.freebsd.org/src/tree/sys/kern/kern_jail.c#n2599

On Sat, Jul 1, 2023, at 6:26 AM, Warner Losh wrote:
> OK. System calls are a pain. there's a lot of boilerplate needed to ma=
ke
> them all work.
>
> So, it's been automated. The process starts after you add a system cal=
l to
> syscalls.master.
> 'make sysent' is run which creates a number of different files. It cre=
ates
> the kernel glue.
> These glue files are then committed to the tree. On the kernel side we=
 have
> sys/kern/init_sysent.c which has the 'sysent' array which is used to
> dispatch the system
> calls. sys/kern/syscalls.c has the names, and sys/kern/systrace_args h=
as
> information
> for dtrace decoding them.
>
> In userland, though, the system calls live in libc. But there's no sou=
rce
> file for them.
> Instead, libc's sys/Makefile.inc includes sys/sys/syscall.mk, which is=
 also
> generated above,
> which has a list of all the system call files to create. Dependency ru=
les
> in sys/Makefile.inc
> cause those .o's to be created with this rule:
> ${SASM}:
>         printf '/* %sgenerated by libc/sys/Makefile.inc */\n' @ > ${.T=
ARGET}
>         printf '#include "compat.h"\n' >> ${.TARGET}
>         printf '#include "SYS.h"\nRSYSCALL(${.PREFIX})\n' >> ${.TARGET}
>         printf  ${NOTE_GNU_STACK} >>${.TARGET}
>
> which is where the source winds up: in the object tree as jail_attach.S
> likely
> with the contents (generated by hand):
>
> /* jail_attach.S generated by libc/sys/Makefile.inc */
> #incldue "compat.h"
> #include "SYS.h"
> RSYSCALL(jail_attach)
> .section .note.GNU-stack,"".%%progbits
>
> The different __sys_jail_attach wrapping for the threading
> libraries also is part of the RSYSCALL macro, for example amd64:
> #define RSYSCALL(name)  ENTRY(__sys_##name);                          =
  \
>                         WEAK_REFERENCE(__sys_##name, name);           =
  \
>                         WEAK_REFERENCE(__sys_##name, _##name);        =
  \
>                         mov $SYS_##name,%eax; KERNCALL;               =
  \
>                         jb HIDENAME(cerror); ret;                     =
  \
>                         END(__sys_##name)
>
> The System.map file, etc, all know that this is generated, and is used=
 to
> put the symbols in the proper version area. Symbol versions are beyond
> the scope of this post.
>
> Warner
>
> On Sat, Jul 1, 2023 at 5:23=E2=80=AFAM Pat Maddox <pat@patmaddox.com> =
wrote:
>
>> On Sat, Jul 1, 2023, at 3:11 AM, Pat Maddox wrote:
>> > jail_attach is defined in syscalls.master [1] which generates a
>> > declaration in jail.h [2]. Try as I might, I can=E2=80=99t find any=
 definition
>> > of that specific syscall function (or any other).  I think the clos=
est
>> > I=E2=80=99ve found is sys_jail_attach in kern_jail.c [3]. I suspect=
 there=E2=80=99s
>> > some generation going on that defines jail_attach - but if that=E2=80=
=99s the
>> > case, I haven=E2=80=99t been able to track it down.
>> >
>> > Can someone point me to how the C function gets defined?
>> >
>> > Thanks,
>> > Pat
>> >
>> > [1]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/kern/sysc=
alls.master#L2307
>> > [2]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/sys/jail.=
h#L119
>> > [3]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/kern/kern=
_jail.c#L2340
>>
>> Symbol.map [1] is used to produce a version map [2] which is then fed=
 to
>> the linker [3], which I assume maps the symbols in the resulting bina=
ry. I
>> intend to experiment with that a bit, but I think that makes sense.
>>
>> Pat
>>
>> [1]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/lib/libc/sys/=
Symbol.map#L672
>> [2]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/share/mk/bsd.=
symver.mk#L43
>> [3]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/share/mk/bsd.=
lib.mk#L253
>>
>>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f78938e0-fb58-459d-8d8e-b054fc31ee38>