Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Oct 2012 20:19:41 +0400
From:      Daniil Cherednik <dcherednik@roshianokatachi.com>
To:        freebsd-hackers@freebsd.org
Cc:        Konstantin Belousov <kostikbel@gmail.com>, davidxu@freebsd.org
Subject:   Re: Fast syscalls via sysenter
Message-ID:  <507AE61D.7030709@roshianokatachi.com>
In-Reply-To: <20120623165823.GX2337@deviant.kiev.zoral.com.ua>
References:  <201206182256.30535.dcherednik@roshianokatachi.com> <201206210811.20427.jhb@freebsd.org> <4FE55F91.5070303@gmail.com> <20120623165823.GX2337@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06/23/2012 08:58 PM, Konstantin Belousov wrote:
> On Sat, Jun 23, 2012 at 02:17:53PM +0800, David Xu wrote:
>> On 2012/06/21 20:11, John Baldwin wrote:
>>> On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote:
>>>> Hi!
>>>>
>>>> I am trying to continue the work started by DavidXu on implemention of
>>>> fast
>>>> syscalls via sysenter/sysexit.
>>>> http://people.freebsd.org/~davidxu/sysenter/kernel/
>>>> I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a
>>>> beginner in kernel so I have some questions:
>>>>
>>>> 1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
>>>> /*
>>>> * If %edx was changed, we can not use sysexit, because it
>>>> * needs %edx to restore userland %eip.
>>>> */
>>>> if (orig_edx != frame.tf_edx)
>>>> 	td->td_pcb->pcb_flags |= PCB_FULLCTX;
>>>>
>>>> What is the reason why we have to do this additional check? In
>>>> http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
>>>> we store %edx to the stack in
>>>> pushl %edx		/* ring 3 next %eip */
>>>> and we restore the register in
>>>> popl	%edx		/* ring 3 %eip */
>>> Some system calls return two return values (pipe(2)) or return a 64-bit
>>> off_t (lseek(2)).  Those system calls change %edx's value and need that
>>> changed value to make it out to userland.
>>>
>>>> 2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
>>>> movl	PCPU(CURPCB),%esi
>>>> call	syscall
>>>>
>>>> Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just
>>>> c-
>>>> function.
>>> No clue on this one, looks like it is not needed.
>>>
>> [kib@ is cc'ed]
>> I implemented the sysenter syscall long time ago, it indeed can reduce
>> system call overhead on i386. I think it might be the time to implement
>> linux like vdso syscall now based on the work kib@ recently has done,
>> though I don''t know how to hook it into kib's code.
>> I quick googled it, and found they put some data into aux vector:
>> http://www.trilithium.com/johan/2005/08/linux-gate/
>> http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40
> Yes, intent is to eventually switch to VDSO from current situation were
> libc is aware of shared page content. This was extensively discussed in
> flame that resulted in me writing the current gettimeofday(2) patch.
> It was arch@ several weeks ago, AFAIR.
>
> Committed gettimeofday() code structure allows for VDSO interposing without
> breaking normal symbol visibility rules.
>
> I do not see a sense in implementing syscall or sysenter support for
> i386 kernel. On the other hand, using syscall for 32bit binaries on amd64
> looks reasonable.
I was not able to write some time, sorry.
So. What about implementing vdso now? I know it was a patch and feature 
request 
http://lists.freebsd.org/pipermail/freebsd-bugs/2010-April/039597.html

About sysenter: I have ported sysenter patch for 9.0-RELEASE-p4, it 
looks fine. I made some fixes in SYS.h. The reason is (if i understand 
it right) we have to get elf without DT_TEXTREL in ld-elf.so
You can find the patch here:
https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sysenter.patch
https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sys/i386/i386/sysenter.s

But now, this patch breaks compatibility with i386 XEN PV kernel. I 
wanted to fix it, but without VDSO it would be limited solution. It is 
one of reasons why I am interested about vdso status.

So, about using 32bit binaries on amd64. It is reasonable. But if we 
will use it I think we have to implement vdso support in i386 kernel too 
for compatibility and it is better to implement sysenter too.









Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?507AE61D.7030709>