Date: Sun, 14 Oct 2012 20:19:41 +0400 From: Daniil Cherednik <dcherednik@roshianokatachi.com> To: freebsd-hackers@freebsd.org Cc: Konstantin Belousov <kostikbel@gmail.com>, davidxu@freebsd.org Subject: Re: Fast syscalls via sysenter Message-ID: <507AE61D.7030709@roshianokatachi.com> In-Reply-To: <20120623165823.GX2337@deviant.kiev.zoral.com.ua> References: <201206182256.30535.dcherednik@roshianokatachi.com> <201206210811.20427.jhb@freebsd.org> <4FE55F91.5070303@gmail.com> <20120623165823.GX2337@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 06/23/2012 08:58 PM, Konstantin Belousov wrote: > On Sat, Jun 23, 2012 at 02:17:53PM +0800, David Xu wrote: >> On 2012/06/21 20:11, John Baldwin wrote: >>> On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote: >>>> Hi! >>>> >>>> I am trying to continue the work started by DavidXu on implemention of >>>> fast >>>> syscalls via sysenter/sysexit. >>>> http://people.freebsd.org/~davidxu/sysenter/kernel/ >>>> I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a >>>> beginner in kernel so I have some questions: >>>> >>>> 1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch >>>> /* >>>> * If %edx was changed, we can not use sysexit, because it >>>> * needs %edx to restore userland %eip. >>>> */ >>>> if (orig_edx != frame.tf_edx) >>>> td->td_pcb->pcb_flags |= PCB_FULLCTX; >>>> >>>> What is the reason why we have to do this additional check? In >>>> http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s >>>> we store %edx to the stack in >>>> pushl %edx /* ring 3 next %eip */ >>>> and we restore the register in >>>> popl %edx /* ring 3 %eip */ >>> Some system calls return two return values (pipe(2)) or return a 64-bit >>> off_t (lseek(2)). Those system calls change %edx's value and need that >>> changed value to make it out to userland. >>> >>>> 2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s >>>> movl PCPU(CURPCB),%esi >>>> call syscall >>>> >>>> Why do we movl PCPU(CURPCB),%esi before calling syscall? syscall is just >>>> c- >>>> function. >>> No clue on this one, looks like it is not needed. >>> >> [kib@ is cc'ed] >> I implemented the sysenter syscall long time ago, it indeed can reduce >> system call overhead on i386. I think it might be the time to implement >> linux like vdso syscall now based on the work kib@ recently has done, >> though I don''t know how to hook it into kib's code. >> I quick googled it, and found they put some data into aux vector: >> http://www.trilithium.com/johan/2005/08/linux-gate/ >> http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40 > Yes, intent is to eventually switch to VDSO from current situation were > libc is aware of shared page content. This was extensively discussed in > flame that resulted in me writing the current gettimeofday(2) patch. > It was arch@ several weeks ago, AFAIR. > > Committed gettimeofday() code structure allows for VDSO interposing without > breaking normal symbol visibility rules. > > I do not see a sense in implementing syscall or sysenter support for > i386 kernel. On the other hand, using syscall for 32bit binaries on amd64 > looks reasonable. I was not able to write some time, sorry. So. What about implementing vdso now? I know it was a patch and feature request http://lists.freebsd.org/pipermail/freebsd-bugs/2010-April/039597.html About sysenter: I have ported sysenter patch for 9.0-RELEASE-p4, it looks fine. I made some fixes in SYS.h. The reason is (if i understand it right) we have to get elf without DT_TEXTREL in ld-elf.so You can find the patch here: https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sysenter.patch https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sys/i386/i386/sysenter.s But now, this patch breaks compatibility with i386 XEN PV kernel. I wanted to fix it, but without VDSO it would be limited solution. It is one of reasons why I am interested about vdso status. So, about using 32bit binaries on amd64. It is reasonable. But if we will use it I think we have to implement vdso support in i386 kernel too for compatibility and it is better to implement sysenter too.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?507AE61D.7030709>