From owner-freebsd-hackers@FreeBSD.ORG Sat Jun 23 06:19:31 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 857211065670; Sat, 23 Jun 2012 06:19:31 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6E4C58FC14; Sat, 23 Jun 2012 06:19:31 +0000 (UTC) Received: from xp5k.my.domain (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q5N6JRao022350; Sat, 23 Jun 2012 06:19:28 GMT (envelope-from listlog2011@gmail.com) Message-ID: <4FE55F91.5070303@gmail.com> Date: Sat, 23 Jun 2012 14:17:53 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111228 Thunderbird/9.0 MIME-Version: 1.0 To: John Baldwin References: <201206182256.30535.dcherednik@roshianokatachi.com> <201206210811.20427.jhb@freebsd.org> In-Reply-To: <201206210811.20427.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Daniil Cherednik , kib@freebsd.org Subject: Re: Fast syscalls via sysenter X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jun 2012 06:19:31 -0000 On 2012/06/21 20:11, John Baldwin wrote: > On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote: >> Hi! >> >> I am trying to continue the work started by DavidXu on implemention of fast >> syscalls via sysenter/sysexit. >> http://people.freebsd.org/~davidxu/sysenter/kernel/ >> I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a >> beginner in kernel so I have some questions: >> >> 1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch >> /* >> * If %edx was changed, we can not use sysexit, because it >> * needs %edx to restore userland %eip. >> */ >> if (orig_edx != frame.tf_edx) >> td->td_pcb->pcb_flags |= PCB_FULLCTX; >> >> What is the reason why we have to do this additional check? In >> http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s >> we store %edx to the stack in >> pushl %edx /* ring 3 next %eip */ >> and we restore the register in >> popl %edx /* ring 3 %eip */ > Some system calls return two return values (pipe(2)) or return a 64-bit > off_t (lseek(2)). Those system calls change %edx's value and need that > changed value to make it out to userland. > >> 2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s >> movl PCPU(CURPCB),%esi >> call syscall >> >> Why do we movl PCPU(CURPCB),%esi before calling syscall? syscall is just c- >> function. > No clue on this one, looks like it is not needed. > [kib@ is cc'ed] I implemented the sysenter syscall long time ago, it indeed can reduce system call overhead on i386. I think it might be the time to implement linux like vdso syscall now based on the work kib@ recently has done, though I don''t know how to hook it into kib's code. I quick googled it, and found they put some data into aux vector: http://www.trilithium.com/johan/2005/08/linux-gate/ http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40 Regards, David Xu