Date: Mon, 17 Mar 2008 14:43:25 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Julian Elischer <julian@elischer.org> Cc: Alexander Sack <pisymbol@gmail.com>, jgordeev@dir.bg, "Andrey V. Elsukov" <bu7cher@yandex.ru>, Robert Watson <rwatson@freebsd.org>, freebsd-hackers@freebsd.org Subject: Re: vkernel & GSoC, some questions Message-ID: <200803172143.m2HLhP03021235@apollo.backplane.com> References: <20080316122108.S44049@fledge.watson.org> <E1JatyK-000FfY-00.shmukler-mail-ru@f8.mail.ru> <200803162313.m2GNDbvl009550@apollo.backplane.com> <3c0b01820803171243k5eb6abd3y1e1c44694c6be0f6@mail.gmail.com> <200803172016.m2HKGfjA020263@apollo.backplane.com> <47DEDDF9.7010200@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
:> In all three cases the emulated hardware -- disk and network basically, :> devolves down into calling read() or write() or the real-kernel :> equivalent. A hypervisor has the most work to do since it is trying to :> emulate a hardware interface (adding another layer). XEN has less work :> to do as it is really not trying to emulate hardware. A vkernel has :> even less work to do because it is running as a userland program and can :> simply make the appropriate system call to implement the back-end. : :And jails and similar have the absolute minimum.. :at the cost of making a single accessible point of failure :(the one kernel). Yes, absolutely. Jails have the greatest performance, though the characterization of a single point of failure is a bit misleading. The problem with a jail is that all programs running under it are directly accessing the real kernel and are able to exercise *ALL* code paths into that kernel, even many root code paths, and thus expose all the bugs in that kernel. A vkernel or hypervisor use only a subset of the real kernel's functionality resulting in much lower exposure to potential kernel bugs. While a vkernel or kernel running under a hypervisor is fully exposed, a failure of same does not cause the whole machine to fail and a recovery 'reboot' can be as short as 5 seconds. The cost is performance. Even if you were to instrument the kernel code with full resource control (jailed memory use, I/O, descriptors, real-kernel memory use, etc)... even if you were to do that, it still doesn't solve the bug exposure issue. In anycase, there are only two performance bottlenecks that really matter for a vkernel or hypervisor: (1) system calls from virtualized processes to their virtualized kernels, and (2) MMU invalid page faults. The I/O path is a distant third, really requiring only a co-thread or two for write()s to be made efficient. (1) and (2) are not easy problems to solve, mainly due to the need for the real kernel to have exclusive access to the context when doing an iret (a R/W shared mapping of the top of the kernel stack is a security hole). I do think a read-only mapping might be doable, particularly for the standard syscall path which only modifies EAX and EDX in the critical path. That would cut the overhead in half. -Matt Matthew Dillon <dillon@backplane.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803172143.m2HLhP03021235>