Date: Sun, 7 Dec 2003 17:41:06 -0500 (EST) From: Robert Watson <rwatson@freebsd.org> To: Anand Subramanian <anand@pythagoras.math.uwaterloo.ca> Cc: anand@cs.uwaterloo.ca Subject: Re: Sharing data between user space and kernel Message-ID: <Pine.NEB.3.96L.1031207173708.29768A-100000@fledge.watson.org> In-Reply-To: <20031207144117.A6912@pythagoras.math>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 7 Dec 2003, Anand Subramanian wrote: > A look at the copyin() code in the kernel reveals that all the kernel > needs to do to access the data(address space) of a user process is Note that the copyin/copyout implementat is machine-dependent (MD) and so while this is true on i386, it may not be true on other systems. The fuword/suword/copyin/copyout/uio code is intentionally designed to avoid the assumption that userspace pointers are directly dereferenceable by kernel code. One important example of a situation where this difference has to be maintained is implementing 32-bit emulation on 64-bit platforms. On amd64, you can't just dereference a 32-bit pointer when the kernel is running in 64-bit mode. > 1. Get the current thread, which I saw is done using the PCPU_GET macro. > So I suppose this is always preserved upon a system call. > > 2. Set the segment register for the user process correctly. > > And magically, all the user process's data can now be accessed by the > kernel directly. > > Is that correct? In the event of which, it would become really easy for > a user process to allocate a chunk of memory and all a kernel module > needs to do to "implement shared memory" is do the steps 1 & 2 and > access the data. > > Of course there is the question that the user process is "swapped" out > after the system call and some other thread starts running in between in > which case curthread should point to some other thread and not the one > that issued the system call. But then, isn't this what happens upon > every system call normally, when the kernel does the steps 1 & 2 to > obtain the data arguments which are passed to the system call. So this > is hardly a problem. So, can shared memory be implemented this way > instead of the more traditional "pseudo-device" way? > > Appreciate any comments on this(please do a CC to my email address, in > case you choose to respond). An additional issue is that user pages are pageable to disk, so may not be in memory. If you're holding any mutexes/etc in kernel when you touch one of those pages, the page fault has to be processed, and you risk (a) holding the locks for a long time, and (b) lock order problems. This is one reason why copyin()/copyout() have to be used very carefully, and this would apply also to any code replicating that functionality. If you take a look at the sysctl() code, you'll see that it wires userspace pages into memory to avoid the risk of sleeping(). What you probably want to do is actually allocate wired kernel pages and export them to userspace. Take a look at the GEOM gstat(8) implementation, which does exactly that. However, you have to make sure that if you ever decide to reuse that kernel memory for something else (i.e., free it back to the allocator), you've GC'd all userspace references to it. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1031207173708.29768A-100000>