Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Dec 2003 17:41:06 -0500 (EST)
From:      Robert Watson <rwatson@freebsd.org>
To:        Anand Subramanian <anand@pythagoras.math.uwaterloo.ca>
Cc:        anand@cs.uwaterloo.ca
Subject:   Re: Sharing data between user space and kernel
Message-ID:  <Pine.NEB.3.96L.1031207173708.29768A-100000@fledge.watson.org>
In-Reply-To: <20031207144117.A6912@pythagoras.math>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, 7 Dec 2003, Anand Subramanian wrote:

> A look at the copyin() code in the kernel reveals that all the kernel
> needs to do to access the data(address space) of a user process is

Note that the copyin/copyout implementat is machine-dependent (MD) and so
while this is true on i386, it may not be true on other systems.  The
fuword/suword/copyin/copyout/uio code is intentionally designed to avoid
the assumption that userspace pointers are directly dereferenceable by
kernel code.  One important example of a situation where this difference
has to be maintained is implementing 32-bit emulation on 64-bit platforms. 
On amd64, you can't just dereference a 32-bit pointer when the kernel is
running in 64-bit mode. 

> 1.  Get the current thread, which I saw is done using the PCPU_GET macro.
>     So I suppose this is always preserved upon a system call.
> 
> 2. Set the segment register for the user process correctly.
> 
> And magically, all the user process's data can now be accessed by the
> kernel directly. 
> 
> Is that correct? In the event of which, it would become really easy for
> a user process to allocate a chunk of memory and all a kernel module
> needs to do to "implement shared memory" is do the steps 1 & 2 and
> access the data.
> 
> Of course there is the question that the user process is "swapped" out
> after the system call and some other thread starts running in between in
> which case curthread should point to some other thread and not the one
> that issued the system call. But then, isn't this what happens upon
> every system call normally, when the kernel does the steps 1 & 2 to
> obtain the data arguments which are passed to the system call. So this
> is hardly a problem. So, can shared memory be implemented this way
> instead of the more traditional "pseudo-device" way? 
> 
> Appreciate any comments on this(please do a CC to my email address, in
> case you choose to respond). 

An additional issue is that user pages are pageable to disk, so may not be
in memory.  If you're holding any mutexes/etc in kernel when you touch one
of those pages, the page fault has to be processed, and you risk (a)
holding the locks for a long time, and (b) lock order problems.  This is
one reason why copyin()/copyout() have to be used very carefully, and this
would apply also to any code replicating that functionality.  If you take
a look at the sysctl() code, you'll see that it wires userspace pages into
memory to avoid the risk of sleeping().  What you probably want to do is
actually allocate wired kernel pages and export them to userspace.  Take a
look at the GEOM gstat(8) implementation, which does exactly that. 
However, you have to make sure that if you ever decide to reuse that
kernel memory for something else (i.e., free it back to the allocator),
you've GC'd all userspace references to it. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Senior Research Scientist, McAfee Research




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1031207173708.29768A-100000>