Date: Wed, 21 Jul 1999 09:37:18 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Zhihui Zhang <zzhang@cs.binghamton.edu> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: understanding code related to forced COW for debugger Message-ID: <199907211637.JAA29468@apollo.backplane.com> References: <37952EBF.3960E7D4@cs.binghamton.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
:I have tried to understand the following code in vm_map_lookup() without
:much success:
:
: if (fault_type & VM_PROT_OVERRIDE_WRITE)
: prot = entry->max_protection;
: else
: prot = entry->protection;
: ........
:
: if (entry->wired_count && (fault_type & VM_PROT_WRITE) &&
: (entry->eflags & MAP_ENTRY_COW) &&
: (fault_typea & VM_PROT_OVERRIDE_WRITE) == 0) {
: RETURN(KERN_PROTECTION_FAILURE);
: }
:
:At first, it seems to me that if you want to write a COW page, you must
:have OVERRIDE_WRITE set.
The VM_PROT_OVERRIDE_WRITE flag is only used for user-wired pages, so
it does not effect 'normal' page handling. Look carefully at the
vm_fault() code (vm/vm_fault.c line 212), that lookup only occurs
with VM_PROT_OVERRIDE_WRITE set if the normal lookup fails and the
user has wired the page.
So if a normal lookup fails and this is a user-wired page, we try
the lookup again with VM_PROT_OVERRIDE_WRITE, presumably to handle
a faked copy-on-write fault for the debugger. This results in the
following:
First, we temporarily increase the protections to make the page *appear*
writeable. Note: only 'appear' writeable, not actually be writeable.
if (fault_type & VM_PROT_OVERRIDE_WRITE)
prot = entry->max_protection;
else
prot = entry->protection;
Next we strip off only the fault bits that we care about. Note that
we have already adjusted 'prot' based on the VM_PROT_OVERRIDE_WRITE
flag so 'prot' is probably writeable. We will thus fall through this
conditional:
fault_type &= (VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE);
if ((fault_type & prot) != fault_type) {
RETURN(KERN_PROTECTION_FAILURE);
}
If this is part of a user wire and we have a write fault and the
page is copy-on-write, *AND* VM_PROT_OVERRIDE_WRITE was not set,
we return a failure. This is, in fact, the failure that is returned
when the vm_fault code initially attempts to do the lookup before
vm_fault falls through and makes a second attempt with
VM_PROT_OVERRIDE_WRITE.
if (entry->wired_count && (fault_type & VM_PROT_WRITE) &&
(entry->eflags & MAP_ENTRY_COW) &&
(fault_typea & VM_PROT_OVERRIDE_WRITE) == 0) {
RETURN(KERN_PROTECTION_FAILURE);
}
Now that we've gotten past this code we revert the protection bits
if the page is a user-wire, because it was because the page was a
user wire (indirectly, anyway) that the protections were increased in
the first place. We lose the entry->max_protection and
revert back to entry->protection. Essentially, we make the page
(probably) read-only again.
*wired = (entry->wired_count != 0);
if (*wired)
prot = fault_type = entry->protection;
... but we've already gotten past the conditionals that can cause
a failure to be returned, so the code that follows will *still* do
the copy-on-write for the debugger.
:But later I find that when wired_count is non zero, we are actually
:simulating a page fault, not a real one.
:Anyway, I do not know how the above code (1) prevents a debugger from
:writing a binary code, (2) forces
:a COW when a debugger write other data.
:
:I also have some questions on wiring a page:
:
:(1) According to the man pages of mlock(2), a wired page can still
:cause protection-violation faults.
:But in the same vm_map_lookup(), we have the following code:
:
: if (*wired)
: prot = fault_type = entry->protection;
:
:and the comment says "get it for all possible accesses". As I undersand
:it, we wire a page by simulating
:a page fault (no matter whether it is kernel or user who is wiring a
:page).
I'm pretty sure this piece is simply reverting the mess that the
copy-on-write stuff does for the debugger. entry->protection is what
we normally want to use.
The debugger copy-on-write junk is there so the debugger can modify a
program's TEXT area but the program itself *cannot* modify its own TEXT
area. It's a big mess and I don't fully understand how the structures
are faked up to handle the case.
:(2) Can the kernel wire a page of a user process without that user's
:request (by calling mlock)?
:
:Any help is appreciated.
Yes. The kernel can wire a page. It usually busies the page for the
duration, however, so vm_fault will block on the page and then retry
without actually noticing that the page has been wired. I'm probably
not entirely correct here, John may be able to say more about it.
-Matt
Matthew Dillon
<dillon@backplane.com>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907211637.JAA29468>
