Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 08 Feb 2006 13:39:23 +0800
From:      David Xu <davidxu@freebsd.org>
To:        Sam Lawrance <boris@brooknet.com.au>
Cc:        Kris Kennaway <kris@obsecurity.org>, Jason Evans <jasone@freebsd.org>, current@freebsd.org
Subject:   Re: Which signal occurs due to a page protection violation?
Message-ID:  <43E9840B.2030709@freebsd.org>
In-Reply-To: <ADC5EE4E-2304-4F95-A32F-8BB37F2A9A7E@brooknet.com.au>
References:  <3458D5B9-860C-4185-9359-1F48FC35B048@brooknet.com.au> <31986988-9FB7-4EFC-986B-50DB99934E32@freebsd.org> <ADC5EE4E-2304-4F95-A32F-8BB37F2A9A7E@brooknet.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Sam Lawrance wrote:
> 
> [ moved to -current ]
> 
> On 01/02/2006, at 6:41 AM, Jason Evans wrote:
> 
>> On Jan 31, 2006, at 1:06 AM, Sam Lawrance wrote:
>>
>>> ElectricFence is failing during its self test on i386 7-current:
>>>
>>> Testing Electric Fence.
>>> After the last test, it should print that the test has PASSED.
>>> EF_PROTECT_BELOW= && EF_PROTECT_FREE= && EF_ALIGNMENT= && ./eftest
>>> Segmentation fault (core dumped)
>>> *** Error code 139
>>>
>>> The program intentionally overruns and underruns buffers in order  to 
>>> test the functionality of ElectricFence.
>>> I think it's failing because:
>>> 1) the new jemalloc is actually catching the problem and throwing  
>>> SIGSEGV
>>> 2) ElectricFence is being compiled with - 
>>> DPAGE_PROTECTION_VIOLATED_SIGNAL=SIGBUS on that platform.
>>
>>
>> I'm not sure about this, but I think the change of which signal  
>> occurs is unrelated to jemalloc.  I think Kris Kennaway at one  point 
>> told me that jemalloc broke the efence port, but then later  retracted 
>> that claim when efence also failed on a machine that was  still using 
>> phkmalloc.  This may be due to a signal delivery bugfix  that someone 
>> put in, probably in early December 2005.
> 
> 
> You are right.  The change below delivers SIGSEGV instead of SIGBUS  
> (also on amd64).
> 
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c? 
> annotate=1.282
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/amd64/trap.c.diff? 
> r1=1.294&r2=1.295&f=h
> 
> David, was this an intentional change?  It broke ElectricFence, which  
> depended on the old behaviour.  The 4.x, 5.x, and 6.x package builds  
> are hosted on machines running -current, so ElectricFence self tests  
> will fail in that environment.
> 
> I haven't seen any other fallout from this change.  However, binaries  
> depending on the old behaviour may have issues with backward  
> compatibility.
> 
> 
> 
> 
Yes, I made the change when I was making it POSIX compatible, I didn't
intentionally break ElectricFence, it was an unfortunateness.
The real problem is return code of vm_fault, in file i386/i386/trap.c,
line 768:

  return((rv == KERN_PROTECTION_FAILURE) ? SIGBUS : SIGSEGV);

it only detects protection fault, and if it is true, returns SIGBUS,
however, I had mapped this SIGBUS to SIGSEGV + SEGV_ACCERR, to
understand POSIX signal encoding, please read manual page:
man siginfo

I think the return code from vm_fault should be documented, I really 
need it to implement it correctly, current there are following values in
/sys/vm/vm_param.h:

#define KERN_SUCCESS            0
#define KERN_INVALID_ADDRESS    1
#define KERN_PROTECTION_FAILURE 2
#define KERN_NO_SPACE           3
#define KERN_INVALID_ARGUMENT   4
#define KERN_FAILURE            5
#define KERN_RESOURCE_SHORTAGE  6
#define KERN_NOT_RECEIVER       7
#define KERN_NO_ACCESS          8

I don't know in what situation each value will be returned.

One situation I think kernel should return SIGBUS to user code is:

for example, I have a binary file its name called hello, its size is
1024 bytes, now following code tries to read the byte at 10239 offset:

#include <stddef.h>
#include <sys/types.h>
#include <fcntl.h>
#include <signal.h>
#include <sys/mman.h>

int main()
{
         char *p;
         int fd = open("./hello", O_RDONLY);

         if (fd == -1) {
                 printf("open failed\n");
                 return (0);
         }

         p = mmap(NULL, 10240, PROT_READ , MAP_PRIVATE, fd, 0);
         if (p == NULL) {
                 printf("can not map\n");
                 return (0);
         }

         printf("%c\n", *(p + 10239));
         return 0;
}
~

kernel should post SIGBUS to userland, because physical page is
not available, the file pager won't privde a page for it, it should
get SIGBUS + BUS_ADRERR, same when you access a device mapped into
memory, but out of device object's address range.

Above is my knowledge of the POSIX signal code, I may be wrong.

Regards,
David Xu




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43E9840B.2030709>