Date: Wed, 08 Feb 2006 13:39:23 +0800 From: David Xu <davidxu@freebsd.org> To: Sam Lawrance <boris@brooknet.com.au> Cc: Kris Kennaway <kris@obsecurity.org>, Jason Evans <jasone@freebsd.org>, current@freebsd.org Subject: Re: Which signal occurs due to a page protection violation? Message-ID: <43E9840B.2030709@freebsd.org> In-Reply-To: <ADC5EE4E-2304-4F95-A32F-8BB37F2A9A7E@brooknet.com.au> References: <3458D5B9-860C-4185-9359-1F48FC35B048@brooknet.com.au> <31986988-9FB7-4EFC-986B-50DB99934E32@freebsd.org> <ADC5EE4E-2304-4F95-A32F-8BB37F2A9A7E@brooknet.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
Sam Lawrance wrote: > > [ moved to -current ] > > On 01/02/2006, at 6:41 AM, Jason Evans wrote: > >> On Jan 31, 2006, at 1:06 AM, Sam Lawrance wrote: >> >>> ElectricFence is failing during its self test on i386 7-current: >>> >>> Testing Electric Fence. >>> After the last test, it should print that the test has PASSED. >>> EF_PROTECT_BELOW= && EF_PROTECT_FREE= && EF_ALIGNMENT= && ./eftest >>> Segmentation fault (core dumped) >>> *** Error code 139 >>> >>> The program intentionally overruns and underruns buffers in order to >>> test the functionality of ElectricFence. >>> I think it's failing because: >>> 1) the new jemalloc is actually catching the problem and throwing >>> SIGSEGV >>> 2) ElectricFence is being compiled with - >>> DPAGE_PROTECTION_VIOLATED_SIGNAL=SIGBUS on that platform. >> >> >> I'm not sure about this, but I think the change of which signal >> occurs is unrelated to jemalloc. I think Kris Kennaway at one point >> told me that jemalloc broke the efence port, but then later retracted >> that claim when efence also failed on a machine that was still using >> phkmalloc. This may be due to a signal delivery bugfix that someone >> put in, probably in early December 2005. > > > You are right. The change below delivers SIGSEGV instead of SIGBUS > (also on amd64). > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c? > annotate=1.282 > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/amd64/trap.c.diff? > r1=1.294&r2=1.295&f=h > > David, was this an intentional change? It broke ElectricFence, which > depended on the old behaviour. The 4.x, 5.x, and 6.x package builds > are hosted on machines running -current, so ElectricFence self tests > will fail in that environment. > > I haven't seen any other fallout from this change. However, binaries > depending on the old behaviour may have issues with backward > compatibility. > > > > Yes, I made the change when I was making it POSIX compatible, I didn't intentionally break ElectricFence, it was an unfortunateness. The real problem is return code of vm_fault, in file i386/i386/trap.c, line 768: return((rv == KERN_PROTECTION_FAILURE) ? SIGBUS : SIGSEGV); it only detects protection fault, and if it is true, returns SIGBUS, however, I had mapped this SIGBUS to SIGSEGV + SEGV_ACCERR, to understand POSIX signal encoding, please read manual page: man siginfo I think the return code from vm_fault should be documented, I really need it to implement it correctly, current there are following values in /sys/vm/vm_param.h: #define KERN_SUCCESS 0 #define KERN_INVALID_ADDRESS 1 #define KERN_PROTECTION_FAILURE 2 #define KERN_NO_SPACE 3 #define KERN_INVALID_ARGUMENT 4 #define KERN_FAILURE 5 #define KERN_RESOURCE_SHORTAGE 6 #define KERN_NOT_RECEIVER 7 #define KERN_NO_ACCESS 8 I don't know in what situation each value will be returned. One situation I think kernel should return SIGBUS to user code is: for example, I have a binary file its name called hello, its size is 1024 bytes, now following code tries to read the byte at 10239 offset: #include <stddef.h> #include <sys/types.h> #include <fcntl.h> #include <signal.h> #include <sys/mman.h> int main() { char *p; int fd = open("./hello", O_RDONLY); if (fd == -1) { printf("open failed\n"); return (0); } p = mmap(NULL, 10240, PROT_READ , MAP_PRIVATE, fd, 0); if (p == NULL) { printf("can not map\n"); return (0); } printf("%c\n", *(p + 10239)); return 0; } ~ kernel should post SIGBUS to userland, because physical page is not available, the file pager won't privde a page for it, it should get SIGBUS + BUS_ADRERR, same when you access a device mapped into memory, but out of device object's address range. Above is my knowledge of the POSIX signal code, I may be wrong. Regards, David Xu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43E9840B.2030709>