Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Jun 2005 16:49:05 -0400
From:      John Baldwin <jhb@FreeBSD.org>
To:        freebsd-alpha@freebsd.org
Subject:   Re: Kernel trap on linux compat load
Message-ID:  <200506201649.06845.jhb@FreeBSD.org>
In-Reply-To: <42B72778.5090009@coldhaus.com>
References:  <42AA1C23.4080901@coldhaus.com> <200506201433.46335.jhb@FreeBSD.org> <42B72778.5090009@coldhaus.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 20 June 2005 04:30 pm, Eric Millbrandt wrote:
> John Baldwin wrote:
> >On Friday 17 June 2005 05:56 pm, Eric Millbrandt wrote:
> >>John Baldwin wrote:
> >>>On Thursday 16 June 2005 08:21 pm, Eric Millbrandt wrote:
> >>>>John Baldwin wrote:
> >>>>>On Friday 10 June 2005 07:02 pm, Eric Millbrandt wrote:
> >>>>>>I've noticed that linux compatibility caused my system to crash since
> >>>>>> I upgraded from RELENG_4_9 (RELENG_4_10, RELENG_4_11, RELENG_4).  I
> >>>>>> upgraded using buildworld, I haven't had a change to test using a
> >>>>>> clean binary install.  Has anyone else seen this behavior?  It's
> >>>>>> easy to reproduce, the system traps right on linux compatibility
> >>>>>> load.  The only similar behavior I've found on google is an
> >>>>>> unanswered post
> >>>>>> http://lists.freebsd.org/pipermail/freebsd-alpha/2004-June/001555.ht
> >>>>>>ml
> >>>>>>
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: fatal kernel trap:
> >>>>>>Jun  8 07:18:17 mongoloid /kernel:
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: trap entry = 0x4 (unaligned access
> >>>>>>fault) Jun  8 07:18:17 mongoloid /kernel: a0         =
> >>>>>>0xfffffe0015285e34 Jun  8 07:18:17 mongoloid /kernel: a1         =
> >>>>>>0x2d
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: a2         = 0x1
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: pc         = 0xfffffc0000386e68
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: ra         = 0xfffffc0000386dd8
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: curproc    = 0xfffffe001177e780
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: pid = 173, comm = ldconfig
> >>>>>>Jun  8 07:18:17 mongoloid /kernel:
> >>>>>>Jun  8 07:18:17 mongoloid /kernel: panic: trap
> >>>>>>
> >>>>>>EEM
> >>>>>
> >>>>>Did you rebuild your kernel with buildkernel / installkernel?  I
> >>>>> believe that buildworld on 4.x still builds and installs new modules
> >>>>> into /modules including the linux and osf1 compat modules.  New
> >>>>> modules aren't guaranteed to work on older kernels, so if your kernel
> >>>>> is still 4.9, that is probably your problem.
> >>>>
> >>>>I used buildworld, buildkernel, installkernel, drop to single user,
> >>>>installworld, and mergemaster.  This is with linux_compat loading from
> >>>>rc.conf.  I misspoke earlier loading linux.ko works fine but running a
> >>>>linux binary, /compat/linux/sbin/ldconfig or sophie, causes the trap.
> >>>>On a side not osf1_compat works fine.
> >>>>
> >>>>Here are my stack traces after running /compat/linux/sbin/ldconfig...
> >>>>Debugger() at Debugger=0x2c
> >>>>panic() at panic+0x100
> >>>>trap() at trap+0x600
> >>>>XentUna() at XentUna+0x2c
> >>>>--- unaligned access fault (from ipl0) ---
> >>>>kernel_sysctl at kernel_sysctl+ox1a8
> >>>>347() at -0x1fffe4a63d0
> >>>>Here ddb traps on itself (memory management fault)
> >>>>
> >>>>This one is from running sophie
> >>>>Debugger() at Debugger=0x2c
> >>>>panic() at panic+0x100
> >>>>trap() at trap+0x600
> >>>>XentUna() at XentUna+0x2c
> >>>>--- unaligned access fault (from ipl0) ---
> >>>>kernel_sysctl at kernel_sysctl+ox1a8
> >>>>linux_newuname() at linux_newuname+0xd0
> >>>>syscall() at syscal+0x224
> >>>>XentSys at XentSys+0x5c
> >>>>--- syscall (339, Linux ELF, linux_newuname) ---
> >>>>--- user mode ---
> >>>
> >>>Ok, this is helpful.  Can you do 'gdb kernel.debug' and then type 'list
> >>>*kernel_sysctl+0x1a8'?  This will give us the source file/line that it
> >>>faulted at.  Thanks!
> >>
> >>Ok here is what I found.
> >>
> >>(gdb) list *kernel_sysctl+0x1a8
> >>0xfffffc000038e9a8 is in kernel_sysctl
> >>(/usr/src/sys/kern/kern_sysctl.c:938).
> >>933                     if (req.oldptr && req.oldidx > req.oldlen)
> >>934                             *retval = req.oldlen;
> >>935                     else
> >>936                             *retval = req.oldidx;
> >>937             }
> >>938             return (error);
> >>939     }
> >>940
> >>941     int
> >>942     kernel_sysctlbyname(struct proc *p, char *name, void *old,
> >>size_t *oldlenp,
> >
> >Ok, try this patch please:
> >
> >Index: compat/linux/linux_misc.c
> >===================================================================
> >RCS file: /usr/cvs/src/sys/compat/linux/linux_misc.c,v
> >retrieving revision 1.85.2.11
> >diff -u -r1.85.2.11 linux_misc.c
> >--- compat/linux/linux_misc.c   23 Mar 2004 12:16:48 -0000      1.85.2.11
> >+++ compat/linux/linux_misc.c   20 Jun 2005 18:33:07 -0000
> >@@ -687,7 +687,8 @@
> >        struct l_new_utsname utsname;
> >        char *osrelease, *osname;
> >        int name[2];
> >-       int error, plen, olen;
> >+       int error;
> >+       size_t plen, olen;
> >
> > #ifdef DEBUG
> >        if (ldebug(newuname))
>
> Ok, I rebuilt and installed linux.ko and linux binaries appear to be
> working!  At least sophie, compiled for linux, catches viruses again.
> Let me know if you need any further testing and where you will apply the
> patch in cvs.  Thanks for the help.

Excellent.  Thanks for testing.  I've just committed the fix to RELENG_4.

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200506201649.06845.jhb>