Date: Fri, 21 Jul 1995 14:31:27 -0400 (EDT) From: A boy and his worm gear <wpaul@skynet.ctr.columbia.edu> To: graichen@omega.physik.fu-berlin.de (Thomas Graichen) Cc: bugs@freebsd.org Subject: Re: 3 ways to crash FreeBSD (2.0.5 and 950412-SNAP) Message-ID: <199507211831.OAA05262@skynet.ctr.columbia.edu> In-Reply-To: <9507210817.AA25956@omega.physik.fu-berlin.de> from "Thomas Graichen" at Jul 21, 95 10:17:53 am
next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the world, Thomas Graichen had to walk into mine and say: > hello - here are my 3 ways to crash FreeBSD: [three-finger salute during boot causes stange crash] Just a guess: I remember someone saying that CTRL-ALT-DEL is supposed to cause the kernel to send a signal to init to tell it to start shutting down the system. If it tries to do the same thing when init isn't running (yet), then I imagine nasty things would happen. > * simply do the following as root at the console: > > modload -u -o /tmp/saver_mod -e saver_init -q /lkm/${saver}_saver_mod.o > modload -u -o /tmp/saver_mod -e saver_init -q /lkm/${saver}_saver_mod.o > (this will giva an error) > modunload -n star_saver I think I fixed this one. I noticed a similar problem with the if_sl module; all the MISC type modules are suceptible to the bug. The problem is that duplicate checking (i.e. checking to see if the user is trying to load a second instance of the same module) didn't work quite right with MISC modules: there are handler routines for other module types (VFS, EXEC, etc...) which do the checking before the module is actually called for the first time, but there is no such handler function for MISC modules, and MISC modules aren't smart enough to do it themselves. What's happening is that the duplicate checking is done _after_ the module's internal initialization routine is called. By the time the kernel notices the problem, the module has already wired itself in. As part of the error handling, the kernel tries to unmap the duplicate instance of the module. This is akin to gnawing your own arm off: the next time the now-dead module's address space is referenced, the system will blow up. What you need to do to fix this is grab a new copy of /usr/src/sys/sys/lkm.h, install it (it goes in /usr/include/sys too, if you have just the lkm sources installed) and rebuild all the MISC modules. The new lkm.h has a tiny modification in the DISPATCH() macro: it makes a quick call to lkmexists() before actually trying to run the module's initialization routine. (This should be pulled into the STABLE branch if it hasn't already.) > * the last is a problem i have since the early january SNAP's - this is what > i've written some times before to jordan: > > the system crashes then i log in (but couriously not if somebody else or > root does this) via xdm - /var/log/messages says > > Feb 9 10:49:44 julia /vmunix: Error in getattr: 70 Lessee... errno 70 is Stale NFS file handle. > Feb 9 10:49:45 julia /vmunix: instruction pointer = 0x8:0xf0125b1b ^^^^^^^^ Do an 'nm /vmunix' and see if you can find a symbol with an address close to this one. This will give you a rough idea of where the system is getting hosed (though it may not point you directly at the problem). > its absolutely reproducable: reboot - xdm is started - i try to login - the > xdm login window disappears - i here the disk writing the coredump - but the > problem did'nt appear if somebody else logs in (who has his homedirectory on > another machine - but both dec alpha's osf/1 3.0 - we are mounting the > homedirs via amd with nfs) - my nfs-homedir-server says: > > Feb 9 10:45:48 sirius vmunix: NFS server: stale file handle fs(8,2054) file > 116839 gen 792314558 > Feb 9 10:45:48 sirius vmunix: getattr, client address = 130.133.3.235, errno > 22 Errno 22 is Invalid Argument. This stuff is out of my league, but it sounds like a locking problem or race condition. It happens that there have been many changes to the NFS and VM code in FreeBSD-current. You might try setting up a -current system and seeing if the problem persists. > * and one last thing: > we mount all our homedir's via amd - which mounts them if they are needed > (user logs in) and tries to unmount them automatic if they are no longer used > - but FreeBSD seems to loose the directories from time to time - that means > the directory will be mounted again and again (this way i sometimes get 20 > times the same dir mounted) - and after each of these overmountings i get an > "getting cwd failed" from my tcsh (because the directory is new ... mounted) - > do you have any ideas ? Sorry: I use amd on my system and it works fine. My configuration is probably different from yours though. (I mount each user's home directory just once and then use amd to create symlinks that point to the right directies in each filesystem. So amd mounts /q/elara/home/elara via NFS, creates a /home/elara link that points to /q/elara/home/elara, then it makes, for example, a /homes/foouser link that points to /home/elara/foouser (and a /homes/baruser that points to /home/elara/baruser, and a /homes/bazuser, etc...) This way, everyone's home directory is always /homes/<username>. Note that I use the Berkeley amd on all my machines too. The only special thing I have to do with FreeBSD is use the resvport option.) > to all the points above - i'll try to give you all the information you need > and as far as i can all the help i may give you - thanks in advance - t Try -current first and see if the problems are still there. -Bill -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Bill Paul (212) 854-6020 | System Manager Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Møøse Illuminati: ignore it and be confused, or join it and be confusing! ~~~~~~ "Welcome to All Things BSDish! If it's not BSDish, it's crap!" ~~~~~~~
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507211831.OAA05262>