Date: Thu, 23 Sep 1999 23:13:02 -0700 (PDT) From: "Rodney W. Grimes" <freebsd@gndrsh.dnsmgr.net> To: john@nlc.net.au (John Saunders) Cc: freebsd-current@FreeBSD.ORG (FreeBSD current) Subject: Re: Automating filesystem check at boot time Message-ID: <199909240613.XAA01586@gndrsh.dnsmgr.net> In-Reply-To: <011701bf064e$d0f7f4f0$6cb611cb@scitec.com.au> from John Saunders at "Sep 24, 1999 03:36:56 pm"
next in thread | previous in thread | raw e-mail | index | archive | help
[Charset iso-8859-1 unsupported, filtering to ASCII...] > I administer a number of remote FreeBSD boxes and starting with 3.x > they have been unreliable at rebooting. We all know FreeBSD wants to > keep running forever, however it seems to be at the expense of > reboot stability. I have found the following problems occuring. > > 1) After a power failure the filesystem is inconsistent such that > a manual fsck is required. Actually this can also occur following > a crash or failed shutdown. However I must admit that FreeBSD does > this less than Linux, but it still does it. Unattended remotly administered systems shall have an ups, power failure is not an acceptable operation in 99% of these types of systems. There is no good reason to have to hack thems up to deal with this situation. Get a good ups, and install one of the ups monitor daemons from ports to properly shut down before total battery failure. > 2) After running "shutdown -r now" FreeBSD will kill off all processes > but complain that is unable to kill everything. It then says Syncing > disks...done. Then hangs until the reset button is pressed. I think > that amd is causing this. The time this happened was following a > reboot to clear an amd problem when the NFS server was isolated from > from the network for some time. You need to find and fix what ever it is that is not dieing when being told to die. Your work around is a bandaid that only hides the real problem, which is probably a bug some place in something. amd and NFS are good first conidates. Just what process does it complain that it is unable to kill, or does it just say could not kill? > My previous hacks at Linux has led me to the following patch to /etc/rc > which I have been using for a while on FreeBSD to solve point 1. It has > saved me a lot of driving on 2 occasions. The program "waitkey" is one > I wrote that sleeps for the specicifed and returns TRUE (0), unless a > key is pressed in which case it returns the ASCII code for the key. Not a bad hack, but I will never ever run a fsck -y without at least a log file some place, and then only as a last resort after imaging the disk and doing everything else I can to try and recoverer it if it has anything important at all on it. > > +++ rc Wed Aug 18 13:59:59 1999 > @@ -69,6 +69,12 @@ > ;; > 8) > echo "Automatic file system check failed... help!" > + if waitkey 30; then > + exit 1 > + fi > + fsck -y fsck -y >tosomeplace writable. Hard to figure out at this stage, but if you run a seperate /tmp like we do you can always change this to: newfs /dev/rawtmpdev mount -u /tmp fsck -u >&/tmp/fsck-y.OUT So you have some clue as to what got destroyed during the fsck. > + reboot > + echo "reboot failed... help!" Have you actually ever seen this echo execute?? > exit 1 > ;; > 12) > > Anyway I am proposing a method where FreeBSD can be configured though an > rc.conf knob to be more friendly in an unattended situation. I propose > as a first step that a knob called "unattended_operation" be added with > a default value of "NO". Enabling this knob can be used to allow code > like the above to be executed. It can also be used to force the sysctl > variable "debug.debugger_on_panic" to 0 in the rc file. You might want to get input from Julian and friends at Whistle, they are experts at unattended operations... > > I can also contribute the waitkey.c program. It may even be useful for > other stuff with some changes to the command syntax. This can actually be done from /bin/sh using a background processes, just run a read from the console in background, sleep for X, check for child status, if child alive no keypress occured, kill child. > > Does anybody have any strong opinions on this, either way? I have this > running on my machine at present so I'm not too fussed either way, just > thought it might be useful for other people as well. I can supply code > and patches, but I would like somebody with commit privs to look over > the code, make suggestions and eventually commit the work. I'll strongly object to any automated running of a fsck -y, it is far to dangerious for way to many folks. -- Rod Grimes - KD7CAX - (RWG25) rgrimes@gndrsh.dnsmgr.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199909240613.XAA01586>