Date: Sat, 10 Mar 2001 00:15:01 +1000 From: "Doug Young" <dougy@gargoyle.apana.org.au> To: "Ted Mittelstaedt" <tedm@toybox.placo.com>, "Denis J. Cirulis" <monster@okb.lv>, <freebsd-newbies@FreeBSD.ORG> Subject: Re: About Unix Message-ID: <001b01c0a8a3$5629f5e0$0200a8c0@apana.org.au> References: <003d01c0a85c$7955d620$1401a8c0@tedm.placo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Nothing is trivial to the user who doesn't know everything an expert does, and particularly in the case of linux, the documentation is so poor that the chances of anyone stumbling across info on esoteric subjects is virtually nil. Switching from linux to FreeBSD solved most of the problems here anyway. None of our systems should need to be restarted from one upgrade til the next, however the reliability of the power supply in most parts of OZ is comparable to what I've heard of California (fair to dreadful). I know that we should have UPSes everywhere ... however with any non-profit organization the budget is always tight. I've been doing my best to institute changes aimed at improving the financial position but when you have a bunch of committees to contend with thats another exercise. > However, this is just a default. It is trivial to change on both system, > and if you have Linux systems that are going to be in environments where > they are going to be regularly restarting, then it's a lot easier to > change the config to have the automounter mount the Linux filesystems > synchronously. > > In any case, all of this is begging 2 very important questions: > > 1) Why don't you organize your systems to be resistant to this? > > 2) Why don't you correct the environment so the systems don't have to > restart. > > As far as #1 is concerned, I manage a Usenet news server that is very busy. > About once every 2-3 months it gets a SCSI bus error and reboots itself. > After the first one of these I changed the system so that when it does > reboot itself, that there's not a problem. > > You see, the issue with uncontrolled shutdowns is this. If the partition > (note partition, not filesystem) is quiescent during a uncontrolled > shutdown, when it is fsck'd during reboot, there won't be any corruption - > and fsck will mark it clean and remount it. > > This leads to an obvious solution - you arrainge your filesystem mount > points so that anything that is being written is NOT on a filesystem > containing startup scripts, (typically in /etc) or on a partition that's > automounted. > > For example, with FreeBSD, the default mount points are to put /etc and / on > the same partition. Fine - but /tmp is created on /, and /tmp is usually > going to be in use during an uncontrolled shutdown. > > What I did with my news server is /var is on it's own partition, and all > logs in it are softlinked to another disk. /tmp and /usr/tmp are also > softlinked to this disk. The filesystems on this disk are NOT automounted. > > If the system crashes and reboots itself, then /, /var, and /usr are all on > partitions that are NEVER written to during normal operation, thus they are > always quiescent, and they always come back up with no problem. I can then > Telnet into the system and manually run fsck on the other disks. Granted, > it's a nuisance because the log and temp directories are unavailable during > this limited maintainence mode, but the system won't deny me access. > > Once the rest of the disks are clean, I mount them, then restart syslogd and > the other programs that need to be started and away we go. No need to be > physically at the system to do all this, nor is sync/async mounting an > issue. > > Now, as far as #2 is concerned, with the exception of my news server, none > of my other servers ever have uncontrolled shutdowns. This is because of > several things. First, all servers have their own UPS's and are plugged > into the sense port of the UPS, and if the UPS goes onto battery for too > long, the server does a controlled shutdown. The servers and UPS's are also > all on remote reboot switches. Secondly, if I find a flaky server I work > with it until I fix it or scrap it. I tolerate the news server because I > know that the problem is a software driver bug and I have not yet gotten > time to rebuild it and fix the bug. (news servers typically take a long > time to rebuild and tune) > > > Ted Mittelstaedt tedm@toybox.placo.com > Author of: The FreeBSD Corporate Networker's Guide > Book website: http://www.freebsd-corp-net-guide.com > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-newbies" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?001b01c0a8a3$5629f5e0$0200a8c0>