Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Mar 2001 00:15:01 +1000
From:      "Doug Young" <dougy@gargoyle.apana.org.au>
To:        "Ted Mittelstaedt" <tedm@toybox.placo.com>, "Denis J. Cirulis" <monster@okb.lv>, <freebsd-newbies@FreeBSD.ORG>
Subject:   Re: About Unix
Message-ID:  <001b01c0a8a3$5629f5e0$0200a8c0@apana.org.au>
References:  <003d01c0a85c$7955d620$1401a8c0@tedm.placo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Nothing is trivial to the user who doesn't know everything an expert does,
and
particularly in the case of linux, the documentation is so poor that the
chances of
anyone stumbling across info on esoteric subjects is virtually nil.
Switching from
linux to FreeBSD solved most of the problems here anyway. None of our
systems
should need to be restarted from one upgrade til the next, however the
reliability of
the power supply in most parts of OZ is comparable to what I've heard of
California
(fair to dreadful). I know that we should have UPSes everywhere ... however
with
any non-profit organization the budget is always tight. I've been doing my
best to
institute changes aimed at improving the financial position but when you
have a bunch
of committees to contend with thats another exercise.

> However, this is just a default.  It is trivial to change on both system,
> and if you have Linux systems that are going to be in environments where
> they are going to be regularly restarting, then it's a lot easier to
> change the config to have the automounter mount the Linux filesystems
> synchronously.
>
> In any case, all of this is begging 2 very important questions:
>
> 1) Why don't you organize your systems to be resistant to this?
>
> 2) Why don't you correct the environment so the systems don't have to
> restart.
>
> As far as #1 is concerned, I manage a Usenet news server that is very
busy.
> About once every 2-3 months it gets a SCSI bus error and reboots itself.
> After the first one of these I changed the system so that when it does
> reboot itself, that there's not a problem.
>
> You see, the issue with uncontrolled shutdowns is this.  If the partition
> (note partition, not filesystem) is quiescent during a uncontrolled
> shutdown, when it is fsck'd during reboot, there won't be any corruption -
> and fsck will mark it clean and remount it.
>
> This leads to an obvious solution - you arrainge your filesystem mount
> points so that anything that is being written is NOT on a filesystem
> containing startup scripts, (typically in /etc) or on a partition that's
> automounted.
>
> For example, with FreeBSD, the default mount points are to put /etc and /
on
> the same partition.  Fine - but /tmp is created on /, and /tmp is usually
> going to be in use during an uncontrolled shutdown.
>
> What I did with my news server is /var is on it's own partition, and all
> logs in it are softlinked to another disk.  /tmp and /usr/tmp are also
> softlinked to this disk.  The filesystems on this disk are NOT
automounted.
>
> If the system crashes and reboots itself, then /, /var, and /usr are all
on
> partitions that are NEVER written to during normal operation, thus they
are
> always quiescent, and they always come back up with no problem.  I can
then
> Telnet into the system and manually run fsck on the other disks.  Granted,
> it's a nuisance because the log and temp directories are unavailable
during
> this limited maintainence mode, but the system won't deny me access.
>
> Once the rest of the disks are clean, I mount them, then restart syslogd
and
> the other programs that need to be started and away we go.  No need to be
> physically at the system to do all this, nor is sync/async mounting an
> issue.
>
> Now, as far as #2 is concerned, with the exception of my news server, none
> of my other servers ever have uncontrolled shutdowns.  This is because of
> several things.  First, all servers have their own UPS's and are plugged
> into the sense port of the UPS, and if the UPS goes onto battery for too
> long, the server does a controlled shutdown.  The servers and UPS's are
also
> all on remote reboot switches.  Secondly, if I find a flaky server I work
> with it until I fix it or scrap it.  I tolerate the news server because I
> know that the problem is a software driver bug and I have not yet gotten
> time to rebuild it and fix the bug.  (news servers typically take a long
> time to rebuild and tune)
>
>
> Ted Mittelstaedt                      tedm@toybox.placo.com
> Author of:          The FreeBSD Corporate Networker's Guide
> Book website:         http://www.freebsd-corp-net-guide.com
>
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-newbies" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?001b01c0a8a3$5629f5e0$0200a8c0>