Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Jun 2002 21:07:24 -0400
From:      "Robert Blayzor" <rblayzor@inoc.net>
To:        "'Matthew Dillon'" <dillon@apollo.backplane.com>, <freebsd-stable@FreeBSD.ORG>
Subject:   RE: RE: Swap_pager error
Message-ID:  <001701c20c2d$5a689460$080010ac@z0.inoc.net>
In-Reply-To: <200206041749.g54Hn5SL096100@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I looked through all the periodic daily stuff.  It doesn't seem that any
of the scripts will trash NFS mounted partitions, almost everything I
saw would only look at UFS mounted partitions.

One thing I did notice is the security check was quite brutal.  While
any server should survive it I believe this is what is causing the
system to crash.  The security check seems to run a find on the NFS
servers local UFS mounts.  We have some very, very large volumes with
hundreds of thousands of small files... (maildirs, boxes, webmail, web,
etc).  On this box, it seemed that the security check would take almost
3-4 minutes to complete with that find, and it just totally saturates
the box in activity when it runs.

So, I think there may be a loading issue with all these files/inodes in
relation to the find process... Perhaps the SCSI or driver stuff in
FreeBSD.  If I can be of any help on this, I surely will led a hand.  I
would like to see FreeBSD be able to survive this without a hitch.

Perhaps a suggestion to change the priority of the "find" tasks in those
scripts with nice or something.  I mean the box was really bogged down
when we ran "periodic daily" by manually.

As a work around, we moved periodic daily to run at 9:01am instead of
3:01am, and only on Monday - Friday.  We don't need any more weekend
surprise pages and then call-ins.  :-)

Since this box is an internal server only with no accounts on it, and it
has no route to the outside + behind a firewall, we're going to go ahead
and disable the security check all together.  I'm hoping that this will
provide a work around for this "loading" issue.  If I can be any help to
the core team to debug this problem, I'll do my best to do what I can.

--
Robert Blayzor, BOFH
INOC, LLC
rblayzor@inoc.net

One picture is worth 128K words.



> -----Original Message-----
> From: Matthew Dillon [mailto:dillon@apollo.backplane.com] 
> Sent: Tuesday, June 04, 2002 1:49 PM
> To: Robert Blayzor; freebsd-stable@FreeBSD.ORG
> Subject: Re: RE: Swap_pager error
> 
> 
>     I have one more idea... daily cron jobs tend to really 
> load down the
>     system for a short period of time, especially the disks.  
> In your case
>     the local daily cron is combinging with the daily cron 
> running on the
>     NFS clients.  There could be a hardware problem with the 
> system that
>     is most likely to show up under heavy loads.  
> 
>     It is also possible that this is revealing a driver bug somewhere.
>     For example, the extreme disk load could be revealing a bug in the
>     driver's tag handling or in the RAID card's tag handling. 
>  The lack
>     of driver-based error messages is rather odd.  I don't 
> see how that
>     can happen unless the RAID card itself is locking up.
> 
> 						-Matt
> 
> ::
> ::Both times the box has crashed crashed at ~3:02am.  I'm 
> thinking that
> ::something in periodic daily is causing the crashes.
> ::
> ::Keep in mind, that this server serves several NFS clients 
> which mount
> ::things such as FreeBSD ports and /usr/src.  Those are soft 
> linked to on
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?001701c20c2d$5a689460$080010ac>