Date: Fri, 22 Oct 2004 16:43:23 +0200 From: "Tom Jensen" <tom@motd.dk> To: "'Robert Watson'" <rwatson@freebsd.org> Cc: freebsd-current@freebsd.org Subject: RE: Machine hangs(Beta7), only reset button works Message-ID: <20041022144231.C023D62EF@bart.motd.dk> In-Reply-To: <Pine.NEB.3.96L.1041022045223.34569C-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
The problem is seen when running my backup script with cron. The script is executed six times at midnight where it create 6 .tgz files (of different directories) and after this is done a mount_smbfs connection is made to a windows box where the .tgz files are copied to. The script contains the following: #!/bin/sh export BACKUPDATE=$(date +DATE.%Y.%m.%d-TIME.%H.%M.%S) export BACKUPPATH=/home/backup export BACKUPFILE=BACKUP.$1.tgz tar czfv $BACKUPPATH/$BACKUPFILE $2 export REMOTEPATH=/home/backup/remote export REMOTEFILE=BACKUP.$1.$BACKUPDATE.tgz mount_smbfs -N -I backup.motd.dk //backup/'sharenamehere$' /home/backup/remote/$1 cp $BACKUPPATH/$BACKUPFILE $REMOTEPATH/$1/$REMOTEFILE umount $BACKUPPATH/remote/$1 exit $? /etc/crontab: # Run the backup every 24 hour 0 0 * * * root /home/backup-script.sh tom /home/tom There is no NFS etc. involved and all the dir's are located on the box. The largest size of one of the dir's are approx. 210 MB at the moment. I'll recompile kernel with DEBUG_LOCKS right away, I'm not sure about what you mean by "compile all modules in, as this produces a kernel that is ABI-incompatible with most modules" - Tom -----Original Message----- From: owner-freebsd-current@freebsd.org [mailto:owner-freebsd-current@freebsd.org] On Behalf Of Robert Watson Sent: 22. oktober 2004 10:55 To: Tom Jensen Cc: freebsd-current@freebsd.org Subject: RE: Machine hangs(Beta7), only reset button works On Fri, 22 Oct 2004, Tom Jensen wrote: > Ok, I managed to break into the with a break over serial console, > attached is some info from KDB (please note that I have no knowledge > about using the debugger) > > I also noticed that I don't have "makeoptions DEBUG=-g" in my kernel > conf so I have no debug kernel. I rebuild my kernel ASAP so I'm able > to provide more info sometime tomorrow. > > Please let me know if there is more info needed. Great, into the debugger is good news. Looking at the thread wait states, it looks like you might have a vnode deadlock going on. It would be useful if you could use the "show lockedvnods" command to show what vnode locks are held. If you recompile your kernel with DEBUG_LOCKS (and compile all modules in, as this produces a kernel that is ABI-incompatible with most modules), you will get extra debugging information when you use that command (it will say what locks were acquired where, not just what locks are held). It looks like you're using SMBfs could you mention a little about how that's in use, and whether things like home directories, etc, are mounted that way? Is it an element of the system you could remove during testing to see if the problem goes away? Also, are you using NFS or other distributed file systems? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research > > - Tom > > -----Original Message----- > From: owner-freebsd-current@freebsd.org > [mailto:owner-freebsd-current@freebsd.org] On Behalf Of Robert Watson > Sent: 21. oktober 2004 13:46 > To: Tom Jensen > Cc: freebsd-current@freebsd.org > Subject: Re: Machine hangs(Beta7), only reset button works > > > On Thu, 21 Oct 2004, Tom Jensen wrote: > > > I've been seeing a pretty strange problem lately with my server. > > > > The box completely freeze typically when it's done running the first > > part of my backup script, resulting in no possibility to login on > > the console or by SSH, the freeze even happens when I'm sitting in a > > terminal and working. > > > > There is no indication in log files etc. about what's causing the > > problem and it's not breaking into debugger either :-( > > This should probably be in debugging lore somewhere, but I've observed > that it's often possible to break into the debugger using a break over > serial console when it's not possible to break in using syscons. This > is because syscons requires the Giant lock, so if the freeze happens > because a thread is spinning while holding Giant, you can't get in. > This needs to be fixed, but hasn't yet been fixed, so in the mean time > often useful advice is to use a serial console to generate the break. > > If you still can't get into the debugger, you might try some of the > various watchdog drivers -- some hardware comes with built in watchdog > parts, such as ichwd(4), or you could try options MP_WATCHDOG on an > SMP box if you're willing to dedicate a CPU to running as a watchdog for the other cpu(s). > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Principal Research Scientist, McAfee Research > > > > > > The backup script is really simple, creating a .tgz file of a given > > directory, mounting a windows share (mount_smbfs) and copying the > > file. The script is run by cron six times (start at the same time) > > in six different directories, this results in the box freezes after > > the tar processes finishes. > > > > Attached is the dmesg.boot and the latest top, don't know if it's > > any use but it's seems rather strange that a lot of processes are in > > a STATE usf (not sure what this means but I don't sees this when the > > box is running > > normally) > > > > The kernel is mostly a generic with the following modifications: > > > > options IPFIREWALL > > options IPFIREWALL_VERBOSE > > options IPFIREWALL_VERBOSE_LIMIT=400 > > options IPDIVERT > > options IPSEC > > options IPSEC_ESP > > options IPSEC_DEBUG > > device ath > > device ath_hal > > options KDB > > options DDB > > > > bash-2.05b# uname -a > > FreeBSD bart.motd.dk 5.3-BETA7 FreeBSD 5.3-BETA7 #6: Tue Oct 19 00:36:59 > > CEST 2004 root@bart.motd.dk:/usr/obj/usr/src/sys/GW i386 > > > > Any more info needed please let me know. > > > > Best regards > > > > - Tom > > > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041022144231.C023D62EF>