Date: Fri, 10 Oct 2008 10:40:01 +0200 From: Laszlo Nagy <gandalf@shopzeus.com> To: freebsd-questions@freebsd.org Subject: 7.1 hangs, shutdown terminated Message-ID: <48EF14E1.9080808@shopzeus.com>
next in thread | raw e-mail | index | archive | help
Hi, A computer hangs every day in the morning at a specific time, between 8 AM and 9 AM. We can ping it. Apparently the console works, also gdm works on it, but we are not able to login at all. ssh accepts connections, but the authentication does not continue (e.g. ssh client waits for the server forever...) I even cannot login on the console as "root" because it accepts the user name, but does not ask for the password! Pressing Ctrl+Alt+Del on the console waits for about one or two minutes, then I see this on the screen: http://www.imghype.com/viewer.php?imgdata=9d95ee9d1fstrange_shutdown.jpg Here is /var/log/messages just before the crash: Oct 10 01:52:47 shopzeus postgres[81114]: [5-1] WARNING: nonstandard use of escape in a string literal at character 193 Oct 10 01:52:47 shopzeus postgres[81114]: [5-2] HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. Oct 10 01:57:11 shopzeus postgres[84132]: [5-1] WARNING: nonstandard use of escape in a string literal at character 188 Oct 10 01:57:11 shopzeus postgres[84132]: [5-2] HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. Oct 10 02:00:01 shopzeus postfix/postfix-script[86167]: fatal: the Postfix mail system is already running Oct 10 02:30:00 shopzeus postfix/postfix-script[7240]: fatal: the Postfix mail system is already running Oct 10 03:00:00 shopzeus postfix/postfix-script[27437]: fatal: the Postfix mail system is already running Oct 10 04:07:54 shopzeus rc.shutdown: 30 second watchdog timeout expired. Shutdown terminated. Oct 10 04:09:16 shopzeus postgres[30455]: [5-1] FATAL: terminating connection due to administrator command Oct 10 04:09:17 shopzeus syslogd: exiting on signal 15 Oct 10 04:11:31 shopzeus syslogd: kernel boot file is /boot/kernel/kernel Oct 10 04:11:31 shopzeus kernel: Copyright (c) 1992-2008 The FreeBSD Project. Oct 10 04:11:31 shopzeus kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 After rebooting the machine, nothing happens until the next day. Here are some possible problems I can think of: #1. We are using gjournal. It might be that the journal size is too small. Although I do not think this is the case, because we have 40GB journal space for each journaled partition below (except for /home, it has 10GB only, but /home is rarely used) Filesystem 1G-blocks Used Avail Capacity Mounted on /dev/da0s1a 9 1 7 14% / devfs 0 0 0 100% /dev /dev/da0s1f.journal 140 12 117 9% /home /dev/da0s2d.journal 106 8 89 8% /pgdata0 /dev/da0s1d 29 0 26 0% /tmp /dev/da0s2e.journal 585 74 464 14% /usr /dev/da0s1e.journal 145 17 116 13% /var /dev/da1s1d.journal 416 0 383 0% /data Is it possible that gjournal is hanging up the machine? #2. Yesterday when I logged in in the morning, I saw a process running under root, it was something like " find / -sx ..." and then something. I don't remember but it was scanning the whole filesystem. It was using 100% cpu and 100% disk I/O. I wonder if that might be freezing the computer. I do not know how to disable this maintenance process but I should. After killing this process, the system worked fine. (We have zillions of files on the disks, running "find / ..." is a bad idea.) #3. In the screenshot above, you can see that the IMAP server "dovecot" was terminated on signal 11. Can it be the problem? I can't believe that dovecot could freeze the whole system. #4. Hardware error. I don't think this is the case since the computer freezes at the same time, every day, so it is more likely a software problem. Any thoughts what is causing this? uname -a: FreeBSD shopzeus.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #3: Mon Oct 6 07:50:31 EDT 2008 gandalf@shopzeus.com:/usr/obj/usr/src/sys/SHOPZEUS amd64 Thank you, Laszlo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48EF14E1.9080808>