Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Apr 2007 13:23:07 +0200
From:      viper@fx-services.com
To:        freebsd-questions@freebsd.org
Subject:   Random reboots 5.4 with CPanel
Message-ID:  <20070405132307.r71n36m3uogscc0s@www.fxs.se>

next in thread | raw e-mail | index | archive | help
Hi guys,

For a couple of years already I've been trying to find out why our =20
hosting machine reboots randomly. I posted some stuff to this list =20
too. Got some tips, mostly about hardware. What happens is that both =20
the main server and the backup server (which is just idling) just =20
reboot. Sometimes after 60 days, sometimes after one day. No logs, no =20
strange traffic patterns, nothing. I enabled kernel debugging. Caught =20
a crashdump on our backup machine which I will post below. The process =20
that crashes is the CPU monitor for Cpanel. I disabled that one, so it =20
crashed on any other process (httpd, perl, etc). I tried disabling =20
ACPI, rebuild world with just -O in make.conf, etc etc. This morning =20
the main server rebooted again, it didn't even leave a dump in =20
/var/crash. Hardware is not the same. This behavious I've seen on dual =20
athlons (two different mainboards) and dual Xeons. It seems related to =20
SMP code. Played around with idle and hyperthreading settings in =20
sysctl too. Nothing seems to make any difference at all. The crashump =20
is below, does anyone have ANY idea what might cause this?

I think it has to be the cpanel hosting panel, but such an application =20
shouldn't be able to to crash the OS...

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 01
fault virtual address   =3D 0x98
fault code              =3D supervisor write, page not present
instruction pointer     =3D 0x20:0xc06b7f1e
stack pointer           =3D 0x28:0xece5f730
frame pointer           =3D 0x28:0xece5f774
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                         =3D DPL 0, pres 1, def32 1, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 69885 (dcpumon)
trap number             =3D 12
panic: page fault
cpuid =3D 0
Uptime: 2d22h1m13s
Dumping 2047 MB (2 chunks)
   chunk 0: 1MB (159 pages) ... ok
   chunk 1: 2047MB (523904 pages) 2031 2015 1999 1983 1967 1951 1935 =20
1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 =20
1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 =20
1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 =20
1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 =20
1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 =20
751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 =20
479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 =20
207 191 175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:165
165             __asm __volatile("movl %%fs:0,%0" : "=3Dr" (td));
(kgdb) backtrace
#0  doadump () at pcpu.h:165
#1  0xc063efca in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:39=
9
#2  0xc063f396 in panic (fmt=3D0xc0870bd4 "%s") at =20
/usr/src/sys/kern/kern_shutdown.c:555
#3  0xc082e16c in trap_fatal (frame=3D0xece5f6f0, eva=3D0) at =20
/usr/src/sys/i386/i386/trap.c:831
#4  0xc082de52 in trap_pfault (frame=3D0xece5f6f0, usermode=3D0, eva=3D152) =
=20
at /usr/src/sys/i386/i386/trap.c:742
#5  0xc082da02 in trap (frame=3D
       {tf_fs =3D 8, tf_es =3D 40, tf_ds =3D 40, tf_edi =3D 4, tf_esi =3D 0,=
 =20
tf_ebp =3D -320473228, tf_isp =3D -320473316, tf_ebx =3D 4098, tf_edx =3D =
=20
-1002850048, tf_ecx =3D 0, tf_eax =3D 4, tf_trapno =3D 12, tf_err =3D 2, =20
tf_eip =3D -1066696930, tf_cs =3D 32, tf_eflags =3D 66118, tf_esp =3D =20
-320473100, tf_ss =3D 1017})
     at /usr/src/sys/i386/i386/trap.c:432
#6  0xc0817d0a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc06b7f1e in vn_lock (vp=3D0x0, flags=3D4098, td=3D0xc439b900) at atomi=
c.h:149
#8  0xc05eee46 in procfs_doprocfile (td=3D0xc439b900, p=3D0xc9068830, =20
pn=3D0xc35f3900, sb=3D0x4, uio=3D0x0) at /usr/src/sys/fs/procfs/procfs.c:73
#9  0xc05f3f5b in pfs_readlink (va=3D0x4) at pcpu.h:162
#10 0xc0841a13 in VOP_READLINK_APV (vop=3D0x4, a=3D0xc439b900) at vnode_if.c=
:1481
#11 0xc06b14e3 in kern_readlink (td=3D0xc439b900, path=3D0xc439b900 =20
"<j\006=C9 x\006=C9", pathseg=3D3292117248, buf=3D0x4 <Address 0x4 out of =
=20
bounds>, bufseg=3D4,
     count=3D1024) at vnode_if.h:772
#12 0xc06b13e8 in readlink (td=3D0x4, uap=3D0xc439b900) at =20
/usr/src/sys/kern/vfs_syscalls.c:2261
#13 0xc082e573 in syscall (frame=3D
       {tf_fs =3D 59, tf_es =3D 59, tf_ds =3D 59, tf_edi =3D 135512892, tf_e=
si =20
=3D 135663632, tf_ebp =3D -1077940936, tf_isp =3D -320471708, tf_ebx =3D =20
674109588, tf_edx =3D -1077941960, tf_ecx =3D 0, tf_eax =3D 58, tf_trapno =
=3D =20
0, tf_err =3D 2, tf_eip =3D 672579140, tf_cs =3D 51, tf_eflags =3D 647, tf_e=
sp =20
=3D -1077942020, tf_ss =3D 59}) at /usr/src/sys/i386/i386/trap.c:976
#14 0xc0817d5f in Xint0x80_syscall () at =20
/usr/src/sys/i386/i386/exception.s:200
#15 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)

/Robin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070405132307.r71n36m3uogscc0s>