Date: Thu, 05 Apr 2007 13:03:20 -0400 From: "Simon" <simon@optinet.com> To: "freebsd-hardware@freebsd.org" <freebsd-hardware@freebsd.org>, "Robin Vley" <viper@fx-services.com> Subject: Re: SMP crashes / reboots 5.4 with CPanel Message-ID: <20070405170522.A443A13C484@mx1.freebsd.org> In-Reply-To: <4614F070.2000302@fx-services.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Get=206.2,=205.4=20is=20old=20and=20is=20known=20to=20have=20kernel=20bugs= =20that=20are=20fixed=20in=206.x. -Simon On=20Thu,=2005=20Apr=202007=2014:49:52=20+0200,=20Robin=20Vley=20wrote: >Hi! >I=20posted=20this=20to=20the=20FBSD-Questions=20mailinglist,=20because=20= I'm=20completely >not=20sure=20if=20this=20is=20hardware=20or=20software.=20Last=20time=20I= =20got=20some=20good >pointers=20there,=20but=20since=20I'm=20100%=20in=20the=20dark=20where=20= this=20is=20coming >from,=20I=20crosspost=20it=20here. >For=20a=20couple=20of=20years=20already=20I've=20been=20trying=20to=20fin= d=20out=20why=20our >hosting=20machine=20reboots=20randomly.=20Got=20some=20tips,=20mostly=20a= bout=20hardware. >What=20happens=20is=20that=20both=20the=20main=20server=20and=20the=20bac= kup=20server=20(which >is=20just=20idling)=20just=20reboot.=20Sometimes=20after=2060=20days,=20s= ometimes=20after >one=20day.=20No=20logs,=20no=20strange=20traffic=20patterns,=20nothing.=20= I=20enabled=20kernel >debugging.=20Caught=20a=20crashdump=20on=20our=20backup=20machine=20which= =20I=20will=20post >below.=20The=20process=20that=20crashes=20is=20the=20CPU=20monitor=20for=20= Cpanel.=20I >disabled=20that=20one,=20so=20it=20crashed=20on=20any=20other=20process=20= (httpd,=20perl, >etc).=20I=20tried=20disabling=20ACPI,=20rebuild=20world=20with=20just=20-= O=20in=20make.conf, >etc=20etc.=20This=20morning=20the=20main=20server=20rebooted=20again,=20i= t=20didn't=20even >leave=20a=20dump=20in=20/var/crash.=20Hardware=20is=20not=20the=20same.=20= This=20behavious >I've=20seen=20on=20dual=20athlons=20(two=20different=20mainboards)=20and=20= dual=20Xeons.=20It >seems=20related=20to=20SMP=20code.=20Played=20around=20with=20idle=20and=20= hyperthreading >settings=20in=20sysctl=20too.=20Nothing=20seems=20to=20make=20any=20diffe= rence=20at=20all.=20The >crashump=20is=20below,=20does=20anyone=20have=20ANY=20idea=20what=20might= =20cause=20this? >The=20machine=20is=20running=20on=20a=20SuperMicro=20Dual=20Xeon=20board=20= (X5DPA-TMG+). >Crashes=20happen=20on=20this=20board,=20but=20also=20on=20the=20Tyan=20MP= X=20dual=20athlon >systems.=20I=20think=20it=20has=20to=20be=20the=20cpanel=20hosting=20pane= l,=20but=20such=20an >application=20shouldn't=20be=20able=20to=20to=20crash=20the=20OS... >Fatal=20trap=2012:=20page=20fault=20while=20in=20kernel=20mode >cpuid=20=3D=200;=20apic=20id=20=3D=2001 >fault=20virtual=20address=20=20=20=3D=200x98 >fault=20code=20=20=20=20=20=20=20=20=20=20=20=20=20=20=3D=20supervisor=20= write,=20page=20not=20present >instruction=20pointer=20=20=20=20=20=3D=200x20:0xc06b7f1e >stack=20pointer=20=20=20=20=20=20=20=20=20=20=20=3D=200x28:0xece5f730 >frame=20pointer=20=20=20=20=20=20=20=20=20=20=20=3D=200x28:0xece5f774 >code=20segment=20=20=20=20=20=20=20=20=20=20=20=20=3D=20base=200x0,=20lim= it=200xfffff,=20type=200x1b >=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=3D= =20DPL=200,=20pres=201,=20def32=201,=20gran=201 >processor=20eflags=20=20=20=20=20=20=20=20=3D=20interrupt=20enabled,=20re= sume,=20IOPL=20=3D=200 >current=20process=20=20=20=20=20=20=20=20=20=3D=2069885=20(dcpumon) >trap=20number=20=20=20=20=20=20=20=20=20=20=20=20=20=3D=2012 >panic:=20page=20fault >cpuid=20=3D=200 >Uptime:=202d22h1m13s >Dumping=202047=20MB=20(2=20chunks) >=20=20chunk=200:=201MB=20(159=20pages)=20...=20ok >=20=20chunk=201:=202047MB=20(523904=20pages)=202031=202015=201999=201983=20= 1967=201951=201935=201919 >1903=201887=201871=201855=201839=201823=201807=201791=201775=201759=20174= 3=201727=201711=201695 >1679=201663=201647=201631=201615=201599=201583=201567=201551=201535=20151= 9=201503=201487=201471 >1455=201439=201423=201407=201391=201375=201359=201343=201327=201311=20129= 5=201279=201263=201247 >1231=201215=201199=201183=201167=201151=201135=201119=201103=201087=20107= 1=201055=201039=201023 >1007=20991=20975=20959=20943=20927=20911=20895=20879=20863=20847=20831=20= 815=20799=20783=20767=20751=20735 >719=20703=20687=20671=20655=20639=20623=20607=20591=20575=20559=20543=205= 27=20511=20495=20479=20463=20447 >431=20415=20399=20383=20367=20351=20335=20319=20303=20287=20271=20255=202= 39=20223=20207=20191=20175=20159 >143=20127=20111=2095=2079=2063=2047=2031=2015 >#0=20=20doadump=20()=20at=20pcpu.h:165 >165=20=20=20=20=20=20=20=20=20=20=20=20=20__asm=20__volatile("movl=20%%fs= :0,%0"=20:=20"=3Dr"=20(td)); >(kgdb)=20backtrace >#0=20=20doadump=20()=20at=20pcpu.h:165 >#1=20=200xc063efca=20in=20boot=20(howto=3D260)=20at=20/usr/src/sys/kern/k= ern_shutdown.c:399 >#2=20=200xc063f396=20in=20panic=20(fmt=3D0xc0870bd4=20"%s")=20at >/usr/src/sys/kern/kern_shutdown.c:555 >#3=20=200xc082e16c=20in=20trap_fatal=20(frame=3D0xece5f6f0,=20eva=3D0)=20= at >/usr/src/sys/i386/i386/trap.c:831 >#4=20=200xc082de52=20in=20trap_pfault=20(frame=3D0xece5f6f0,=20usermode=3D= 0,=20eva=3D152)=20at >/usr/src/sys/i386/i386/trap.c:742 >#5=20=200xc082da02=20in=20trap=20(frame=3D >=20=20=20=20=20=20{tf_fs=20=3D=208,=20tf_es=20=3D=2040,=20tf_ds=20=3D=204= 0,=20tf_edi=20=3D=204,=20tf_esi=20=3D=200,=20tf_ebp >=3D=20-320473228,=20tf_isp=20=3D=20-320473316,=20tf_ebx=20=3D=204098,=20t= f_edx=20=3D=20-1002850048, >tf_ecx=20=3D=200,=20tf_eax=20=3D=204,=20tf_trapno=20=3D=2012,=20tf_err=20= =3D=202,=20tf_eip=20=3D >-1066696930,=20tf_cs=20=3D=2032,=20tf_eflags=20=3D=2066118,=20tf_esp=20=3D= =20-320473100,=20tf_ss=20=3D >1017}) >=20=20=20=20at=20/usr/src/sys/i386/i386/trap.c:432 >#6=20=200xc0817d0a=20in=20calltrap=20()=20at=20/usr/src/sys/i386/i386/exc= eption.s:139 >#7=20=200xc06b7f1e=20in=20vn_lock=20(vp=3D0x0,=20flags=3D4098,=20td=3D0xc= 439b900)=20at >atomic.h:149 >#8=20=200xc05eee46=20in=20procfs_doprocfile=20(td=3D0xc439b900,=20p=3D0xc= 9068830, >pn=3D0xc35f3900,=20sb=3D0x4,=20uio=3D0x0)=20at=20/usr/src/sys/fs/procfs/p= rocfs.c:73 >#9=20=200xc05f3f5b=20in=20pfs_readlink=20(va=3D0x4)=20at=20pcpu.h:162 >#10=200xc0841a13=20in=20VOP_READLINK_APV=20(vop=3D0x4,=20a=3D0xc439b900)=20= at >vnode_if.c:1481 >#11=200xc06b14e3=20in=20kern_readlink=20(td=3D0xc439b900,=20path=3D0xc439= b900=20"<j\006=C9 >x\006=C9",=20pathseg=3D3292117248,=20buf=3D0x4=20<Address=200x4=20out=20o= f=20bounds>,=20bufseg=3D4, >=20=20=20=20count=3D1024)=20at=20vnode_if.h:772 >#12=200xc06b13e8=20in=20readlink=20(td=3D0x4,=20uap=3D0xc439b900)=20at >/usr/src/sys/kern/vfs_syscalls.c:2261 >#13=200xc082e573=20in=20syscall=20(frame=3D >=20=20=20=20=20=20{tf_fs=20=3D=2059,=20tf_es=20=3D=2059,=20tf_ds=20=3D=20= 59,=20tf_edi=20=3D=20135512892,=20tf_esi=20=3D >135663632,=20tf_ebp=20=3D=20-1077940936,=20tf_isp=20=3D=20-320471708,=20t= f_ebx=20=3D >674109588,=20tf_edx=20=3D=20-1077941960,=20tf_ecx=20=3D=200,=20tf_eax=20=3D= =2058,=20tf_trapno=20=3D=200, >tf_err=20=3D=202,=20tf_eip=20=3D=20672579140,=20tf_cs=20=3D=2051,=20tf_ef= lags=20=3D=20647,=20tf_esp=20=3D >-1077942020,=20tf_ss=20=3D=2059})=20at=20/usr/src/sys/i386/i386/trap.c:97= 6 >#14=200xc0817d5f=20in=20Xint0x80_syscall=20()=20at >/usr/src/sys/i386/i386/exception.s:200 >#15=200x00000033=20in=20??=20() >Previous=20frame=20inner=20to=20this=20frame=20(corrupt=20stack?) >/Robin >_______________________________________________ >freebsd-questions@freebsd.org=20mailing=20list >http://lists.freebsd.org/mailman/listinfo/freebsd-questions >To=20unsubscribe,=20send=20any=20mail=20to=20"freebsd-questions-unsubscri= be@freebsd.org" >--=20 >Robin=20Vley >F/X=20Services=20Managed=20Hosting >http://www.fx-services.com >_______________________________________________ >freebsd-hardware@freebsd.org=20mailing=20list >http://lists.freebsd.org/mailman/listinfo/freebsd-hardware >To=20unsubscribe,=20send=20any=20mail=20to=20"freebsd-hardware-unsubscrib= e@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070405170522.A443A13C484>