Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Feb 2014 09:50:51 +0100
From:      msch@snafu.de
To:        freebsd-stable@freebsd.org
Subject:   Re: Stack overflow with kernel r254683
Message-ID:  <98238cf30dfc3f61ea0491c9cbb4c7161192157a@mein.snafu.de>

next in thread | raw e-mail | index | archive | help
=0AIt seems that this mail has been sent encoded... Here I try it once=
=0Amore=0A verified as 'plain text' from my private e-mail account=0A **=
************************************************************************=
=0A=0A Hello,=0A=0A finally I got it managed to upgrade and test my serv=
er last weekend.=0A=0A There are good news: so far kernel r261208 (FreeB=
SD 9.2-STABLE) runs=0Awithout problems.=0A=0A I could not apply the patc=
h you supplied, but I saw that the code was=0Amodified=0A nonetheless an=
d I gave it a try :-)=0A=0A It seems that the problem has been solved.=
=0A=0A Thank you very much! :-)=0A=0A with best regards=0A Matthias Schu=
endehuette=0A=0A > -----Urspr=C3=BCngliche Nachricht-----=0A > Von: Rick=
 Macklem [mailto:rmacklem@uoguelph.ca]=0A > Gesendet: Sonntag, 19. Janua=
r 2014 03:19=0A > An: Schuendehuette, Matthias=0A > Cc: Konstantin Belou=
sov=0A > Betreff: Re: Stack overflow with kernel r254683=0A > =0A > I ju=
st found a bug that causes a stack overflow in the file handle=0A > affi=
nity code done by ken@. It occurs for an NFSv2 client mounting=0A > a se=
rver, where sizeof(fhandle_t) < 32.=0A > =0A > I've attached the patch t=
hat fixes this, in case you can test it?=0A > =0A > Since your stack tra=
ce looks completely different, I won't guess if=0A > this was the bug, b=
ut this bug definitely trashed the stack.=0A > =0A > rick=0A > =0A > ---=
-- Original Message -----=0A > > On Mon, Aug 26, 2013 at 07:11:48PM -040=
0, Rick Macklem wrote:=0A > > > Matthias Schuendehuette wrote:=0A > > >=
 > Hello,=0A > > > >=0A > > > > yesterday I got a kernel crash on my ser=
ver (a ProLiant DL380=0A > > > > G5):=0A > > > >=0A > > > > "panic: stac=
k overflow detected; backtrace may be corrupted"=0A > > > >=0A > > > > K=
ernel is "9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #7 r254683"=0A > > > >=
=0A > > > >=0A > > > > The stack trace reads:=0A > > > >=0A > > > > #0 d=
oadump (textdump=3D1) at pcpu.h:249=0A > > > > 249 pcpu.h: No such file=
 or directory.=0A > > > > in pcpu.h=0A > > > > (kgdb) #0 doadump (textdu=
mp=3D1) at pcpu.h:249=0A > > > > #1 0xc0668a4d in kern_reboot (howto=3D2=
60)=0A > > > > at /usr/src/sys/kern/kern_shutdown.c:449=0A > > > > #2 0x=
c0668f07 in panic (fmt=3D0x104 )=0A > > > > at /usr/src/sys/kern/kern_sh=
utdown.c:637=0A > > > > #3 0xc0691da2 in __stack_chk_fail ()=0A > > > >=
 at /usr/src/sys/kern/stack_protector.c:17=0A > > > > #4 0xc7fdb175 in n=
fsrvd_setattr (nd=3D0xc73b4400,=0A > > > > isdgram=3D-952596480,=0A > >=
 > > vp=3D0xc8001140, p=3D0xf405ecc8, exp=3D0xc07af7f0)=0A > > > > at=0A=
 > > > >=0A/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:3=
71=0A > > > > #5 0xc7fdb6e0 in nfsrvd_releaselckown (nd=3D0xc7442a00,=0A=
 > > > > isdgram=3D-952596480,=0A > > > > vp=3D0xc7388848, p=3D0xf405ecb=
8, exp=3D0x0)=0A > > > > at=0A > > > >=0A/usr/src/sys/modules/nfsd/../..=
/fs/nfsserver/nfs_nfsdserv.c:3481=0A > > > > #6 0xc07af7f0 in svc_run_in=
ternal (pool=3D0xc7de8b80,=0Aismaster=3D0)=0A > > > > at /usr/src/sys/rp=
c/svc.c:1109=0A > > > > #7 0xc07b006d in svc_thread_start (arg=3D0xc7de8=
b80)=0A > > > > at /usr/src/sys/rpc/svc.c:1200=0A > > > > #8 0xc06384f7=
 in fork_exit (callout=3D0xc07b0060=0A > > > > ,=0A > > > > arg=3D0xc7de=
8b80, frame=3D0xf405ed08) at=0A > > > > /usr/src/sys/kern/kern_fork.c:99=
2=0A > > > > #9 0xc08787c4 in fork_trampoline () at=0A > > > > /usr/src/=
sys/i386/i386/exception.s:279=0A > > > >=0A > > > Well, when I've looked=
 on i386, the nfsd threads normally don't=0Ause=0A > > > 1 page=0A > > >=
 and the stacks are 2 pages, so I doubt an nfsd thread is=0Ablowing=0A >=
 > > the stack.=0A > > It is overflowing the frame, not the whole stack.=
 In other word,=0A > > something=0A > > overwrote the canary which was p=
ut on the stack between local=0A > > variables=0A > > and the return add=
ress, possibly corrupting the return address as=0A > > well.=0A > >=0A >=
 > > Also, nfsrvd_releaselckown() doesn't call nfsrvd_setattr(), so=0Ath=
e=0A > > > backtrace=0A > > > doesn't make much sense.=0A > > Yes, this=
 might be one of the consequences of the stack smashing.=0A > >=0A > > >=
=0A > > > Afraid I can't help more than this. Good luck with it, rick=0A=
 > > >=0A > > > >=0A > > > > I have all the files in /var/crash, so if s=
omeone wants=0A > > > > additional=0A > > > > informations=0A > > > > I=
 should be able to deliver them.=0A > > > >=0A > > > > The kernel config=
 file is customized in the sense that I have=0A > > > > removed=0A > > >=
 > kernel items, that aren't used on that machine.=0A > > > >=0A > > > >=
 One major difference: I use=0A > > > >=0A > > > > < options NFSCLIENT #=
 Network Filesystem=0A > > > > Client=0A > > > > < options NFSSERVER # N=
etwork Filesystem=0A > > > > Server=0A > > > >=0A > > > > instead of=0A=
 > > > >=0A > > > > > options NFSCL # New Network Filesystem=0A > > > >=
 > Client=0A > > > > > options NFSD # New Network Filesystem=0A > > > >=
 > Server=0A > > > >=0A > > > > because a kernel a few weeks ago immedia=
tely crashed with the=0Anew=0A > > > > NFS-code.=0A > > > >=0A > > > > B=
ut it seems now, that the old NFS-code is also somehow=0Adamaged.=0A > >=
 > >=0A > > > > Ah, and I still have from older releases of FreeBSD the=
=0Afollowing=0A > > > > loader options - do they still make sense?=0A >=
 > > >=0A > > > > geom_vinum_load=3D"YES"=0A > > > > kern.maxdsiz=3D"734=
003200"=0A > > > > vm.pmap.shpgperproc=3D256=0A > > > > vm.pmap.pv_entry=
_max=3D3145728=0A > > > >=0A > > > >=0A > > > > 'geom_vinum' is used as=
 LVM only, no RAIDs are configured.=0A > > > >=0A > > > > This server is=
 primarily a Samba server with the SMB-shares=0A > > > > exported=0A > >=
 > > as NFS-shares as well=0A > > > > for the other *nix-servers around.=
=0A > > > >=0A > > > > Because this is the most loaded production server=
, testing is=0Aa=0A > > > > bit=0A > > > > difficult, restricted to the=
 evening and the weekends.=0A > > > >=0A > > > > On my two other FreeBSD=
 machines I have no problems at all,=0Aone=0A > > > > of=0A > > > > them=
 is an identical ProLiant server with a nearly identical=0A > > > > kern=
el=0A > > > > config - runs like a charm...=0A > > > >=0A > > > > Has so=
meone a good advice or further questions?=0A > > > >=0A > > > >=0A > > >=
 >=0A > > > > with best regards=0A > > > > Matthias Schuendehuette=0A >=
 > > >=0A



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?98238cf30dfc3f61ea0491c9cbb4c7161192157a>