From owner-freebsd-stable@FreeBSD.ORG Tue Feb 4 09:09:05 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 458456F8 for ; Tue, 4 Feb 2014 09:09:05 +0000 (UTC) Received: from sour.ops.eusc.inter.net (sour.ops.eusc.inter.net [84.23.254.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 036F5144D for ; Tue, 4 Feb 2014 09:09:04 +0000 (UTC) X-Trace: 4c7c6d73636840736e6166752e64657c38342e32332e3235342e3232347c315741 6269562d303030335a612d38757c31333931353033383531 Received: from sour.ops.eusc.inter.net ([10.154.10.15] helo=localhost) by sour.ops.eusc.inter.net with esmtpa (Exim 4.72) id 1WAbiV-0003Za-8u for freebsd-stable@freebsd.org; Tue, 04 Feb 2014 09:50:51 +0100 Message-Id: <98238cf30dfc3f61ea0491c9cbb4c7161192157a@mein.snafu.de> From: msch@snafu.de To: freebsd-stable@freebsd.org X-Mailer: Atmail 6.6.5.13732 Subject: Re: Stack overflow with kernel r254683 Date: Tue, 04 Feb 2014 09:50:51 +0100 MIME-Version: 1.0 X-SA-Exim-Connect-IP: 84.23.254.224 X-SA-Exim-Mail-From: msch@snafu.de X-SA-Exim-Scanned: No (on sour.ops.eusc.inter.net); SAEximRunCond expanded to false Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 09:09:05 -0000 =0AIt seems that this mail has been sent encoded... Here I try it once= =0Amore=0A verified as 'plain text' from my private e-mail account=0A **= ************************************************************************= =0A=0A Hello,=0A=0A finally I got it managed to upgrade and test my serv= er last weekend.=0A=0A There are good news: so far kernel r261208 (FreeB= SD 9.2-STABLE) runs=0Awithout problems.=0A=0A I could not apply the patc= h you supplied, but I saw that the code was=0Amodified=0A nonetheless an= d I gave it a try :-)=0A=0A It seems that the problem has been solved.= =0A=0A Thank you very much! :-)=0A=0A with best regards=0A Matthias Schu= endehuette=0A=0A > -----Urspr=C3=BCngliche Nachricht-----=0A > Von: Rick= Macklem [mailto:rmacklem@uoguelph.ca]=0A > Gesendet: Sonntag, 19. Janua= r 2014 03:19=0A > An: Schuendehuette, Matthias=0A > Cc: Konstantin Belou= sov=0A > Betreff: Re: Stack overflow with kernel r254683=0A > =0A > I ju= st found a bug that causes a stack overflow in the file handle=0A > affi= nity code done by ken@. It occurs for an NFSv2 client mounting=0A > a se= rver, where sizeof(fhandle_t) < 32.=0A > =0A > I've attached the patch t= hat fixes this, in case you can test it?=0A > =0A > Since your stack tra= ce looks completely different, I won't guess if=0A > this was the bug, b= ut this bug definitely trashed the stack.=0A > =0A > rick=0A > =0A > ---= -- Original Message -----=0A > > On Mon, Aug 26, 2013 at 07:11:48PM -040= 0, Rick Macklem wrote:=0A > > > Matthias Schuendehuette wrote:=0A > > >= > Hello,=0A > > > >=0A > > > > yesterday I got a kernel crash on my ser= ver (a ProLiant DL380=0A > > > > G5):=0A > > > >=0A > > > > "panic: stac= k overflow detected; backtrace may be corrupted"=0A > > > >=0A > > > > K= ernel is "9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #7 r254683"=0A > > > >= =0A > > > >=0A > > > > The stack trace reads:=0A > > > >=0A > > > > #0 d= oadump (textdump=3D1) at pcpu.h:249=0A > > > > 249 pcpu.h: No such file= or directory.=0A > > > > in pcpu.h=0A > > > > (kgdb) #0 doadump (textdu= mp=3D1) at pcpu.h:249=0A > > > > #1 0xc0668a4d in kern_reboot (howto=3D2= 60)=0A > > > > at /usr/src/sys/kern/kern_shutdown.c:449=0A > > > > #2 0x= c0668f07 in panic (fmt=3D0x104 )=0A > > > > at /usr/src/sys/kern/kern_sh= utdown.c:637=0A > > > > #3 0xc0691da2 in __stack_chk_fail ()=0A > > > >= at /usr/src/sys/kern/stack_protector.c:17=0A > > > > #4 0xc7fdb175 in n= fsrvd_setattr (nd=3D0xc73b4400,=0A > > > > isdgram=3D-952596480,=0A > >= > > vp=3D0xc8001140, p=3D0xf405ecc8, exp=3D0xc07af7f0)=0A > > > > at=0A= > > > >=0A/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:3= 71=0A > > > > #5 0xc7fdb6e0 in nfsrvd_releaselckown (nd=3D0xc7442a00,=0A= > > > > isdgram=3D-952596480,=0A > > > > vp=3D0xc7388848, p=3D0xf405ecb= 8, exp=3D0x0)=0A > > > > at=0A > > > >=0A/usr/src/sys/modules/nfsd/../..= /fs/nfsserver/nfs_nfsdserv.c:3481=0A > > > > #6 0xc07af7f0 in svc_run_in= ternal (pool=3D0xc7de8b80,=0Aismaster=3D0)=0A > > > > at /usr/src/sys/rp= c/svc.c:1109=0A > > > > #7 0xc07b006d in svc_thread_start (arg=3D0xc7de8= b80)=0A > > > > at /usr/src/sys/rpc/svc.c:1200=0A > > > > #8 0xc06384f7= in fork_exit (callout=3D0xc07b0060=0A > > > > ,=0A > > > > arg=3D0xc7de= 8b80, frame=3D0xf405ed08) at=0A > > > > /usr/src/sys/kern/kern_fork.c:99= 2=0A > > > > #9 0xc08787c4 in fork_trampoline () at=0A > > > > /usr/src/= sys/i386/i386/exception.s:279=0A > > > >=0A > > > Well, when I've looked= on i386, the nfsd threads normally don't=0Ause=0A > > > 1 page=0A > > >= and the stacks are 2 pages, so I doubt an nfsd thread is=0Ablowing=0A >= > > the stack.=0A > > It is overflowing the frame, not the whole stack.= In other word,=0A > > something=0A > > overwrote the canary which was p= ut on the stack between local=0A > > variables=0A > > and the return add= ress, possibly corrupting the return address as=0A > > well.=0A > >=0A >= > > Also, nfsrvd_releaselckown() doesn't call nfsrvd_setattr(), so=0Ath= e=0A > > > backtrace=0A > > > doesn't make much sense.=0A > > Yes, this= might be one of the consequences of the stack smashing.=0A > >=0A > > >= =0A > > > Afraid I can't help more than this. Good luck with it, rick=0A= > > >=0A > > > >=0A > > > > I have all the files in /var/crash, so if s= omeone wants=0A > > > > additional=0A > > > > informations=0A > > > > I= should be able to deliver them.=0A > > > >=0A > > > > The kernel config= file is customized in the sense that I have=0A > > > > removed=0A > > >= > kernel items, that aren't used on that machine.=0A > > > >=0A > > > >= One major difference: I use=0A > > > >=0A > > > > < options NFSCLIENT #= Network Filesystem=0A > > > > Client=0A > > > > < options NFSSERVER # N= etwork Filesystem=0A > > > > Server=0A > > > >=0A > > > > instead of=0A= > > > >=0A > > > > > options NFSCL # New Network Filesystem=0A > > > >= > Client=0A > > > > > options NFSD # New Network Filesystem=0A > > > >= > Server=0A > > > >=0A > > > > because a kernel a few weeks ago immedia= tely crashed with the=0Anew=0A > > > > NFS-code.=0A > > > >=0A > > > > B= ut it seems now, that the old NFS-code is also somehow=0Adamaged.=0A > >= > >=0A > > > > Ah, and I still have from older releases of FreeBSD the= =0Afollowing=0A > > > > loader options - do they still make sense?=0A >= > > >=0A > > > > geom_vinum_load=3D"YES"=0A > > > > kern.maxdsiz=3D"734= 003200"=0A > > > > vm.pmap.shpgperproc=3D256=0A > > > > vm.pmap.pv_entry= _max=3D3145728=0A > > > >=0A > > > >=0A > > > > 'geom_vinum' is used as= LVM only, no RAIDs are configured.=0A > > > >=0A > > > > This server is= primarily a Samba server with the SMB-shares=0A > > > > exported=0A > >= > > as NFS-shares as well=0A > > > > for the other *nix-servers around.= =0A > > > >=0A > > > > Because this is the most loaded production server= , testing is=0Aa=0A > > > > bit=0A > > > > difficult, restricted to the= evening and the weekends.=0A > > > >=0A > > > > On my two other FreeBSD= machines I have no problems at all,=0Aone=0A > > > > of=0A > > > > them= is an identical ProLiant server with a nearly identical=0A > > > > kern= el=0A > > > > config - runs like a charm...=0A > > > >=0A > > > > Has so= meone a good advice or further questions?=0A > > > >=0A > > > >=0A > > >= >=0A > > > > with best regards=0A > > > > Matthias Schuendehuette=0A >= > > >=0A