From owner-freebsd-current@FreeBSD.ORG Tue Oct 18 15:40:36 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFB5A106566B for ; Tue, 18 Oct 2011 15:40:35 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 80D998FC16 for ; Tue, 18 Oct 2011 15:40:35 +0000 (UTC) Received: by eyd10 with SMTP id 10so913893eyd.13 for ; Tue, 18 Oct 2011 08:40:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=C5TbFDQoAF85wzlOj1RXHYYp1b8a7PQUmWMuheZ3GU8=; b=vAKJTf7vLzbhMNu9ibvA6OgsB5HHlusZGlZHoNd9qWGut9pYMIqwPtQ2E4TGUM8nAN Jve05xrhx62avx9gqSdAUwkAN4CNyVpMFLGuYxUm9GJjEw5I8IHhb7IxwUxxsFwB+8FR j7G1BWJbl5pFMj1ilE+IojESeKf7qNKTNQ9ik= Received: by 10.14.34.13 with SMTP id r13mr376314eea.121.1318952434083; Tue, 18 Oct 2011 08:40:34 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id t2sm6831355eef.10.2011.10.18.08.40.31 (version=SSLv3 cipher=OTHER); Tue, 18 Oct 2011 08:40:33 -0700 (PDT) Sender: Alexander Motin Message-ID: <4E9D9DE1.8060501@FreeBSD.org> Date: Tue, 18 Oct 2011 18:40:17 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: Alexey Shuvaev References: <20111008201456.GA3529@lexx.ifp.tuwien.ac.at> <20111017190027.GA9873@lexx.ifp.tuwien.ac.at> In-Reply-To: <20111017190027.GA9873@lexx.ifp.tuwien.ac.at> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org Subject: Re: Panics after AHCI timeouts X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Oct 2011 15:40:36 -0000 Hi. Alexey Shuvaev wrote: > On Sat, Oct 08, 2011 at 10:14:56PM +0200, Alexey Shuvaev wrote: > Errr... Replying to myself... Ping? Should I file a PR and put it > in the back burner? :) Sorry for not replying, I wasn't home to look on it closely. >> In the view of upcoming RELEASE-9.0 I should have reported it earlier, >> but it is better later than never... Every time I wanted to report >> this, the system was ~one month old and I tried to upgrade it >> to see, if the problem was still there, waiting for the next panic... >> and when it finally paniced it was one month old again. >> > [snip] >> >From core.txt.5: >> [snip] >> Unread portion of the kernel message buffer: >> Memory modified after free 0xfffffe000416e200(248) val=79e8800 @ 0xfffffe000416e200 >> panic: Most recently used by cred >> >> cpuid = 2 >> Uptime: 20h11m1s >> Dumping 1308 out of 7914 MB:..2%..12%..21%..31%..41%..51%..62%..71%..81%..91% >> [snip] >> #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:252 >> 252 if (textdump && textdump_pending) { >> (kgdb) #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:252 >> #1 0xffffffff808234aa in kern_reboot (howto=260) >> at /usr/src/sys/kern/kern_shutdown.c:430 >> #2 0xffffffff80822f41 in panic (fmt=Variable "fmt" is not available. >> ) >> at /usr/src/sys/kern/kern_shutdown.c:595 >> #3 0xffffffff80a6f7b4 in mtrash_ctor (mem=Variable "mem" is not available. >> ) at /usr/src/sys/vm/uma_dbg.c:137 >> #4 0xffffffff80a6f01c in uma_zalloc_arg (zone=0xfffffe021ffe0700, udata=0x0, >> flags=258) at /usr/src/sys/vm/uma_core.c:2018 >> #5 0xffffffff808108be in malloc (size=Variable "size" is not available. >> ) at uma.h:305 >> #6 0xffffffff8081c21f in crget () at /usr/src/sys/kern/kern_prot.c:1809 >> #7 0xffffffff8081c269 in crdup (cr=0xfffffe0143103300) >> at /usr/src/sys/kern/kern_prot.c:1911 >> #8 0xffffffff808c5ca6 in kern_accessat (td=0xfffffe0007dd7000, fd=-100, >> path=0x80065c000
, >> pathseg=UIO_USERSPACE, flags=Variable "flags" is not available. >> ) at /usr/src/sys/kern/vfs_syscalls.c:2201 >> #9 0xffffffff8086719a in syscallenter (td=0xfffffe0007dd7000, >> sa=0xffffff8223f67bb0) at /usr/src/sys/kern/subr_trap.c:344 >> #10 0xffffffff80b0b43c in syscall (frame=0xffffff8223f67c50) >> at /usr/src/sys/amd64/amd64/trap.c:910 >> #11 0xffffffff80af617d in Xfast_syscall () >> at /usr/src/sys/amd64/amd64/exception.S:384 >> #12 0x000000080062dbdc in ?? () >> Previous frame inner to this frame (corrupt stack?) >> [snip] >> [last message in dmesg] >> ahcich0: Timeout on slot 29 port 0 >> ahcich0: is 00000000 cs 00000000 ss ffffffff rs ffffffff tfd 40 serr 00000000 cm >> d 0000fc17 >> [snip] Now looking on two you backtraces I don't see anything common between them. While first crash happened within timer event handler, it was not AHCI-related event. Second crash happened inside some unrelated syscall. I may suppose that some memory corruption could cause both, but I have no idea what it is and how can it be related to AHCI. With the same effect I could tell that some other hardware problem causes both problems. Try to collect more statistics. -- Alexander Motin