From owner-freebsd-current@FreeBSD.ORG Wed Nov 27 20:01:05 2013 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7DB22ACB for ; Wed, 27 Nov 2013 20:01:05 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 04BE924F9 for ; Wed, 27 Nov 2013 20:01:04 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id rARK0oM1069526; Wed, 27 Nov 2013 22:00:50 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua rARK0oM1069526 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id rARK0o4w069525; Wed, 27 Nov 2013 22:00:50 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 27 Nov 2013 22:00:50 +0200 From: Konstantin Belousov To: Don Lewis Subject: Re: panic: double fault with 11.0-CURRENT r258504 Message-ID: <20131127200050.GE59496@kib.kiev.ua> References: <20131127192008.GD59496@kib.kiev.ua> <201311271935.rARJZJfC042361@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3JTsiWhzN5QveqyW" Content-Disposition: inline In-Reply-To: <201311271935.rARJZJfC042361@gw.catspoiler.org> User-Agent: Mutt/1.5.22 (2013-10-16) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Nov 2013 20:01:05 -0000 --3JTsiWhzN5QveqyW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 27, 2013 at 11:35:19AM -0800, Don Lewis wrote: > On 27 Nov, Konstantin Belousov wrote: > > On Wed, Nov 27, 2013 at 11:02:57AM -0800, Don Lewis wrote: > >> On 27 Nov, Konstantin Belousov wrote: > >> > On Wed, Nov 27, 2013 at 10:33:30AM -0800, Don Lewis wrote: > >> >> On 27 Nov, Konstantin Belousov wrote: > >> >> > On Wed, Nov 27, 2013 at 09:41:36AM -0800, Don Lewis wrote: > >> >> >> On 27 Nov, Konstantin Belousov wrote: > >> >> >> > On Wed, Nov 27, 2013 at 02:49:12AM -0800, Don Lewis wrote: > >> >> >> >> > >> >> >> >=20 > >> >> >> > What is the instruction at cpu_switch+0x9b ? > >> >> >>=20 > >> >> >> movl 0x8(%edx),%eax > >> >> > So it is line 176 in swtch.s. Is machine still in ddb, or did you > >> >> > obtained the core ? If yes, please print out the content of words= at > >> >> > 0xe4f62bb0 + 4, +8 (*), +16. Please print the content of the word= at > >> >> > address (*) + 8. > >> >>=20 > >> >> It is still in ddb. > >> >>=20 > >> >> , though not = in > >> >> the above order. > >> > Uhm, sorry, I mistyped the last part of the instructions. > >> >=20 > >> > The new thread pointer is 0xd2f4e000, there is nothing incriminating. > >> > Please print the word at 0xd2f4e000+0x254 =3D=3D 0xd2f4e254, which w= ould be > >> > the address of the new thread pcb. It is load from the pcb + 8 which > >> > faults. > >>=20 > >> 0xf3d44d60 > > Again, the pointer looks fine, and its tail is 0xd60, which is correct = for > > the pcb offset in the last page of the thread stack. > >=20 > > Please do 'show thread 0xd2f4e000' before trying below instructions. >=20 > Ok, see below: > =20 > > What happens if you try to read word at 0xf3d44d68 ? >=20 > Nothing bad ... >=20 > >=20 So the thread structure looks sane, the stack region is in place where it is supposed to be, all the gathered data looks self-consistent. And, the access to the faulted address from ddb does not fault. Thread stacks can only be invalidated when the process is swapped out and kernel stack is written to swap. Your thread flags indicate that it is in memory, and TDF_CANSWAP is not set. I do not believe that our swapout code would invalidate stack mapping in such situation, otherwise we would have too many complaints already. Just in case, do you use swap on this box ? And, as the last resort, I do understand that this sounds as giving up, do you monitor the temperature of the CPUs ? BTW, which CPUs are that, please show the cpu identification lines from the boot dmesg. --3JTsiWhzN5QveqyW Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJSlk9xAAoJEJDCuSvBvK1BfhgP/1k4JttQJs1jTAj9+8k19NLN e7XwQ/YFYu6dMXdvHfYQwL17OO8Kfw9yGskoX6Va+QbvRguIFyFYlGrKM0z159kV SwhzuY/qVFmqr2mBCOROYRnXdZRaLnPsj/BY81g3FucXl/0HyxbxgVEjX30U5ehC NF9d9kjA8tj3FZgWX42GfrC3y4UQ8XuQ7cdTKDh/KTVOrCKMl/mA6neHC6lcrQFb 3TPxIMzynBKCUkW1v5wl4PYuCTgCwSEh6TAde3AgdGZ/0h4iYKbqJiUvl3vUwmnA mWvhfKH34CZDHwrBlOa5GW/vbs5zP3IPMbJ0C2o3SPV7b6oJgwBHd/Y/wx8LWRCK 8CoBc2EPGNEX8ELi0eKc4AWvGzBghKJr21uW+Vova2gVXqADC+ABChZXKvGcGrZ5 wsQnTaZRU82I2hy00H+gS7jyMpBmnlCYNy+nBnEBPe8mECC6c4CtbOmM0IbQCWSh 7EaLrmQLfRi/dEB0XsZe9DffUmoaY5egOLbTPQ4yUbOKN6mVvppKb5ZGAV0iCYMs fnmHDHogh5DU1YCR9WMHIac6VD/nYUdnevXjgh16U0I9xOhvXIJIycavQiNbk6+M rfDJGlgjD+tzaDZRFn11l4SolZdv1hExfHG+vh6pKKD6d0PcjkqMaK/JBfZpwlz+ XxFHY8XbZMpaqNuTU327 =8PJh -----END PGP SIGNATURE----- --3JTsiWhzN5QveqyW--