From owner-freebsd-hackers@freebsd.org Sun Aug 7 15:07:04 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 473FEBB02C7 for ; Sun, 7 Aug 2016 15:07:04 +0000 (UTC) (envelope-from mpp302@gmail.com) Received: from mail-wm0-f67.google.com (mail-wm0-f67.google.com [74.125.82.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E02E2127B for ; Sun, 7 Aug 2016 15:07:03 +0000 (UTC) (envelope-from mpp302@gmail.com) Received: by mail-wm0-f67.google.com with SMTP id i5so11135726wmg.2 for ; Sun, 07 Aug 2016 08:07:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-transfer-encoding:subject :message-id:date:to:mime-version; bh=ONl8V0CfuozCQucdwMcXXlQMFvUxdRY2VmqJA5rf9/w=; b=dSYgIEQbFEH/2egsOf2+NEQ85tT/tWTAQCDzZYCmHTyiPa2jgaoubb0ESrTmPxp0M8 xeEcXSC4+1tahhSWN+I1Od7MwyGBiKbU7QqYp2tSZA4M4s9qfj4sCXJ+eKB4mwoBysFq wwtVX2HV+Y4otF2WXV3pcs/SDqzM9XiBzIF/5HK8rI9JPmQQ16Q2dBBEniq8TtRv6rBf wIJXajaPeu5cHVm1sS0etuvetQtx4TTtfrnX8k2KWqXJ0bLK0W3HaYXCFIi8Urg6Xiii xdObSNjj8IrStvkiVZ5SgE1xTiBqL3GxadTzS8vmD34C5QfyS+5g5sBiLqUX5lX2s3N6 FCqA== X-Gm-Message-State: AEkoouuMKvmfYq45mgpFN+i1CKIoCobm3GSbHX96s/LERMej4aZV7+Ogp8AWlzM+4/TPlA== X-Received: by 10.28.63.21 with SMTP id m21mr12451829wma.77.1470582416133; Sun, 07 Aug 2016 08:06:56 -0700 (PDT) Received: from maka.fritz.box (dslb-178-005-164-211.178.005.pools.vodafone-ip.de. [178.5.164.211]) by smtp.gmail.com with ESMTPSA id v134sm18955134wmf.10.2016.08.07.08.06.54 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 07 Aug 2016 08:06:55 -0700 (PDT) From: Mateusz Piotrowski <0mp@FreeBSD.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Cannot login in after running freebsd-update install Message-Id: <225F271C-004E-4A9B-B9EA-0CEA29961EE6@FreeBSD.org> Date: Sun, 7 Aug 2016 17:06:56 +0200 To: freebsd-hackers@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Aug 2016 15:07:04 -0000 Hello, I just run `freebsd-update fetch && freebsd-update install`. Everything = went=20 rather smoothly except one error message which I ignored (I guess it was=20= cap_mkdb: potential reference loop detected but I am not sure). Unfortunately, I was unable to `login`, `su` and = `sudo`.=20 I was getting log messages like login_getclass: 'tc=3D' reference loop 'root' pam_acct_mgmt: error in service module I rebooted and booted in single user mode (since I couldn't log in when = I=20 started in multiuser mode). I tried to run cap_mkdb /etc/login.conf but I was getting the "cap_mkdb: potential reference loop detected" = every time. Apart from that I ran `mergemaster` and `freebsd-update rollback` but it = didn't=20 solve the issue. Finally, I fixed the issue by removing ":tc=3Ddefault:" from the = default login=20 class in /etc/login.conf and running `cap_mkdb /etc/login.conf` = afterwards. Why did `freebsd-update install` broke /etc/login.conf? Is there = anything I did wrong/could have done better? Cheers, -mateusz From owner-freebsd-hackers@freebsd.org Mon Aug 8 14:55:08 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5211DBB2800 for ; Mon, 8 Aug 2016 14:55:08 +0000 (UTC) (envelope-from christian.mauderer@embedded-brains.de) Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0CC5517E7; Mon, 8 Aug 2016 14:55:07 +0000 (UTC) (envelope-from christian.mauderer@embedded-brains.de) Received: from [88.198.220.131] (helo=sslproxy02.your-server.de) by dedi548.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1bWlxC-00078w-0U; Mon, 08 Aug 2016 16:54:58 +0200 Received: from [82.135.62.35] (helo=mail.embedded-brains.de) by sslproxy02.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84_2) (envelope-from ) id 1bWlxB-0001R1-NU; Mon, 08 Aug 2016 16:54:57 +0200 Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 8CC412A000B; Mon, 8 Aug 2016 16:55:18 +0200 (CEST) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 4j28JIHvA1za; Mon, 8 Aug 2016 16:55:18 +0200 (CEST) Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id F21DF2A1808; Mon, 8 Aug 2016 16:55:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id H8PIu5QantFu; Mon, 8 Aug 2016 16:55:17 +0200 (CEST) Received: from mauderer-linux.eb.localhost (unknown [192.168.96.154]) by mail.embedded-brains.de (Postfix) with ESMTPSA id D5A912A000B; Mon, 8 Aug 2016 16:55:17 +0200 (CEST) Subject: Re: Changes to pfctl to allow easier integration into a library To: Kristof Provost References: <25df9fd5-be75-b9ae-aa3a-22abef3bddf0@embedded-brains.de> <0C7EC45D-C3BC-4417-AF77-3ACC027D28B5@FreeBSD.org> <336150f6-9dcd-873f-1f8f-a264dfa4c4ed@embedded-brains.de> Cc: "freebsd-hackers@freebsd.org" From: Christian Mauderer Message-ID: <4571f890-6d35-b843-9bd8-86966fe515f5@embedded-brains.de> Date: Mon, 8 Aug 2016 16:54:56 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2 MIME-Version: 1.0 In-Reply-To: <336150f6-9dcd-873f-1f8f-a264dfa4c4ed@embedded-brains.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.99.2/22050/Mon Aug 8 13:16:52 2016) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2016 14:55:08 -0000 Am 02.08.2016 um 07:36 schrieb Christian Mauderer: >=20 >=20 > Am 01.08.2016 um 20:36 schrieb Kristof Provost: >> On 1 Aug 2016, at 16:32, Christian Mauderer wrote: >> >> Am 01.08.2016 um 16:02 schrieb Kristof Provost: >> >> On 1 Aug 2016, at 15:03, Christian Mauderer wrote: >> >> Can I improve anything to make the patches more acceptable= ? >> >> Can you explain why >> 0003-pfctl-Pull-static-variables-out-of-the-function.patch is >> required? >> I=E2=80=99m not sure I see why you need it. >> >> I use roughly the following method for the global variables: >> - I put all initialized (zero or value) variables into a special n= amed >> linker section. >> - In a wrapper around main() I do the following: >> o First save the content of the section to a temporary memory spac= e >> o Execute the original (mostly unchanged) main() >> o After main() finishes, I restore the content of the section >> To simplify a later update to a newer source version, the differen= ce >> between the sources we use and the original FreeBSD sources should= be >> minimal. Therefore my attempt to put the variables into a section = is >> the >> following: >> I create a header file (i.e. pfctl-data.h) that contains a matchin= g >> declaration of the global variables but with an added gcc attribut= e >> __attribute__((__section__("my_section_name"))). This header file = is >> included at the end of the original pfctl.c. >> >> Oh. >> Ick. >> Clever, but =E2=80=A6 ick. >> >> I=E2=80=99m not a big fan of this patch, because it makes the code a b= it harder >> to follow, rather than improving things as most of your other patches = do. >> I suspect that something similar can be accomplished with a bit of >> linker trickery. >> >> A first idea is to insert two new symbols (e.g. pf_begin/pf_end) in tw= o >> separate files, the first before all of the pfctl object files, the >> second after them. >> This should let you group the .data section of the pfctl globals, >> accomplishing what you do here with the *attribute* attribute. >> >> Regards, >> Kristof >> >=20 > Hello Kristof, >=20 > I agree that my solution is not optimal and that this specific patch > does not really improve the code for you. So I'll try to find alternati= ves. >=20 > The method you suggested would not work for us. We are using part of th= e > FreeBSD sources as a library that is statically linked with the rest of > the system. Using our build process, the order of the object files does > not guarantee an order of the symbols. As far as I know a fixed order > can only be achieved by the section names. >=20 > Theoretically it could be possible to get a similar result with some > object file manipulation (renaming sections, ...). I'll check if I'm > able to use such a solution instead and report back as soon as I can > tell you more. >=20 > Kind regards, >=20 > Christian Mauderer >=20 Hello Kristof, just the promised report: I'm quite convinced that a solution using object file manipulation is possible. It only needs some additional work (some adaption to our build system) and therefore needs some time. But we work toward it. So just ignore the patch 0003. I noted that you already imported the other patches into FreeBSD. Thanks for that. Kind Regards Christian Mauderer --=20 -------------------------------------------- embedded brains GmbH Christian Mauderer Dornierstr. 4 D-82178 Puchheim Germany email: christian.mauderer@embedded-brains.de Phone: +49-89-18 94 741 - 18 Fax: +49-89-18 94 741 - 08 PGP: Public key available on request. Diese Nachricht ist keine gesch=C3=A4ftliche Mitteilung im Sinne des EHUG= . From owner-freebsd-hackers@freebsd.org Tue Aug 9 06:12:57 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B74ABB3A11 for ; Tue, 9 Aug 2016 06:12:57 +0000 (UTC) (envelope-from sebastian.huber@embedded-brains.de) Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 141F01303; Tue, 9 Aug 2016 06:12:56 +0000 (UTC) (envelope-from sebastian.huber@embedded-brains.de) Received: from [88.198.220.131] (helo=sslproxy02.your-server.de) by dedi548.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1bX0HV-0005w1-Tr; Tue, 09 Aug 2016 08:12:53 +0200 Received: from [82.135.62.35] (helo=mail.embedded-brains.de) by sslproxy02.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84_2) (envelope-from ) id 1bX0HV-0000WA-KX; Tue, 09 Aug 2016 08:12:53 +0200 Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 3B4072A1808; Tue, 9 Aug 2016 08:13:15 +0200 (CEST) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id gEi6MDmn8Rsk; Tue, 9 Aug 2016 08:13:14 +0200 (CEST) Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 9D2B12A1807; Tue, 9 Aug 2016 08:13:14 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id oGTHZS-uhEOd; Tue, 9 Aug 2016 08:13:14 +0200 (CEST) Received: from [192.168.96.129] (unknown [192.168.96.129]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 7F3302A000B; Tue, 9 Aug 2016 08:13:14 +0200 (CEST) Subject: Re: Changes to pfctl to allow easier integration into a library To: Christian Mauderer , Kristof Provost References: <25df9fd5-be75-b9ae-aa3a-22abef3bddf0@embedded-brains.de> <0C7EC45D-C3BC-4417-AF77-3ACC027D28B5@FreeBSD.org> <336150f6-9dcd-873f-1f8f-a264dfa4c4ed@embedded-brains.de> <4571f890-6d35-b843-9bd8-86966fe515f5@embedded-brains.de> Cc: "freebsd-hackers@freebsd.org" From: Sebastian Huber Message-ID: <57A97463.9000801@embedded-brains.de> Date: Tue, 9 Aug 2016 08:12:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <4571f890-6d35-b843-9bd8-86966fe515f5@embedded-brains.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.99.2/22054/Tue Aug 9 05:18:33 2016) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2016 06:12:57 -0000 On 08/08/16 16:54, Christian Mauderer wrote: > Am 02.08.2016 um 07:36 schrieb Christian Mauderer: >> >> Am 01.08.2016 um 20:36 schrieb Kristof Provost: >>> On 1 Aug 2016, at 16:32, Christian Mauderer wrote: >>> >>> Am 01.08.2016 um 16:02 schrieb Kristof Provost: >>> >>> On 1 Aug 2016, at 15:03, Christian Mauderer wrote: >>> >>> Can I improve anything to make the patches more acceptab= le? >>> >>> Can you explain why >>> 0003-pfctl-Pull-static-variables-out-of-the-function.patch i= s >>> required? >>> I=E2=80=99m not sure I see why you need it. >>> >>> I use roughly the following method for the global variables: >>> - I put all initialized (zero or value) variables into a special= named >>> linker section. >>> - In a wrapper around main() I do the following: >>> o First save the content of the section to a temporary memory sp= ace >>> o Execute the original (mostly unchanged) main() >>> o After main() finishes, I restore the content of the section >>> To simplify a later update to a newer source version, the differ= ence >>> between the sources we use and the original FreeBSD sources shou= ld be >>> minimal. Therefore my attempt to put the variables into a sectio= n is >>> the >>> following: >>> I create a header file (i.e. pfctl-data.h) that contains a match= ing >>> declaration of the global variables but with an added gcc attrib= ute >>> __attribute__((__section__("my_section_name"))). This header fil= e is >>> included at the end of the original pfctl.c. >>> >>> Oh. >>> Ick. >>> Clever, but =E2=80=A6 ick. >>> >>> I=E2=80=99m not a big fan of this patch, because it makes the code a = bit harder >>> to follow, rather than improving things as most of your other patches= do. >>> I suspect that something similar can be accomplished with a bit of >>> linker trickery. >>> >>> A first idea is to insert two new symbols (e.g. pf_begin/pf_end) in t= wo >>> separate files, the first before all of the pfctl object files, the >>> second after them. >>> This should let you group the .data section of the pfctl globals, >>> accomplishing what you do here with the *attribute* attribute. >>> >>> Regards, >>> Kristof >>> >> Hello Kristof, >> >> I agree that my solution is not optimal and that this specific patch >> does not really improve the code for you. So I'll try to find alternat= ives. >> >> The method you suggested would not work for us. We are using part of t= he >> FreeBSD sources as a library that is statically linked with the rest o= f >> the system. Using our build process, the order of the object files doe= s >> not guarantee an order of the symbols. As far as I know a fixed order >> can only be achieved by the section names. >> >> Theoretically it could be possible to get a similar result with some >> object file manipulation (renaming sections, ...). I'll check if I'm >> able to use such a solution instead and report back as soon as I can >> tell you more. >> >> Kind regards, >> >> Christian Mauderer >> > Hello Kristof, > > just the promised report: I'm quite convinced that a solution using > object file manipulation is possible. It only needs some additional wor= k > (some adaption to our build system) and therefore needs some time. But > we work toward it. So just ignore the patch 0003. I think on PowerPC the object file manipulations could turn out to be=20 rather difficult, since we would have to change also the relocation=20 types. On PowerPC we use the small-data area, so if you have in your=20 source code a global variable like int g; then the compiler generates small-data area relocations. However, if you = use int g __attribute__((__section__("x"))); then the compiler generates normal relocations. If you simply rename the=20 sections in an object file, e.g. from ".sdata" to ".x", then the=20 relocations cannot be resolved or are invalid at run-time. I think we=20 have to add the section attribute to all global variable declarations at=20 least in the RTEMS port of the FreeBSD code. --=20 Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.huber@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine gesch=C3=A4ftliche Mitteilung im Sinne des EHUG= . From owner-freebsd-hackers@freebsd.org Tue Aug 9 09:17:34 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AEB18BB2F03 for ; Tue, 9 Aug 2016 09:17:34 +0000 (UTC) (envelope-from ap00@mail.ru) Received: from fallback2.mail.ru (fallback2.mail.ru [94.100.179.22]) by mx1.freebsd.org (Postfix) with ESMTP id 286991390 for ; Tue, 9 Aug 2016 09:17:33 +0000 (UTC) (envelope-from ap00@mail.ru) Received: from smtp44.i.mail.ru (smtp44.i.mail.ru [94.100.177.104]) by fallback2.mail.ru (mPOP.Fallback_MX) with ESMTP id B6DCE75380E6 for ; Tue, 9 Aug 2016 11:33:00 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:To:Message-ID:From:Date; bh=7xfcA48I8Q46SsBrPnkWl4IQpRARN3qflFgvuyL/JD4=; b=LkyCJROgNZRsxXOcOVty1Rl9mgFzKpieM5kQI+XDXNG6hJKNIhE/W68IXYoSSs9Xse8MDzEgJz/esl/q6ilSFLgW4yOFp8GFzy+uHAeXiyhIcBnMBVtkmVD+aQNO2mKGhbAjHHefB5etVQN10MU/XqiIo3WN8h//4XwvTTjcJQA=; Received: from [91.190.121.202] (port=53542 helo=pstation) by smtp44.i.mail.ru with esmtpa (envelope-from ) id 1bX2Sx-0004bk-Av for freebsd-hackers@freebsd.org; Tue, 09 Aug 2016 11:32:51 +0300 Date: Tue, 9 Aug 2016 11:32:48 +0300 From: Anthony Pankov X-Priority: 3 (Normal) Message-ID: <747363738.20160809113248@mail.ru> To: FreeBSD Hackers Subject: Root fs overflow. Updater tools must warn? MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailru-Sender: 0489DF6C38DA5EE561E8A477868835DBB961184D07F9964DFAB6D19B86A1E6D652F8A52D7429ABF2ACCB5262BB601CA2 X-Mras: OK X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2016 09:17:34 -0000 Greetings. Recently I've got a very bad experience with root filesystem (ufs) overflow. I was preparing to install some port and wanted to upgrade ports tree (ports was on root fs). I estimated free space that was 1Gb+. Than I ran portsnap fetch portsnap extract After some time server stoped responding. On server reboot I see the fatal Error: "can't write, filesystem is full" than kernel messages "can't allocate .. soft_.. dep ... system going to reboot" Rebooting in single user mode show the following: - fsck makes the Error; - make clean in ports makes the Error; - removing snaphot makes the Error; - removing large file makes the Error; The same was when booting from live cd. It was very sad. Of course I think this is my fault. But tools such as portsnap and freebsd-update always running as root. System can't preserve fs space as in case of unprivileged user. May be, tools that run from root and do the fs space consuming things must estimate available space and stop in bad case? -- Best regards, Anthony Pankov mailto:ap00@mail.ru From owner-freebsd-hackers@freebsd.org Tue Aug 9 19:42:09 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89563BB4AFC for ; Tue, 9 Aug 2016 19:42:09 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-it0-x236.google.com (mail-it0-x236.google.com [IPv6:2607:f8b0:4001:c0b::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4EB891182; Tue, 9 Aug 2016 19:42:09 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: by mail-it0-x236.google.com with SMTP id u186so21326171ita.0; Tue, 09 Aug 2016 12:42:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=F7BeGkYoQ46bX5zCvkEGiDEuCKSgIwN9xmhwuols8Us=; b=PS6/bPZdVmAIrTpGR06muNm4zWookGuLvB3t+nWi4nflenYhwDSMrNKVVxZI7A0q+S 6qonPCczGm6LZi5j86NgteereaFAp1Gge8dbTZ/T0aVAqWiSK9OH+5vvaP1j2C1LXllX x8yjJeQJtpGKIqso4fAqScSjd9kFcS2EsqHQ4Kp10bK/Jq3NAqFKz4BG3ME1SOfwiLlF Ra1fKK7ORo23/cXaHnTZUTCV4r/pCsLAY0ZnAvNlixnXRth0N03AsIAengN/gA4fEwae X6MtWz0A7Dvt9znXpBRgGQYXgayPcAex8E2qT4iXjsYCRLfXuA4qxopCAGMQKLaZ7Kk4 m7RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=F7BeGkYoQ46bX5zCvkEGiDEuCKSgIwN9xmhwuols8Us=; b=EH3/IQWT4xUpqyD4IYCp1QIex0m7wJk3kL3W06o3isE2n5Q3/yeeIytXsxnwdcH3fq XWsm8QUAiPT7BSPNmZRSAmJf4q7oOmSeD84qAjKryCfin3aZqfXuVxBGZ+pGkeXVWZSd tDSeMwDq0O7Ql3+mkf9xhnDRVGFpnIOTmBb4RMDAeZ6p52OsTT32c0y7uxIvSmLfm+q4 Y8ERbnd1jx5UfYpnmWEfpcdwQ7pJ72eHn6y3nwadXXQzz6z9Y/CA063vkkj7x/uVPB+1 O0EKhlfBR2qsb2P1EfIEjxF7YnCH603AIR5ngn7XouQLYdag0sfLke3nG6FHHD5m2TDh XldQ== X-Gm-Message-State: AEkoout+G3nh74WL/NIa2y72W53m7HlOduO5jFEMJ8UmGJqaKmF8bWkChxqhTbidCWnJCjGd7NIt2aBG2YtbPQ== X-Received: by 10.36.101.195 with SMTP id u186mr902737itb.80.1470771728636; Tue, 09 Aug 2016 12:42:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.141.129 with HTTP; Tue, 9 Aug 2016 12:42:07 -0700 (PDT) In-Reply-To: <57A97463.9000801@embedded-brains.de> References: <25df9fd5-be75-b9ae-aa3a-22abef3bddf0@embedded-brains.de> <0C7EC45D-C3BC-4417-AF77-3ACC027D28B5@FreeBSD.org> <336150f6-9dcd-873f-1f8f-a264dfa4c4ed@embedded-brains.de> <4571f890-6d35-b843-9bd8-86966fe515f5@embedded-brains.de> <57A97463.9000801@embedded-brains.de> From: Adrian Chadd Date: Tue, 9 Aug 2016 12:42:07 -0700 Message-ID: Subject: Re: Changes to pfctl to allow easier integration into a library To: Sebastian Huber Cc: Christian Mauderer , Kristof Provost , "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-Mailman-Approved-At: Tue, 09 Aug 2016 20:42:41 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2016 19:42:09 -0000 [snip] hi, I'm looking at this, because I'm thinking of what it'd take to make some of this stuff in a daemon and be persistent, versus having to fork/exec commands in a memory constrained environment. Some static vars are effectively consts and are consts. Those may just be committed as-is to make things clearer. Some are actual state, which means I can't run this in a multi-threaded environment.. :) -adrian From owner-freebsd-hackers@freebsd.org Wed Aug 10 05:52:37 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C37FBB3A78 for ; Wed, 10 Aug 2016 05:52:37 +0000 (UTC) (envelope-from christian.mauderer@embedded-brains.de) Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5340F1C92 for ; Wed, 10 Aug 2016 05:52:36 +0000 (UTC) (envelope-from christian.mauderer@embedded-brains.de) Received: from [88.198.220.132] (helo=sslproxy03.your-server.de) by dedi548.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1bXMRH-0003H1-MM; Wed, 10 Aug 2016 07:52:27 +0200 Received: from [82.135.62.35] (helo=mail.embedded-brains.de) by sslproxy03.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84_2) (envelope-from ) id 1bXMRH-0007bF-3w; Wed, 10 Aug 2016 07:52:27 +0200 Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 498FA2A1807; Wed, 10 Aug 2016 07:52:49 +0200 (CEST) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id QCwe5fphiHcj; Wed, 10 Aug 2016 07:52:46 +0200 (CEST) Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id AF59B2A180C; Wed, 10 Aug 2016 07:52:46 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id CrdhFFJxUip2; Wed, 10 Aug 2016 07:52:46 +0200 (CEST) Received: from mauderer-linux.eb.localhost (unknown [192.168.96.154]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 97DA72A1807; Wed, 10 Aug 2016 07:52:46 +0200 (CEST) Subject: Re: Changes to pfctl to allow easier integration into a library To: Adrian Chadd References: <25df9fd5-be75-b9ae-aa3a-22abef3bddf0@embedded-brains.de> <0C7EC45D-C3BC-4417-AF77-3ACC027D28B5@FreeBSD.org> <336150f6-9dcd-873f-1f8f-a264dfa4c4ed@embedded-brains.de> <4571f890-6d35-b843-9bd8-86966fe515f5@embedded-brains.de> <57A97463.9000801@embedded-brains.de> Cc: "freebsd-hackers@freebsd.org" From: Christian Mauderer Message-ID: <23714335-7f8c-63b6-4f9c-747ceba954d7@embedded-brains.de> Date: Wed, 10 Aug 2016 07:52:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.99.2/22060/Wed Aug 10 05:21:20 2016) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 05:52:37 -0000 Am 09.08.2016 um 21:42 schrieb Adrian Chadd: > [snip] >=20 > hi, >=20 > I'm looking at this, because I'm thinking of what it'd take to make > some of this stuff in a daemon and be persistent, versus having to > fork/exec commands in a memory constrained environment. >=20 > Some static vars are effectively consts and are consts. Those may just > be committed as-is to make things clearer. >=20 > Some are actual state, which means I can't run this in a > multi-threaded environment.. :) >=20 >=20 >=20 > -adrian >=20 Hello Adrian, I'm quite sure there would be some more use cases where a library would be preferable to a stand alone tool. This could also simplify calls from perl or python scripts. Like I already mentioned in my first mail Sebastian had a discussion into this direction in 2013: https://lists.freebsd.org/pipermail/freebsd-hackers/2013-October/043553.h= tml If it is of any use to you, you can take a look at the method we use here= : https://git.rtems.org/rtems-libbsd/ Please note that this work is currently based on an FreeBSD 9.3. Some of the changes (mainly in the user space tools) make it hard to update to the current version. I hoped that my approach that I used on pfctl would bring us closer to the FreeBSD sources. Some of the other tools (like route which has been ported as one of the earlier tools) have some extensive modifications. We plan to reduce the modifications in the futur= e. Most of the tools use a mutex to prevent parallel execution of multiple instances of the tool. But it could be really tricky to have multiple instances of a configuration tool like pfctl (or ifconfig, or ...) at the same time anyway. So this is not really a restriction. Kind regards Christian --=20 -------------------------------------------- embedded brains GmbH Christian Mauderer Dornierstr. 4 D-82178 Puchheim Germany email: christian.mauderer@embedded-brains.de Phone: +49-89-18 94 741 - 18 Fax: +49-89-18 94 741 - 08 PGP: Public key available on request. Diese Nachricht ist keine gesch=C3=A4ftliche Mitteilung im Sinne des EHUG= . From owner-freebsd-hackers@freebsd.org Wed Aug 10 13:00:52 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 232A1BB43AB for ; Wed, 10 Aug 2016 13:00:52 +0000 (UTC) (envelope-from afiskon@devzen.ru) Received: from relay11.nicmail.ru (relay11.nicmail.ru [195.208.3.7]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF4D01673 for ; Wed, 10 Aug 2016 13:00:50 +0000 (UTC) (envelope-from afiskon@devzen.ru) Received: from [109.70.25.187] (port=51838 helo=e733) by f06.mail.nic.ru with esmtp (Exim 5.55) (envelope-from ) id 1bXT7o-0004cv-01 for freebsd-hackers@freebsd.org; Wed, 10 Aug 2016 16:00:48 +0300 Received: from [93.174.131.138] (account afiskon@devzen.ru HELO e733) by proxy08.mail.nic.ru (Exim 5.55) with id 1bXT7l-00054w-VA for freebsd-hackers@freebsd.org; Wed, 10 Aug 2016 16:00:46 +0300 Date: Wed, 10 Aug 2016 15:59:35 +0300 From: Aleksander Alekseev To: freebsd-hackers@freebsd.org Subject: A few noob question regarding system calls on x86/x64 Message-ID: <20160810155935.608a908a@e733> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 13:00:52 -0000 Hello. Currently I'm exploring how system calls work in FreeBSD. For now I'm interested only in x86 and x64 architectures and not considering Linux ABI support. I've found a few good articles [1][2], however I would like to clarify something: * In general: 'int 80h' on x86, 'syscall' on x64 - right? * Do I right understand that there is no such thing as system calls through 'sysenter' in FreeBSD? * I've found a mentioning of vDSO in FreeBSD but there is little information on this topic. There is currently only one vDSO implementation in FreeBSD, not two like in Linux (vdso/vsyscall)? Where can I find more details on vDSO implementation? [1] https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/x86-system-calls.html [2] https://thebrownnotebook.wordpress.com/2009/10/27/native-64-bit-hello-world-with-nasm-on-freebsd/ -- Best regards, Aleksander Alekseev From owner-freebsd-hackers@freebsd.org Wed Aug 10 14:05:21 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 12963BB4BEF for ; Wed, 10 Aug 2016 14:05:21 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 903F21782 for ; Wed, 10 Aug 2016 14:05:20 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id q128so9749147wma.1 for ; Wed, 10 Aug 2016 07:05:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-transfer-encoding; bh=BQb45WPu05JKKJPmKMj2YoG3s5cu+2l13BfO9O55908=; b=TdqNIVM3T9GFVBUbsdr+2+6sYEa9AP/kt1z44KoEo9AXbnwk4UVok8qAp0a+ULm7x3 dHIgp5rIP4Bl6sNHavh7QvDzESyIVr9Q1MLo3RqNUeCfDUnDQT6D/0wuJo2U1Ha9RWlx UHnu7xea4J4vGkELyUPel8xAqWEu9UJH03c8tP3SxPfnL2RmDvqxs++XK0kZWroCtHqa G9791aTpsQCyZk/2xzjyO4QVjWmEL3hjMZNqafn8IPDfMRffhucXl6AMtLe4VeSp/gRe rOvgwsFxZdX2grxNzs6ob0dglQFJmEhMzRYnmbBKNnV8I3UQCOTOuB4PqG8a251Wow3/ ENuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:content-transfer-encoding; bh=BQb45WPu05JKKJPmKMj2YoG3s5cu+2l13BfO9O55908=; b=OyyJg0nSGRr0wdqM+K1u3bOs8xpyGjx8iE934uZ4ofX63YQw2+IXPondJdLmvBjDQr IT7ul2MIiu0qa1ARc1hovu5hIkTmJc7ydTeynNvlgdFY8ICH1NGqNPATA855k9Z1KUOg iR6jFGgLI6NDzG3WYUQLquylWBRedd48JbDJR6P07DQPZtRPwUENK6mLbgxH+rw4UZyI WFf2QHBnf1Pt01LUfu/b022wrkzSjEL+zz8EQfReDqzpSEw1gtxqI3E1DRiDf1CYQUA0 U+plvTLPubrko9sYVNFgjWe909Hr3kuXJvIbbdGa4w5UVziUGLZH4UJGRCPULLlbE2HU Ta1A== X-Gm-Message-State: AEkoouuSWiTpB8N21wgknpm8Scuj6JLT081gudnG4xn2SbHGOKLlpzZ/PzUQRoMoBLLgag== X-Received: by 10.28.47.199 with SMTP id v190mr3591757wmv.28.1470837918771; Wed, 10 Aug 2016 07:05:18 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id x133sm8584236wmf.16.2016.08.10.07.05.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 07:05:18 -0700 (PDT) Message-ID: <57AB349B.2010805@gmail.com> Date: Wed, 10 Aug 2016 18:35:15 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: FreeBSD Hackers Subject: 9.3-RELEASE panic: spin lock held too long Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 14:05:21 -0000 Hi on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. System info: ------------- - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading) - 4G non-ECC RAM - asterisk-1.8.30.0 from ports - dahdi-kmod26-2.6.1.r10738 from ports - powerd disabled. - Workload: ISDN & SIP call processing. ------------ The panics are either on 'sched lock' or 'turnstile lock' spin locks. PANIC 1 ======= As below trace shows: 1- input arrives on a UDP socket 2- doselwakeup is called. 3- That wakeup call ends up in sched_add. 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. 5- The pancing thread does the same calls as owner thread but panics because it can't grab the the same spin lock. > kgdb /boot/kernel/kernel /var/crash/vmcore.14 ... kernel trap 12 with interrupts disabled spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long panic: spin lock held too long cpuid = 3 KDB: stack backtrace: #0 0xc0b17eaf at kdb_backtrace+0x4f #1 0xc0adeaef at panic+0x16f #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f #3 0xc0ac9e75 at _mtx_lock_spin+0x165 #4 0xc0b096c5 at sched_add+0xf5 #5 0xc0b09890 at sched_wakeup+0x70 #6 0xc0ae8968 at setrunnable+0x88 #7 0xc0b2227e at sleepq_resume_thread+0x12e #8 0xc0b22fd3 at sleepq_broadcast+0xd3 #9 0xc0a8c4cd at cv_broadcastpri+0x4d #10 0xc0b2a406 at doselwakeup+0xe6 #11 0xc0b2a4be at selwakeuppri+0xe #12 0xc0a9fa59 at knote_enqueue+0x59 #13 0xc0aa073f at kqueue_register+0x84f #14 0xc0aa09f3 at kern_kevent+0xf3 #15 0xc0aa16ce at sys_kevent+0x19e #16 0xc0fcc8c3 at syscall+0x443 #17 0xc0fb60f1 at Xint0x80_syscall+0x21 Uptime: 7m44s > bt #0 doadump (textdump=1) at pcpu.h:250 #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) tid 100045 [Switching to thread 34 (Thread 100045)]#0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 1496 ../../../i386/i386/mp_machdep.c: No such file or directory. in ../../../i386/i386/mp_machdep.c (kgdb) bt #0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 #1 0xc0fc1805 in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1478 #2 0xc0fccf38 in trap (frame=0xe1be9620) at ../../../i386/i386/trap.c:227 #3 0xc0fb605c in calltrap () at ../../../i386/i386/exception.s:170 #4 0xc0fddf45 in DELAY (n=1) at ../../../x86/isa/clock.c:283 #5 0xc0ac9e6c in _mtx_lock_spin (m=0xc140b1c0, tid=3355943664, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:555 #6 0xc0b096c5 in sched_add (td=0xc7de0000, flags=4) at ../../../kern/sched_ule.c:1153 #7 0xc0aae008 in intr_event_schedule_thread (ie=) at ../../../kern/kern_intr.c:921 #8 0xc0aaf8fd in swi_sched (cookie=0xc7d95300, flags=) at ../../../kern/kern_intr.c:1174 #9 0xc0af5b53 in callout_tick () at ../../../kern/kern_timeout.c:361 #10 0xc0a8c3e1 in hardclock_cnt (cnt=1, usermode=0) at ../../../kern/kern_clock.c:554 #11 0xc0fd403f in handleevents (now=0xe1be9870, fake=0) at ../../../kern/kern_clocksource.c:215 #12 0xc0fd5d3f in timercb (et=0xc148c860, arg=0x0) at ../../../kern/kern_clocksource.c:390 #13 0xc0fe731f in lapic_handle_timer (frame=0xe1be98b4) at ../../../x86/x86/local_apic.c:818 #14 0xc0fb6600 in Xtimerint () at apic_vector.s:108 #15 0xc0aca0df in _mtx_lock_sleep (m=0xc95741d0, tid=3355943664, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:400 #16 0xc0b2a3e9 in doselwakeup (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1681 #17 0xc0b2a4be in selwakeuppri (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1651 #18 0xc0b4886f in sowakeup (so=0xc9a491a0, sb=0xc9a491f4) at ../../../kern/uipc_sockbuf.c:182 #19 0xc0c9680d in udp_append (inp=0xc87bb0fc, ip=0xc8ead80e, n=0xc8ebf500, off=28, udp_in=0xe1be9a7c) at ../../../netinet/udp_usrreq.c:330 #20 0xc0c98ad6 in udp_input (m=0xc8ebf500, off=20) at ../../../netinet/udp_usrreq.c:616 #21 0xc0c0d5e7 in ip_input (m=0xc8ebf500) at ../../../netinet/ip_input.c:760 #22 0xc0bae68f in netisr_dispatch_src (proto=1, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 #23 0xc0bae930 in netisr_dispatch (proto=1, m=0xc8ebf500) at ../../../net/netisr.c:1104 #24 0xc0ba50c1 in ether_demux (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:943 #25 0xc0ba551f in ether_nh_input (m=0xc8ebf500) at ../../../net/if_ethersubr.c:762 #26 0xc0bae68f in netisr_dispatch_src (proto=9, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 #27 0xc0bae930 in netisr_dispatch (proto=9, m=0xc8ebf500) at ../../../net/netisr.c:1104 #28 0xc0ba4c09 in ether_input (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:803 #29 0xc06930c2 in igb_rxeof (que=0xc8078180, count=99, done=0x0) at ../../../dev/e1000/if_igb.c:4735 #30 0xc0693328 in igb_msix_que (arg=0xc8078180) at ../../../dev/e1000/if_igb.c:1601 #31 0xc0aae18b in intr_event_execute_handlers (p=0xc7d9e5b0, ie=0xc8077a00) at ../../../kern/kern_intr.c:1272 #32 0xc0aaf990 in ithread_loop (arg=0xc80a43e0) at ../../../kern/kern_intr.c:1285 #33 0xc0aaa96f in fork_exit (callout=0xc0aaf910 , arg=0xc80a43e0, frame=0xe1be9d08) at ../../../kern/kern_fork.c:996 #34 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 PANIC 2 ======= > kgdb /boot/kernel/kernel /var/crash/vmcore.15 ... ... Unread portion of the kernel message buffer: spin lock 0xc9976800 (turnstile lock) held by 0xc7da12f0 (tid 100005) too long panic: spin lock held too long cpuid = 2 KDB: stack backtrace: #0 0xc0b17eaf at kdb_backtrace+0x4f #1 0xc0adeaef at panic+0x16f #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f #3 0xc0ac9e75 at _mtx_lock_spin+0x165 #4 0xc0b26f87 at turnstile_lookup+0x87 #5 0xc0ac9c77 at _mtx_unlock_sleep+0x47 #6 0xc0b2a443 at doselwakeup+0x123 #7 0xc0b2a4be at selwakeuppri+0xe #8 0xc0a9fa59 at knote_enqueue+0x59 #9 0xc0aa073f at kqueue_register+0x84f #10 0xc0aa09f3 at kern_kevent+0xf3 #11 0xc0aa16ce at sys_kevent+0x19e #12 0xc0fcc8c3 at syscall+0x443 #13 0xc0fb60f1 at Xint0x80_syscall+0x21 Uptime: 2h6m20s Physical memory: 3486 MB Dumping 162 MB: 147 131 115 99 83 67 51 35 19 3 ... ... (kgdb) bt #0 doadump (textdump=1) at pcpu.h:250 #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc9976800, tid=3383440864, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 #5 0xc0b26f87 in turnstile_lookup (lock=0xc9639550) at ../../../kern/subr_turnstile.c:600 #6 0xc0ac9c77 in _mtx_unlock_sleep (m=0xc9639550, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:712 #7 0xc0b2a443 in doselwakeup (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1684 #8 0xc0b2a4be in selwakeuppri (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1651 #9 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 #10 0xc0aa073f in kqueue_register (kq=0xc9ef7000, kev=0xf0fc8b20, td=0xc9ab35e0, waitok=1) at ../../../kern/kern_event.c:1154 #11 0xc0aa09f3 in kern_kevent (td=0xc9ab35e0, fd=284, nchanges=2, nevents=0, k_ops=0xf0fc8c20, timeout=0x0) at ../../../kern/kern_event.c:850 #12 0xc0aa16ce in sys_kevent (td=0xc9ab35e0, uap=0xf0fc8ccc) at ../../../kern/kern_event.c:771 #13 0xc0fcc8c3 in syscall (frame=0xf0fc8d08) at subr_syscall.c:135 #14 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 #15 0x00000033 in ?? () (kgdb) tid 100005 [Switching to thread 18 (Thread 100005)]#0 sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904 1904 ../../../kern/sched_ule.c: No such file or directory. in ../../../kern/sched_ule.c (kgdb) bt #0 sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904 #1 0xc0ae8b43 in mi_switch (flags=259, newtd=0x0) at ../../../kern/kern_synch.c:485 #2 0xc0b28214 in turnstile_wait (ts=0xc9976800, owner=0xc9ab35e0, queue=) at ../../../kern/subr_turnstile.c:753 #3 0xc0aca177 in _mtx_lock_sleep (m=0xc9639550, tid=3352957680, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:472 #4 0xc0b2a3e9 in doselwakeup (sip=0xc94cd0e0, pri=-1) at ../../../kern/sys_generic.c:1681 #5 0xc0b2a4d0 in selwakeup (sip=0xc94cd0e0) at ../../../kern/sys_generic.c:1642 #6 0xc9151789 in dahdi_wake_up_interruptible (sel=0xc94cd0e0) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:227 #7 0xc915946f in __dahdi_transmit_chunk (chan=0xc94cd000, buf=0xc85efba8 "ÕÕÕUUUÕÕ", '˙' ) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:7970 #8 0xc915fea7 in _dahdi_transmit (span=0xc858d03c) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:9527 #9 0xc941d0e4 in __transmit_span (ts=0xc858d000) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3139 #10 0xc941d084 in t4_prep_gen2 (wc=0xc8612400) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3170 #11 0xc941fd41 in _t4_interrupt_gen2 (dev_id=0xc8612400) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:4233 #12 0xc0aaef07 in intr_event_handle (ie=0xc7d9c800, frame=0xc7bc4bec) at ../../../kern/kern_intr.c:1446 #13 0xc0fe42d9 in intr_execute_handlers (isrc=0xc7d8de84, frame=0xc7bc4bec) at ../../../x86/x86/intr_machdep.c:266 #14 0xc0fe7376 in lapic_handle_intr (vector=50, frame=0xc7bc4bec) at ../../../x86/x86/local_apic.c:780 #15 0xc0fb6455 in Xapic_isr1 () at apic_vector.s:89 #16 0xc0fbc16b in spinlock_exit () at cpufunc.h:373 #17 0xc0fd540c in cpu_activeclock () at ../../../kern/kern_clocksource.c:825 #18 0xc0fbee72 in cpu_idle (busy=0) at ../../../i386/i386/machdep.c:1386 #19 0xc0b08c06 in sched_idletd (dummy=0x0) at ../../../kern/sched_ule.c:2609 #20 0xc0aaa96f in fork_exit (callout=0xc0b088a0 , arg=0x0, frame=0xc7bc4d08) at ../../../kern/kern_fork.c:996 #21 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 ------------ I have read all the past relevant posts related to my problem. However, the suggested workarounds don't work for or are irrelavant to my problem. pls. let me know if more information is needed. Thanks in advance. -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Wed Aug 10 14:05:44 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 560C5BB4C2F for ; Wed, 10 Aug 2016 14:05:44 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E580F188E for ; Wed, 10 Aug 2016 14:05:43 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x243.google.com with SMTP id i138so9726069wmf.3 for ; Wed, 10 Aug 2016 07:05:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-transfer-encoding; bh=IHihgws6eZeC7L/fa5A12hSuNnE5smBh/8kSMGbINLU=; b=P3PiLUgdiDWHqh/8ey4VFP8BDbTodur/y0Y9PlYdfnMZ3dvnLlWMVQDlwROPbFGIFo ANLaI4wApOdfMMrb883brdxdhm+JIv2F4TiQ60Dper9N0j3vn7B0BA4mTR1qoyYsJ8RC obrh/idv5WeCReLGBw5I9RJiIJvAXtFRQtVqjHy+3w2Ng7sHu9cwRZ8VRrOZ6t6HWet+ etu2kQqFNJ23AL4S5wyR7DAi6WtbmhJhkwTYGAwD+jQk0sonObYsj558LuCp72ZtDyUk ycSr1Bp2AaG93d6J8sawRh2G5eKrv+xbGTJSBxNnXLnGXyGgbt+Xonpsur2wLSYhe8Oh poew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:content-transfer-encoding; bh=IHihgws6eZeC7L/fa5A12hSuNnE5smBh/8kSMGbINLU=; b=CjIGbtywFOLb/r0bnghJ+vsOrX8hNjePeURLZZFXFcUQfM0D16DG6rDUgQOfNSBceY jFGtc+HktxXmRJiMeJKEF6oygQdG3qT1NOdcg6PHdXSOhhnkUdxvoOyir/4amOUk5hSe LAqNddUDxn/bWY8mrJfcn/EuAFewnpqKzoQ2Ju9Vl5WXTvvVX7a9u9CSlTpvf5nXCoG9 eqpbhTgmdZf9txnEkue4+MQglQRkVg9FEUt9zofIOaqF7jCDMUj9foMi9iJmQkPUVFQ5 NE6Oq7YrzFCv1Ql56KRqnxNRsoE4y6gUeXNk8uKzD5Kxbr8KZ8F+xmM8krgQXuzqZMA+ vHfw== X-Gm-Message-State: AEkoouuSXSHgtOAx9RC3dcFOsxKRXoz7NhqUQp6AXIeQ0IaDzV2wWBCK7Us/UVzVvJzkyw== X-Received: by 10.28.61.215 with SMTP id k206mr3581568wma.80.1470837941988; Wed, 10 Aug 2016 07:05:41 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id a184sm5092748wmh.1.2016.08.10.07.05.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 07:05:41 -0700 (PDT) Message-ID: <57AB34B3.20304@gmail.com> Date: Wed, 10 Aug 2016 18:35:39 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: FreeBSD Hackers Subject: 9.3-RELEASE panic: spin lock held too long Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 14:05:44 -0000 Hi on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. System info: ------------- - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, w/o hyper threading) - 4G non-ECC RAM - asterisk-1.8.30.0 from ports - dahdi-kmod26-2.6.1.r10738 from ports - powerd disabled. - Workload: ISDN & SIP call processing. ------------ The panics are either on 'sched lock' or 'turnstile lock' spin locks. PANIC 1 ======= As below trace shows: 1- input arrives on a UDP socket 2- doselwakeup is called. 3- That wakeup call ends up in sched_add. 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. 5- The pancing thread does the same calls as owner thread but panics because it can't grab the the same spin lock. > kgdb /boot/kernel/kernel /var/crash/vmcore.14 ... kernel trap 12 with interrupts disabled spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long panic: spin lock held too long cpuid = 3 KDB: stack backtrace: #0 0xc0b17eaf at kdb_backtrace+0x4f #1 0xc0adeaef at panic+0x16f #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f #3 0xc0ac9e75 at _mtx_lock_spin+0x165 #4 0xc0b096c5 at sched_add+0xf5 #5 0xc0b09890 at sched_wakeup+0x70 #6 0xc0ae8968 at setrunnable+0x88 #7 0xc0b2227e at sleepq_resume_thread+0x12e #8 0xc0b22fd3 at sleepq_broadcast+0xd3 #9 0xc0a8c4cd at cv_broadcastpri+0x4d #10 0xc0b2a406 at doselwakeup+0xe6 #11 0xc0b2a4be at selwakeuppri+0xe #12 0xc0a9fa59 at knote_enqueue+0x59 #13 0xc0aa073f at kqueue_register+0x84f #14 0xc0aa09f3 at kern_kevent+0xf3 #15 0xc0aa16ce at sys_kevent+0x19e #16 0xc0fcc8c3 at syscall+0x443 #17 0xc0fb60f1 at Xint0x80_syscall+0x21 Uptime: 7m44s > bt #0 doadump (textdump=1) at pcpu.h:250 #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) tid 100045 [Switching to thread 34 (Thread 100045)]#0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 1496 ../../../i386/i386/mp_machdep.c: No such file or directory. in ../../../i386/i386/mp_machdep.c (kgdb) bt #0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 #1 0xc0fc1805 in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1478 #2 0xc0fccf38 in trap (frame=0xe1be9620) at ../../../i386/i386/trap.c:227 #3 0xc0fb605c in calltrap () at ../../../i386/i386/exception.s:170 #4 0xc0fddf45 in DELAY (n=1) at ../../../x86/isa/clock.c:283 #5 0xc0ac9e6c in _mtx_lock_spin (m=0xc140b1c0, tid=3355943664, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:555 #6 0xc0b096c5 in sched_add (td=0xc7de0000, flags=4) at ../../../kern/sched_ule.c:1153 #7 0xc0aae008 in intr_event_schedule_thread (ie=) at ../../../kern/kern_intr.c:921 #8 0xc0aaf8fd in swi_sched (cookie=0xc7d95300, flags=) at ../../../kern/kern_intr.c:1174 #9 0xc0af5b53 in callout_tick () at ../../../kern/kern_timeout.c:361 #10 0xc0a8c3e1 in hardclock_cnt (cnt=1, usermode=0) at ../../../kern/kern_clock.c:554 #11 0xc0fd403f in handleevents (now=0xe1be9870, fake=0) at ../../../kern/kern_clocksource.c:215 #12 0xc0fd5d3f in timercb (et=0xc148c860, arg=0x0) at ../../../kern/kern_clocksource.c:390 #13 0xc0fe731f in lapic_handle_timer (frame=0xe1be98b4) at ../../../x86/x86/local_apic.c:818 #14 0xc0fb6600 in Xtimerint () at apic_vector.s:108 #15 0xc0aca0df in _mtx_lock_sleep (m=0xc95741d0, tid=3355943664, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:400 #16 0xc0b2a3e9 in doselwakeup (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1681 #17 0xc0b2a4be in selwakeuppri (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1651 #18 0xc0b4886f in sowakeup (so=0xc9a491a0, sb=0xc9a491f4) at ../../../kern/uipc_sockbuf.c:182 #19 0xc0c9680d in udp_append (inp=0xc87bb0fc, ip=0xc8ead80e, n=0xc8ebf500, off=28, udp_in=0xe1be9a7c) at ../../../netinet/udp_usrreq.c:330 #20 0xc0c98ad6 in udp_input (m=0xc8ebf500, off=20) at ../../../netinet/udp_usrreq.c:616 #21 0xc0c0d5e7 in ip_input (m=0xc8ebf500) at ../../../netinet/ip_input.c:760 #22 0xc0bae68f in netisr_dispatch_src (proto=1, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 #23 0xc0bae930 in netisr_dispatch (proto=1, m=0xc8ebf500) at ../../../net/netisr.c:1104 #24 0xc0ba50c1 in ether_demux (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:943 #25 0xc0ba551f in ether_nh_input (m=0xc8ebf500) at ../../../net/if_ethersubr.c:762 #26 0xc0bae68f in netisr_dispatch_src (proto=9, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 #27 0xc0bae930 in netisr_dispatch (proto=9, m=0xc8ebf500) at ../../../net/netisr.c:1104 #28 0xc0ba4c09 in ether_input (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:803 #29 0xc06930c2 in igb_rxeof (que=0xc8078180, count=99, done=0x0) at ../../../dev/e1000/if_igb.c:4735 #30 0xc0693328 in igb_msix_que (arg=0xc8078180) at ../../../dev/e1000/if_igb.c:1601 #31 0xc0aae18b in intr_event_execute_handlers (p=0xc7d9e5b0, ie=0xc8077a00) at ../../../kern/kern_intr.c:1272 #32 0xc0aaf990 in ithread_loop (arg=0xc80a43e0) at ../../../kern/kern_intr.c:1285 #33 0xc0aaa96f in fork_exit (callout=0xc0aaf910 , arg=0xc80a43e0, frame=0xe1be9d08) at ../../../kern/kern_fork.c:996 #34 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 PANIC 2 ======= > kgdb /boot/kernel/kernel /var/crash/vmcore.15 ... ... Unread portion of the kernel message buffer: spin lock 0xc9976800 (turnstile lock) held by 0xc7da12f0 (tid 100005) too long panic: spin lock held too long cpuid = 2 KDB: stack backtrace: #0 0xc0b17eaf at kdb_backtrace+0x4f #1 0xc0adeaef at panic+0x16f #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f #3 0xc0ac9e75 at _mtx_lock_spin+0x165 #4 0xc0b26f87 at turnstile_lookup+0x87 #5 0xc0ac9c77 at _mtx_unlock_sleep+0x47 #6 0xc0b2a443 at doselwakeup+0x123 #7 0xc0b2a4be at selwakeuppri+0xe #8 0xc0a9fa59 at knote_enqueue+0x59 #9 0xc0aa073f at kqueue_register+0x84f #10 0xc0aa09f3 at kern_kevent+0xf3 #11 0xc0aa16ce at sys_kevent+0x19e #12 0xc0fcc8c3 at syscall+0x443 #13 0xc0fb60f1 at Xint0x80_syscall+0x21 Uptime: 2h6m20s Physical memory: 3486 MB Dumping 162 MB: 147 131 115 99 83 67 51 35 19 3 ... ... (kgdb) bt #0 doadump (textdump=1) at pcpu.h:250 #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc9976800, tid=3383440864, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 #5 0xc0b26f87 in turnstile_lookup (lock=0xc9639550) at ../../../kern/subr_turnstile.c:600 #6 0xc0ac9c77 in _mtx_unlock_sleep (m=0xc9639550, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:712 #7 0xc0b2a443 in doselwakeup (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1684 #8 0xc0b2a4be in selwakeuppri (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1651 #9 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 #10 0xc0aa073f in kqueue_register (kq=0xc9ef7000, kev=0xf0fc8b20, td=0xc9ab35e0, waitok=1) at ../../../kern/kern_event.c:1154 #11 0xc0aa09f3 in kern_kevent (td=0xc9ab35e0, fd=284, nchanges=2, nevents=0, k_ops=0xf0fc8c20, timeout=0x0) at ../../../kern/kern_event.c:850 #12 0xc0aa16ce in sys_kevent (td=0xc9ab35e0, uap=0xf0fc8ccc) at ../../../kern/kern_event.c:771 #13 0xc0fcc8c3 in syscall (frame=0xf0fc8d08) at subr_syscall.c:135 #14 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 #15 0x00000033 in ?? () (kgdb) tid 100005 [Switching to thread 18 (Thread 100005)]#0 sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904 1904 ../../../kern/sched_ule.c: No such file or directory. in ../../../kern/sched_ule.c (kgdb) bt #0 sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904 #1 0xc0ae8b43 in mi_switch (flags=259, newtd=0x0) at ../../../kern/kern_synch.c:485 #2 0xc0b28214 in turnstile_wait (ts=0xc9976800, owner=0xc9ab35e0, queue=) at ../../../kern/subr_turnstile.c:753 #3 0xc0aca177 in _mtx_lock_sleep (m=0xc9639550, tid=3352957680, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:472 #4 0xc0b2a3e9 in doselwakeup (sip=0xc94cd0e0, pri=-1) at ../../../kern/sys_generic.c:1681 #5 0xc0b2a4d0 in selwakeup (sip=0xc94cd0e0) at ../../../kern/sys_generic.c:1642 #6 0xc9151789 in dahdi_wake_up_interruptible (sel=0xc94cd0e0) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:227 #7 0xc915946f in __dahdi_transmit_chunk (chan=0xc94cd000, buf=0xc85efba8 "ÕÕÕUUUÕÕ", '˙' ) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:7970 #8 0xc915fea7 in _dahdi_transmit (span=0xc858d03c) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:9527 #9 0xc941d0e4 in __transmit_span (ts=0xc858d000) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3139 #10 0xc941d084 in t4_prep_gen2 (wc=0xc8612400) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3170 #11 0xc941fd41 in _t4_interrupt_gen2 (dev_id=0xc8612400) at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:4233 #12 0xc0aaef07 in intr_event_handle (ie=0xc7d9c800, frame=0xc7bc4bec) at ../../../kern/kern_intr.c:1446 #13 0xc0fe42d9 in intr_execute_handlers (isrc=0xc7d8de84, frame=0xc7bc4bec) at ../../../x86/x86/intr_machdep.c:266 #14 0xc0fe7376 in lapic_handle_intr (vector=50, frame=0xc7bc4bec) at ../../../x86/x86/local_apic.c:780 #15 0xc0fb6455 in Xapic_isr1 () at apic_vector.s:89 #16 0xc0fbc16b in spinlock_exit () at cpufunc.h:373 #17 0xc0fd540c in cpu_activeclock () at ../../../kern/kern_clocksource.c:825 #18 0xc0fbee72 in cpu_idle (busy=0) at ../../../i386/i386/machdep.c:1386 #19 0xc0b08c06 in sched_idletd (dummy=0x0) at ../../../kern/sched_ule.c:2609 #20 0xc0aaa96f in fork_exit (callout=0xc0b088a0 , arg=0x0, frame=0xc7bc4d08) at ../../../kern/kern_fork.c:996 #21 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 ------------ I have read all the past relevant posts related to my problem. However, the suggested workarounds don't work for or are irrelavant to my problem. pls. let me know if more information is needed. Thanks in advance. -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Wed Aug 10 14:20:00 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C27A0BB43DB for ; Wed, 10 Aug 2016 14:20:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2D4501B36 for ; Wed, 10 Aug 2016 14:19:59 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u7AEJmVI046101 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 10 Aug 2016 17:19:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u7AEJmVI046101 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u7AEJmXl046100; Wed, 10 Aug 2016 17:19:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 10 Aug 2016 17:19:48 +0300 From: Konstantin Belousov To: Hooman Fazaeli Cc: FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long Message-ID: <20160810141948.GP83214@kib.kiev.ua> References: <57AB349B.2010805@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57AB349B.2010805@gmail.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 14:20:00 -0000 On Wed, Aug 10, 2016 at 06:35:15PM +0430, Hooman Fazaeli wrote: > > Hi > > on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. > > System info: > ------------- > - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading) > - 4G non-ECC RAM > - asterisk-1.8.30.0 from ports > - dahdi-kmod26-2.6.1.r10738 from ports > - powerd disabled. > - Workload: ISDN & SIP call processing. > ------------ > > The panics are either on 'sched lock' or 'turnstile lock' spin locks. > > PANIC 1 > ======= > As below trace shows: > > 1- input arrives on a UDP socket > 2- doselwakeup is called. > 3- That wakeup call ends up in sched_add. > 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. > 5- The pancing thread does the same calls as owner thread but panics because > it can't grab the the same spin lock. > > > kgdb /boot/kernel/kernel /var/crash/vmcore.14 > ... > kernel trap 12 with interrupts disabled > spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long > panic: spin lock held too long > cpuid = 3 > KDB: stack backtrace: > #0 0xc0b17eaf at kdb_backtrace+0x4f > #1 0xc0adeaef at panic+0x16f > #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f > #3 0xc0ac9e75 at _mtx_lock_spin+0x165 > #4 0xc0b096c5 at sched_add+0xf5 > #5 0xc0b09890 at sched_wakeup+0x70 > #6 0xc0ae8968 at setrunnable+0x88 > #7 0xc0b2227e at sleepq_resume_thread+0x12e > #8 0xc0b22fd3 at sleepq_broadcast+0xd3 > #9 0xc0a8c4cd at cv_broadcastpri+0x4d > #10 0xc0b2a406 at doselwakeup+0xe6 > #11 0xc0b2a4be at selwakeuppri+0xe > #12 0xc0a9fa59 at knote_enqueue+0x59 > #13 0xc0aa073f at kqueue_register+0x84f > #14 0xc0aa09f3 at kern_kevent+0xf3 > #15 0xc0aa16ce at sys_kevent+0x19e > #16 0xc0fcc8c3 at syscall+0x443 > #17 0xc0fb60f1 at Xint0x80_syscall+0x21 > Uptime: 7m44s > > > bt > #0 doadump (textdump=1) at pcpu.h:250 > #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 > #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 > #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 > #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 > #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 > #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 > #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 > #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 > #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 > #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 > #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 > #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 > #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 > #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 > #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 > #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 > #19 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) tid 100045 > [Switching to thread 34 (Thread 100045)]#0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 > 1496 ../../../i386/i386/mp_machdep.c: No such file or directory. > in ../../../i386/i386/mp_machdep.c > > (kgdb) bt > #0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 > #1 0xc0fc1805 in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1478 > #2 0xc0fccf38 in trap (frame=0xe1be9620) at ../../../i386/i386/trap.c:227 > #3 0xc0fb605c in calltrap () at ../../../i386/i386/exception.s:170 > #4 0xc0fddf45 in DELAY (n=1) at ../../../x86/isa/clock.c:283 > #5 0xc0ac9e6c in _mtx_lock_spin (m=0xc140b1c0, tid=3355943664, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:555 > #6 0xc0b096c5 in sched_add (td=0xc7de0000, flags=4) at ../../../kern/sched_ule.c:1153 > #7 0xc0aae008 in intr_event_schedule_thread (ie=) at ../../../kern/kern_intr.c:921 > #8 0xc0aaf8fd in swi_sched (cookie=0xc7d95300, flags=) at ../../../kern/kern_intr.c:1174 > #9 0xc0af5b53 in callout_tick () at ../../../kern/kern_timeout.c:361 > #10 0xc0a8c3e1 in hardclock_cnt (cnt=1, usermode=0) at ../../../kern/kern_clock.c:554 > #11 0xc0fd403f in handleevents (now=0xe1be9870, fake=0) at ../../../kern/kern_clocksource.c:215 > #12 0xc0fd5d3f in timercb (et=0xc148c860, arg=0x0) at ../../../kern/kern_clocksource.c:390 > #13 0xc0fe731f in lapic_handle_timer (frame=0xe1be98b4) at ../../../x86/x86/local_apic.c:818 > #14 0xc0fb6600 in Xtimerint () at apic_vector.s:108 > #15 0xc0aca0df in _mtx_lock_sleep (m=0xc95741d0, tid=3355943664, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:400 > #16 0xc0b2a3e9 in doselwakeup (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1681 > #17 0xc0b2a4be in selwakeuppri (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1651 > #18 0xc0b4886f in sowakeup (so=0xc9a491a0, sb=0xc9a491f4) at ../../../kern/uipc_sockbuf.c:182 > #19 0xc0c9680d in udp_append (inp=0xc87bb0fc, ip=0xc8ead80e, n=0xc8ebf500, off=28, udp_in=0xe1be9a7c) at ../../../netinet/udp_usrreq.c:330 > #20 0xc0c98ad6 in udp_input (m=0xc8ebf500, off=20) at ../../../netinet/udp_usrreq.c:616 > #21 0xc0c0d5e7 in ip_input (m=0xc8ebf500) at ../../../netinet/ip_input.c:760 > #22 0xc0bae68f in netisr_dispatch_src (proto=1, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 > #23 0xc0bae930 in netisr_dispatch (proto=1, m=0xc8ebf500) at ../../../net/netisr.c:1104 > #24 0xc0ba50c1 in ether_demux (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:943 > #25 0xc0ba551f in ether_nh_input (m=0xc8ebf500) at ../../../net/if_ethersubr.c:762 > #26 0xc0bae68f in netisr_dispatch_src (proto=9, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 > #27 0xc0bae930 in netisr_dispatch (proto=9, m=0xc8ebf500) at ../../../net/netisr.c:1104 > #28 0xc0ba4c09 in ether_input (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:803 > #29 0xc06930c2 in igb_rxeof (que=0xc8078180, count=99, done=0x0) at ../../../dev/e1000/if_igb.c:4735 > #30 0xc0693328 in igb_msix_que (arg=0xc8078180) at ../../../dev/e1000/if_igb.c:1601 > #31 0xc0aae18b in intr_event_execute_handlers (p=0xc7d9e5b0, ie=0xc8077a00) at ../../../kern/kern_intr.c:1272 > #32 0xc0aaf990 in ithread_loop (arg=0xc80a43e0) at ../../../kern/kern_intr.c:1285 > #33 0xc0aaa96f in fork_exit (callout=0xc0aaf910 , arg=0xc80a43e0, frame=0xe1be9d08) at ../../../kern/kern_fork.c:996 > #34 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 I am not convinced that the thread 100045 is the owner of the spin lock. mtx_lock_spin()->DELAY() frames on the stack mean that the thread is trying to obtain the spin lock but did not succeeded yet. Or did you not mean that ? You may obtain the current owner by printing the mutex content from kgdb. Look at the mtx_lock field, it is a pointer to the struct thread owning the spin lock, modulo some flags in lower bits. From owner-freebsd-hackers@freebsd.org Wed Aug 10 15:20:16 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4ABD3BB50C9 for ; Wed, 10 Aug 2016 15:20:16 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C87D41696 for ; Wed, 10 Aug 2016 15:20:15 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id q128so10202810wma.1 for ; Wed, 10 Aug 2016 08:20:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-transfer-encoding; bh=5vG+6qZnLABAeaijyaFtdq6tTvF9nzfgNfI1nlxYEyU=; b=pf4Fe/hBFMTWRTBNIl69/2D2yQm3sfdAjnjZcjwfhFbjLOT/prpnaZBFh6IxyweUoT en6ymxSO/W9pvNI4iXW8b4J/GnDArRIS/B9TB3z791ugj15bcTO8GYIGbTXbUfQ3C5zO hNTFDDv67nppkZO1okipwpi4xOnFi8wwDUTA0FkDOee90BDgTbzKMmQpsRN9x9xIycuv nXyGDmjFGrE8wGROuiejoYagQLj9Enh/27a4rI2N+2mNHqiE713aQejJE1lw07Jf/dYs uZfDjP64iILaIW9/JhW54iQKCAg86GxFXywbRjx/oPk+sCP98rO7lz0H1Ttf0btZKH6m FG/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-transfer-encoding; bh=5vG+6qZnLABAeaijyaFtdq6tTvF9nzfgNfI1nlxYEyU=; b=N/7Lyxp10h1qxLE+/R4bkQc8652qx3JPn6g0l5rpyEUjDaBr3Bbwn5rh4ihQHs6wsd Nhu+okK1sPAGWLAXWBzwGU8sWeGVK0yEaLbAtPdNIdBpg/zO6OtsPsBlu52nuyk1ozty YSWPRdx2fLVKylrtNiEVoph1Yrvrr8NoN1vKX7VTSU3Wzzi/dIL3Oy9kX8PT7JJZ2n+s YRGrEKfSkjJ3l4C8fgj8DTEqr47eM0gVf18YagLpbEXVPKLJXaNcm+u+y5zIxY1f6AwM LHiTL5htlVUZna33G0Tw65/Yk26246206yQDd2a2kkygUMEnYhpZuNW9tMQ25HNVnKjO +1Eg== X-Gm-Message-State: AEkoout1IigN0Y9xVGp9iDg0Nc351s/CSmClSP7gsEwqKIeMK2f79HoS14cFxrUxxWsotg== X-Received: by 10.194.148.19 with SMTP id to19mr5375601wjb.81.1470842414089; Wed, 10 Aug 2016 08:20:14 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id d8sm8895054wmi.0.2016.08.10.08.20.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 08:20:13 -0700 (PDT) Message-ID: <57AB462A.2080608@gmail.com> Date: Wed, 10 Aug 2016 19:50:10 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Konstantin Belousov CC: FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> In-Reply-To: <20160810141948.GP83214@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 15:20:16 -0000 On 2016-08-10 18:49, Konstantin Belousov wrote: > On Wed, Aug 10, 2016 at 06:35:15PM +0430, Hooman Fazaeli wrote: >> Hi >> >> on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. >> >> System info: >> ------------- >> - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading) >> - 4G non-ECC RAM >> - asterisk-1.8.30.0 from ports >> - dahdi-kmod26-2.6.1.r10738 from ports >> - powerd disabled. >> - Workload: ISDN & SIP call processing. >> ------------ >> >> The panics are either on 'sched lock' or 'turnstile lock' spin locks. >> >> PANIC 1 >> ======= >> As below trace shows: >> >> 1- input arrives on a UDP socket >> 2- doselwakeup is called. >> 3- That wakeup call ends up in sched_add. >> 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. >> 5- The pancing thread does the same calls as owner thread but panics because >> it can't grab the the same spin lock. >> >> > kgdb /boot/kernel/kernel /var/crash/vmcore.14 >> ... >> kernel trap 12 with interrupts disabled >> spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long >> panic: spin lock held too long >> cpuid = 3 >> KDB: stack backtrace: >> #0 0xc0b17eaf at kdb_backtrace+0x4f >> #1 0xc0adeaef at panic+0x16f >> #2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f >> #3 0xc0ac9e75 at _mtx_lock_spin+0x165 >> #4 0xc0b096c5 at sched_add+0xf5 >> #5 0xc0b09890 at sched_wakeup+0x70 >> #6 0xc0ae8968 at setrunnable+0x88 >> #7 0xc0b2227e at sleepq_resume_thread+0x12e >> #8 0xc0b22fd3 at sleepq_broadcast+0xd3 >> #9 0xc0a8c4cd at cv_broadcastpri+0x4d >> #10 0xc0b2a406 at doselwakeup+0xe6 >> #11 0xc0b2a4be at selwakeuppri+0xe >> #12 0xc0a9fa59 at knote_enqueue+0x59 >> #13 0xc0aa073f at kqueue_register+0x84f >> #14 0xc0aa09f3 at kern_kevent+0xf3 >> #15 0xc0aa16ce at sys_kevent+0x19e >> #16 0xc0fcc8c3 at syscall+0x443 >> #17 0xc0fb60f1 at Xint0x80_syscall+0x21 >> Uptime: 7m44s >> >> > bt >> #0 doadump (textdump=1) at pcpu.h:250 >> #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 >> #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 >> #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 >> #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 >> #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 >> #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 >> #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 >> #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 >> #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 >> #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 >> #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 >> #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 >> #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 >> #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 >> #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 >> #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 >> #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 >> #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 >> #19 0x00000033 in ?? () >> Previous frame inner to this frame (corrupt stack?) >> >> (kgdb) tid 100045 >> [Switching to thread 34 (Thread 100045)]#0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 >> 1496 ../../../i386/i386/mp_machdep.c: No such file or directory. >> in ../../../i386/i386/mp_machdep.c >> >> (kgdb) bt >> #0 cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496 >> #1 0xc0fc1805 in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1478 >> #2 0xc0fccf38 in trap (frame=0xe1be9620) at ../../../i386/i386/trap.c:227 >> #3 0xc0fb605c in calltrap () at ../../../i386/i386/exception.s:170 >> #4 0xc0fddf45 in DELAY (n=1) at ../../../x86/isa/clock.c:283 >> #5 0xc0ac9e6c in _mtx_lock_spin (m=0xc140b1c0, tid=3355943664, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:555 >> #6 0xc0b096c5 in sched_add (td=0xc7de0000, flags=4) at ../../../kern/sched_ule.c:1153 >> #7 0xc0aae008 in intr_event_schedule_thread (ie=) at ../../../kern/kern_intr.c:921 >> #8 0xc0aaf8fd in swi_sched (cookie=0xc7d95300, flags=) at ../../../kern/kern_intr.c:1174 >> #9 0xc0af5b53 in callout_tick () at ../../../kern/kern_timeout.c:361 >> #10 0xc0a8c3e1 in hardclock_cnt (cnt=1, usermode=0) at ../../../kern/kern_clock.c:554 >> #11 0xc0fd403f in handleevents (now=0xe1be9870, fake=0) at ../../../kern/kern_clocksource.c:215 >> #12 0xc0fd5d3f in timercb (et=0xc148c860, arg=0x0) at ../../../kern/kern_clocksource.c:390 >> #13 0xc0fe731f in lapic_handle_timer (frame=0xe1be98b4) at ../../../x86/x86/local_apic.c:818 >> #14 0xc0fb6600 in Xtimerint () at apic_vector.s:108 >> #15 0xc0aca0df in _mtx_lock_sleep (m=0xc95741d0, tid=3355943664, opts=, file=0x0, line=0) at ../../../kern/kern_mutex.c:400 >> #16 0xc0b2a3e9 in doselwakeup (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1681 >> #17 0xc0b2a4be in selwakeuppri (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1651 >> #18 0xc0b4886f in sowakeup (so=0xc9a491a0, sb=0xc9a491f4) at ../../../kern/uipc_sockbuf.c:182 >> #19 0xc0c9680d in udp_append (inp=0xc87bb0fc, ip=0xc8ead80e, n=0xc8ebf500, off=28, udp_in=0xe1be9a7c) at ../../../netinet/udp_usrreq.c:330 >> #20 0xc0c98ad6 in udp_input (m=0xc8ebf500, off=20) at ../../../netinet/udp_usrreq.c:616 >> #21 0xc0c0d5e7 in ip_input (m=0xc8ebf500) at ../../../netinet/ip_input.c:760 >> #22 0xc0bae68f in netisr_dispatch_src (proto=1, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 >> #23 0xc0bae930 in netisr_dispatch (proto=1, m=0xc8ebf500) at ../../../net/netisr.c:1104 >> #24 0xc0ba50c1 in ether_demux (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:943 >> #25 0xc0ba551f in ether_nh_input (m=0xc8ebf500) at ../../../net/if_ethersubr.c:762 >> #26 0xc0bae68f in netisr_dispatch_src (proto=9, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013 >> #27 0xc0bae930 in netisr_dispatch (proto=9, m=0xc8ebf500) at ../../../net/netisr.c:1104 >> #28 0xc0ba4c09 in ether_input (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:803 >> #29 0xc06930c2 in igb_rxeof (que=0xc8078180, count=99, done=0x0) at ../../../dev/e1000/if_igb.c:4735 >> #30 0xc0693328 in igb_msix_que (arg=0xc8078180) at ../../../dev/e1000/if_igb.c:1601 >> #31 0xc0aae18b in intr_event_execute_handlers (p=0xc7d9e5b0, ie=0xc8077a00) at ../../../kern/kern_intr.c:1272 >> #32 0xc0aaf990 in ithread_loop (arg=0xc80a43e0) at ../../../kern/kern_intr.c:1285 >> #33 0xc0aaa96f in fork_exit (callout=0xc0aaf910 , arg=0xc80a43e0, frame=0xe1be9d08) at ../../../kern/kern_fork.c:996 >> #34 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279 > I am not convinced that the thread 100045 is the owner of the spin > lock. mtx_lock_spin()->DELAY() frames on the stack mean that the > thread is trying to obtain the spin lock but did not succeeded yet. > Or did you not mean that ? > > You may obtain the current owner by printing the mutex content from kgdb. > Look at the mtx_lock field, it is a pointer to the struct thread owning > the spin lock, modulo some flags in lower bits. > kgdb /boot/kernel/kernel /var/crash/vmcore.14 ... ... (kgdb) bt #0 doadump (textdump=1) at pcpu.h:250 #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up 4 #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 557 ../../../kern/kern_mutex.c: No such file or directory. in ../../../kern/kern_mutex.c (kgdb) p *m $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} ------------ As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported in panic string. (kgdb) info threads ... 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at ../../../kern/sched_ule.c:1904 ... -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Wed Aug 10 16:11:45 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 30D0FBB44BB for ; Wed, 10 Aug 2016 16:11:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 726BC1E2A for ; Wed, 10 Aug 2016 16:11:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u7AGBck0073844 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 10 Aug 2016 19:11:38 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u7AGBck0073844 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u7AGBbZf073843; Wed, 10 Aug 2016 19:11:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 10 Aug 2016 19:11:37 +0300 From: Konstantin Belousov To: Hooman Fazaeli Cc: FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long Message-ID: <20160810161137.GU83214@kib.kiev.ua> References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57AB462A.2080608@gmail.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 16:11:45 -0000 On Wed, Aug 10, 2016 at 07:50:10PM +0430, Hooman Fazaeli wrote: > On 2016-08-10 18:49, Konstantin Belousov wrote: > > On Wed, Aug 10, 2016 at 06:35:15PM +0430, Hooman Fazaeli wrote: > >> Hi > >> > >> on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. > >> > >> System info: > >> ------------- > >> - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading) > >> - 4G non-ECC RAM > >> - asterisk-1.8.30.0 from ports > >> - dahdi-kmod26-2.6.1.r10738 from ports > >> - powerd disabled. > >> - Workload: ISDN & SIP call processing. > >> ------------ > >> > >> The panics are either on 'sched lock' or 'turnstile lock' spin locks. > >> > >> PANIC 1 > >> ======= > >> As below trace shows: > >> > >> 1- input arrives on a UDP socket > >> 2- doselwakeup is called. > >> 3- That wakeup call ends up in sched_add. > >> 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. > >> 5- The pancing thread does the same calls as owner thread but panics because > >> it can't grab the the same spin lock. > >> > >> > kgdb /boot/kernel/kernel /var/crash/vmcore.14 > >> ... > >> kernel trap 12 with interrupts disabled > >> spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long > (kgdb) up 4 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > 557 ../../../kern/kern_mutex.c: No such file or directory. > in ../../../kern/kern_mutex.c > > (kgdb) p *m > $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} > > ------------ > > As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported in panic string. > > (kgdb) info threads > ... > 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at ../../../kern/sched_ule.c:1904 > ... > I see. What else could be, is the spinlock leak. Can you _try_ to enable the WITNESS, without WITNESS_SKIPSPIN option. Then show alllocks from the ddb prompt after the panic could reveal the place which originally locked it. From owner-freebsd-hackers@freebsd.org Wed Aug 10 16:58:31 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 511EABB5374 for ; Wed, 10 Aug 2016 16:58:31 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-io0-x22e.google.com (mail-io0-x22e.google.com [IPv6:2607:f8b0:4001:c06::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1A02A1C8C for ; Wed, 10 Aug 2016 16:58:31 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by mail-io0-x22e.google.com with SMTP id m101so47200175ioi.2 for ; Wed, 10 Aug 2016 09:58:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GFBa4K8nsn5zwdQTAdMlZTzslbxJSXNDAmznDdfzhvw=; b=mi/k6eqrqUBkH4zSLv0gtQvh+zbxy0/s1ZmuD30IRB0n/xnw4hAjLbDUU2IjiFATnA yY3K0zm/jSaCbPMTZyQFjbWe+mNs/yGbwfK67q5Uyn+kMUH1QXlWJv7wMKIRaXmPl+WP U1GFk3A23ck6yXKjBRb92P9hH/FcIbZ5qapSQ/a92sli71BBKg1+lcc+cp4i2A27Gt6A tDKONG9DfRh/hbn+lsg6AMSx2rrFX2aqPFuA5W0vXZLwCLHzy8I8bKxgChI5wdH+uW0T PSW9lMX3BQ5R8QFPVlXDFXGIXmyZazpcSPe4zvtv22nDJRkUL4FwLkZyTaNojOWgY/C8 YgFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GFBa4K8nsn5zwdQTAdMlZTzslbxJSXNDAmznDdfzhvw=; b=HMbwRrDtGyylIlZhfHjaxTpPz0cTrNNi4bMdlrTuNX7StxRD4Aa4nzI+L1ZajcXhYR Yq4oH/ks6tIEwgBMn1e+H2SSgXrR7vt/2t11admTn6KIFTnmPnU2Vb+WKQpgzp6omVUq T26U3a5+RzKfgaMquvh/d7j0U4c0Sbn4FQIHr4RKLCh0bWvLdUjz0YLtq4Cr1x3NuFFP 2U2jl91ztN67T6GV7AK8/RvD28dbHavpcGCuhlrebWqvzQCFq0lZjaBGrH4dhr3PsHOK tKird4hxdwQXOn7+hAI/CkcRQ7mZ1SdMh6wIhG5oV+7TnZDLGGi/9IjnuZDLkqeWgDHd X28Q== X-Gm-Message-State: AEkoouuk0HBub5k26ZRIk637xSqQCUyJLcye+xhnwfrk59jOr9xwzXxCz6aq+tYWoswJsMv3UEYgZXrwI6Mr0Q== X-Received: by 10.107.129.97 with SMTP id c94mr6568140iod.102.1470848310519; Wed, 10 Aug 2016 09:58:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.200.71 with HTTP; Wed, 10 Aug 2016 09:58:29 -0700 (PDT) In-Reply-To: <57AB462A.2080608@gmail.com> References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> From: Ryan Stone Date: Wed, 10 Aug 2016 12:58:29 -0400 Message-ID: Subject: Re: 9.3-RELEASE panic: spin lock held too long To: Hooman Fazaeli Cc: Konstantin Belousov , FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 16:58:31 -0000 On Wed, Aug 10, 2016 at 11:20 AM, Hooman Fazaeli wrote: > > kgdb /boot/kernel/kernel /var/crash/vmcore.14 > ... > ... > (kgdb) bt > #0 doadump (textdump=1) at pcpu.h:250 > #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c: > 454 > #2 0xc0adeb32 in panic (fmt=) at > ../../../kern/kern_shutdown.c:642 > #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at > ../../../kern/kern_mutex.c:515 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, > file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at > ../../../kern/sched_ule.c:1153 > #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at > ../../../kern/sched_ule.c:1991 > #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at > ../../../kern/kern_synch.c:537 > #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, > pri=104) at ../../../kern/subr_sleepqueue.c:763 > #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, > queue=0) at ../../../kern/subr_sleepqueue.c:865 > #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at > ../../../kern/kern_condvar.c:448 > #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at > ../../../kern/sys_generic.c:1683 > #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at > ../../../kern/sys_generic.c:1651 > #13 0xc0a9fa59 in knote_enqueue (kn=) at > ../../../kern/kern_event.c:1786 > #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, > td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 > #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, > nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 > #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at > ../../../kern/kern_event.c:771 > #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 > #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s > :270 > #19 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) up 4 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, > file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > 557 ../../../kern/kern_mutex.c: No such file or directory. > in ../../../kern/kern_mutex.c > > (kgdb) p *m > $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = > 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} > > ------------ > > As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported > in panic string. > > (kgdb) info threads > ... > 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch > (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at > ../../../kern/sched_ule.c:1904 > ... This sounds somewhat familiar. Is it always 'sched lock 0' that is ultimately leaked? Could you try applying this patch and seeing whether the new KASSERT triggers? https://people.freebsd.org/~rstone/patches/sched_balance_kassert.diff From owner-freebsd-hackers@freebsd.org Wed Aug 10 17:20:51 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E21C5BB5C5E for ; Wed, 10 Aug 2016 17:20:51 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 75B5E1C64 for ; Wed, 10 Aug 2016 17:20:51 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id o80so10837220wme.0 for ; Wed, 10 Aug 2016 10:20:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-transfer-encoding; bh=1iclcGhFwoDjB1eGjy3e7jilmyrk3td6bFkRfrAoV0M=; b=nwt5Fdwk9QB3HVMr6ITfgFDP4MTtX/QVaZ4+hkO4o8RaZT4a5oFv9IdU8ljeDtSPZs +876v4/6AlJyduBI8bv/G/tTCE2sFziNE7ZaAtcX+DTZJzy2bhGzCeiCltct5V83CKrt 6Pyp0iEmyztPT9H+/ZQBejTJVMn8ImiL7H8mfHSxpb0qPoyD/CnX5HRwKDQn6ROETR+0 3hpu6levwDNNXT95G8hGM6ULkTos2m9ZWRAO1ad58eSRcuK6hThoeTH8lqoKwctiKGAq 0D7w0N3kp7Z2SipAcV2rTogdF8kyK1GAfCJMw7dDGhJ2ixRsx98T/w6loDiLyDs0T372 uFwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-transfer-encoding; bh=1iclcGhFwoDjB1eGjy3e7jilmyrk3td6bFkRfrAoV0M=; b=MLN1Q5tRZvW2X0BaithXDaEs6mVfbRkvkiS60AHe3Qfhc+xjJrHjf1ESvCXCsjoB3v q7iDAD8Nve4QaGhhtacNS80QP7jjkQCLTnnmWfk1RjB1yG/VQ6jUUPRio3euvmjijQ5v ikb0MAWuAbm6Ag8ZY7HLtHBayFZpDykit4qGnspaoTDhYoB/bmYbXe3mKGtAnmUGOSHD PnIuICrRbcy+z4oDV5dzh3N9DWgCEZ+PTzWO1iD9QGpdySY1fmDs5ftoqVf9CHTJt1t2 X0DBCED438z8l4n2WE/85MpJZwRg2RaTi+q2UO8bE+Qcj36bJaeEGzjFNDJN7mc9LKGj j5bg== X-Gm-Message-State: AEkoousA+Rn2qS0q3NPU2BesNW9zyAlel7B7GETGllsrCKG+UwvY8PsfJDBjqI52m9Id9g== X-Received: by 10.194.35.42 with SMTP id e10mr5065855wjj.107.1470849649915; Wed, 10 Aug 2016 10:20:49 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id m62sm9319733wmm.24.2016.08.10.10.20.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 10:20:49 -0700 (PDT) Message-ID: <57AB626C.6010904@gmail.com> Date: Wed, 10 Aug 2016 21:50:44 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Konstantin Belousov CC: FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> <20160810161137.GU83214@kib.kiev.ua> In-Reply-To: <20160810161137.GU83214@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 17:20:52 -0000 On 2016-08-10 20:41, Konstantin Belousov wrote: > On Wed, Aug 10, 2016 at 07:50:10PM +0430, Hooman Fazaeli wrote: >> On 2016-08-10 18:49, Konstantin Belousov wrote: >>> On Wed, Aug 10, 2016 at 06:35:15PM +0430, Hooman Fazaeli wrote: >>>> Hi >>>> >>>> on a 9.3-REL i386 box we have occasional "spin lock held too long" panics. >>>> >>>> System info: >>>> ------------- >>>> - Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading) >>>> - 4G non-ECC RAM >>>> - asterisk-1.8.30.0 from ports >>>> - dahdi-kmod26-2.6.1.r10738 from ports >>>> - powerd disabled. >>>> - Workload: ISDN & SIP call processing. >>>> ------------ >>>> >>>> The panics are either on 'sched lock' or 'turnstile lock' spin locks. >>>> >>>> PANIC 1 >>>> ======= >>>> As below trace shows: >>>> >>>> 1- input arrives on a UDP socket >>>> 2- doselwakeup is called. >>>> 3- That wakeup call ends up in sched_add. >>>> 4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time. >>>> 5- The pancing thread does the same calls as owner thread but panics because >>>> it can't grab the the same spin lock. >>>> >>>> > kgdb /boot/kernel/kernel /var/crash/vmcore.14 >>>> ... >>>> kernel trap 12 with interrupts disabled >>>> spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long >> (kgdb) up 4 >> #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 >> 557 ../../../kern/kern_mutex.c: No such file or directory. >> in ../../../kern/kern_mutex.c >> >> (kgdb) p *m >> $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} >> >> ------------ >> >> As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported in panic string. >> >> (kgdb) info threads >> ... >> 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at ../../../kern/sched_ule.c:1904 >> ... >> > I see. What else could be, is the spinlock leak. > Can you _try_ to enable the WITNESS, without WITNESS_SKIPSPIN option. > Then show alllocks from the ddb prompt after the panic could reveal > the place which originally locked it. I will (it may take a while) Thanks. -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Wed Aug 10 17:24:02 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6D4CBB5DC2 for ; Wed, 10 Aug 2016 17:24:02 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x22a.google.com (mail-wm0-x22a.google.com [IPv6:2a00:1450:400c:c09::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 25D841EAD for ; Wed, 10 Aug 2016 17:24:02 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x22a.google.com with SMTP id q128so107051459wma.1 for ; Wed, 10 Aug 2016 10:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to; bh=w9qA69b5PmjssCKhFLpjS1HvCcGMr99t+jHQfU6iVps=; b=0SXx4a/RWAswN8A9ozLjSFhcTImfZfcbq7WhfKlwlUFMQrgVZG5T+ua180cLfwTS50 W5o50nrOz2msGnTnfMGXE67ZbkHvxL/OS/8dexUc5aPT7pUBX0rDlJqwYoEMDqbrQ+TG 4NFvVH+Ts8wDyDPVH0LU808W0Ukp1WfHqm3f07FKiNqLAnvRjU5IXsUfsxSBUitfMurB xPdhzCJilHw33kTKqiiNDmFrGMzjgQELwdvzIEVa21KxejeSiXSnLcAUTUE0+LOzx+hk dVZ548AfGb7bohp8ZawdrOoUSKbgGmtEAh2hE9EJEdZJtnRPovLVQaJnUJHQFd/4+reZ gaUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to; bh=w9qA69b5PmjssCKhFLpjS1HvCcGMr99t+jHQfU6iVps=; b=Jj2MqbUyPmDMH167hHElsRT72ptHEejxmiI6UT1SJKxn3c6e12j+iIDH4XQj08kd04 hNfWc77mIx43qOjwrSHEFShqdvkMrc4YUKhgtyIskmBYF7SdZ9sJjtu9cRvt57maCxNI 7jh73p4U8vcEGWAiTJPp11ps6nF7k7LkOE28kYKlGgbLW4kFgSO3SLXnfDOqm1x+zybk /OxReX0EdbzDb1nsCkTSrUpRqQUgx3DrrEMdxNPA7JohNa9sWMOITxjpn/W0H9x5ugcw N7zdAHRgR6047fJzBV11IDPy+qAWZcjFuh/gGXg9eEXhvSIymMGJYsRUFvMx3mo9S+tj G4XQ== X-Gm-Message-State: AEkoousuFWE7aOs0nx5wRLJAEPgVcYiqT4TA1GSwNFkGFBz6afsIrcDS8GWSwbISmNKvww== X-Received: by 10.194.104.106 with SMTP id gd10mr5871783wjb.55.1470849840690; Wed, 10 Aug 2016 10:24:00 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id m81sm9362872wmf.1.2016.08.10.10.23.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 10:24:00 -0700 (PDT) Message-ID: <57AB632D.4000501@gmail.com> Date: Wed, 10 Aug 2016 21:53:57 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Ryan Stone CC: Konstantin Belousov , FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 17:24:02 -0000 On 2016-08-10 21:28, Ryan Stone wrote: > On Wed, Aug 10, 2016 at 11:20 AM, Hooman Fazaeli > wrote: > > > kgdb /boot/kernel/kernel /var/crash/vmcore.14 > ... > ... > (kgdb) bt > #0 doadump (textdump=1) at pcpu.h:250 > #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454 > #2 0xc0adeb32 in panic (fmt=) at ../../../kern/kern_shutdown.c:642 > #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153 > #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991 > #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537 > #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763 > #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865 > #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448 > #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683 > #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651 > #13 0xc0a9fa59 in knote_enqueue (kn=) at ../../../kern/kern_event.c:1786 > #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 > #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 > #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771 > #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 > #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270 > #19 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) up 4 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > 557 ../../../kern/kern_mutex.c: No such file or directory. > in ../../../kern/kern_mutex.c > > (kgdb) p *m > $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} > > ------------ > > As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported in panic string. > > (kgdb) info threads > ... > 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at ../../../kern/sched_ule.c:1904 > ... > > > This sounds somewhat familiar. Is it always 'sched lock 0' that is ultimately leaked? Could you try applying this patch and seeing whether the new KASSERT triggers? > > https://people.freebsd.org/~rstone/patches/sched_balance_kassert.diff > No. I have panics involving 'turnstile lock' (see the original post) and 'sched lock 2' too. -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Wed Aug 10 17:40:21 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D41B9BB5023 for ; Wed, 10 Aug 2016 17:40:21 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-io0-x233.google.com (mail-io0-x233.google.com [IPv6:2607:f8b0:4001:c06::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9CB3A154F for ; Wed, 10 Aug 2016 17:40:21 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by mail-io0-x233.google.com with SMTP id b62so48332060iod.3 for ; Wed, 10 Aug 2016 10:40:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=UDm8B3kp7vHNzH/FcFeepkOOvl2n+eawAus7UVbXnx0=; b=cfoUvNEh30LrYm/ZQ0YwMHAGdivcBRLGrHIUak0G2Zo8Kan6mLZ71/p+w43R76jwtX 5xHaDVp8YnSHcgangXwYQvb65CLxIfVHardxpzzI3/It/bVfPvz+0PRrZV/Sd3jb0ULg EaGnOK1QZV6Dmo4/grSgG4tBh3buBHlS8GmZnCvJ+7NgMBqRrG04H+wncIqyyI6YIoZl r+/8X7hVnewlR8VuVJvVvkbPqetnGbibQpe/emBhKwVvIsA4wD9bPljOzJ/84jilvHJ1 7zVtAXDlageM7XGj2msiAQl57CJ+9O+c8AcU/Nyn9IFEXCH1sorZlU67LRRjvyWlb9lU 6nbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=UDm8B3kp7vHNzH/FcFeepkOOvl2n+eawAus7UVbXnx0=; b=IUKigQ+e/lkrDu+RJMwHONUdvWrC96R/1xDyPVkECs+1OsopHajPJiaK5cVOMen0VY t+EeooV5fcT0B/jGOOHPhWaRIZtgNz8aYZKP99TQdKhf2W1c/rqcHgP1afdVmCXdB2u9 TQsEmSrv9Dllg/Wp77XpS/IelGU1L6m56EnhO4TI/15mMTgxVVNgYnC20+xwMIc+eohs fHL9RbLqCI6fEu7Qxm9V3FyZqK1S3LZHPxlfp6fejYM+o46fdCQvPlRNn3ztfeHiAB0I 31AB4t/6fSDRqKOe0rjuNmlty5NMmGu7hByPhmFqSygFPBbHbYkRgXTd0Xz2hKhTUaLx rDPw== X-Gm-Message-State: AEkoouvCIVQlNJOSGmIawbI4oZd2T3AiZL8p/7YUVdhy+e9O6rkonoggm1K4hhSSZyKdpfB1YhW5FMvIDkpQhQ== X-Received: by 10.107.150.83 with SMTP id y80mr7004984iod.113.1470850821081; Wed, 10 Aug 2016 10:40:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.200.71 with HTTP; Wed, 10 Aug 2016 10:40:20 -0700 (PDT) In-Reply-To: <57AB632D.4000501@gmail.com> References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> <57AB632D.4000501@gmail.com> From: Ryan Stone Date: Wed, 10 Aug 2016 13:40:20 -0400 Message-ID: Subject: Re: 9.3-RELEASE panic: spin lock held too long To: Hooman Fazaeli Cc: Konstantin Belousov , FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 17:40:21 -0000 On Wed, Aug 10, 2016 at 1:23 PM, Hooman Fazaeli wrote: > No. I have panics involving 'turnstile lock' (see the original post) and > 'sched lock 2' too. > That doesn't necessarily mean that the root cause isn't due to sched lock 0 being leaked. You'd have to dig into the cores and look at the chain of dependent locks to be sure. Give the patch a try; it should panic quite quickly if it's the issue I am thinking of. From owner-freebsd-hackers@freebsd.org Wed Aug 10 23:13:29 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4324EBB5D61 for ; Wed, 10 Aug 2016 23:13:29 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C42D611DE for ; Wed, 10 Aug 2016 23:13:28 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x230.google.com with SMTP id q128so121130469wma.1 for ; Wed, 10 Aug 2016 16:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to; bh=yRRqnDLqHjo/jgDCBJ3HS0xA1SW8N24DoKwWhYG2T8A=; b=YNNevA7jyMHxjvWse30Ac5rCl9EPKvzf6x+yk1Uu/MX8qOKplyl4TbDpknTKshSmkU D+TQDQ469ZY4sHuCNxChmvRDgFuJkAKR1lZS9VIldwsW6EZWyfVHuljNr8yxweROcfvD mjZMJyHNoqM5aQHUiF2krTO4QzjZ5SOJLX0+pm4XoawXlHE5TZj+2BRipsqp+oa2fyPA ErcVgnQymS25hjTwBcuQ96lGnlA0eQCq8bwBnimO7lqCiwB293T8KMAYp65cQ/MkLQMa mHcO+e4rY2e11EgWHx6Qd3IBb7jSxQjF0rez6CA34I9A8GrYkOtQ3TIw/F0Zg+LT4e/i YWNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to; bh=yRRqnDLqHjo/jgDCBJ3HS0xA1SW8N24DoKwWhYG2T8A=; b=GVZZxxwG2cEWAhheBlHEAcIJkoO25+8GGav00aKyABZd545BMrzoAX504qFHovixJB XUZMc1hPelcpqPFxBbaO+/RNWdYENnyKD3Vz6x9rna23rLzMDzURVHvigUhYy1vTbHnM LqkVfXqwOuxD21hjSJNO9oi4jsxxkX7s5YVxnPXX3RDEDKpidMtk3WlcJUgf0n2izHCr Z6LUgl33lH42jc3qez3vwHPeOEq8NzYZsNEXg3j5quvr6rhuB8IutBWGl085g5aUUjGH Ax0ZhYI7z0Wd7w55qkSv1ayur49kvCRMGIY8S2ipRTNXehXUn5BqPhnpRlZgt/zpQU3R fEJw== X-Gm-Message-State: AEkooutI0gqIqaRySZL0OwrUlPcRK/GgdLxubfwrBo8CeTWZonW+e8FyV08mUDiYLjWEFQ== X-Received: by 10.194.190.232 with SMTP id gt8mr6280705wjc.141.1470870805863; Wed, 10 Aug 2016 16:13:25 -0700 (PDT) Received: from [192.168.2.30] ([2.190.216.101]) by smtp.googlemail.com with ESMTPSA id cw7sm11347486wjb.38.2016.08.10.16.13.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Aug 2016 16:13:25 -0700 (PDT) Message-ID: <57ABB512.4030503@gmail.com> Date: Thu, 11 Aug 2016 03:43:22 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Ryan Stone CC: Konstantin Belousov , FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> <57AB632D.4000501@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2016 23:13:29 -0000 On 2016-08-10 22:10, Ryan Stone wrote: > On Wed, Aug 10, 2016 at 1:23 PM, Hooman Fazaeli > wrote: > > No. I have panics involving 'turnstile lock' (see the original post) and 'sched lock 2' too. > > > That doesn't necessarily mean that the root cause isn't due to sched lock 0 being leaked. You'd have to dig into the cores and look at the chain of dependent locks to be sure. Give the patch a > try; it should panic quite quickly if it's the issue I am thinking of. Sure, I will. BTW, what do you exactly mean by lock leaking? Is there a list for the possible causes of 'spin lock held too long' panics? I mean, what sorts of coding bugs may cause a thread to hold a spin lock for a long time? Such a list would provide me an starting point for diagnostics. And, How much long is 'too long'? What is the justification behind the few million for() loop iterations that _mtx_lock_spin waits to grab a spin lock? Thanks. -- Best regards Hooman Fazaeli From owner-freebsd-hackers@freebsd.org Thu Aug 11 08:18:07 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7EF0ABB59AF for ; Thu, 11 Aug 2016 08:18:07 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1614415CE for ; Thu, 11 Aug 2016 08:18:06 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u7B8I21L017390 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 11 Aug 2016 11:18:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u7B8I21L017390 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u7B8I2Pk017389; Thu, 11 Aug 2016 11:18:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 11 Aug 2016 11:18:02 +0300 From: Konstantin Belousov To: Hooman Fazaeli Cc: Ryan Stone , FreeBSD Hackers Subject: Re: 9.3-RELEASE panic: spin lock held too long Message-ID: <20160811081802.GF83214@kib.kiev.ua> References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com> <57AB632D.4000501@gmail.com> <57ABB512.4030503@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57ABB512.4030503@gmail.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,FREEMAIL_REPLY,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2016 08:18:07 -0000 On Thu, Aug 11, 2016 at 03:43:22AM +0430, Hooman Fazaeli wrote: > On 2016-08-10 22:10, Ryan Stone wrote: > > On Wed, Aug 10, 2016 at 1:23 PM, Hooman Fazaeli > wrote: > > > > No. I have panics involving 'turnstile lock' (see the original post) and 'sched lock 2' too. > > > > > > That doesn't necessarily mean that the root cause isn't due to sched lock 0 being leaked. You'd have to dig into the cores and look at the chain of dependent locks to be sure. Give the patch a > > try; it should panic quite quickly if it's the issue I am thinking of. > > Sure, I will. > BTW, what do you exactly mean by lock leaking? > > Is there a list for the possible causes of 'spin lock held too long' panics? > I mean, what sorts of coding bugs may cause a thread to hold a spin lock for > a long time? Such a list would provide me an starting point for diagnostics. It is impossible to provide the complete list. Possible causes are: - already mentioned lock leak; - lock recursion (sometimes); - something which delays execution of the protected region, which takes the spinlock for otherwise legitimate reasons and period, eg. infinite or too aggressive looping, e.g. due to a deadlock with spinlocks; NMI with run-away handler; failed and stopped executing core; SMI or hypervisor taking control off the OS on the given CPU, while allowing other thread on other CPU to run and notice that. and so on. > > And, How much long is 'too long'? What is the justification behind > the few million for() loop iterations that _mtx_lock_spin waits > to grab a spin lock? This is purely based on real-life experience on the hardware. If faster CPUs with slower inter-core communication facilities ever appear, the constant might need an adjustment. It is fine for currently fastest hardware, and by design is ok for anything slower. From owner-freebsd-hackers@freebsd.org Thu Aug 11 17:54:11 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 63BF5BB5981 for ; Thu, 11 Aug 2016 17:54:11 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4DD961D89 for ; Thu, 11 Aug 2016 17:54:11 +0000 (UTC) (envelope-from david@catwhisker.org) Received: by mailman.ysv.freebsd.org (Postfix) id 4D2D7BB5980; Thu, 11 Aug 2016 17:54:11 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4CD21BB597F for ; Thu, 11 Aug 2016 17:54:11 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (mx.catwhisker.org [198.144.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2168A1D88 for ; Thu, 11 Aug 2016 17:54:10 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.15.2/8.15.2) with ESMTP id u7BHs9fe033417 for ; Thu, 11 Aug 2016 17:54:09 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.15.2/8.15.2/Submit) id u7BHs9Gt033416 for hackers@freebsd.org; Thu, 11 Aug 2016 10:54:09 -0700 (PDT) (envelope-from david) Date: Thu, 11 Aug 2016 10:54:09 -0700 From: David Wolfskill To: hackers@freebsd.org Subject: "ipmi0: KCS..." whines Message-ID: <20160811175409.GW1112@albert.catwhisker.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="hn090KliNom6wizA" Content-Disposition: inline User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2016 17:54:11 -0000 --hn090KliNom6wizA Content-Type: multipart/mixed; boundary="3xvobRwI6W2FSfVC" Content-Disposition: inline --3xvobRwI6W2FSfVC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I'm trying to figure out how to cope with some messages: =2E.. kernel: ipmi0: KCS: Failed to read completion code =2E.. kernel: ipmi0: KCS error: ff that recently started showing up on some machines at work (which run somethinng based on FreeBSD). As the messages started showing up shortly after a colleague committed some code that would issue "ipmitool dcmi power reading" periodically, I tried issuing the command myself... to no (initial) effect. I then tried doing it in a loop: foreach n (`jot 10`) ipmitool dcmi power reading end Nothing. But then, 10 isn't a very big number, so I tried again with 32 -- and got a few. That said, I have seen no evidence of data corruption or mis-reporting that coincides with the messages being issued: at this point, the messages seem to merely be annoying, and there's nothing I can do to stop getting them (except to not issue "ipmitool dcmi power reading", perhaps). Also, I have no real evidence that whatever is causing the messages to be issued "in real life" is actually related to my "articial" recreation of the symptoms. I poked around a bit, and found that the messages are issued from src/sys/dev/ipmi/ipmi_kcs.c, where device_printf(9) is used to emit them. (Eventually, that gets around to using printf(9), which is why I was unable to merely use syslog.conf to stop spamming the console.) As a proof-of-concept, I cobbled up the attached patch (with significant help from a colleague). It seems to do as intended: suppress the messages for a "normal" boot, but emit them for a verbose boot. I'm not at all happy with that approach: it's way too far on the "treating the symptom" side than "fixing the root cause." On the other hand, it's not clear that emitting a message about which I can do nothing useful is all that ... useful. :-} But ... none of the machines to which I have access and have the requisite hardware are running stock FreeBSD. Is anyone able to reproduce this on a stock FreeBSD machine? Better: anyone have a better idea how to address thhe issue? Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --3xvobRwI6W2FSfVC Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="ipmi_kcs.c.diff" Content-Transfer-Encoding: quoted-printable diff --git a/sys/dev/ipmi/ipmi_kcs.c b/sys/dev/ipmi/ipmi_kcs.c --- a/sys/dev/ipmi/ipmi_kcs.c +++ b/sys/dev/ipmi/ipmi_kcs.c @@ -148,7 +148,7 @@ =20 /* Read error status */ data =3D INB(sc, KCS_DATA); - if (data !=3D 0) + if (data !=3D 0 && (data !=3D 0xff || bootverbose)) device_printf(sc->ipmi_dev, "KCS error: %02x\n", data); =20 @@ -414,8 +414,10 @@ =20 /* Next we read the completion code. */ if (kcs_read_byte(sc, &req->ir_compcode) !=3D 1) { - device_printf(sc->ipmi_dev, - "KCS: Failed to read completion code\n"); + if (bootverbose) { + device_printf(sc->ipmi_dev, + "KCS: Failed to read completion code\n"); + } goto fail; } #ifdef KCS_DEBUG --3xvobRwI6W2FSfVC-- --hn090KliNom6wizA Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJXrLvBXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XS4UH/A48/9XlKrqqtHK+okEr6FJ0 8Lx1mPYHLRE3+qX9XEjBawJT0P9wLLWPcL6jDaWJvTsyLjif2bGlMmsDNgTxsHy+ siHzJdusVA/GgWtOUtekH9QtMpKszmTPks0wEpheRoGx+HMLZIS3QewM+Sbw9dL5 X7o81vE4Kksh5GBSqrwuNSYbk1kaCl3QjlVWsdcnSafQtM5cWaBfel/lOUeIqmjn zJIcE1ZVUR7vaHEZDBvBAhJtamv6WQSGiAvsdDSu8s8F3ZTyOv/T1MQeVavFmUHM heXHX98wc+8NziTVYdbwNA2tyfbvI0ffKwvJsxwoaJpesZzsIaAkMlvSVdygf40= =0YL2 -----END PGP SIGNATURE----- --hn090KliNom6wizA-- From owner-freebsd-hackers@freebsd.org Fri Aug 12 06:35:37 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B4BC8BB7A58 for ; Fri, 12 Aug 2016 06:35:37 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.210]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7188216D4; Fri, 12 Aug 2016 06:35:37 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from bach.cs.huji.ac.il ([132.65.81.13]) by kabab.cs.huji.ac.il with esmtp id 1bY63y-000PGa-Ig; Fri, 12 Aug 2016 09:35:26 +0300 From: Daniel Braniss Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: autofs/special_hosts issue Date: Fri, 12 Aug 2016 09:35:26 +0300 Message-Id: <174CC64D-D9C2-46B7-A21C-D9A95423D137@cs.huji.ac.il> Cc: =?utf-8?Q?Edward_Tomasz_Napiera=C5=82a?= To: "freebsd-hackers@freebsd.org" Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 06:35:37 -0000 hi, show mount -e serverhost =E2=80=A6 /z/a/b Everyone /z/a Everyone =E2=80=A6 but /net/serverhost/z/a/b is empty. is there a workaround? cheers, danny From owner-freebsd-hackers@freebsd.org Fri Aug 12 11:27:59 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D13E3BB7262 for ; Fri, 12 Aug 2016 11:27:59 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com [IPv6:2a00:1450:400c:c09::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5F67B19BA for ; Fri, 12 Aug 2016 11:27:59 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by mail-wm0-x233.google.com with SMTP id f65so22580101wmi.0 for ; Fri, 12 Aug 2016 04:27:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=KqvQkRDXaRcBt+RGTJK9QiZ3KAS3x4CYub9Hd6zyCYk=; b=RZaqLCu+xzTQVhbmuu+zIDg2bCGM1qqPjesfTwQPyYKqiPmNLyIld7F7adPsrJHgQw B8i0zpheVPrI0xqBe4v1gofLzxbOqxs/IWqQTp0YPVWv/ZWhMqK9ZJ1dENOnLuw1JP5J 0WjF08XWbpaBBBOd5pV4vo0N/Gnr7eMy61kii9ndQq4AYosbA75hBTA0vnVVioN0FmnJ 5RHj6RZe/29PlFECXCS6iG/bcT9vDPy3PDCycoYdeX38cOYK848hX2gD6HhtteHj227y LOjINA+sfc1hrX/auQMdas0qKWllFzeXKjG6gXBF2aatkVUCt3XLkc1DkcvhKDpj9fw1 VwpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=KqvQkRDXaRcBt+RGTJK9QiZ3KAS3x4CYub9Hd6zyCYk=; b=HnFeXLwQLn0CMceykycZi2tqlGgNlGSnFKkb+EcrLQrII3Qx7iVMybXnkt9mXAUp/1 g5Iibzms8Oo9U3H7eaYVTzklJfSglMhA213cRaanLxUeBAuV5DrMRTzWAdyE9Yzm1ihj MLM3gv5X2D9arNYFi1dJNQ8JGrlRA92pNvKLAyl7PPoTeJ4VwU+2jDaPAChHEWOfXmUx 3xebdJ8FUeW1XbSAiniQjT+CpnW5aNUfCysuCx7/Az44jj3yftCplgW9TqpG3Qdvo28J IAaAqTrufk1Srn8/V+XZ9wFaNB2ifE9Puceq7sGlblZ2EPUqRP/WKwgjov6VdavAwMPh 3bKQ== X-Gm-Message-State: AEkoouun5shhJN8jR4rLVnhy/Mfg/ZOHPUezB6TrspvjfJR72mM1pJaQ/E6BWAKRjM4SbA== X-Received: by 10.194.175.38 with SMTP id bx6mr16984936wjc.47.1471001277385; Fri, 12 Aug 2016 04:27:57 -0700 (PDT) Received: from brick (abvv96.neoplus.adsl.tpnet.pl. [83.8.219.96]) by smtp.gmail.com with ESMTPSA id e10sm7118231wjc.21.2016.08.12.04.27.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Aug 2016 04:27:56 -0700 (PDT) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Date: Fri, 12 Aug 2016 13:27:53 +0200 From: Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= To: Daniel Braniss Cc: "freebsd-hackers@freebsd.org" Subject: Re: autofs/special_hosts issue Message-ID: <20160812112753.GA28317@brick> Mail-Followup-To: Daniel Braniss , "freebsd-hackers@freebsd.org" References: <174CC64D-D9C2-46B7-A21C-D9A95423D137@cs.huji.ac.il> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <174CC64D-D9C2-46B7-A21C-D9A95423D137@cs.huji.ac.il> User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 11:27:59 -0000 On 0812T0935, Daniel Braniss wrote: > hi, > show mount -e serverhost > ā€¦ > /z/a/b Everyone > /z/a Everyone > ā€¦ > > but /net/serverhost/z/a/b is empty. > > is there a workaround? Kind of - you can either change the server's exported layout, or use a static map instead of -hosts and manually map that layout onto something "flat" (without nested mounts) on your side. From owner-freebsd-hackers@freebsd.org Fri Aug 12 17:35:17 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 72FBDBB8439 for ; Fri, 12 Aug 2016 17:35:17 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-yb0-x22b.google.com (mail-yb0-x22b.google.com [IPv6:2607:f8b0:4002:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3473C12D7 for ; Fri, 12 Aug 2016 17:35:17 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: by mail-yb0-x22b.google.com with SMTP id d10so5577402ybi.1 for ; Fri, 12 Aug 2016 10:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=1VirmK7KnlgpHZDS+Ar6YCwVFU5VUXjoBRlcG7hRxB8=; b=s7si5rbwx73cThmDGuJldWq5zGUbjCyEsTb9ynASvQ59YOvsLeMh9JBsT2vmeyrtzZ +98t/iAfNEa/YejA6rhcJHjSfIRXjz9iNsq4GVcKJ7ZlMWV4PNukfOhIMl/2tKeGSIDy j09D8ahJTyrbKNDsDFW4pDvchnWAj1m3tlFNwqdTKjN6kfsTScaY6mPnzAX7X2dG/0GL IWSmkz/05Kga/2EvKY6ycnssSdii3OQwIxbkOFFffg5Z1dKgV+TYAHYUNu+28/xD/JYf orH7ozExwOIPTdPI1MRgwYI/eTtD0it6NOcgW93XhUzDBqe1qtRE5dpZrhtRcVLwsazC rH2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=1VirmK7KnlgpHZDS+Ar6YCwVFU5VUXjoBRlcG7hRxB8=; b=GZLP70C3btBBcsDWnETbjsAbGxSyfvkkx9RS9SmCF8R7u/EvB6maEznKRAbgp0Y4TU DQhFXzEG7fv9vb2XuHjVZjU9v9Aj8jGJSZMxIuw0YTEJtoAIj9DpK33BjmGzjM1kpp8i b/TOYRLxNYs1NU5WNxZLawd+qnhQbgky0OnVdGTKh9wB0Z3XUDTVklj5B+xVs0wTKKvp xnA2AKQIIyCCInJ86MOar4wApZfW6AH0Y8G8NEDXKuCrtEHAYQ9YAxZiPF6KrwuSweOh PZHpwF1CgtxpubnWFlRTW9uKeW7ejeFxz1byhL561aV3zTRU3uSkLBFSfhcdNg6wCVOR Oalg== X-Gm-Message-State: AEkoouuOQuTrI8Y10MG+1PIJfBDlLLu5blwTRBoe6Z8n4gIYaiXUdN8sRsPHqv0jYJZPptYwjoSPBbnbFhFOgg== X-Received: by 10.37.118.68 with SMTP id r65mr10768312ybc.3.1471023316041; Fri, 12 Aug 2016 10:35:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.161.37 with HTTP; Fri, 12 Aug 2016 10:35:15 -0700 (PDT) From: Zaphod Beeblebrox Date: Fri, 12 Aug 2016 13:35:15 -0400 Message-ID: Subject: Panic not dumping to USB. To: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 17:35:17 -0000 My swap spaces are too small to hold some of the panic's. Either that, or this is also happening to them. I added a 64G USB drive to the machine for the express purpose of crash dumps. I formatted it with GPT and a label (so that it wouldn't get mixed up with SCSI drives). The first partition has the type freebsd-swap and the label 'crash' ... so /dev/gpt/crash is my configured crash dump location. Luckily I have a serial console, so this is what's happening at crash-time: panic: dva_get_dsize_sync(): bad DVA 1573890:1587590144 cpuid = 0 KDB: stack backtrace: #0 0xffffffff80e55fed at kdb_backtrace+0x8d #1 0xffffffff80df76af at vpanic+0x1df #2 0xffffffff80df74d0 at vpanic+0 #3 0xffffffff822b8b01 at dva_get_dsize_sync+0xb1 #4 0xffffffff822b8d9e at bp_get_dsize+0x12e #5 0xffffffff8225a786 at dmu_tx_count_free+0x2f6 #6 0xffffffff8225a334 at dmu_tx_hold_free+0x384 #7 0xffffffff8223e9aa at dmu_free_long_range_impl+0xfa #8 0xffffffff8223e850 at dmu_free_long_range+0x60 #9 0xffffffff82312dfe at zfs_rmnode+0xbe #10 0xffffffff822ed36f at zfs_zinactive+0xef #11 0xffffffff82339da7 at zfs_freebsd_reclaim+0x87 #12 0xffffffff81619894 at VOP_RECLAIM_APV+0x174 #13 0xffffffff80f15e39 at VOP_RECLAIM+0x39 #14 0xffffffff80f0ff99 at vgonel+0x3e9 #15 0xffffffff80f105e1 at vrecycle+0xb1 #16 0xffffffff8233614e at zfs_inactive+0x1ae #17 0xffffffff82339d14 at zfs_freebsd_inactive+0x34 Uptime: 1d5h2m19s Dumping 10472 out of 49114 MB:..1%..11%SESC[0mESC[2;30;40mESC[0mESC[1m^@ESC[01;0 ... obviously, that last bit after 11% is the console rebooting. Note that dumping begins after Uptime. Is that normal? In this case, the 10472 would fit on the 16G swap, but I'm not sure what the problem is. It definitely fits on the 64G USB stick. From owner-freebsd-hackers@freebsd.org Fri Aug 12 18:34:52 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2DCEEBB73AE for ; Fri, 12 Aug 2016 18:34:52 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-io0-f179.google.com (mail-io0-f179.google.com [209.85.223.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C16C615FC for ; Fri, 12 Aug 2016 18:34:51 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-io0-f179.google.com with SMTP id b62so32233072iod.3 for ; Fri, 12 Aug 2016 11:34:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:cc; bh=3qGfGOglMRtKy5i9D5gCL3G4U55m70j12SJDo9JWqsk=; b=RuWjO9/S2cYkeLkBu3HvqwM44VZ3v27Niz4UbSdgp/B3urnXuFsSEye9QoKwYdYOLQ TLinZOZmOgSojdncNStiQRIuRMSDoxtFY3quzNEs8YiVBR8Mzy8Y644eKjIO4J/oUxkg jS8Br792wYSZqv0i8Vh8iazmMYnehg7wYx27a3ERTmfE3K0e0v1vkPVKKR1EbImWrEaW zg4woHaIYnuZNNkowO3p4dJYAQrdPIYNEG5as/pC74ITbwe5BTmkak206z3LBuPa/9Cc JsMLmDBOd3h37cSU9drNuquRKeF7bE8JZePzg4lj3L6XYmzWBW3uk6GDAGTGzknrYG+R PsBA== X-Gm-Message-State: AEkoouuqo41IIKYE21+dykcyGEvg90Avl46xzeP0fyrTEwIOGtNYNf9tSEKpOKCsNIh8Sg== X-Received: by 10.107.129.152 with SMTP id l24mr20005891ioi.179.1471026884672; Fri, 12 Aug 2016 11:34:44 -0700 (PDT) Received: from mail-it0-f48.google.com (mail-it0-f48.google.com. [209.85.214.48]) by smtp.gmail.com with ESMTPSA id o74sm3947380ioe.37.2016.08.12.11.34.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Aug 2016 11:34:44 -0700 (PDT) Received: by mail-it0-f48.google.com with SMTP id x130so18500417ite.1 for ; Fri, 12 Aug 2016 11:34:44 -0700 (PDT) X-Received: by 10.36.29.15 with SMTP id 15mr264402itj.97.1471026884165; Fri, 12 Aug 2016 11:34:44 -0700 (PDT) MIME-Version: 1.0 Reply-To: cem@freebsd.org Received: by 10.36.220.129 with HTTP; Fri, 12 Aug 2016 11:34:43 -0700 (PDT) In-Reply-To: References: From: Conrad Meyer Date: Fri, 12 Aug 2016 11:34:43 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Panic not dumping to USB. To: Zaphod Beeblebrox Cc: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 18:34:52 -0000 Hi Zaphod, I don't think USB disks support polled I/O, which is needed for dump devices. As far as I know only classic disk devices (ATA, SCSI) work. Best, Conrad On Fri, Aug 12, 2016 at 10:35 AM, Zaphod Beeblebrox wrote: > My swap spaces are too small to hold some of the panic's. Either that, or > this is also happening to them. I added a 64G USB drive to the machine for > the express purpose of crash dumps. I formatted it with GPT and a label > (so that it wouldn't get mixed up with SCSI drives). The first partition > has the type freebsd-swap and the label 'crash' ... so /dev/gpt/crash is my > configured crash dump location. > > Luckily I have a serial console, so this is what's happening at crash-time: > > panic: dva_get_dsize_sync(): bad DVA 1573890:1587590144 > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80e55fed at kdb_backtrace+0x8d > #1 0xffffffff80df76af at vpanic+0x1df > #2 0xffffffff80df74d0 at vpanic+0 > #3 0xffffffff822b8b01 at dva_get_dsize_sync+0xb1 > #4 0xffffffff822b8d9e at bp_get_dsize+0x12e > #5 0xffffffff8225a786 at dmu_tx_count_free+0x2f6 > #6 0xffffffff8225a334 at dmu_tx_hold_free+0x384 > #7 0xffffffff8223e9aa at dmu_free_long_range_impl+0xfa > #8 0xffffffff8223e850 at dmu_free_long_range+0x60 > #9 0xffffffff82312dfe at zfs_rmnode+0xbe > #10 0xffffffff822ed36f at zfs_zinactive+0xef > #11 0xffffffff82339da7 at zfs_freebsd_reclaim+0x87 > #12 0xffffffff81619894 at VOP_RECLAIM_APV+0x174 > #13 0xffffffff80f15e39 at VOP_RECLAIM+0x39 > #14 0xffffffff80f0ff99 at vgonel+0x3e9 > #15 0xffffffff80f105e1 at vrecycle+0xb1 > #16 0xffffffff8233614e at zfs_inactive+0x1ae > #17 0xffffffff82339d14 at zfs_freebsd_inactive+0x34 > Uptime: 1d5h2m19s > Dumping 10472 out of 49114 > MB:..1%..11%SESC[0mESC[2;30;40mESC[0mESC[1m^@ESC[01;0 > > ... obviously, that last bit after 11% is the console rebooting. Note that > dumping begins after Uptime. Is that normal? In this case, the 10472 > would fit on the 16G swap, but I'm not sure what the problem is. It > definitely fits on the 64G USB stick. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@freebsd.org Fri Aug 12 18:41:48 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4BCD7BB76A9 for ; Fri, 12 Aug 2016 18:41:48 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [IPv6:2001:470:1f05:b76::196]) by mx1.freebsd.org (Postfix) with ESMTP id 3AAB01E0E for ; Fri, 12 Aug 2016 18:41:48 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from AlfredMacbookAir.local (unknown [IPv6:2601:645:8003:a4d6:d15b:b8f7:2960:5efc]) by elvis.mu.org (Postfix) with ESMTPSA id 2754E346DE30; Fri, 12 Aug 2016 11:41:42 -0700 (PDT) Subject: Re: Panic not dumping to USB. To: Zaphod Beeblebrox , FreeBSD Hackers References: From: Alfred Perlstein Organization: FreeBSD Message-ID: <483f7abd-d19f-ce22-79b8-d5d5501666d0@freebsd.org> Date: Fri, 12 Aug 2016 11:41:41 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 18:41:48 -0000 On 8/12/16 10:35 AM, Zaphod Beeblebrox wrote: > My swap spaces are too small to hold some of the panic's. Either that, or > this is also happening to them. I added a 64G USB drive to the machine for > the express purpose of crash dumps. I formatted it with GPT and a label > (so that it wouldn't get mixed up with SCSI drives). The first partition > has the type freebsd-swap and the label 'crash' ... so /dev/gpt/crash is my > configured crash dump location. > > Luckily I have a serial console, so this is what's happening at crash-time: > > panic: dva_get_dsize_sync(): bad DVA 1573890:1587590144 > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80e55fed at kdb_backtrace+0x8d > #1 0xffffffff80df76af at vpanic+0x1df > #2 0xffffffff80df74d0 at vpanic+0 > #3 0xffffffff822b8b01 at dva_get_dsize_sync+0xb1 > #4 0xffffffff822b8d9e at bp_get_dsize+0x12e > #5 0xffffffff8225a786 at dmu_tx_count_free+0x2f6 > #6 0xffffffff8225a334 at dmu_tx_hold_free+0x384 > #7 0xffffffff8223e9aa at dmu_free_long_range_impl+0xfa > #8 0xffffffff8223e850 at dmu_free_long_range+0x60 > #9 0xffffffff82312dfe at zfs_rmnode+0xbe > #10 0xffffffff822ed36f at zfs_zinactive+0xef > #11 0xffffffff82339da7 at zfs_freebsd_reclaim+0x87 > #12 0xffffffff81619894 at VOP_RECLAIM_APV+0x174 > #13 0xffffffff80f15e39 at VOP_RECLAIM+0x39 > #14 0xffffffff80f0ff99 at vgonel+0x3e9 > #15 0xffffffff80f105e1 at vrecycle+0xb1 > #16 0xffffffff8233614e at zfs_inactive+0x1ae > #17 0xffffffff82339d14 at zfs_freebsd_inactive+0x34 > Uptime: 1d5h2m19s > Dumping 10472 out of 49114 > MB:..1%..11%SESC[0mESC[2;30;40mESC[0mESC[1m^@ESC[01;0 > > ... obviously, that last bit after 11% is the console rebooting. Note that > dumping begins after Uptime. Is that normal? In this case, the 10472 > would fit on the 16G swap, but I'm not sure what the problem is. It > definitely fits on the 64G USB stick. > Last I checked there is not enough of the system running during crash to dump to USB. You may have more luck attaching an actual disk, or you might be able to use "textdumps" to get more information if your swap is too small. -Alfred From owner-freebsd-hackers@freebsd.org Fri Aug 12 20:32:04 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9B224BB791A for ; Fri, 12 Aug 2016 20:32:04 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7ABED1DB9; Fri, 12 Aug 2016 20:32:04 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 61CC7B91E; Fri, 12 Aug 2016 16:32:03 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Cc: "K. Macy" , Andriy Gapon , Samy Bahra , Ryan Stone Subject: Re: How to get better debugging for the kernel. Date: Fri, 12 Aug 2016 12:22:18 -0700 Message-ID: <1551519.RkbAThDAeZ@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.3-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: References: <5cc825d5-9ed7-efac-b711-60a8d4b18cc4@FreeBSD.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 12 Aug 2016 16:32:03 -0400 (EDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 20:32:04 -0000 On Thursday, August 04, 2016 01:07:39 AM K. Macy wrote: > On Wed, Aug 3, 2016 at 12:53 PM, Andriy Gapon wrote: > > On 03/08/2016 20:14, Ryan Stone wrote: > >> Are you using the kgdb from the base system or from ports(it's a part of > >> devel/gdb)? The kgdb in ports is significantly better. If you haven't > >> tried the version from ports already, definitely do that first. > > > > kgdb 7.x from ports is certainly more powerful than the old base kgdb, > > but clang with O2 optimizations seems to be too much even for it. > > Samy did a good presentation about this issue. I'm hoping I can get > him to put his slides on line. Evidently clang is much more simplistic > about how it treats callee saved registers. In essence clang will > always err on the side of saying "optimized out" even when it has > sufficient state to know otherwise. Gcc, on the other hand will > sometimes incorrectly infer that a value is valid when it is in fact > not. > > I have been building some kernels with clang with dwarf4 enabled (and > thus needed to use kgdb 7.x from ports). Contrary to what I have heard > from some others I have found it to have virtually no added benefit. My understanding is that dwarf4 will not help with C programs like the kernel, that the new idioms in dwarf4 are for declaring more complex constructs in C++11, C++14, etc. I have heard that clang does not update debug information during optimization passes causing it to loose track of variables that are moved during optimization. I have not (yet) tried using gcc as avg@ describes though I will likely start doing so soon. -- John Baldwin From owner-freebsd-hackers@freebsd.org Fri Aug 12 20:32:07 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88E4CBB792C for ; Fri, 12 Aug 2016 20:32:07 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 611BF1DDE for ; Fri, 12 Aug 2016 20:32:07 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 4BD4DB915; Fri, 12 Aug 2016 16:32:06 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Cc: David Wolfskill Subject: Re: "ipmi0: KCS..." whines Date: Fri, 12 Aug 2016 11:54:38 -0700 Message-ID: <2855524.PakqtZoDR6@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.3-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <20160811175409.GW1112@albert.catwhisker.org> References: <20160811175409.GW1112@albert.catwhisker.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 12 Aug 2016 16:32:06 -0400 (EDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 20:32:07 -0000 On Thursday, August 11, 2016 10:54:09 AM David Wolfskill wrote: > I'm trying to figure out how to cope with some messages: > > ... kernel: ipmi0: KCS: Failed to read completion code > ... kernel: ipmi0: KCS error: ff > > that recently started showing up on some machines at work (which run > somethinng based on FreeBSD). > > As the messages started showing up shortly after a colleague committed > some code that would issue "ipmitool dcmi power reading" periodically, I > tried issuing the command myself... to no (initial) effect. > > I then tried doing it in a loop: > > foreach n (`jot 10`) > ipmitool dcmi power reading > end > > Nothing. But then, 10 isn't a very big number, so I tried again with 32 > -- and got a few. > > That said, I have seen no evidence of data corruption or mis-reporting > that coincides with the messages being issued: at this point, the > messages seem to merely be annoying, and there's nothing I can do to > stop getting them (except to not issue "ipmitool dcmi power reading", > perhaps). > > Also, I have no real evidence that whatever is causing the messages to > be issued "in real life" is actually related to my "articial" recreation > of the symptoms. > > I poked around a bit, and found that the messages are issued from > src/sys/dev/ipmi/ipmi_kcs.c, where device_printf(9) is used to emit > them. (Eventually, that gets around to using printf(9), which is why I > was unable to merely use syslog.conf to stop spamming the console.) > > As a proof-of-concept, I cobbled up the attached patch (with significant > help from a colleague). It seems to do as intended: suppress the > messages for a "normal" boot, but emit them for a verbose boot. > > I'm not at all happy with that approach: it's way too far on the > "treating the symptom" side than "fixing the root cause." On the other > hand, it's not clear that emitting a message about which I can do > nothing useful is all that ... useful. :-} > > But ... none of the machines to which I have access and have the > requisite hardware are running stock FreeBSD. > > Is anyone able to reproduce this on a stock FreeBSD machine? > > Better: anyone have a better idea how to address thhe issue? So the issue is probably that the BMC controller on your box is sometimes slow in responding. The completion code is the third byte of the reply we wait to read after sending a request to the BMC via KCS. However, the first two bytes just echo back the request ID and command we asked for, so it may be that the BMC echoes those back right away without waiting for whatever work it needs to do to handle the request to complete, but doesn't send the completion code (the status of the request) until the request is fully processed. The driver is complaining that the BMC didn't respond with the completion code before it's timeout expired. The default timeout is MAX_TIMEOUT in sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds. It may be that occasionally some "background" task runs in the BMC OS that delays responses to handling commands. It could also be that whatever work the BMC has to do to read this specific value is actually timing out or having issues in the hardware, etc. You could try increasing the timeout in MAX_TIMEOUT (just increase '6' to however many seconds you want to tolerate), but keep in mind that the CPU sits and spins polling for a reply (though the cure may be worse than the disease). You might also try polling this sensor less often. We could maybe use ppsratecheck() to rate limit the errors, but that's sort of papering over the problem that the BMC is timing out the request. A larger option is to modify the IPMI driver to support interrupt-driven operation (and not just polled) in which case a longer timeout might not hurt so much (you at least wouldn't be spinning on the CPU for N seconds). -- John Baldwin From owner-freebsd-hackers@freebsd.org Fri Aug 12 20:32:05 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EADEFBB791F for ; Fri, 12 Aug 2016 20:32:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CCD301DC2; Fri, 12 Aug 2016 20:32:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D86EEB960; Fri, 12 Aug 2016 16:32:04 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Cc: Andriy Gapon Subject: Re: on BIOS problems with disks larger than 2 TB Date: Fri, 12 Aug 2016 12:18:32 -0700 Message-ID: <490347865.SvN7iQoFWI@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.3-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <6cec427b-4df1-50f0-3014-a96e5f8210f5@FreeBSD.org> References: <6cec427b-4df1-50f0-3014-a96e5f8210f5@FreeBSD.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 12 Aug 2016 16:32:05 -0400 (EDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 20:32:06 -0000 On Tuesday, August 02, 2016 04:35:23 PM Andriy Gapon wrote: > > There are some BIOSes out there that do not properly support disks > larger than 2TB and cause boot problems if there is any data required > for boot at offsets larger than 2 TB (TiB, rather). > > The most typical victim is the ZFS boot if a boot pool includes disk > areas beyond 2TB, because a kernel, or zfsloader or any configuration > files required by the loader may end up in those "inaccessible" areas. > > It's obvious why 2TiB is a magic value here: > 2^32 * 512 = 2^41 = 2 * 2^40 > So the problem seems to happen when an LBA is treated as a 32-bit > integer (unsigned). > > I happen to own one of affected systems and I have done some more > investigation. As far as I can see, the only actual problem in my case > is that a disk size in 512b sectors is reported modulo 2^32 by INT 13h > AH=48h. If I "fix up" the parameter, then everything else (i.e. actual > data reads) seems to work just fine after that. > > I suspect that a large subclass of other problematic systems may have > exactly the same problem. > > Does anyone have an idea about how we could auto-detect and and > auto-correct that problem? > Would that be worth the trouble at all? Given the gradual de-orbiting > of BIOS systems. Hmm, I'm not sure how easy it is to handle this case (i.e. how do you know if an LBA beyond the size is really legit due to truncation vs coming from corrupted metadata). Related is that tsoome's bcache stuff wants to know where the end of the disk is (to avoid reading off the end), so just ignoring the size is not easy. -- John Baldwin From owner-freebsd-hackers@freebsd.org Fri Aug 12 20:40:09 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D43B1BB7D9D for ; Fri, 12 Aug 2016 20:40:09 +0000 (UTC) (envelope-from sbahra@backtrace.io) Received: from mail-io0-x229.google.com (mail-io0-x229.google.com [IPv6:2607:f8b0:4001:c06::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A7BC1158C for ; Fri, 12 Aug 2016 20:40:09 +0000 (UTC) (envelope-from sbahra@backtrace.io) Received: by mail-io0-x229.google.com with SMTP id 38so35225804iol.0 for ; Fri, 12 Aug 2016 13:40:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=backtrace.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ceeJe4uCEUjiBSfdz8XIEzstBxu5Vl4S9XK7q/zhcnw=; b=b2kCTBdkQ/MnAcUYuOwk76jL2Cr3VDhFRE02sIv345DgLsU3WaSf7fSlsVyVI3s6EF IEeMKqnxDJB3DuMrrAWQVJjApJD7pyIm/KltLfs3HkgmzlinQ12PvsXWfELCgTFxlNTg sSXjWWivgxsgLB1d6Pz2IRyG0zrs4ORKvSm5c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ceeJe4uCEUjiBSfdz8XIEzstBxu5Vl4S9XK7q/zhcnw=; b=MpAj3GtH7MEFmO1y/cAwQGTBeIWXA8vRqKqW1zh5f7+9i6wm4cI2oqDej/XbzFC2nr /pctAcYQumI0MD70CrvFf7e1nGW/gih5X0vIoKASCpQo9Vjca6giAevz4uaRmKCnOrbL Uq01XVifPdHnHkp9kC/jZgU9aWT/ikh407EQ3iM2asTCGECYmY0tYqiVG0QBioE4FCLw NmXCOnwVXL/+pLeqt/ipI137qzhC2G+u3AM4fGIMNZAgI6cxGDGkTEBTLqt9614iJCar s2zXG7s1z+2JRfpGvba4Lj3tg+zGd+WTjVLJ8hmT+35704q1iiVz9eykfKBmEXoQswLx masg== X-Gm-Message-State: AEkoouuj9NOkeu42j632yveb207fYwHLgnV/c6sbU0qahm1ABdvUdFfYh2QUXS3n+6U7cvlAwsNaaFLau2/DTqMI X-Received: by 10.107.140.17 with SMTP id o17mr19659225iod.69.1471034408636; Fri, 12 Aug 2016 13:40:08 -0700 (PDT) MIME-Version: 1.0 References: <5cc825d5-9ed7-efac-b711-60a8d4b18cc4@FreeBSD.org> <1551519.RkbAThDAeZ@ralph.baldwin.cx> In-Reply-To: <1551519.RkbAThDAeZ@ralph.baldwin.cx> From: Samy Bahra Date: Fri, 12 Aug 2016 20:39:57 +0000 Message-ID: Subject: Re: How to get better debugging for the kernel. To: John Baldwin , freebsd-hackers@freebsd.org Cc: "K. Macy" , Andriy Gapon , Ryan Stone X-Mailman-Approved-At: Fri, 12 Aug 2016 20:48:41 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 20:40:09 -0000 Slides up at: http://backtrace.io/blog/images/bbcon2016-sbahra.pdf On Fri, Aug 12, 2016 at 4:32 PM John Baldwin wrote: > On Thursday, August 04, 2016 01:07:39 AM K. Macy wrote: > > On Wed, Aug 3, 2016 at 12:53 PM, Andriy Gapon wrote: > > > On 03/08/2016 20:14, Ryan Stone wrote: > > >> Are you using the kgdb from the base system or from ports(it's a part > of > > >> devel/gdb)? The kgdb in ports is significantly better. If you > haven't > > >> tried the version from ports already, definitely do that first. > > > > > > kgdb 7.x from ports is certainly more powerful than the old base kgdb, > > > but clang with O2 optimizations seems to be too much even for it. > > > > Samy did a good presentation about this issue. I'm hoping I can get > > him to put his slides on line. Evidently clang is much more simplistic > > about how it treats callee saved registers. In essence clang will > > always err on the side of saying "optimized out" even when it has > > sufficient state to know otherwise. Gcc, on the other hand will > > sometimes incorrectly infer that a value is valid when it is in fact > > not. > > > > I have been building some kernels with clang with dwarf4 enabled (and > > thus needed to use kgdb 7.x from ports). Contrary to what I have heard > > from some others I have found it to have virtually no added benefit. > > My understanding is that dwarf4 will not help with C programs like the > kernel, that the new idioms in dwarf4 are for declaring more complex > constructs in C++11, C++14, etc. I have heard that clang does not update > debug information during optimization passes causing it to loose track of > variables that are moved during optimization. I have not (yet) tried > using gcc as avg@ describes though I will likely start doing so soon. > > -- > John Baldwin > From owner-freebsd-hackers@freebsd.org Fri Aug 12 20:59:42 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D606BBB82EE for ; Fri, 12 Aug 2016 20:59:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 038351258; Fri, 12 Aug 2016 20:59:41 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA21220; Fri, 12 Aug 2016 23:59:33 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1bYJYD-000AhP-Ae; Fri, 12 Aug 2016 23:59:33 +0300 Subject: Re: on BIOS problems with disks larger than 2 TB To: John Baldwin , freebsd-hackers@FreeBSD.org References: <6cec427b-4df1-50f0-3014-a96e5f8210f5@FreeBSD.org> <490347865.SvN7iQoFWI@ralph.baldwin.cx> From: Andriy Gapon Message-ID: Date: Fri, 12 Aug 2016 23:58:12 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <490347865.SvN7iQoFWI@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 20:59:42 -0000 On 12/08/2016 22:18, John Baldwin wrote: > Hmm, I'm not sure how easy it is to handle this case (i.e. how do you know > if an LBA beyond the size is really legit due to truncation vs coming from > corrupted metadata). Related is that tsoome's bcache stuff wants to know > where the end of the disk is (to avoid reading off the end), so just > ignoring the size is not easy. One idea that I have in mind but haven't really explored yet is for GPT formatted disks. Basically, if a GPT label hints that the disk size is larger than what BIOS reports, then we could try to read a backup label and if it matches what we expect, then we could adjust the size. Hmm, I think I recall that a long time ago some BIOSes used to do something similar with MBR :-) -- Andriy Gapon From owner-freebsd-hackers@freebsd.org Fri Aug 12 21:43:42 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 770C5BB805E for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 6258F13BA for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5E313BB805D; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DCB2BB805C for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (mx.catwhisker.org [198.144.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2FF0E13B9; Fri, 12 Aug 2016 21:43:41 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.15.2/8.15.2) with ESMTP id u7CLheuK048316; Fri, 12 Aug 2016 21:43:40 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.15.2/8.15.2/Submit) id u7CLhecv048315; Fri, 12 Aug 2016 14:43:40 -0700 (PDT) (envelope-from david) Date: Fri, 12 Aug 2016 14:43:40 -0700 From: David Wolfskill To: John Baldwin Cc: hackers@freebsd.org Subject: Re: "ipmi0: KCS..." whines Message-ID: <20160812214340.GZ1112@albert.catwhisker.org> References: <20160811175409.GW1112@albert.catwhisker.org> <2855524.PakqtZoDR6@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="sBy3cog7RUybpTge" Content-Disposition: inline In-Reply-To: <2855524.PakqtZoDR6@ralph.baldwin.cx> User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 21:43:42 -0000 --sBy3cog7RUybpTge Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 12, 2016 at 11:54:38AM -0700, John Baldwin wrote: > ... > So the issue is probably that the BMC controller on your box is sometimes > slow in responding. The completion code is the third byte of the reply > we wait to read after sending a request to the BMC via KCS. However, the > first two bytes just echo back the request ID and command we asked for, so > it may be that the BMC echoes those back right away without waiting for > whatever work it needs to do to handle the request to complete, but doesn= 't > send the completion code (the status of the request) until the request is > fully processed. >=20 > The driver is complaining that the BMC didn't respond with the completion > code before it's timeout expired. The default timeout is MAX_TIMEOUT in > sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds. It may be that > occasionally some "background" task runs in the BMC OS that delays respon= ses > to handling commands. It could also be that whatever work the BMC has to= do > to read this specific value is actually timing out or having issues in the > hardware, etc. I could easily modify the stress-test loop to run "date" after each "ipmitool" invocation. (Pity we don't seem to have a sub-second format in strftime().) So... I tried the above (interspersing "date" commands while running "ipmitool dcmi power reading" in a loop within script(1)). I did not get a whine at 32 repetitions; I got one at 64. The total elapsed time was no more than 3 seconds (last timestamp - first timestamp difference was 2 seconds). > You could try increasing the timeout in MAX_TIMEOUT (just increase '6' to > however many seconds you want to tolerate), but keep in mind that the CPU > sits and spins polling for a reply (though the cure may be worse than the > disease). You might also try polling this sensor less often. That's one of the "odd things" -- based on the change that was committed (locally) I would expect that we issue the "ipmitool dcmi power reading" command (along with a handful of others) once every 59 seconds. The complete list of such commands (fed to ipmitool via stdin) is: dcmi power reading sensor raw 0x06 0x52 0x07 0x5b 0x01 0x92 raw 0x30 0x70 0x4b 0x00 0x03 exit > We could maybe use ppsratecheck() to rate limit the errors, but that's > sort of papering over the problem that the BMC is timing out the request. Well, in fairness, that's probably doing a slightly less brute force bit of "papering over" than the patch I had provided. :-} > A larger option is to modify the IPMI driver to support interrupt-driven > operation (and not just polled) in which case a longer timeout might not > hurt so much (you at least wouldn't be spinning on the CPU for N seconds). > .... =20 I wouldn't mind testing that, but I don't think I'm up to writing it. Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --sBy3cog7RUybpTge Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJXrkMMXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XfvYH/jYhYC8o/NPzFrkTkHAZ2W1E kqrJitnkPUSnqU5zuSW/usKCYvrWh9YGBeBTv1TsGzzYoAsCi8kRUqMAF/oJiFRd vC1CBAvnUVqXkHvX1Nes8THnML0HtMW6VAiyx8to+oFshs2VKXJqI1iq5geFH8el QaqIBuvBd0zu6DGCszmQxMq0VT3ls3qhgmUN/x1asBZ44X60h+n71taiEjvFzzRf BqZPminCQcmPZx9CdNxIOu/jx+8r1W5hBAuc80r2DSkUS4VPBNQPRa4fm7KWvWoj 1VaSrryGgOq/Bb/fYKNWrbh7FBylhNIoD6J1yAaGQ32fuek6clYyayjD0Qqqywo= =EpUB -----END PGP SIGNATURE----- --sBy3cog7RUybpTge-- From owner-freebsd-hackers@freebsd.org Fri Aug 12 22:04:14 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 558E7BB84A7 for ; Fri, 12 Aug 2016 22:04:14 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 218DF1F8D; Fri, 12 Aug 2016 22:04:14 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 534A71FE022; Sat, 13 Aug 2016 00:04:11 +0200 (CEST) Subject: Re: Panic not dumping to USB. To: cem@freebsd.org, Zaphod Beeblebrox References: Cc: FreeBSD Hackers From: Hans Petter Selasky Message-ID: <47d14884-29dc-e5df-9d78-2c09e6c3dc0e@selasky.org> Date: Sat, 13 Aug 2016 00:08:36 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 22:04:14 -0000 On 08/12/16 20:34, Conrad Meyer wrote: > Hi Zaphod, > > I don't think USB disks support polled I/O, which is needed for dump > devices. As far as I know only classic disk devices (ATA, SCSI) work. > USB supports polled mode for USB disks, but the disk must be attached prior to panic. --HPS From owner-freebsd-hackers@freebsd.org Sat Aug 13 04:44:06 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D21DBB80C5 for ; Sat, 13 Aug 2016 04:44:06 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x22c.google.com (mail-io0-x22c.google.com [IPv6:2607:f8b0:4001:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 292ED19AA for ; Sat, 13 Aug 2016 04:44:06 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x22c.google.com with SMTP id m101so41617738ioi.2 for ; Fri, 12 Aug 2016 21:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=cN6FwlebUsNPzZvzNCtqy50yBgoM0IMeld+GMr2vaII=; b=x7bZCDDq4PLhT8GO0fEL3PVhLfdacaIboefO5kGuFpm8248/8dp1i81k7FUS1/4Kmd CIxGKr/91a+Yi7MY1iBA7PIY01Vyh60fVCFLPS3Exkw+4hmM/akdXZ9nGm9UomZ5zoxO 9ZEY2z7UrH8HeSV43ISkdtNmM33/4n84clu7ErqkzGN7LV7M8NxBPscyphKM0IZz/kkq W15rGyT9oaubcpjGQD+bJyAXfvxltJoo3JF5zvPs26WTmS7INy1iBI8yLNBUrcimE0US aN72ruNnt406C0UYdC1BDvzFnPHsZxYU+7mDR6HqhdsXlCvBGzY7CLB22kla364H9zmo g5aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=cN6FwlebUsNPzZvzNCtqy50yBgoM0IMeld+GMr2vaII=; b=ioQaOtFtgdECixcG+BPXn2el/QNXl3Nh+Wv/xhAzpim3t5ARgBFlamcrciycXt6RiG WlnbwOOxZGme4lkddKnJw0Ju2Q19KK4ycUcCYQ4OwhDMya0fNxe05fkIjnWLWwjNKYPp JausmPFfX6KBDCQ/5gz+cgo/fgu5pO7O8h+2O3JQ/P2SKWz/g4RaqkCmoLRcB5Ypmc7r s9Z0ZvPXBRRrEXxQXsY4O/wbApzAmhbQiFlM/3Sza7N7TZchwOAHf/RfZQ1qCMHx3PKy /T06lB0FsLTRl3qAO9TJ71b8/YxA1svRTHgn2pOxNWU2eni71tEExC3RXyHZUCoGW9rk 7wrg== X-Gm-Message-State: AEkoouus90BWwIr7nfOiTd2KtVZpuneex4GAtkxmcV28uuo5fUelwRi9MD/fVFLDe61vA4LwcoSK6GTwlZp+eQ== X-Received: by 10.107.9.39 with SMTP id j39mr21671764ioi.73.1471063445412; Fri, 12 Aug 2016 21:44:05 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.36.65.105 with HTTP; Fri, 12 Aug 2016 21:44:04 -0700 (PDT) X-Originating-IP: [69.53.245.200] In-Reply-To: References: <6cec427b-4df1-50f0-3014-a96e5f8210f5@FreeBSD.org> <490347865.SvN7iQoFWI@ralph.baldwin.cx> From: Warner Losh Date: Fri, 12 Aug 2016 22:44:04 -0600 X-Google-Sender-Auth: Q54Ga8gBRA9km1wK5OhEFjvsWis Message-ID: Subject: Re: on BIOS problems with disks larger than 2 TB To: Andriy Gapon Cc: John Baldwin , "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Aug 2016 04:44:06 -0000 On Fri, Aug 12, 2016 at 2:58 PM, Andriy Gapon wrote: > On 12/08/2016 22:18, John Baldwin wrote: >> Hmm, I'm not sure how easy it is to handle this case (i.e. how do you know >> if an LBA beyond the size is really legit due to truncation vs coming from >> corrupted metadata). Related is that tsoome's bcache stuff wants to know >> where the end of the disk is (to avoid reading off the end), so just >> ignoring the size is not easy. > > One idea that I have in mind but haven't really explored yet is for GPT > formatted disks. Basically, if a GPT label hints that the disk size is larger > than what BIOS reports, then we could try to read a backup label and if it > matches what we expect, then we could adjust the size. > > Hmm, I think I recall that a long time ago some BIOSes used to do something > similar with MBR :-) I think we should just trust the GPT bounds and ignore the actual size of the disk. If this is incorrect, we'll get I/O errors to indicate something is wrong. Doesn't matter what's wrong, at the end of the day, and many different pathologies present as the same error. We should ignore the total size of the disk reported by BIOS routines because they lie to stay compatible with the long-dead hand of the past. The lie is different depending on what the underlying disk technology is,which we have no real visibility into in /boot/loader. For MBR, well, it can't go about 2TB anyway for 512 byte sector size and 4k MBRs are (or at least used to be) quite rare. Warner From owner-freebsd-hackers@freebsd.org Sat Aug 13 07:42:41 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 85D75BB89B9 for ; Sat, 13 Aug 2016 07:42:41 +0000 (UTC) (envelope-from joel.bertrand@systella.fr) Received: from rayleigh.systella.fr (newton-ipv6.systella.fr [IPv6:2001:7a8:a8ed:253::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "rayleigh.systella.fr", Issuer "rayleigh.systella.fr" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0E7971FC1 for ; Sat, 13 Aug 2016 07:42:40 +0000 (UTC) (envelope-from joel.bertrand@systella.fr) Received: from [192.168.10.20] ([192.168.10.20]) (authenticated bits=0) by rayleigh.systella.fr (8.15.2/8.15.2/Debian-4) with ESMTPSA id u7D7gRU2006838 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Sat, 13 Aug 2016 09:42:27 +0200 To: FreeBSD Hackers From: =?UTF-8?Q?BERTRAND_Jo=c3=abl?= Subject: Gimp and PostScript Message-ID: <57AECF63.7070102@systella.fr> Date: Sat, 13 Aug 2016 09:42:27 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.99.2 at rayleigh X-Virus-Status: Clean X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Aug 2016 07:42:41 -0000 Hello, I'm trying to use Gimp with Ghostscript to import and export eps files. I have installed gimp (and all dependencies) and ghostscript 9. Of course, gs runs as expected (in /usr/local/bin) but I'm unable to work on eps files in gimp. Both gimp and ghostscript come from binary ports (for FreeBSD 10.3) and I don't understand my mistake. Help is welcome to fix this issue. Best regards, JB From owner-freebsd-hackers@freebsd.org Sat Aug 13 15:55:04 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5958BB83FB for ; Sat, 13 Aug 2016 15:55:04 +0000 (UTC) (envelope-from gordon.w.ross@gmail.com) Received: from mail-io0-x231.google.com (mail-io0-x231.google.com [IPv6:2607:f8b0:4001:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8ED881C2C for ; Sat, 13 Aug 2016 15:55:04 +0000 (UTC) (envelope-from gordon.w.ross@gmail.com) Received: by mail-io0-x231.google.com with SMTP id m101so49246810ioi.2 for ; Sat, 13 Aug 2016 08:55:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=os4MGWpgWl5WjDQD2nHew4z9ToSFNBI0aiQcMZ+oakA=; b=e+HHM7LQhqS8Zn9wEsedwICcmZ/Hu/CWi/V+kO/8QTVAENSaMC3/gr9Fs+ej78saFK u6WW+But6feLHHBKspBqT9w6CB0YALGmJ2c5QmUCJsfvTAWZBwNDG6VqNk6hI2lE/48y MwzqGsnkIuJcG7QHKIxod/y/lDULOD7TCstXVMD2sewPyABBzOZ7KiefqB4dPYoopyAj 8OcNYL4VWHgdthYi9g5p7sQKbaqHhpcawslQDCyGFs6Nbc93chhqCe+lACTR/AwuBmmU I+BGaozLhWNFKLPi382PPe0ohS93zt45LNYG6T/HAByo+lxNol6mQiQBlL78Vutrt+to 6GlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=os4MGWpgWl5WjDQD2nHew4z9ToSFNBI0aiQcMZ+oakA=; b=ANlc43LiX7EEqqRb+Zfc/Wy4FBt9vOybZcB7A7yG8JGq3LgrZUo6SN3xKP2AUjWcfz wEF18M99FvsANxcrjKkcJ+RstTb2zU8weP38jI6NHdphJkHFwTMny1BjKqcX/XlNOrY9 h0Tag0flwzh8G4xIcaoq7Zf0tv+MTKdomzephrvd7vfGZRc4O+qDEE3UhDshUUsmXo9W pepseC0P7l+SxkdAY33ccyOSaC3YFhi/D86RrGR/0mWJ0luDql3+HhPZ/NEVm0SfguED +vMQA9eiiac9b6CPDlmto59bBQ573ju9141FpF/7wKOLvPxsHj1cN9rc3bdUsiiOewj4 ghgg== X-Gm-Message-State: AEkoouuqi9fV0ZhnRCAANX3YxbQjcnLmZ1nYbAj1/FI7N1PLzqaClIJrFxGaWGYsQN3JafhXH2OUvcN0uihZYQ== X-Received: by 10.107.146.195 with SMTP id u186mr23613305iod.112.1471103703716; Sat, 13 Aug 2016 08:55:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.17.79 with HTTP; Sat, 13 Aug 2016 08:55:03 -0700 (PDT) In-Reply-To: References: <5cc825d5-9ed7-efac-b711-60a8d4b18cc4@FreeBSD.org> <1551519.RkbAThDAeZ@ralph.baldwin.cx> From: Gordon Ross Date: Sat, 13 Aug 2016 11:55:03 -0400 Message-ID: Subject: Re: How to get better debugging for the kernel. To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Aug 2016 15:55:04 -0000 I heard a rumor someone might be working on a port of illumos "mdb". Anyone know if that's true, and how far along it went? We depend heavily upon this tool for production support; so much that I'm not sure how we'd live without it. On Fri, Aug 12, 2016 at 4:39 PM, Samy Bahra wrote: > Slides up at: http://backtrace.io/blog/images/bbcon2016-sbahra.pdf > > On Fri, Aug 12, 2016 at 4:32 PM John Baldwin wrote: > >> On Thursday, August 04, 2016 01:07:39 AM K. Macy wrote: >> > On Wed, Aug 3, 2016 at 12:53 PM, Andriy Gapon wrote: >> > > On 03/08/2016 20:14, Ryan Stone wrote: >> > >> Are you using the kgdb from the base system or from ports(it's a part >> of >> > >> devel/gdb)? The kgdb in ports is significantly better. If you >> haven't >> > >> tried the version from ports already, definitely do that first. >> > > >> > > kgdb 7.x from ports is certainly more powerful than the old base kgdb, >> > > but clang with O2 optimizations seems to be too much even for it. >> > >> > Samy did a good presentation about this issue. I'm hoping I can get >> > him to put his slides on line. Evidently clang is much more simplistic >> > about how it treats callee saved registers. In essence clang will >> > always err on the side of saying "optimized out" even when it has >> > sufficient state to know otherwise. Gcc, on the other hand will >> > sometimes incorrectly infer that a value is valid when it is in fact >> > not. >> > >> > I have been building some kernels with clang with dwarf4 enabled (and >> > thus needed to use kgdb 7.x from ports). Contrary to what I have heard >> > from some others I have found it to have virtually no added benefit. >> >> My understanding is that dwarf4 will not help with C programs like the >> kernel, that the new idioms in dwarf4 are for declaring more complex >> constructs in C++11, C++14, etc. I have heard that clang does not update >> debug information during optimization passes causing it to loose track of >> variables that are moved during optimization. I have not (yet) tried >> using gcc as avg@ describes though I will likely start doing so soon. >> >> -- >> John Baldwin >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@freebsd.org Sat Aug 13 17:22:18 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 945B0BB8A7E for ; Sat, 13 Aug 2016 17:22:18 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: from nm21.bullet.mail.bf1.yahoo.com (nm21.bullet.mail.bf1.yahoo.com [98.139.212.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCB11EB1 for ; Sat, 13 Aug 2016 17:22:17 +0000 (UTC) (envelope-from pfg@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1471108936; bh=5imIJcDAjvnuspcHhJsmbDulOk2D+DUaNl+k81/kVTc=; h=From:Subject:To:Cc:Date:From:Subject; b=GsPBDS7avPz4+DvlKMZ94pwGb9M7cQzDyl7MK0ws/ULWInfimUvRk/RT6G6VDzOcxKnfj4GpWsReYr8OC1QWzXTV7jyBkkyCcM7KPkDHxmzyI51ef+nVnCPGYFMHlFY34DbYI5njsnV9gSUZyxJ13iPF2wPdjGhjzlg/RDf6mZ5zMU4nVf7vT1OhOsC2a2z2JdSRkyhZBW6Pe1WQae431fzU6aY3TZ5rc4aC6w1LxncqWT/Gu9US2y4rGUJaHZVZE7OT8vLX0KAUt40NXnsI+R0v3ke3qmCeJ06xApaMi4Ba/cw917HQA155YLNXUZdBBjnyT3wo8rSkVQlcyBTbwQ== Received: from [66.196.81.172] by nm21.bullet.mail.bf1.yahoo.com with NNFMP; 13 Aug 2016 17:22:16 -0000 Received: from [98.139.211.207] by tm18.bullet.mail.bf1.yahoo.com with NNFMP; 13 Aug 2016 17:22:16 -0000 Received: from [127.0.0.1] by smtp216.mail.bf1.yahoo.com with NNFMP; 13 Aug 2016 17:22:16 -0000 X-Yahoo-Newman-Id: 923567.6808.bm@smtp216.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: LJjdjpEVM1nMed7RHTWAxTJDs_DlXLiXMZgHzLJcC8b26Go pZu62brv_VdHvn4jOeMwGp6_tDSFSIX8sUntbKyD59Iwu8Z_iyc0F_Mzs_li wl5XjqJ2lZKX4b91F1qvqbB8dP_khZFs_NxTswCsb4hkMwNeScQ9ALia7LuP ErWm_lSG578kDa.fOBYdpEO05lH0pJtbkDIv90ftII7Nm9fr96O5.UhgwQLC tnBrjhcPzPkKgfu4NPT4SPna_FwmZuwqxkIwFxXtrTVz30Rol.Xs34P9nCVM TXYuE_5qtTG4REDAK2s8BD5Xf9es4j7vWQXFT83sC7_aPgzNVUYuaF7yGAg3 xpLu3xri_ZU_O.VBkztLGywAx0DU7j7TfRmGr.SFCSvfoPTNLET43x0Hd63i 4yUneMZJCsLb7Huf3scY0Kqa6iG5_JDdIngfXr14tBArwHBvnQRpCXnrGVVy 0YOwwJy8g.QjjbE2YWgxt4yr55tP6iYBJiywHLi4eq_pIdKSjcz6x1a1Ebfc oUKlySN4HPxfFSvXUyJAQP6AnWvsn4vK6dlGw8VW9SFq6.w-- X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf From: Pedro Giffuni Subject: Re: How to get better debugging for the kernel. To: FreeBSD Hackers Cc: Gordon Ross Message-ID: <7b58509a-556b-0784-56c5-00378a1fc5e2@FreeBSD.org> Date: Sat, 13 Aug 2016 12:22:15 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Aug 2016 17:22:18 -0000 Gordon Ross wrote: > I heard a rumor someone might be working on a port of illumos "mdb". > Anyone know if that's true, and how far along it went? > FWIW, I heard the same rumor. > We depend heavily upon this tool for production support; > so much that I'm not sure how we'd live without it. There is interest, and I have seen some attempt at work on it however it is not something likely to bear much fruit because mdb knows nothing about dwarf; AFAICT it knows only about stabs, which we don't use at all. Also, Oracle did make some Dtrace enhancements in upstream gdb that would be good to port. Pedro. From owner-freebsd-hackers@freebsd.org Sat Aug 13 23:49:53 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C9BCCBB9385 for ; Sat, 13 Aug 2016 23:49:53 +0000 (UTC) (envelope-from gordon.w.ross@gmail.com) Received: from mail-it0-x229.google.com (mail-it0-x229.google.com [IPv6:2607:f8b0:4001:c0b::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 95B871B78; Sat, 13 Aug 2016 23:49:53 +0000 (UTC) (envelope-from gordon.w.ross@gmail.com) Received: by mail-it0-x229.google.com with SMTP id f6so17360350ith.0; Sat, 13 Aug 2016 16:49:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=/dhsu4Psq+SVeJNhtoGTLKcbOsXZUPclkKEMmlcDOFk=; b=JqXEltuGrqpu2NgCNbXBbdUG/V2UWBRs2EAxdNr9TdFJVyBD/TLnku3tdaEHrGmE/y ZdBPhJ+Rkum49jQKs8U+w6h7XQ13DYim7+pWkbi0UKnlkQ+f/GDqHNzbwW8XU8v3v1W9 fAjxt5COqw5qnAruq33anjPGt2h8kn5laeyzwSkJCdHJusrIpyFXNGmxgoF7p8rYKRWy OGX6KHpo/lPh7KgvLe8vvvRDkW45/DYjixnySm0+QhTjDEdQ+bjVhmEZ9X/grzx9OB2+ j/1kEytk7QrsrRLAAvCpuc/zFTJEcMtUC1Muj3n/1vL5QsgxrkfkCZhW1CcxeJs+b7yy wzqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=/dhsu4Psq+SVeJNhtoGTLKcbOsXZUPclkKEMmlcDOFk=; b=ZiijTfN24xbmskMJ3RBw+WctxjBNdljWYPj0DF1ueh7tFHst8+RrGpr41Zgccv6wTT AgU5Uokd82D8j5lFw6enLAd8ONHvY/h0t+lI3CauxPuf5p7IZS1DqVKxPNxC8iw9KpNt 42wUC8j0YUTzkl2g6zUzbcaFPzKKEFf+1P6NYr4koRbArcsiaZPeUfsDhF3nITjM0Go5 2IRwj/T8nwCa45+OWr5DlaVmzkZEy3TpGTaNCK3LOmnSZdbfRPwfreo09MCZMCJopP8S uTU2xBRDd3iMS1b5gekh179BFPTzuceYd6Vrn2hqGME52u5q5O+bi35EeG/AG0tfPuDB i8eA== X-Gm-Message-State: AEkoouvUyBOMmYEGJukfoJ4SVHp4hPxDPNgvK80S2sS4zqesx0lL53jA0RW7SAD+DM3JtqwjbIEwwjW/P2I5Hg== X-Received: by 10.36.79.9 with SMTP id c9mr6727174itb.28.1471132192971; Sat, 13 Aug 2016 16:49:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.17.79 with HTTP; Sat, 13 Aug 2016 16:49:52 -0700 (PDT) In-Reply-To: <7b58509a-556b-0784-56c5-00378a1fc5e2@FreeBSD.org> References: <7b58509a-556b-0784-56c5-00378a1fc5e2@FreeBSD.org> From: Gordon Ross Date: Sat, 13 Aug 2016 19:49:52 -0400 Message-ID: Subject: Re: How to get better debugging for the kernel. To: Pedro Giffuni Cc: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Aug 2016 23:49:53 -0000 Actually, mdb primaryliy uses Compact ANSI-C Type Format (CTF), which is in sections added to binaries by tools that convert from either dwarf or stabs. I've assumed FreeBSD already has CTF since dtrace normally uses it too... On Sat, Aug 13, 2016 at 1:22 PM, Pedro Giffuni wrote: > Gordon Ross wrote: >> >> I heard a rumor someone might be working on a port of illumos "mdb". >> Anyone know if that's true, and how far along it went? >> > > FWIW, I heard the same rumor. > >> We depend heavily upon this tool for production support; >> so much that I'm not sure how we'd live without it. > > > There is interest, and I have seen some attempt at work on it however > it is not something likely to bear much fruit because mdb knows nothing > about dwarf; AFAICT it knows only about stabs, which we don't use > at all. > > Also, Oracle did make some Dtrace enhancements in upstream gdb that > would be good to port. > > Pedro.