From owner-freebsd-net@FreeBSD.ORG Wed Feb 18 11:38:00 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EF4D1725 for ; Wed, 18 Feb 2015 11:37:59 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [IPv6:2001:4f8:3:ffe0:406a:0:50:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B5E2BA60 for ; Wed, 18 Feb 2015 11:37:59 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [127.0.1.5]) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9) with ESMTP id t1IBbxbl008536 for ; Wed, 18 Feb 2015 11:37:59 GMT (envelope-from root@phabric-backend.isc.freebsd.org) Received: (from root@localhost) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9/Submit) id t1IBbxcR008535; Wed, 18 Feb 2015 11:37:59 GMT (envelope-from root) Date: Wed, 18 Feb 2015 11:37:59 +0000 To: freebsd-net@freebsd.org From: "rrs (Randall Stewart)" Subject: [Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests). Message-ID: <953272367a10765054ccf4e0b5d231ba@localhost.localdomain> X-Priority: 3 Thread-Topic: D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate the callout code (and potentially for use by other tests). X-Herald-Rules: none X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: In-Reply-To: References: Thread-Index: Y2JjMTcyODJkYzgxM2NkZDFjY2RhOGRmMTlkIFTkeZc= X-Phabricator-Sent-This-Message: Yes X-Mail-Transport-Agent: MetaMTA X-Auto-Response-Suppress: All X-Phabricator-Mail-Tags: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Feb 2015 11:38:00 -0000 rrs added a comment. I have thought long and hard about this. I don't think its a bug. But to know for sure I will need to add some instrumentation. I suspect what is happening is a tremendous number of callouts all come due at the same time. The three back traces trying to stop or reset a callout are just unlucky in they don't get the lock as the callout code works through doing its loops of CC_LOCK(cc) while there is more on the list prepare callout CC_UNLOCK(cc) call_callout_function CC_LOCK(cc) done CC_UNLOCK(cc) The spin-mtx has (from what I can see) no awareness of the fact that you might have lost several bids to get the lock. It just crashes if it spins for too long and cannot get the lock. The previous problem is fixed, which I could reproduce, where the callout temp-list was corrupt and pointing to itself.. thus the soft clock looped forever... Hiren/Sbruno: Let me make a special patch that includes some counts in the cpu_cc structure that we can find out 1) For both callout loops how many the last call had 2) For both callout loops what was the max ever seen This will give us a hint if I am correct. I have also asked jhb on his thoughts for this in email. R REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net