From owner-freebsd-current@FreeBSD.ORG Tue Jan 27 18:43:40 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 80C3CC2A; Tue, 27 Jan 2015 18:43:40 +0000 (UTC) Received: from lakerest.net (lakerest.net [162.235.35.161]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "lakerest.net", Issuer "Stewart" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 3C0973D5; Tue, 27 Jan 2015 18:43:39 +0000 (UTC) Received: from rrss-air.corp.netflix.com ([69.53.237.72]) (authenticated bits=0) by lakerest.net (8.14.4/8.14.3) with ESMTP id t0RIh7k9022957 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 27 Jan 2015 13:43:08 -0500 (EST) (envelope-from rrs@freebsd.org) From: Randall Stewart Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Callout lockups and spin-lock held to long panic.. Date: Tue, 27 Jan 2015 10:43:25 -0800 Message-Id: <8334666F-AE31-4298-A6D7-11453A22DF41@freebsd.org> To: freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.4\)) X-Mailer: Apple Mail (2.2070.4) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jan 2015 18:43:40 -0000 All: I just wanted to send a note to let folks know I have finally dug to the bottom of the crashes that Sean Bruno has been seeing and will = shortly have a fix committed for it. The problem was related to two callout_reset=E2=80=99s being run with = migration happening and that callout was executing (or waiting to execute). The twin callout resets would in the end each remove the entry from the linked list (twice) thus corrupting the linked list. The software code would thus run, holding the CC_lock spinning forever going through the linked list.. causing the crash. I was able to reproduce this in a branch at netflix here so I can prove that the fix I have actually fixes the issue. It will be a couple more days of proving things out, followed by = hopefully getting interested reviewer=E2=80=99s to review the patch.. and then = from there I can commit it to head .. Best wishes R ------------------------------ Randall Stewart 803-317-4952 (cell)