From owner-freebsd-scsi@freebsd.org Fri Mar 4 11:07:52 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7ECAC9DAD4D for ; Fri, 4 Mar 2016 11:07:52 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com [IPv6:2a00:1450:400c:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C759C64 for ; Fri, 4 Mar 2016 11:07:51 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x22e.google.com with SMTP id l68so29686062wml.0 for ; Fri, 04 Mar 2016 03:07:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=mtpYvUxUDWlxy8PPWW66mjbTWTSxQx4CJySEQiRv8rM=; b=YU/cnMZaEpcQOp2YN4973qJxcwyePML2w3LmbxmeSe9OhTKMpukdhPM4gYBAI5GN3K OYpoDM9sJAbl5JIwTGU8nfrqFxex8nAnU6MaWlzGB2DByppKbkJa+SOElwxW4Fs+rUQy zle7ZLxmeNiHjQeJzrLQKGKDsY/UQF+XrhpqIbtK80cV0icPq1Dpty75QtOQ5Z2ar0FL SXME2m5Cyppk0T9SBsK7x+Cez92q7XmFJvS29b2UCvG1ToUegycmQUL9juA43+RF5i8j k3jLTmjYO2nzSJuxpRasJkwBFGOfUlJAiJErrUQKH7KBpc86P4gnr//gY9dp6qLXB41C hSHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=mtpYvUxUDWlxy8PPWW66mjbTWTSxQx4CJySEQiRv8rM=; b=kDdifVn7WhM1SQPYmQw3JRzeFSyHzeBy2xa2sFc5GwCNJFHm3/t3COAMZrLuq3eNVP H/NEejUTQ0TQ8lMqzwP0H8dW84ED+X9sdw+FASyI6AoGuuGXRG8gGXQeRFSVUrgCxjrG chkGEBj0o2eK9Rv0NR1BawLJ2UOl0lmkpfCihP1T0H8P3jzksewNc3je4umgLpTPqe0c V7UsZIa9IDzDDt/gNbbj7dPYaJnvV915Q/U+BvyW+KApVcztXwI1MMgbD1cn0e602vWS vrHRQE7e+oCF56OGAA++pG0xXqfJKIbDK4EjZJaMaVBMEVb7M20Wva2xZyw+ytAHVkIo 4lcg== X-Gm-Message-State: AD7BkJLZjbWIlfgxQGjvRclNZwzkasUz3JKZCwpvrReYtDseB+/JIdq2m0CNRJHrnOQfuW4P X-Received: by 10.194.120.229 with SMTP id lf5mr9084654wjb.151.1457089670145; Fri, 04 Mar 2016 03:07:50 -0800 (PST) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id e19sm2828407wmd.1.2016.03.04.03.07.49 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 04 Mar 2016 03:07:49 -0800 (PST) Subject: Re: mpr(4) SAS3008 Repeated Crashing To: Borja Marcos References: <56D5FDB8.8040402@freebsd.org> <56D612FA.6090909@multiplay.co.uk> <56D805FD.50500@multiplay.co.uk> <56D95266.301@multiplay.co.uk> Cc: Scott Long , FreeBSD-scsi From: Steven Hartland Message-ID: <56D96C84.7070507@multiplay.co.uk> Date: Fri, 4 Mar 2016 11:07:48 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 11:07:52 -0000 On 04/03/2016 10:58, Borja Marcos wrote: >> On 04 Mar 2016, at 10:16, Steven Hartland wrote: >> >> Its very rare but we've also seen this type of behaviour from a failing Intel CPU. There was no other indication the CPU had an issue, which one might expect, so just wanted to make you aware of the possibility. >> >> That said the most common cause of this we've seen, when its not a common disk or disks, is a bad backplane or cabling to the backplane. > Now I’m really curious! > > How did you determine that it was the CPU? And what kind of issue was it causing? Noise in the power rails? Interference? After a month or so of fixing mfi so it recovered from all bad events and prevented all the various kernel panics, the machine stayed running long enough to log an MCA which pointed to a failing CPU cache. We we're lucky it was CPU #2 so we disabled all cores for said CPU in /boot/loader.conf and all the issues disappeared. We replaced the CPU and no more issues. We we're in the same situation as you, two machines identical configs, one which was constantly panicing in mfi the other was rock solid. Regards Steve