From owner-freebsd-scsi@freebsd.org Thu Jun 1 18:30:43 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5CAC3B7BDE2 for ; Thu, 1 Jun 2017 18:30:43 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0DA906DA; Thu, 1 Jun 2017 18:30:42 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [78.138.80.135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id v51IUeTo083315; Thu, 1 Jun 2017 20:30:40 +0200 (CEST) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (titan.inop.mo1.omnilan.net [IPv6:2001:a60:f0bb:1::3:1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id 3E14CF78; Thu, 1 Jun 2017 20:30:40 +0200 (CEST) Message-ID: <59305D4F.40707@omnilan.de> Date: Thu, 01 Jun 2017 20:30:39 +0200 From: Harry Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Stephen Mcconnell CC: freebsd-scsi@freebsd.org, Scott Long Subject: Re: sporadic CAM (all devices) outage on 11-stable, mps(4), ahci(4) and bhyve(8) involved. [Was: Re: mps(4) blocks panic-reboot] References: <592FDE8C.1090609@omnilan.de> 12a36df9eff99c77ec621987efbe75fe@mail.gmail.com <59303484.1040609@omnilan.de> <593056E9.6000807@omnilan.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: ACL 129 matched, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [78.138.80.130]); Thu, 01 Jun 2017 20:30:40 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: 78.138.80.135; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jun 2017 18:30:43 -0000 Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 20:12 (localtime): >> -----Original Message----- >> From: Harry Schmalzbauer [mailto:freebsd@omnilan.de] >> Sent: Thursday, June 01, 2017 12:03 PM >> To: Stephen Mcconnell >> Cc: freebsd-scsi@freebsd.org; Scott Long >> Subject: Re: mps(4) blocks panic-reboot >> >> Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 19:36 (localtime): >>> Can you try the attached patch and let me know how it goes? I didn't >>> test it, but since you know how, it might be easier this way. This was >>> diff'd from the latest mps files in stable/11, which I recently updated >>> (today). >> >> Thanks a lot, I noticed the highly appreciated MFC! >> Things are cooking... There were sysdecode userland changes, so I need to >> buidl >> world also, before my rollout system provides the update for this >> machine – will >> be ready in an hour. >> >> Since I have expert's attention, I'd like to ask a another mps(4) related >> question: >> >> I had unionfs deadlocks. (I'm aware of the broken status of unionfs, and >> since >> I'm not able to fix it myself at the moment, I already replaced it with >> nullfs >> where possible, true for the following event) >> >> Since this machine has a memory-disk as rootfs (and 5 SSDs via mps(4) for >> bootpool and a separate syspool, where /var e.g. lives), I guess the >> deadlock is >> responsible for simultanious disappearance of all mps(4) attached drives. >> >> Is that plausable? (meaning, does the mps(4) driver depend on filesystem >> subsystem?) >> >> Or do you have any idea what else could lead to disapearance of all drives >> simultaniously? Other ata drives, via on-board ahci (C203) were not >> affected! >> UNfortunately, I haven't been able to record any kernel messages when that >> happened (3 times as far as I remember, no occurence since abandoning >> unionfs >> yet) > > This doesn't seem like an mps driver problem to me, but maybe someone else > here can help more than I can. I can't think of anything that might be > causing your drives to disappear. It would help if you could get some kernel > logs when this happens. Thanks, I should have searched beforehand... Two lies: At least once there were also SATA drives via ahci(4) affected, and I noted some kernel messages. Please see this post: https://lists.freebsd.org/pipermail/freebsd-scsi/2016-December/007216.html Sorry, thought it was longer ago and not discueesd at scsi@ at all... At that time, there was unionfs involved, which later lead to complete deadlocks on different setups with completely different applications. But I think that (deadlock) is one possible root of problems these setups had in common. So if one expert can tell me – nope, disapearing drives can't be related to (union)fs deadlocks, or the opposite, I'd be deeply grateful. -harry