Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 01 Jun 2017 20:30:39 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Stephen Mcconnell <stephen.mcconnell@broadcom.com>
Cc:        freebsd-scsi@freebsd.org, Scott Long <scottl@freebsd.org>
Subject:   Re: sporadic CAM (all devices) outage on 11-stable, mps(4), ahci(4) and bhyve(8) involved. [Was: Re: mps(4) blocks panic-reboot]
Message-ID:  <59305D4F.40707@omnilan.de>
In-Reply-To: <d48587b45e608cd519155d19567d03af@mail.gmail.com>
References:  <592FDE8C.1090609@omnilan.de> 12a36df9eff99c77ec621987efbe75fe@mail.gmail.com <ff9342e2e1eb541f347d9f683cfc8214@mail.gmail.com> <59303484.1040609@omnilan.de> <e6fe7cc17fb1302caf2122eaa11d10ba@mail.gmail.com> <593056E9.6000807@omnilan.de> <d48587b45e608cd519155d19567d03af@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 20:12 (localtime):
>> -----Original Message-----
>> From: Harry Schmalzbauer [mailto:freebsd@omnilan.de]
>> Sent: Thursday, June 01, 2017 12:03 PM
>> To: Stephen Mcconnell
>> Cc: freebsd-scsi@freebsd.org; Scott Long
>> Subject: Re: mps(4) blocks panic-reboot
>>
>> Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 19:36 (localtime):
>>> Can you try the attached patch and let me know how it goes? I didn't
>>> test it, but since you know how, it might be easier this way. This was
>>> diff'd from the latest mps files in stable/11, which I recently updated
>>> (today).
>>
>> Thanks a lot, I noticed the highly appreciated MFC!
>> Things are cooking... There were sysdecode userland changes, so I need to
>> buidl
>> world also, before my rollout system provides the update for this
>> machine – will
>> be ready in an hour.
>>
>> Since I have expert's attention, I'd like to ask a another mps(4) related
>> question:
>>
>> I had unionfs deadlocks.  (I'm aware of the broken status of unionfs, and
>> since
>> I'm not able to fix it myself at the moment, I already replaced it with
>> nullfs
>> where possible, true for the following event)
>>
>> Since this machine has a memory-disk as rootfs (and 5 SSDs via mps(4) for
>> bootpool and a separate syspool, where /var e.g. lives), I guess the
>> deadlock is
>> responsible for simultanious disappearance of all mps(4) attached drives.
>>
>> Is that plausable? (meaning, does the mps(4) driver depend on filesystem
>> subsystem?)
>>
>> Or do you have any idea what else could lead to disapearance of all drives
>> simultaniously? Other ata drives, via on-board ahci (C203) were not
>> affected!
>> UNfortunately, I haven't been able to record any kernel messages when that
>> happened (3 times as far as I remember, no occurence since abandoning
>> unionfs
>> yet)
> 
> This doesn't seem like an mps driver problem to me, but maybe someone else
> here can help more than I can. I can't think of anything that might be
> causing your drives to disappear. It would help if you could get some kernel
> logs when this happens.

Thanks, I should have searched beforehand... Two lies: At least once
there were also SATA drives via ahci(4) affected, and I noted some
kernel messages.

Please see this post:
https://lists.freebsd.org/pipermail/freebsd-scsi/2016-December/007216.html

Sorry, thought it was longer ago and not discueesd at scsi@ at all...

At that time, there was unionfs involved, which later lead to complete
deadlocks on different setups with completely different applications.
But I think that (deadlock) is one possible root of problems these
setups had in common.

So if one expert can tell me – nope, disapearing drives can't be related
to (union)fs deadlocks, or the opposite, I'd be deeply grateful.

-harry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59305D4F.40707>