Date: Mon, 29 Apr 2013 08:37:51 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Peter Wemm <peter@wemm.org> Cc: svn-src-stable@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-stable-9@freebsd.org Subject: Re: svn commit: r249611 - in stable/9/sys/cam: ata scsi Message-ID: <517E072F.2030006@FreeBSD.org> In-Reply-To: <CAGE5yCov5PoWHO3pHH_L8JGndOVdW7S2sefFEbtQV2N4Sc3R8A@mail.gmail.com> References: <201304180944.r3I9i05t093967@svn.freebsd.org> <CAGE5yCrdLUmgDOkFy7u8PpwUgtCcD4=kv0UVO79RPMR80mJ1xQ@mail.gmail.com> <517AC0BB.4040207@FreeBSD.org> <CAGE5yCrHoWUqcCe4=ZfO%2B27s8WdorfbNPuB7dLkCr24du7LsCA@mail.gmail.com> <CAGE5yCov5PoWHO3pHH_L8JGndOVdW7S2sefFEbtQV2N4Sc3R8A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 29.04.2013 04:29, Peter Wemm wrote: > On Sat, Apr 27, 2013 at 5:56 PM, Peter Wemm <peter@wemm.org> wrote: >> On Fri, Apr 26, 2013 at 11:00 AM, Alexander Motin <mav@freebsd.org> wrote: >>> On 26.04.2013 19:47, Peter Wemm wrote: >>>> >>>> On Thu, Apr 18, 2013 at 2:44 AM, Alexander Motin <mav@freebsd.org> wrote: >>>>> >>>>> Author: mav >>>>> Date: Thu Apr 18 09:44:00 2013 >>>>> New Revision: 249611 >>>>> URL: http://svnweb.freebsd.org/changeset/base/249611 >> [..] >>>> This breaks a number of machines in the freebsd.org cluster. I have >>>> to back out both of these changes to get them to reboot. >>> >>> >>> I've made a search though the base system and found only two drivers >>> affected by this change: mpt and hptmv. I've patched both at head r249849 >>> and going to merge fix to stable/9 tomorrow unless objected. Have you tried >>> that patch instead of reverting? >> >> I'm testing this on ns1.freebsd.org and ns2.freebsd.org as we speak. >> If the cluster goes dark, that's why :) > > The machines have survived multiple reboots with > r249849. Thank you. > I do wonder if perhaps "post_sync" isn't the ideal name. Perhaps add > a "quiesce_hardware" eventhandler chain and make it clear that this is > the hook for what things like mpt were doing. It is only one specific case, but I am not sure that mpt is doing the right thing. According to the commit log, interrupts are disabled to prevent new incoming target commands. But may be it could/should be blocked in some other way. What worries me more is that I've just rediscovered that post_sync handlers are called even in case of panic, when no FS sync happens at all. Is it wise to touch random subsystems is state when nothing can be trusted? I've recalled that earlier I've even added checks to CAM ATA disk driver to not recurse lock in case of panic, but that IMO is a dirty hack. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?517E072F.2030006>