Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Nov 2012 18:26:13 -0000
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Doug Ambrisko" <ambrisko@ambrisko.com>
Cc:        freebsd-scsi@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: mfi panic on recused on non-recusive mutex MFI I/O lock
Message-ID:  <6B5B65F4FC854EB8BBC701500096602E@multiplay.co.uk>
References:  <2DC1C56CFFF24FE0B17C34AD21A7DFAA@multiplay.co.uk> <39D16C43C8274CE9B8F23C18459E2FD4@multiplay.co.uk> <20121105212911.GA17904@ambrisko.com> <27169C7FE704495087A093752D15E7B6@multiplay.co.uk> <20121106180152.GA40422@ambrisko.com>

next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- 
From: "Doug Ambrisko" <ambrisko@ambrisko.com>

> On Tue, Nov 06, 2012 at 12:09:42AM -0000, Steven Hartland wrote:
> | Thanks Doug, actually just finished another test run with some more
> | debugging in and I believe I've found the reason for the non-recusive
> | lock and at least some of the queuing issues.
> | 
> | The non-recursive lock is due to the mfi_tbolt_reset calling
> | mfi_process_fw_state_chg_isr with mfi_io_lock held which in turn calls
> | mfi_tbolt_init_MFI_queue which tries to acquire mfi_io_lock hence
> | the problem.
> | 
> | mfi-lock.txt attached I believe fixes this as well as what appears
> | to be an invalid call to mtx_unlock(&sc->mfi_io_lock) in mfi_attach
> | which never acquires the lock as far as can see, possibly a cut and
> | paste error.
> 
> I don't seem to see the attachment.

Yer seems like some mail fail by me there, but I've had some more locking
panics during todays tests anyway, requiring additional fixes. Will update
and post when I'm happy with it.

> | The invalid queue problems seem to stem from the error cases of
> | the calls to mfi_mapcmd, some of which call mfi_release_command which
> | blindly sets cm_flags = 0 and then enqueues it on the free queue. Now
> | depending on the flow of mfi_mapcmd and where the error occurs the
> | command may or may not have been put on the busy queue which is going
> | to cause problems.
> | 
> | Going to investigate this further but that's what my current theory is.

I think I've pretty much nailed the queuing issues, there's quite a few
it seems caused by inconsitent handling of calls to mfi_mapcmd as suspected.

My current outstanding issue is that after adapter reset, commands are left
in the queue causing constant timeouts. Hopefully this should be relatively
easy to track down and fix too.

> | Your patch seems quite extensive, so if could you give me brief run
> | down on the changes that would be most appreciated.
> 
> I'll being doing that in the commit message which should happen today.

Cool look forward to it :)

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6B5B65F4FC854EB8BBC701500096602E>