Date: Fri, 2 Jun 2017 11:37:05 -0400 From: "Kenneth D. Merry" <ken@FreeBSD.ORG> To: Harry Schmalzbauer <freebsd@omnilan.de> Cc: Stephen Mcconnell <stephen.mcconnell@broadcom.com>, freebsd-scsi@freebsd.org, Scott Long <scottl@freebsd.org> Subject: Re: mps(4) blocks panic-reboot Message-ID: <20170602153705.GA56018@mithlond.kdm.org> In-Reply-To: <59315A74.9050506@omnilan.de> References: <592FDE8C.1090609@omnilan.de> <ff9342e2e1eb541f347d9f683cfc8214@mail.gmail.com> <59303484.1040609@omnilan.de> <e6fe7cc17fb1302caf2122eaa11d10ba@mail.gmail.com> <59306503.4010007@omnilan.de> <59315A74.9050506@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jun 02, 2017 at 14:30:44 +0200, Harry Schmalzbauer wrote: > Bez??glich Harry Schmalzbauer's Nachricht vom 01.06.2017 21:03 (localtime): > > Bez??glich Stephen Mcconnell's Nachricht vom 01.06.2017 19:36 (localtime): > >> Can you try the attached patch and let me know how it goes? I didn't test > >> it, but since you know how, it might be easier this way. This was diff'd > >> from the latest mps files in stable/11, which I recently updated (today). > > Your diff is doing very well on r319447: > > > > > ??? > > mps0: Sending StopUnit: path (xpt0:mps0:0:6:ffffffff): handle 13 > > mps0: Completing stop unit for (xpt0:mps0:0:6:ffffffff): > > > > And, there followed a immediate reset :-) > > There's one new problem: Shutting down leads to the probably last panic > possible: > > kernel trap 12 with interrupts disabled > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x20 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff805f43ec > stack pointer = 0x28:0xfffffe03bc9c3730 > frame pointer = 0x28:0xfffffe03bc9c3750 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 1 (init) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff805df4f7 at kdb_backtrace+0x67 > #1 0xffffffff8059df96 at vpanic+0x186 > #2 0xffffffff8059de03 at panic+0x43 > #3 0xffffffff808a1892 at trap_fatal+0x322 > #4 0xffffffff808a18e9 at trap_pfault+0x49 > #5 0xffffffff808a1126 at trap+0x286 > #6 0xffffffff80887401 at calltrap+0x8 > #7 0xffffffff805800f2 at __mtx_unlock_sleep+0x72 > #8 0xffffffff8029a7dc at xpt_polled_action+0x31c > #9 0xffffffff80416c2b at mpssas_ir_shutdown+0x51b > #10 0xffffffff8059db9a at kern_reboot+0x49a > #11 0xffffffff8059d6f8 at sys_reboot+0x458 > #12 0xffffffff808a23f4 at amd64_syscall+0x6c4 > #13 0xffffffff808876eb at Xfast_syscall+0xfb > > (kgdb) list *0xffffffff805f43ec > 0xffffffff805f43ec is in turnstile_broadcast > (/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_turnstile.c:837). > 832 > 833 /* > 834 * Transfer the blocked list to the pending list. > 835 */ > 836 mtx_lock_spin(&td_contested_lock); > 837 TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked[queue], > td_lockq); > 838 mtx_unlock_spin(&td_contested_lock); > 839 > 840 /* > 841 * Give a turnstile to each thread. The last thread gets > > I haven't looked at the code at all and only very briefly lokked at the > diff, just out of curiosity, like pigs staring at clockworks ;-) > > But at least I hope this report does help. Thanks for testing it! My guess is that the problem is that the problem is xpt_polled_action() releases the device mutex, but mpssas_SSU_to_SATA_devices() isn't acquiring the mutex. You could try putting the following around the call to xpt_polled_action(): mtx_lock(xpt_path_mtx(ccb->ccb_h.path)); xpt_polled_action(ccb); mtx_unlock(xpt_path_mtx(ccb->ccb_h.path)); See if that fixes things. One other thing to put in there -- after the if (target->stop_at_shutdown) { } statement, but still inside the for loop, add these two lines: xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); Ken -- Kenneth Merry ken@FreeBSD.ORG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170602153705.GA56018>