Date: Mon, 24 Jul 2017 19:45:01 +0200 From: Mark Martinec <Mark.Martinec+freebsd@ijs.si> To: freebsd-stable@freebsd.org Cc: re@freebsd.org Subject: Re: The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching Message-ID: <42cc3fffe99f5b7d5deb7d7bf8d071cd@ijs.si> In-Reply-To: <D466218B-9C65-418E-B9C1-5AE904EA72CA@freebsd.org> References: <e4acc16980fe65751325333870bf2b68@ijs.si> <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com> <c8140f430fb2af93a6bc70a3df8cdadc@ijs.si> <9b3563aae75aa954d7fe31ffe25e1d29@ijs.si> <20170720000325.GB9198@wkstn-mjohnston.west.isilon.com> <81295bcacd7c44813de8d346c88cbb65@ijs.si> <20170724021504.GA97170@raichu> <10649c9070bc419d93ae2a87a511d2ba@ijs.si> <c9b444f1-cb74-8402-4033-0d6161739e8f@multiplay.co.uk> <D466218B-9C65-418E-B9C1-5AE904EA72CA@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
2017-07-24 18:25, Ken Merry wrote: > It is possible that the change I MFCed today (r321207 in head, r321415 > in stable/11) is related, but Mark will have to boot his machine with > the fix to see if it makes any difference. > > What happened in my case on one particular machine (not on most > machines in our lab running the same code) was that mps_wait_command() > / mpr_wait_command() would not wait the full 60 seconds for a write to > the DPM table (Driver Persistent Mapping) table in the controller. > So, it reported that there was a timeout. > [...] > Eliminating bogus timeouts will eliminate most all of the sources of > those panics anyway. Took r321415 from stable/11 and applied it to 11.1-RC3 - and it makes no difference to booting: still hangs attempting to attach da0, with a spinning CPU (according to fan speed). Booting in safe mode, or with EARLY_AP_STARTUP disabled avoids the problem. > There is a secondary bug that is still in the mps(4) / mpr(4) drivers > when a timeout does happen — the error recovery code in the > wait_command() routine reinitializes the controller, which clears out > all the commands. When the wait_command() routine returns, the > command passed in has been freed, but the caller doesn’t know that. > So the caller (it happens in a number of places) dereferences a > pointer to freed memory and the kernel panics. > > I’m planning to fix that bug, too, if slm@ doesn’t get to it first, > I’ve just had other bugs to fix first. No panics in my case, just hangs. Mark
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42cc3fffe99f5b7d5deb7d7bf8d071cd>