Date: Tue, 17 Dec 2019 20:29:57 +0000 From: bugzilla-noreply@freebsd.org To: scsi@FreeBSD.org Subject: [Bug 219857] panic in scsi_cd code Message-ID: <bug-219857-5313-ZmBKLeodrY@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-219857-5313@https.bugs.freebsd.org/bugzilla/> References: <bug-219857-5313@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219857 --- Comment #33 from commit-hook@freebsd.org --- A commit references this bug: Author: ken Date: Tue Dec 17 20:29:47 UTC 2019 New revision: 355862 URL: https://svnweb.freebsd.org/changeset/base/355862 Log: MFC r355299: ------------------------------------------------------------------------ r355299 | ken | 2019-12-02 14:57:39 -0500 (Mon, 02 Dec 2019) | 52 lines Fix a hang introduced in r351599. My changes in 351599 (kindly committed by avg) made the cd(4) media che= ck asynchronous to avoid a sleep while holding a mutex. There was a difficult to reproduce bug with those changes that caused a hang on boot on some single processor machines/VMs. Leandro Lupori managed to reproduce the bug, diagnose it, and supplied a patch! Here = is his analysis, from the PR: =3D=3D=3D=3D=3D=3D I was able to reproduce the problem described in comment#14. Actually, I wasn't trying to reproduce it, I just started seeing it a f= ew weeks ago, in CURRENT. I can reproduce it consistently, by using QEMU to run a PowerPC64 VM wi= th a single core/thread (-smp 1). It happens only when there is no media in the emulated CD-ROM, a device that QEMU adds by default, unless -nodefaults is specified in command l= ine. I've debugged it and this is what I've found: 1- After the CD probe is successful, GEOM will try to open the device, which will end up calling cdcheckmedia(), that sets CD state to CD_STATE_MEDIA_PREVENT. 2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCK= ED flag is set and CD state moves to CD_STATE_MEDIA_SIZE. 3- Next, scsi_read_capacity() is executed and fails, state is set to CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up cdcheckmedia(). 4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, ente= ring an infinite MEDIA_SIZE command loop. When there is a least another core/thread, the GEOM thread that perform= ed the initial cdopen() will get scheduled again, closing the CD device, t= hat will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag = and breaks the loop. So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the state should be advanced to CD_STATE_MEDIA size only when the current s= tate is CD_STATE_MEDIA_PREVENT. =3D=3D=3D=3D=3D ------------------------------------------------------------------------ PR: kern/219857 Submitted by: Leandro Lupori <leandro.lupori@gmail.com> Changes: _U stable/12/ stable/12/sys/cam/scsi/scsi_cd.c --=20 You are receiving this mail because: You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-219857-5313-ZmBKLeodrY>