From owner-svn-src-all@freebsd.org Tue Dec 17 20:29:47 2019 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E0AAB1C921E; Tue, 17 Dec 2019 20:29:47 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47cqWH5fNQz3FyM; Tue, 17 Dec 2019 20:29:47 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id BD2AA1B4A4; Tue, 17 Dec 2019 20:29:47 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xBHKTlq6007702; Tue, 17 Dec 2019 20:29:47 GMT (envelope-from ken@FreeBSD.org) Received: (from ken@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xBHKTlpE007701; Tue, 17 Dec 2019 20:29:47 GMT (envelope-from ken@FreeBSD.org) Message-Id: <201912172029.xBHKTlpE007701@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ken set sender to ken@FreeBSD.org using -f From: "Kenneth D. Merry" Date: Tue, 17 Dec 2019 20:29:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r355862 - stable/12/sys/cam/scsi X-SVN-Group: stable-12 X-SVN-Commit-Author: ken X-SVN-Commit-Paths: stable/12/sys/cam/scsi X-SVN-Commit-Revision: 355862 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Dec 2019 20:29:47 -0000 Author: ken Date: Tue Dec 17 20:29:47 2019 New Revision: 355862 URL: https://svnweb.freebsd.org/changeset/base/355862 Log: MFC r355299: ------------------------------------------------------------------------ r355299 | ken | 2019-12-02 14:57:39 -0500 (Mon, 02 Dec 2019) | 52 lines Fix a hang introduced in r351599. My changes in 351599 (kindly committed by avg) made the cd(4) media check asynchronous to avoid a sleep while holding a mutex. There was a difficult to reproduce bug with those changes that caused a hang on boot on some single processor machines/VMs. Leandro Lupori managed to reproduce the bug, diagnose it, and supplied a patch! Here is his analysis, from the PR: ====== I was able to reproduce the problem described in comment#14. Actually, I wasn't trying to reproduce it, I just started seeing it a few weeks ago, in CURRENT. I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a single core/thread (-smp 1). It happens only when there is no media in the emulated CD-ROM, a device that QEMU adds by default, unless -nodefaults is specified in command line. I've debugged it and this is what I've found: 1- After the CD probe is successful, GEOM will try to open the device, which will end up calling cdcheckmedia(), that sets CD state to CD_STATE_MEDIA_PREVENT. 2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED flag is set and CD state moves to CD_STATE_MEDIA_SIZE. 3- Next, scsi_read_capacity() is executed and fails, state is set to CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up cdcheckmedia(). 4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering an infinite MEDIA_SIZE command loop. When there is a least another core/thread, the GEOM thread that performed the initial cdopen() will get scheduled again, closing the CD device, that will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and breaks the loop. So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the state should be advanced to CD_STATE_MEDIA size only when the current state is CD_STATE_MEDIA_PREVENT. ===== ------------------------------------------------------------------------ PR: kern/219857 Submitted by: Leandro Lupori Modified: stable/12/sys/cam/scsi/scsi_cd.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/cam/scsi/scsi_cd.c ============================================================================== --- stable/12/sys/cam/scsi/scsi_cd.c Tue Dec 17 19:01:09 2019 (r355861) +++ stable/12/sys/cam/scsi/scsi_cd.c Tue Dec 17 20:29:47 2019 (r355862) @@ -1030,7 +1030,8 @@ cdstart(struct cam_periph *periph, union ccb *start_cc * If the CD is already locked, we don't need to do this. * Move on to the capacity check. */ - if ((softc->flags & CD_FLAG_DISC_LOCKED) != 0) { + if (softc->state == CD_STATE_MEDIA_PREVENT + && (softc->flags & CD_FLAG_DISC_LOCKED) != 0) { softc->state = CD_STATE_MEDIA_SIZE; xpt_release_ccb(start_ccb); xpt_schedule(periph, CAM_PRIORITY_NORMAL);