From owner-freebsd-scsi@freebsd.org Tue Mar 13 22:46:05 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9BB6AF573FA; Tue, 13 Mar 2018 22:46:05 +0000 (UTC) (envelope-from dexuan.bsd@gmail.com) Received: from mail-lf0-x231.google.com (mail-lf0-x231.google.com [IPv6:2a00:1450:4010:c07::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DEC177A4DB; Tue, 13 Mar 2018 22:46:04 +0000 (UTC) (envelope-from dexuan.bsd@gmail.com) Received: by mail-lf0-x231.google.com with SMTP id l191-v6so1897794lfe.1; Tue, 13 Mar 2018 15:46:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FQFMc5Ss1ANsk9ngvYYLiCreKBElog3p0Yf4XycdV7I=; b=DY7uojdWxVV4Y+dnCt+T4Jqtrgl6FsnaBx3ipULu6dX0Yt//W0kYon43ngkJJ2aIqP lEEYN5pdWeWwtQOtqaOElybP3Uyp2I4wkdA5ZvvfMS1A2gzSPGVGk+pmWPMi/TDMVbUM 4tfEILF0YAKihK64LKruxqc9XWuopUjpsDoTGvNofTixi0aCyn1u+JHr18cHeGuSlArx qOuQXT8akms0dL2F6gksMsChnjsdxn36TBZo2uwxRNYs+lqGdzJFkXtRIFsMCY93UnDq KHA+PP3+Qtp27uBwoguNmhSuP8/KMJtVVs2U4u8ENDxUKHuc6W7K1V7K0WKduD6qL7OU UvKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FQFMc5Ss1ANsk9ngvYYLiCreKBElog3p0Yf4XycdV7I=; b=PVhQHt+gxSHRnUl1PNtmCMxa+ytYkRrcTPh/NvpHyAfuTgOt2lTXDU3xP0ZA55FqWQ FxJM1M+2o+N34SPY8+hSXTXhZlEGrD0cOftT3XIw26CO05J9G4MYxQyHQSkFHcuhlflu MIUYaqhvDKD/1DnpNaHK6eDVgwD3VOWUp1beZRIvO3o7ppjtnaoxV9oc6rvQuOwT6HI3 5MLKfzSWAmuiwViJg7TASsUdq9RgYpc+Irq4ZDpEFFjBAl6hglLPs3PTUtPFaVwW8aGr GyvjRwZjzLo9QMshUBPUSmDtiXy964rk88f5RlgVbBDpuPBUBxjS9lRZreW/TwZ4ypZf VKFA== X-Gm-Message-State: AElRT7GgqmT2zxuKMzNP80c8pkX0bzcod73vcqgGN175d/HKAxaTPbe2 uyqbl8LS3l0xqXTASYe5KaVu5+jvBQ/PFdf6LSUtMQ== X-Google-Smtp-Source: AG47ELtgvsAvspnG/BezU+DuJLqv+by3wQbjDW+ak4iHs1u/48kv+8rxfbtk1iDPFWkCc+tjFbCYAKbqQIhebAoFzSA= X-Received: by 2002:a19:e9cc:: with SMTP id j73-v6mr1509483lfk.99.1520981163279; Tue, 13 Mar 2018 15:46:03 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:d8e8:0:0:0:0:0 with HTTP; Tue, 13 Mar 2018 15:46:02 -0700 (PDT) In-Reply-To: References: From: Dexuan-BSD Cui Date: Tue, 13 Mar 2018 15:46:02 -0700 Message-ID: Subject: Re: A strange SCSI-disk-detached issue for FreeBSD VM on Hyper-V/Azure To: freebsd-scsi@freebsd.org, freebsd-virtualization@freebsd.org, Scott Long , scottl@freebsd.org, Warner Losh , jhibbits@freebsd.org, avg@freebsd.org, asomers@freebsd.org, cem@freebsd.org, mav@freebsd.org Cc: Dexuan Cui Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Wed, 14 Mar 2018 00:25:47 +0000 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2018 22:46:05 -0000 I reported a bug for this issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D226583 I'm working on a fix. -- Dexuan On Mon, Jan 29, 2018 at 7:44 PM, Dexuan-BSD Cui wrot= e: > With the the recent Azure host upgrade, some FreeBSD VMs can't > properly detect HDD-backed disks. > The issue happens to the latest 12-CURRENT too. > > da0 (0:0:0:0=EF=BC=89is also backed by HDD, but somehow is not affected. > da1 (1:0:1:0) it not affected either, only the extra disks, e,g. da2 > (3:0:0:0=EF=BC=89, are affected. > > These are the logs of my VM, which has da0, da1 and da2. These 3 disks > are on 3 different disk controllers. > > > [root@decui-d4-repro ~]# dmesg -c > /dev/null > [root@decui-d4-repro ~]# camcontrol rescan all > Re-scan of bus 0 was successful > Re-scan of bus 1 was successful > Re-scan of bus 2 was successful > Re-scan of bus 3 was successful > Re-scan of bus 4 was successful > Re-scan of bus 5 was successful > > [root@decui-d4-repro ~]# dmesg > (aprobe0:ata0:0:0:0): Disable ATA disks on simulated ATA controller (0x71= 118086) > (aprobe0:ata0:0:1:0): Disable ATA disks on simulated ATA controller (0x71= 118086) > (probe1:(probe0:blkvsc0:0:1:0): Down reving Protocol Version from 4 to 2? > blkvsc0:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ] > (probe1:blkvsc0:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ] > (probe0:blkvsc0:0:0:0): storvsc scsi_status =3D 2, srb_status =3D 6 > (probe0:(probe1:blkvsc0:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ] > (probe0:blkvsc0:0:1:1): Down reving Protocol Version from 4 to 2? > blkvsc0:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ] > (probe1:blkvsc0:0:1:1): invalid LUN 1 for op: INQUIRY > (probe0:blkvsc0:0:0:1): invalid LUN 1 for op: INQUIRY > (probe0:blkvsc1:0:0:0): Down reving Protocol Version from 4 to 2? > (probe0:blkvsc1:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ] > (probe1:(probe0:blkvsc1:0:1:0): storvsc inquiry (36) [0 0 5 2 1f ... ] > (probe1:blkvsc1:0:0:1): Down reving Protocol Version from 4 to 2? > blkvsc1:0:1:0): storvsc scsi_status =3D 2, srb_status =3D 6 > (probe1:blkvsc1:0:1:0): storvsc inquiry (5) [0 0 0 8 0 ... ] > (probe0:blkvsc1:0:0:1): invalid LUN 1 for op: INQUIRY > (probe1:blkvsc1:0:1:0): storvsc inquiry (6) [0 83 0 30 1 ... ] > (probe0:storvsc2:0:0:0): Down reving Protocol Version from 4 to 2? > (probe1:(probe0:storvsc2:0:1:0): Down reving Protocol Version from 4 to 2= ? > storvsc2:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ] > (probe1:(probe0:storvsc2:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ] > (probe1:storvsc2:0:0:1): Down reving Protocol Version from 4 to 2? > (probe0:storvsc2:0:1:1): Down reving Protocol Version from 4 to 2? > (probe1:storvsc2:0:0:1): invalid LUN 1 for op: INQUIRY > storvsc2:0:1:1): invalid LUN 1 for op: INQUIRY > (probe0:storvsc3:0:0:0): Down reving Protocol Version from 4 to 2? > (probe1:(probe0:storvsc3:0:1:0): Down reving Protocol Version from 4 to 2= ? > storvsc3:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ] > (probe1:(probe0:storvsc3:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ] > storvsc3:0:0:0): storvsc scsi_status =3D 2, srb_status =3D 6 > (probe0:(probe1:storvsc3:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ] > (probe0:storvsc3:0:1:1): Down reving Protocol Version from 4 to 2? > storvsc3:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ] > (probe1:storvsc3:0:1:1): invalid LUN 1 for op: INQUIRY > GEOM: new disk da2 > (probe0:pass3 at storvsc3 bus 0 scbus5 target 0 lun 0 > storvsc3:0:0:1): invalid LUN 1 for op: INQUIRY > pass3: Fixed Direct Access SPC-2 SCSI device > pass3: 300.000MB/s transfers > pass3: Command Queueing enabled > pass3 at storvsc3 bus 0 scbus5 target 0 lun 0 > pass3: detached > (da2:storvsc3:0:0:0): fatal error, could not acquire reference count > (pass3:(da2:storvsc3:0:storvsc3:0:0:0): storvsc inquiry (5) [0 b1 0 3c 0 = ... ] > 0:0): Periph destroyed > da2: Delete methods: > Opened disk da2 -> 6 > g_access(918): provider da2 has error > g_access(918): provider da2 has error > g_access(918): provider da2 has error > g_access(918): provider da2 has error > > > According to the log, first it looks da2 is indeed detected, but then > it's detached immediately: > > pass3: detached > > The above line comes from xpt_denounce_periph_sbuf(), which is called by: > > cam_periph_invalidate() at cam_periph_invalidate+0xe7/frame 0xfffffe00004= eb7f0 > xpt_async_process_dev() at xpt_async_process_dev+0x162/frame 0xfffffe0000= 4eb840 > xptdevicetraverse() at xptdevicetraverse+0xa3/frame 0xfffffe00004eb890 > xpttargettraverse() at xpttargettraverse+0x6a/frame 0xfffffe00004eb8d0 > xpt_async_process() at xpt_async_process+0x114/frame 0xfffffe00004eb9e0 > xpt_done_process() at xpt_done_process+0x38a/frame 0xfffffe00004eba20 > xpt_done_td() at xpt_done_td+0x136/frame 0xfffffe00004eba70 > fork_exit() at fork_exit+0x84/frame 0xfffffe00004ebab0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004ebab > > > I'm trying to figure out why CAM detaches da2 -- it looks something is wr= ong? > > Currently, with the below hack, da2 can be recognized and > reading/writing da2 works fine: > > --- a/sys/cam/cam_periph.c > +++ b/sys/cam/cam_periph.c > @@ -1942,8 +1942,10 @@ cam_periph_error(union ccb *ccb, cam_flags camflag= s, > * Let peripheral drivers know that this > * device has gone away. > */ > +#if 0 > xpt_async(AC_LOST_DEVICE, newpath, NULL); > xpt_free_path(newpath); > +#endif > } > } > > And I'm further debugging. I appreciate any suggestion. > > Thanks! > > -- Dexuan