Date: Fri, 30 Jun 2017 21:22:08 +0200 From: Harry Schmalzbauer <freebsd@omnilan.de> To: Alexander Motin <mav@freebsd.org>, scsi@freebsd.org Subject: Re: bhyve ahcich0: Timeout on slot 0 port 0, , regression with stable/11->releng/11.1 Message-ID: <5956A4E0.3030108@omnilan.de> In-Reply-To: <5956A3DF.8060109@omnilan.de> References: <5956A3DF.8060109@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Bezüglich Harry Schmalzbauer's Nachricht vom 30.06.2017 21:17 (localtime): > Hello, > > on releng/11.1 I noticed a severe performace degradation during file > unlinking in a FreeBSD guest. Host was running quiet recent stable/11 > before. > > On the host, the vm is started with > ahci,hd:/dev/adaN > > The guest attaches: > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: <BHYVE SATA DISK 001> ACS-2 ATA SATA 3.x device > > The geust has very high Sys-load during unlinking (50-75%@2 cores). > Also, the host logs these errors: Sorry, not the host logs these errors, but the guest, obviously. > ahcich0: Timeout on slot 0 port 0 > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr > 00000000 cmd 0001cd17 > ahcich0: ... waiting for slots fffb7ffe > ahcich0: Timeout on slot 10 port 0 > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr > 00000000 cmd 0001cd17 > ahcich0: ... waiting for slots fffb7bfe > ahcich0: Timeout on slot 14 port 0 > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr > 00000000 cmd 0001cd17 > ahcich0: ... waiting for slots fffb3bfe > ahcich0: Timeout on slot 17 port 0 > … > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr > 00000000 cmd 0001c617 > ahcich0: ... waiting for slots 00018000 > ahcich0: Timeout on slot 15 port 0 > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr > 00000000 cmd 0001c617 > ahcich0: ... waiting for slots 00010000 > ahcich0: Timeout on slot 16 port 0 > ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr > 00000000 cmd 0001c617 > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 e8 30 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 40 e8 30 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 80 e8 30 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > … > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 c0 ff 44 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 00 45 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 40 00 45 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 80 00 45 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): SEND_FPDMA_QUEUED DATA SET MANAGEMENT. ACB: 64 01 > 00 00 00 40 00 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 c0 00 45 40 04 00 > 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > … > And so on. > > I always saw performance penalty using ahci instead of virtio-blk, most > likely due to TRIM support, but never noticed such a huge difference: > obj-tree deleting takes <1min with virtio-blk and usually took about 8 > minutes with ahci on stable/11. > Now (releng/11.1) it takes >20min (not yet finished) and I get really > lots of these errors. > > Can someone (mav?) interpret the command errors and tell if it could be > a new problem due to recent MFCs? > Will bisect stable/11 revisions to see where it starts if nobody has a > quick idea about the cause. > > Thanks, > > -harry > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5956A4E0.3030108>