Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Jun 2017 21:22:08 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Alexander Motin <mav@freebsd.org>, scsi@freebsd.org
Subject:   Re: bhyve ahcich0: Timeout on slot 0 port 0, , regression with stable/11->releng/11.1
Message-ID:  <5956A4E0.3030108@omnilan.de>
In-Reply-To: <5956A3DF.8060109@omnilan.de>
References:  <5956A3DF.8060109@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
 Bezüglich Harry Schmalzbauer's Nachricht vom 30.06.2017 21:17 (localtime):
>  Hello,
>
> on releng/11.1 I noticed a severe performace degradation during file
> unlinking in a FreeBSD guest. Host was running quiet recent stable/11
> before.
>
> On the host, the vm is started with
> ahci,hd:/dev/adaN
>
> The guest attaches:
> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
> ada0: <BHYVE SATA DISK 001> ACS-2 ATA SATA 3.x device
>
> The geust has very high Sys-load during unlinking (50-75%@2 cores).
> Also, the host logs these errors:

Sorry, not the host logs these errors, but the guest, obviously.


> ahcich0: Timeout on slot 0 port 0
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr
> 00000000 cmd 0001cd17
> ahcich0: ... waiting for slots fffb7ffe
> ahcich0: Timeout on slot 10 port 0
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr
> 00000000 cmd 0001cd17
> ahcich0: ... waiting for slots fffb7bfe
> ahcich0: Timeout on slot 14 port 0
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffb7fff tfd 50 serr
> 00000000 cmd 0001cd17
> ahcich0: ... waiting for slots fffb3bfe
> ahcich0: Timeout on slot 17 port 0
> …
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr
> 00000000 cmd 0001c617
> ahcich0: ... waiting for slots 00018000
> ahcich0: Timeout on slot 15 port 0
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr
> 00000000 cmd 0001c617
> ahcich0: ... waiting for slots 00010000
> ahcich0: Timeout on slot 16 port 0
> ahcich0: is 00000008 cs 00000000 ss 00000000 rs fffbffff tfd 50 serr
> 00000000 cmd 0001c617
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 e8 30 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 40 e8 30 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 80 e8 30 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> …
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 c0 ff 44 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 00 45 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 40 00 45 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 80 00 45 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): SEND_FPDMA_QUEUED DATA SET MANAGEMENT. ACB: 64 01
> 00 00 00 40 00 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 c0 00 45 40 04 00
> 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Retrying command
> …
> And so on.
>
> I always saw performance penalty using ahci instead of virtio-blk, most
> likely due to TRIM support, but never noticed such a huge difference:
> obj-tree deleting takes <1min with virtio-blk and usually took about 8
> minutes with ahci on stable/11.
> Now (releng/11.1) it takes >20min (not yet finished) and I get really
> lots of these errors.
>
> Can someone (mav?) interpret the command errors and tell if it could be
> a new problem due to recent MFCs?
> Will bisect stable/11 revisions to see where it starts if nobody has a
> quick idea about the cause.
>
> Thanks,
>
> -harry
> _______________________________________________
> freebsd-scsi@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5956A4E0.3030108>