Date: Mon, 27 May 2013 13:58:44 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: arch@freebsd.org, amd64@freebsd.org Subject: x86 IOMMU support (DMAR) Message-ID: <20130527105844.GC3047@kib.kiev.ua>
next in thread | raw e-mail | index | archive | help
--QMUNogrXulKRzqi3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline For the several months, I worked (and continue the work now) on the driver for the Intel VT-d for FreeBSD. The VT-d is sold as the I/O Virtualization technology, but in essence it is a DMA addresses remapping engine, i.e. it is advanced and improved I/O MMU, as also found on other big-iron machines, e.g. PowerPC or Sparc. See the Intel document titled 'Intel Virtualization Technology for Directed I/O Architecture Specification' and chipsets datasheets for the description of the facility. The development was greatly facilitated by Jim Harris from Intel who provided me the access to the Sandy and Ivy Bridge north bridge documentation. John Baldwin patiently educated me about newbus and helped developing required hooks for integration with the existing code. The core hardware element of the VT-d is DMA remap unit, referenced as DMAR both in the documentation and in the source code. Besides DMA remap, VT-d also allows to do remapping of the MSI/MSI-X interrupt messages. FreeBSD could utilize the functionality for the interrupt rebalancing, instead of reprogramming msi registers of the PCI devices, but this part is not (yet) implemented. For the FreeBSD architecture, DMAR naturally fits as busdma engine, making it possible to eliminate bounce page copying. Another great benefit of the DMAR use is the reliability and security improvements, since DMA transfers are only allowed to the memory areas explicitely designated by the device driver as buffers. As noted by Jim Harris, this security angle could find a use in the NTB driver. The existing busdma code for x86 was split into generic interface, kept in the busdma_machdep.c, and bouncing implementation in the busdma_bounce.c. The DMAR-based implementation, which calls the DMAR core, is located in the busdma_dmar.c. There is no KPI provided to manage DMARs, but I plan to implement the proper interface after discussing the needs of the bhyve. I tried to support both i386 and amd64, but for i386 the limited KVA, together with the busdma interface structure of never sleeping from the driver calls, make some promises of IOMMU less strict. For instance, to unload the map, code needs to transiently map the DMAR page table pages, which require sleepable allocations of sf buffers. As result, map unload on i386 is done asynchronously in the taskqueue context, which makes it possible for the buggy device driver or hardware to perform the transfer to freed pages for some time after unload. This problem is not present for amd64 port. For the same reason of busdma KPI, I cannot use queued invalidation both for i386 and amd64. At the moment the code makes the 1:1 relations between device contexts and domains, which is fine for busdma. To support PCI pass-through into the virtualized machines, the relations should be changed to N:1 contexts to domains, which is planned but currently is not yet done. Overall state of the code is that I can boot multiuser over the network from if_igb(4) or if_bce(4), and can use ahci(4) and ata(4) attached disks without corrupting UFS volumes. Uhci(4) has known issues due to too late establishment of the RMRR mappings. Extensive testing of the already written code is not done yet. Plans include - providing the external KPI for the VMM consumers - support ATS - making it possible to select busdma_dmar or busdma_bounce for individual PCI functions - the stabilization work. Also, by converting the ISA DMA implementation to use the busdma KPI, it is possible to make the floppies work reliably again ! It is known that IOMMU adds overhead due to the mapping and unmapping for each I/O. DMAR implementations usually have some erratas, as well as PCIe devices sometime do not completely follow the specification, causing misbehaviour with remapping enabled. For this reason I do not plan to enable IOMMU by default, and intend to provide a possibility to route individual PCI devices to the bounce busdma implementation. http://people.freebsd.org/~kib/misc/DMAR.1.patch --QMUNogrXulKRzqi3 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJRozxjAAoJEJDCuSvBvK1BrFgP/0g7HlYUyg27GjeBy/s7iBsr 1Q3j6gSZM89us0r7wGKAw/DtNWaggD5aahxeab2ZJkDxZIojGAnEDDsEPo42KLAc 9ez/gvLH2AMNXZGkfG90MyP6GFg312ohSwCYfrMwFrU4YG5Hi147JlHy6iUw0wrY Pny/3iYDn/PChwsoRVq8D6wCdy1nXypGUSCc8kbViKOPAxUTDtqL+lZnvtfaOGHZ cqbc78f9izPewPWj5xTcC+lnd8v0kql4+qPCuSZorj2ZjQEbSKsPMW5ZcureGcFV pnwH+49zcn6Ej9deSJEK+tYdWqonTf3liYyhHv8Uu7MNVZ6swEUObgLHWjbvGkxs O0m7sanOvDJ1gLNcmoQvzj24T5F3WExOAHnMq7AermdnpJHFu3Z05J/eCwVFNkR8 srhvl3/P5L5N+o71uV+FoLE7JDf3qwlB77QJqeuPcYp3mVwWhhWohgPNSjle4Nob O1EsLBW7YAoN992nm48DIbG40qdrpOjkEVWc1pqL0tFOZSUc6xF9HrXNKa/jVlEo AJLhJII/0nDrzKhrLPqY4hGZVYyiucbh41Lk5dCuPnc0BEP4x2ZKmzVnWSbXvuIx 86QpsGl9+S4p/HdZfUzU/3NZ/IbSTAwzxQdymH+L0AhTXD/rF8XQeyJ85gr19MbS iz23xjIkWBh5jR5oPUlb =9e72 -----END PGP SIGNATURE----- --QMUNogrXulKRzqi3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130527105844.GC3047>