lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181029122922.7b2a9b0c@t450s.home>
Date:   Mon, 29 Oct 2018 12:29:22 -0600
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     Simon Guo <wei.guo.simon@...ux.alibaba.com>,
        Eric Auger <eric.auger@...hat.com>,
        qixuan.wu@...ux.alibaba.com, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, Peter Xu <peterx@...hat.com>
Subject: Re: Can VFIO pin only a specific region of guest mem when use pass
 through devices?

On Mon, 29 Oct 2018 17:14:46 +0800
Jason Wang <jasowang@...hat.com> wrote:

> On 2018/10/29 上午10:42, Simon Guo wrote:
> > Hi,
> >
> > I am using network device pass through mode with qemu x86(-device vfio-pci,host=0000:xx:yy.z)
> > and “intel_iommu=on” in host kernel command line, and it shows the whole guest memory
> > were pinned(vfio_pin_pages()), viewed by the “top” RES memory output. I understand it is due
> > to device can DMA to any guest memory address and it cannot be swapped.
> >
> > However can we just pin a rang of address space allowed by iommu group of that device,
> > instead of pin whole address space? I do notice some code like vtd_host_dma_iommu().
> > Maybe there is already some way to enable that?
> >
> > Sorry if I missed some basics. I googled some but no luck to find the answer yet. Please
> > let me know if any discussion already raised on that.
> >
> > Any other suggestion will also be appreciated. For example, can we modify the guest network
> > card driver to allocate only from a specific memory region(zone), and qemu advises guest
> > kernel to only pin that memory region(zone) accordingly?
> >
> > Thanks,
> > - Simon  
> 
> 
> One possible method is to enable IOMMU of VM.

Right, making use of a virtual IOMMU in the VM is really the only way
to bound the DMA to some subset of guest memory, but vIOMMU usage by
the guest is optional on x86 and even if the guest does use it, it might
enable passthrough mode, which puts you back at the problem that all
guest memory is pinned with the additional problem that it might also
be accounted for once per assigned device and may hit locked memory
limits.  Also, the DMA mapping and unmapping path with a vIOMMU is very
slow, so performance of the device in the guest will be abysmal unless
the use case is limited to very static mappings, such as userspace use
within the guest for nested assignment or perhaps DPDK use cases.

Modifying the guest to only use a portion of memory for DMA sounds like
a quite intrusive option.  There are certainly IOMMU models where the
IOMMU provides a fixed IOVA range, but creating dynamic mappings within
that range doesn't really solve anything given that it simply returns
us to a vIOMMU with slow mapping.  A window with a fixed identity
mapping used as a DMA zone seems plausible, but again, also pretty
intrusive to the guest, possibly also to the drivers.  Host IOMMU page
faulting can also help the pinned memory footprint, but of course
requires hardware support and lots of new code paths, many of which are
already being discussed for things like Scalable IOV and SVA.  Thanks,

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ