[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201104093337.ge3qtlfhkjjkx4ax@steredhat>
Date: Wed, 4 Nov 2020 10:33:37 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Peter Xu <peterx@...hat.com>
Cc: Jason Wang <jasowang@...hat.com>, mst@...hat.com,
netdev@...r.kernel.org, Stefan Hajnoczi <stefanha@...hat.com>,
kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vhost/vsock: add IOTLB API support
On Tue, Nov 03, 2020 at 02:46:13PM -0500, Peter Xu wrote:
>On Tue, Nov 03, 2020 at 05:04:23PM +0800, Jason Wang wrote:
>>
>> On 2020/11/3 上午1:11, Stefano Garzarella wrote:
>> > On Fri, Oct 30, 2020 at 07:44:43PM +0800, Jason Wang wrote:
>> > >
>> > > On 2020/10/30 下午6:54, Stefano Garzarella wrote:
>> > > > On Fri, Oct 30, 2020 at 06:02:18PM +0800, Jason Wang wrote:
>> > > > >
>> > > > > On 2020/10/30 上午1:43, Stefano Garzarella wrote:
>> > > > > > This patch enables the IOTLB API support for vhost-vsock devices,
>> > > > > > allowing the userspace to emulate an IOMMU for the guest.
>> > > > > >
>> > > > > > These changes were made following vhost-net, in details this patch:
>> > > > > > - exposes VIRTIO_F_ACCESS_PLATFORM feature and inits the iotlb
>> > > > > > device if the feature is acked
>> > > > > > - implements VHOST_GET_BACKEND_FEATURES and
>> > > > > > VHOST_SET_BACKEND_FEATURES ioctls
>> > > > > > - calls vq_meta_prefetch() before vq processing to prefetch vq
>> > > > > > metadata address in IOTLB
>> > > > > > - provides .read_iter, .write_iter, and .poll callbacks for the
>> > > > > > chardev; they are used by the userspace to exchange IOTLB messages
>> > > > > >
>> > > > > > This patch was tested with QEMU and a patch applied [1] to fix a
>> > > > > > simple issue:
>> > > > > > $ qemu -M q35,accel=kvm,kernel-irqchip=split \
>> > > > > > -drive file=fedora.qcow2,format=qcow2,if=virtio \
>> > > > > > -device intel-iommu,intremap=on \
>> > > > > > -device vhost-vsock-pci,guest-cid=3,iommu_platform=on
>> > > > >
>> > > > >
>> > > > > Patch looks good, but a question:
>> > > > >
>> > > > > It looks to me you don't enable ATS which means vhost won't
>> > > > > get any invalidation request or did I miss anything?
>> > > > >
>> > > >
>> > > > You're right, I didn't see invalidation requests, only miss and
>> > > > updates.
>> > > > Now I have tried to enable 'ats' and 'device-iotlb' but I still
>> > > > don't see any invalidation.
>> > > >
>> > > > How can I test it? (Sorry but I don't have much experience yet
>> > > > with vIOMMU)
>> > >
>> > >
>> > > I guess it's because the batched unmap. Maybe you can try to use
>> > > "intel_iommu=strict" in guest kernel command line to see if it
>> > > works.
>> > >
>> > > Btw, make sure the qemu contains the patch [1]. Otherwise ATS won't
>> > > be enabled for recent Linux Kernel in the guest.
>> >
>> > The problem was my kernel, it was built with a tiny configuration.
>> > Using fedora stock kernel I can see the 'invalidate' requests, but I
>> > also had the following issues.
>> >
>> > Do they make you ring any bells?
>> >
>> > $ ./qemu -m 4G -smp 4 -M q35,accel=kvm,kernel-irqchip=split \
>> > -drive file=fedora.qcow2,format=qcow2,if=virtio \
>> > -device intel-iommu,intremap=on,device-iotlb=on \
>> > -device vhost-vsock-pci,guest-cid=6,iommu_platform=on,ats=on,id=v1
>> >
>> > qemu-system-x86_64: vtd_iova_to_slpte: detected IOVA overflow
>> > (iova=0x1d40000030c0)
>>
>>
>> It's a hint that IOVA exceeds the AW. It might be worth to check whether the
>> missed IOVA reported from IOTLB is legal.
>
>Yeah. By default the QEMU vIOMMU should only support 39bits width for guest
>iova address space. To extend it, we can use:
>
> -device intel-iommu,aw-bits=48
>
>So we'll enable 4-level iommu pgtable.
>
>Here the iova is obvious longer than this, so it'll be interesting to know why
>that iova is allocated in the guest driver since the driver should know somehow
>that this iova is beyond what's supported (guest iommu driver should be able to
>probe viommu capability on this width information too).
>
Peter, Jason, thanks for the hints!
I'll try to understand what is going on in the guest driver.
Stefano
Powered by blists - more mailing lists