[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <028ed448-a948-79d9-f224-c325029b17ab@redhat.com>
Date: Tue, 21 Jan 2020 16:35:27 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Jason Gunthorpe <jgg@...lanox.com>,
Shahaf Shuler <shahafs@...lanox.com>,
Rob Miller <rob.miller@...adcom.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
Netdev <netdev@...r.kernel.org>,
"Bie, Tiwei" <tiwei.bie@...el.com>,
"maxime.coquelin@...hat.com" <maxime.coquelin@...hat.com>,
"Liang, Cunming" <cunming.liang@...el.com>,
"Wang, Zhihong" <zhihong.wang@...el.com>,
"Wang, Xiao W" <xiao.w.wang@...el.com>,
"haotian.wang@...ive.com" <haotian.wang@...ive.com>,
"Zhu, Lingshan" <lingshan.zhu@...el.com>,
"eperezma@...hat.com" <eperezma@...hat.com>,
"lulu@...hat.com" <lulu@...hat.com>,
Parav Pandit <parav@...lanox.com>,
"Tian, Kevin" <kevin.tian@...el.com>,
"stefanha@...hat.com" <stefanha@...hat.com>,
"rdunlap@...radead.org" <rdunlap@...radead.org>,
"hch@...radead.org" <hch@...radead.org>,
Ariel Adam <aadam@...hat.com>, Jiri Pirko <jiri@...lanox.com>,
"hanand@...inx.com" <hanand@...inx.com>,
"mhabets@...arflare.com" <mhabets@...arflare.com>
Subject: Re: [PATCH 3/5] vDPA: introduce vDPA bus
On 2020/1/21 下午4:15, Michael S. Tsirkin wrote:
> On Tue, Jan 21, 2020 at 04:00:38PM +0800, Jason Wang wrote:
>> On 2020/1/21 下午1:47, Michael S. Tsirkin wrote:
>>> On Tue, Jan 21, 2020 at 12:00:57PM +0800, Jason Wang wrote:
>>>> On 2020/1/21 上午1:49, Jason Gunthorpe wrote:
>>>>> On Mon, Jan 20, 2020 at 04:43:53PM +0800, Jason Wang wrote:
>>>>>> This is similar to the design of platform IOMMU part of vhost-vdpa. We
>>>>>> decide to send diffs to platform IOMMU there. If it's ok to do that in
>>>>>> driver, we can replace set_map with incremental API like map()/unmap().
>>>>>>
>>>>>> Then driver need to maintain rbtree itself.
>>>>> I think we really need to see two modes, one where there is a fixed
>>>>> translation without dynamic vIOMMU driven changes and one that
>>>>> supports vIOMMU.
>>>> I think in this case, you meant the method proposed by Shahaf that sends
>>>> diffs of "fixed translation" to device?
>>>>
>>>> It would be kind of tricky to deal with the following case for example:
>>>>
>>>> old map [4G, 16G) new map [4G, 8G)
>>>>
>>>> If we do
>>>>
>>>> 1) flush [4G, 16G)
>>>> 2) add [4G, 8G)
>>>>
>>>> There could be a window between 1) and 2).
>>>>
>>>> It requires the IOMMU that can do
>>>>
>>>> 1) remove [8G, 16G)
>>>> 2) flush [8G, 16G)
>>>> 3) change [4G, 8G)
>>>>
>>>> ....
>>> Basically what I had in mind is something like qemu memory api
>>>
>>> 0. begin
>>> 1. remove [8G, 16G)
>>> 2. add [4G, 8G)
>>> 3. commit
>>
>> This sounds more flexible e.g driver may choose to implement static mapping
>> one through commit. But a question here, it looks to me this still requires
>> the DMA to be synced with at least commit here. Otherwise device may get DMA
>> fault? Or device is expected to be paused DMA during begin?
>>
>> Thanks
> For example, commit might switch one set of tables for another,
> without need to pause DMA.
Yes, I think that works but need confirmation from Shahaf or Jason.
Thanks
>
>>> Anyway, I'm fine with a one-shot API for now, we can
>>> improve it later.
>>>
>>>>> There are different optimization goals in the drivers for these two
>>>>> configurations.
>>>>>
>>>>>>> If the first one, then I think memory hotplug is a heavy flow
>>>>>>> regardless. Do you think the extra cycles for the tree traverse
>>>>>>> will be visible in any way?
>>>>>> I think if the driver can pause the DMA during the time for setting up new
>>>>>> mapping, it should be fine.
>>>>> This is very tricky for any driver if the mapping change hits the
>>>>> virtio rings. :(
>>>>>
>>>>> Even a IOMMU using driver is going to have problems with that..
>>>>>
>>>>> Jason
>>>> Or I wonder whether ATS/PRI can help here. E.g during I/O page fault,
>>>> driver/device can wait for the new mapping to be set and then replay the
>>>> DMA.
>>>>
>>>> Thanks
>>>>
Powered by blists - more mailing lists