[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM0PR0502MB37956A8D6690B190EEA713A5C30D0@AM0PR0502MB3795.eurprd05.prod.outlook.com>
Date: Tue, 21 Jan 2020 11:09:38 +0000
From: Shahaf Shuler <shahafs@...lanox.com>
To: Jason Wang <jasowang@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>
CC: Jason Gunthorpe <jgg@...lanox.com>,
Rob Miller <rob.miller@...adcom.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
Netdev <netdev@...r.kernel.org>,
"Bie, Tiwei" <tiwei.bie@...el.com>,
"maxime.coquelin@...hat.com" <maxime.coquelin@...hat.com>,
"Liang, Cunming" <cunming.liang@...el.com>,
"Wang, Zhihong" <zhihong.wang@...el.com>,
"Wang, Xiao W" <xiao.w.wang@...el.com>,
"haotian.wang@...ive.com" <haotian.wang@...ive.com>,
"Zhu, Lingshan" <lingshan.zhu@...el.com>,
"eperezma@...hat.com" <eperezma@...hat.com>,
"lulu@...hat.com" <lulu@...hat.com>,
Parav Pandit <parav@...lanox.com>,
"Tian, Kevin" <kevin.tian@...el.com>,
"stefanha@...hat.com" <stefanha@...hat.com>,
"rdunlap@...radead.org" <rdunlap@...radead.org>,
"hch@...radead.org" <hch@...radead.org>,
Ariel Adam <aadam@...hat.com>, Jiri Pirko <jiri@...lanox.com>,
"hanand@...inx.com" <hanand@...inx.com>,
"mhabets@...arflare.com" <mhabets@...arflare.com>
Subject: RE: [PATCH 3/5] vDPA: introduce vDPA bus
Tuesday, January 21, 2020 10:35 AM, Jason Wang:
> Subject: Re: [PATCH 3/5] vDPA: introduce vDPA bus
>
>
> On 2020/1/21 下午4:15, Michael S. Tsirkin wrote:
> > On Tue, Jan 21, 2020 at 04:00:38PM +0800, Jason Wang wrote:
> >> On 2020/1/21 下午1:47, Michael S. Tsirkin wrote:
> >>> On Tue, Jan 21, 2020 at 12:00:57PM +0800, Jason Wang wrote:
> >>>> On 2020/1/21 上午1:49, Jason Gunthorpe wrote:
> >>>>> On Mon, Jan 20, 2020 at 04:43:53PM +0800, Jason Wang wrote:
> >>>>>> This is similar to the design of platform IOMMU part of
> >>>>>> vhost-vdpa. We decide to send diffs to platform IOMMU there. If
> >>>>>> it's ok to do that in driver, we can replace set_map with incremental
> API like map()/unmap().
> >>>>>>
> >>>>>> Then driver need to maintain rbtree itself.
> >>>>> I think we really need to see two modes, one where there is a
> >>>>> fixed translation without dynamic vIOMMU driven changes and one
> >>>>> that supports vIOMMU.
> >>>> I think in this case, you meant the method proposed by Shahaf that
> >>>> sends diffs of "fixed translation" to device?
> >>>>
> >>>> It would be kind of tricky to deal with the following case for example:
> >>>>
> >>>> old map [4G, 16G) new map [4G, 8G)
> >>>>
> >>>> If we do
> >>>>
> >>>> 1) flush [4G, 16G)
> >>>> 2) add [4G, 8G)
> >>>>
> >>>> There could be a window between 1) and 2).
> >>>>
> >>>> It requires the IOMMU that can do
> >>>>
> >>>> 1) remove [8G, 16G)
> >>>> 2) flush [8G, 16G)
> >>>> 3) change [4G, 8G)
> >>>>
> >>>> ....
> >>> Basically what I had in mind is something like qemu memory api
> >>>
> >>> 0. begin
> >>> 1. remove [8G, 16G)
> >>> 2. add [4G, 8G)
> >>> 3. commit
> >>
> >> This sounds more flexible e.g driver may choose to implement static
> >> mapping one through commit. But a question here, it looks to me this
> >> still requires the DMA to be synced with at least commit here.
> >> Otherwise device may get DMA fault? Or device is expected to be paused
> DMA during begin?
> >>
> >> Thanks
> > For example, commit might switch one set of tables for another,
> > without need to pause DMA.
>
>
> Yes, I think that works but need confirmation from Shahaf or Jason.
From my side, as I wrote, I would like to see the suggested function prototype along w/ the definition of the expectation from driver upon calling those.
It is not 100% clear to me what should be the outcome of remove/flush/change/commit
>
> Thanks
>
>
>
> >
> >>> Anyway, I'm fine with a one-shot API for now, we can improve it
> >>> later.
> >>>
> >>>>> There are different optimization goals in the drivers for these
> >>>>> two configurations.
> >>>>>
> >>>>>>> If the first one, then I think memory hotplug is a heavy flow
> >>>>>>> regardless. Do you think the extra cycles for the tree traverse
> >>>>>>> will be visible in any way?
> >>>>>> I think if the driver can pause the DMA during the time for
> >>>>>> setting up new mapping, it should be fine.
> >>>>> This is very tricky for any driver if the mapping change hits the
> >>>>> virtio rings. :(
> >>>>>
> >>>>> Even a IOMMU using driver is going to have problems with that..
> >>>>>
> >>>>> Jason
> >>>> Or I wonder whether ATS/PRI can help here. E.g during I/O page
> >>>> fault, driver/device can wait for the new mapping to be set and
> >>>> then replay the DMA.
> >>>>
> >>>> Thanks
> >>>>
Powered by blists - more mailing lists