[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e1b5461f-15a5-6647-6473-80b3e8defc48@oracle.com>
Date: Mon, 21 Aug 2023 15:31:28 -0700
From: Si-Wei Liu <si-wei.liu@...cle.com>
To: Eugenio Perez Martin <eperezma@...hat.com>
Cc: Jason Wang <jasowang@...hat.com>, gal@...dia.com,
linux-kernel@...r.kernel.org, mst@...hat.com,
virtualization@...ts.linux-foundation.org,
xuanzhuo@...ux.alibaba.com
Subject: Re: [PATCH RFC 1/4] vdpa: introduce .reset_map operation callback
On 8/17/2023 8:28 AM, Eugenio Perez Martin wrote:
> On Thu, Aug 17, 2023 at 2:05 AM Si-Wei Liu <si-wei.liu@...cle.com> wrote:
>>
>>
>> On 8/15/2023 6:55 PM, Jason Wang wrote:
>>> On Wed, Aug 16, 2023 at 3:49 AM Si-Wei Liu <si-wei.liu@...cle.com> wrote:
>>>>
>>>> On 8/14/2023 7:21 PM, Jason Wang wrote:
>>>>> On Tue, Aug 15, 2023 at 9:46 AM Si-Wei Liu <si-wei.liu@...cle.com> wrote:
>>>>>> Signed-off-by: Si-Wei Liu <si-wei.liu@...cle.com>
>>>>>> ---
>>>>>> include/linux/vdpa.h | 7 +++++++
>>>>>> 1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>>>>> index db1b0ea..3a3878d 100644
>>>>>> --- a/include/linux/vdpa.h
>>>>>> +++ b/include/linux/vdpa.h
>>>>>> @@ -314,6 +314,12 @@ struct vdpa_map_file {
>>>>>> * @iova: iova to be unmapped
>>>>>> * @size: size of the area
>>>>>> * Returns integer: success (0) or error (< 0)
>>>>>> + * @reset_map: Reset device memory mapping (optional)
>>>>>> + * Needed for device that using device
>>>>>> + * specific DMA translation (on-chip IOMMU)
>>>>> This exposes the device internal to the upper layer which is not optimal.
>>>> Not sure what does it mean by "device internal", but this op callback
>>>> just follows existing convention to describe what vdpa parent this API
>>>> targets.
>>> I meant the bus tries to hide the differences among vendors. So it
>>> needs to hide on-chip IOMMU stuff to the upper layer.
>>>
>>> We can expose two dimensional IO mappings models but it looks like
>>> over engineering for this issue. More below.
>>>
>>>> * @set_map: Set device memory mapping (optional)
>>>> * Needed for device that using device
>>>> * specific DMA translation (on-chip IOMMU)
>>>> :
>>>> :
>>>> * @dma_map: Map an area of PA to IOVA (optional)
>>>> * Needed for device that using device
>>>> * specific DMA translation (on-chip IOMMU)
>>>> * and preferring incremental map.
>>>> :
>>>> :
>>>> * @dma_unmap: Unmap an area of IOVA (optional but
>>>> * must be implemented with dma_map)
>>>> * Needed for device that using device
>>>> * specific DMA translation (on-chip IOMMU)
>>>> * and preferring incremental unmap.
>>>>
>>>>
>>>>> Btw, what's the difference between this and a simple
>>>>>
>>>>> set_map(NULL)?
>>>> I don't think parent drivers support this today - they can accept
>>>> non-NULL iotlb containing empty map entry, but not a NULL iotlb. The
>>>> behavior is undefined or it even causes panic when a NULL iotlb is
>>>> passed in.
>>> We can do this simple change if it can work.
>> If we go with setting up 1:1 DMA mapping at virtio-vdpa .probe() and
>> tearing it down at .release(), perhaps set_map(NULL) is not sufficient.
>>>> Further this doesn't work with .dma_map parent drivers.
>>> Probably, but I'd remove dma_map as it doesn't have any real users
>>> except for the simulator.
>> OK, at a point there was suggestion to get this incremental API extended
>> to support batching to be in par with or even replace .set_map, not sure
>> if it's too soon to conclude. But I'm okay with the removal if need be.
> Yes, I think the right move in the long run is to delegate the
> batching to the parent driver. This allows drivers like mlx to add
> memory (like hotplugged memory) without the need of tearing down all
> the old maps.
Nods.
>
> Having said that, maybe we can work on top if we need to remove
> .dma_map for now.
I guess for that sake I would keep .dma_map unless there's strong
objection against it.
Thanks,
-Siwei
>
>>>> The reason why a new op is needed or better is because it allows
>>>> userspace to tell apart different reset behavior from the older kernel
>>>> (via the F_IOTLB_PERSIST feature bit in patch 4), while this behavior
>>>> could vary between parent drivers.
>>> I'm ok with a new feature flag, but we need to first seek a way to
>>> reuse the existing API.
>> A feature flag is needed anyway. I'm fine with reusing but guess I'd
>> want to converge on the direction first.
>>
>> Thanks,
>> -Siwei
>>> Thanks
>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>> Thanks
>>>>>
>>>>>> + * @vdev: vdpa device
>>>>>> + * @asid: address space identifier
>>>>>> + * Returns integer: success (0) or error (< 0)
>>>>>> * @get_vq_dma_dev: Get the dma device for a specific
>>>>>> * virtqueue (optional)
>>>>>> * @vdev: vdpa device
>>>>>> @@ -390,6 +396,7 @@ struct vdpa_config_ops {
>>>>>> u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
>>>>>> int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
>>>>>> u64 iova, u64 size);
>>>>>> + int (*reset_map)(struct vdpa_device *vdev, unsigned int asid);
>>>>>> int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
>>>>>> unsigned int asid);
>>>>>> struct device *(*get_vq_dma_dev)(struct vdpa_device *vdev, u16 idx);
>>>>>> --
>>>>>> 1.8.3.1
>>>>>>
Powered by blists - more mailing lists