[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8bf37fba-8c7e-4b9f-9864-4ca2a6c5c657@amd.com>
Date: Thu, 3 Apr 2025 16:37:53 +0800
From: Zhu Lingshan <lingshan.zhu@....com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: David Woodhouse <dwmw2@...radead.org>, virtio-comment@...ts.linux.dev,
hch@...radead.org, Claire Chang <tientzu@...omium.org>,
linux-devicetree <devicetree@...r.kernel.org>,
Rob Herring <robh+dt@...nel.org>, Jörg Roedel
<joro@...tes.org>, iommu@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, graf@...zon.de
Subject: Re: [RFC PATCH 3/3] transport-pci: Add SWIOTLB bounce buffer
capability
On 4/3/2025 4:16 PM, Michael S. Tsirkin wrote:
> On Thu, Apr 03, 2025 at 04:12:20PM +0800, Zhu Lingshan wrote:
>> On 4/3/2025 3:37 PM, Michael S. Tsirkin wrote:
>>> On Thu, Apr 03, 2025 at 03:36:04PM +0800, Zhu Lingshan wrote:
>>>> On 4/3/2025 3:27 PM, Michael S. Tsirkin wrote:
>>>>> On Wed, Apr 02, 2025 at 12:04:47PM +0100, David Woodhouse wrote:
>>>>>> From: David Woodhouse <dwmw@...zon.co.uk>
>>>>>>
>>>>>> Add a VIRTIO_PCI_CAP_SWIOTLB capability which advertises a SWIOTLB bounce
>>>>>> buffer similar to the existing `restricted-dma-pool` device-tree feature.
>>>>>>
>>>>>> The difference is that this is per-device; each device needs to have its
>>>>>> own. Perhaps we should add a UUID to the capability, and have a way for
>>>>>> a device to not *provide* its own buffer, but just to reference the UUID
>>>>>> of a buffer elsewhere?
>>>>>>
>>>>>> Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
>>>>>> ---
>>>>>> transport-pci.tex | 33 +++++++++++++++++++++++++++++++++
>>>>>> 1 file changed, 33 insertions(+)
>>>>>>
>>>>>> diff --git a/transport-pci.tex b/transport-pci.tex
>>>>>> index a5c6719..23e0d57 100644
>>>>>> --- a/transport-pci.tex
>>>>>> +++ b/transport-pci.tex
>>>>>> @@ -129,6 +129,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>>>>> \item ISR Status
>>>>>> \item Device-specific configuration (optional)
>>>>>> \item PCI configuration access
>>>>>> +\item SWIOTLB bounce buffer
>>>>>> \end{itemize}
>>>>>>
>>>>>> Each structure can be mapped by a Base Address register (BAR) belonging to
>>>>>> @@ -188,6 +189,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>>>>> #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>>>>>> /* Vendor-specific data */
>>>>>> #define VIRTIO_PCI_CAP_VENDOR_CFG 9
>>>>>> +/* Software IOTLB bounce buffer */
>>>>>> +#define VIRTIO_PCI_CAP_SWIOTLB 10
>>>>>> \end{lstlisting}
>>>>>>
>>>>>> Any other value is reserved for future use.
>>>>>> @@ -744,6 +747,36 @@ \subsubsection{Vendor data capability}\label{sec:Virtio
>>>>>> The driver MUST qualify the \field{vendor_id} before
>>>>>> interpreting or writing into the Vendor data capability.
>>>>>>
>>>>>> +\subsubsection{Software IOTLB bounce buffer capability}\label{sec:Virtio
>>>>>> +Transport Options / Virtio Over PCI Bus / PCI Device Layout /
>>>>>> +Software IOTLB bounce buffer capability}
>>>>>> +
>>>>>> +The optional Software IOTLB bounce buffer capability allows the
>>>>>> +device to provide a memory region which can be used by the driver
>>>>>> +driver for bounce buffering. This allows a device on the PCI
>>>>>> +transport to operate without DMA access to system memory addresses.
>>>>>> +
>>>>>> +The Software IOTLB region is referenced by the
>>>>>> +VIRTIO_PCI_CAP_SWIOTLB capability. Bus addresses within the referenced
>>>>>> +range are not subject to the requirements of the VIRTIO_F_ORDER_PLATFORM
>>>>>> +capability, if negotiated.
>>>>>> +
>>>>>> +\devicenormative{\paragraph}{Software IOTLB bounce buffer capability}{Virtio
>>>>>> +Transport Options / Virtio Over PCI Bus / PCI Device Layout /
>>>>>> +Software IOTLB bounce buffer capability}
>>>>>> +
>>>>>> +Devices which present the Software IOTLB bounce buffer capability
>>>>>> +SHOULD also offer the VIRTIO_F_SWIOTLB feature.
>>>>>> +
>>>>>> +\drivernormative{\paragraph}{Software IOTLB bounce buffer capability}{Virtio
>>>>>> +Transport Options / Virtio Over PCI Bus / PCI Device Layout /
>>>>>> +Software IOTLB bounce buffer capability}
>>>>>> +
>>>>>> +The driver SHOULD use the offered buffer in preference to passing system
>>>>>> +memory addresses to the device. If the driver accepts the VIRTIO_F_SWIOTLB
>>>>>> +feature, then the driver MUST use the offered buffer and never pass system
>>>>>> +memory addresses to the device.
>>>>>> +
>>>>>> \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
>>>>>>
>>>>>> The VIRTIO_PCI_CAP_PCI_CFG capability
>>>>>> --
>>>>>> 2.49.0
>>>>>>
>>>>> So on the PCI option. The normal mapping (ioremap) for BAR is uncached. If done
>>>>> like this, performance will suffer. But if you do normal WB, since device
>>>> and this even possibly can cause TLB thrashing.... which is a worse case.
>>>>
>>>> Thanks
>>>> Zhu Lingshan
>>> Hmm which TLB? I don't get it.
>> CPU TLB, because a device side bounce buffer design requires mapping
>> device memory to CPU address space, so that CPU to help copy data,
>> and causing a more frequent TLB switch.
> Lost me here. It's mapped, why switch?
Because the number of TLB entries is quite limited.
But never mind, this is not a key topic of this discussion.
Thanks
Zhu Lingshan
>
>> TLB thrashing will occur when many devices doing DMA through
>> the device side bounce buffer, or scattered DMA.
> Yea I don't think this idea even works. Each device can only use
> its own swiotlb.
>
>> If the bounce buffer resides in the hypervisor, for example QEMU,
>> then TLB switch while QEMU process context switch which already occur all the time.
>>
>> Thanks
>> Zhu Lingshan
>>>>> accesses do not go on the bus, they do not get synchronized with driver
>>>>> writes and there's really no way to synchronize them.
>>>>>
>>>>> First, this needs to be addressed.
>>>>>
>>>>> In this age of accelerators for everything, building pci based
>>>>> interfaces that can't be efficiently accelerated seems shortsighted ...
>>>>>
Powered by blists - more mailing lists