[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d256a967-f50e-2e19-1985-aa9cfc0e8b18@suse.com>
Date: Thu, 16 Mar 2023 15:20:52 +0100
From: Juergen Gross <jgross@...e.com>
To: Alex Deucher <alexdeucher@...il.com>
Cc: Jan Beulich <jbeulich@...e.com>,
Stefano Stabellini <sstabellini@...nel.org>,
Honglei Huang <honglei1.huang@....com>,
amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org,
Stewart Hildebrand <Stewart.Hildebrand@....com>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
Huang Rui <ray.huang@....com>,
Chen Jiqian <Jiqian.Chen@....com>,
Xenia Ragiadakou <burzalodowa@...il.com>,
Alex Deucher <alexander.deucher@....com>,
xen-devel@...ts.xenproject.org,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Julia Zhang <julia.zhang@....com>,
Christian König <christian.koenig@....com>,
Roger Pau Monné <roger.pau@...rix.com>
Subject: Re: [RFC PATCH 1/5] x86/xen: disable swiotlb for xen pvh
On 16.03.23 14:53, Alex Deucher wrote:
> On Thu, Mar 16, 2023 at 9:48 AM Juergen Gross <jgross@...e.com> wrote:
>>
>> On 16.03.23 14:45, Alex Deucher wrote:
>>> On Thu, Mar 16, 2023 at 3:50 AM Jan Beulich <jbeulich@...e.com> wrote:
>>>>
>>>> On 16.03.2023 00:25, Stefano Stabellini wrote:
>>>>> On Wed, 15 Mar 2023, Jan Beulich wrote:
>>>>>> On 15.03.2023 01:52, Stefano Stabellini wrote:
>>>>>>> On Mon, 13 Mar 2023, Jan Beulich wrote:
>>>>>>>> On 12.03.2023 13:01, Huang Rui wrote:
>>>>>>>>> Xen PVH is the paravirtualized mode and takes advantage of hardware
>>>>>>>>> virtualization support when possible. It will using the hardware IOMMU
>>>>>>>>> support instead of xen-swiotlb, so disable swiotlb if current domain is
>>>>>>>>> Xen PVH.
>>>>>>>>
>>>>>>>> But the kernel has no way (yet) to drive the IOMMU, so how can it get
>>>>>>>> away without resorting to swiotlb in certain cases (like I/O to an
>>>>>>>> address-restricted device)?
>>>>>>>
>>>>>>> I think Ray meant that, thanks to the IOMMU setup by Xen, there is no
>>>>>>> need for swiotlb-xen in Dom0. Address translations are done by the IOMMU
>>>>>>> so we can use guest physical addresses instead of machine addresses for
>>>>>>> DMA. This is a similar case to Dom0 on ARM when the IOMMU is available
>>>>>>> (see include/xen/arm/swiotlb-xen.h:xen_swiotlb_detect, the corresponding
>>>>>>> case is XENFEAT_not_direct_mapped).
>>>>>>
>>>>>> But how does Xen using an IOMMU help with, as said, address-restricted
>>>>>> devices? They may still need e.g. a 32-bit address to be programmed in,
>>>>>> and if the kernel has memory beyond the 4G boundary not all I/O buffers
>>>>>> may fulfill this requirement.
>>>>>
>>>>> In short, it is going to work as long as Linux has guest physical
>>>>> addresses (not machine addresses, those could be anything) lower than
>>>>> 4GB.
>>>>>
>>>>> If the address-restricted device does DMA via an IOMMU, then the device
>>>>> gets programmed by Linux using its guest physical addresses (not machine
>>>>> addresses).
>>>>>
>>>>> The 32-bit restriction would be applied by Linux to its choice of guest
>>>>> physical address to use to program the device, the same way it does on
>>>>> native. The device would be fine as it always uses Linux-provided <4GB
>>>>> addresses. After the IOMMU translation (pagetable setup by Xen), we
>>>>> could get any address, including >4GB addresses, and that is expected to
>>>>> work.
>>>>
>>>> I understand that's the "normal" way of working. But whatever the swiotlb
>>>> is used for in baremetal Linux, that would similarly require its use in
>>>> PVH (or HVM) aiui. So unconditionally disabling it in PVH would look to
>>>> me like an incomplete attempt to disable its use altogether on x86. What
>>>> difference of PVH vs baremetal am I missing here?
>>>
>>> swiotlb is not usable for GPUs even on bare metal. They often have
>>> hundreds or megs or even gigs of memory mapped on the device at any
>>> given time. Also, AMD GPUs support 44-48 bit DMA masks (depending on
>>> the chip family).
>>
>> But the swiotlb isn't per device, but system global.
>
> Sure, but if the swiotlb is in use, then you can't really use the GPU.
> So you get to pick one.
The swiotlb is used only for buffers which are not within the DMA mask of a
device (see dma_direct_map_page()). So an AMD GPU supporting a 44 bit DMA mask
won't use the swiotlb unless you have a buffer above guest physical address of
16TB (so basically never).
Disabling swiotlb in such a guest would OTOH mean, that a device with only
32 bit DMA mask passed through to this guest couldn't work with buffers
above 4GB.
I don't think this is acceptable.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists