lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 08 Jan 2021 10:09:10 -0800
To:     Sai Prakash Ranjan <>
Cc:     Will Deacon <>, Rob Clark <>,
        Jordan Crouse <>,, Joerg Roedel <>,,
        Akhil P Oommen <>,,
        Robin Murphy <>,
Subject: Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use
 system cache

On 2021-01-07 21:47, Sai Prakash Ranjan wrote:
> On 2021-01-07 22:27, wrote:
>> On 2021-01-06 03:56, Will Deacon wrote:
>>> On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
>>>> commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY 
>>>> flag")
>>>> removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went
>>>> the memory type setting required for the non-coherent masters to use
>>>> system cache. Now that system cache support for GPU is added, we 
>>>> will
>>>> need to mark the memory as normal sys-cached for GPU to use system 
>>>> cache.
>>>> Without this, the system cache lines are not allocated for GPU. We 
>>>> use
>>>> the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page 
>>>> protection
>>>> flag as the flag cannot be exposed via DMA api because of no in-tree
>>>> users.
>>>> Signed-off-by: Sai Prakash Ranjan <>
>>>> ---
>>>>  drivers/iommu/io-pgtable-arm.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>> diff --git a/drivers/iommu/io-pgtable-arm.c 
>>>> b/drivers/iommu/io-pgtable-arm.c
>>>> index 7c9ea9d7874a..3fb7de8304a2 100644
>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>> @@ -415,6 +415,9 @@ static arm_lpae_iopte 
>>>> arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>>>  		else if (prot & IOMMU_CACHE)
>>>> +		else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
>>>>  	}
>> While this approach of enabling system cache globally for both page
>> tables and other buffers
>> works for the GPU usecase, this isn't ideal for other clients that use
>> system cache. For example,
>> video clients only want to cache a subset of their buffers in the
>> system cache, due to the sizing constraint
>> imposed by how much of the system cache they can use. So, it would be
>> ideal to have
>> a way of expressing the desire to use the system cache on a per-buffer
>> basis. Additionally,
>> our video clients use the DMA layer, and since the requirement is for
>> caching in the system cache
>> to be a per buffer attribute, it seems like we would have to have a
>> DMA attribute to express
>> this on a per-buffer basis.
> I did bring this up initially [1], also where is this video client
> in upstream? AFAIK, only system cache user in upstream is GPU.
> We cannot add any DMA attribute unless there is any user upstream
Right, there wouldn't be an upstream user, which would be problematic,
but I was thinking of having it so that when video or any of our other
clients that use this attribute on a per buffer basis upstreams their
code, it's not too much of a stretch to add the support.
> as per [2], so when the support for such a client is added, wouldn't
> ((data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) || PROT_FLAG)
> work?
I don't think that will work, because we currently have clients who use 
system cache as follows:
-cache only page tables in the system cache
-cache only data buffers in the system cache
-cache both page tables and all buffers in the system cache
-cache both page tables and some buffers in the system cache

The approach you're suggesting doesn't allow for the last case, as 
caching the
page tables in the system cache involves setting 
so we will end up losing the flexibility to cache some data buffers in 
the system cache.

Ideally, the page table quirk would drive the settings for the TCR, and 
the prot flag
drives the PTE for the mapping, as is done with the page table walker 
being dma-coherent,
while buffers are mapped as cacheable based on IOMMU_CACHE. Thoughts?

> [1]
> [2] 
> Thanks,
> Sai

Powered by blists - more mailing lists