lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Jan 2021 10:08:08 +0530
From:   Sai Prakash Ranjan <>
Cc:     Will Deacon <>, Rob Clark <>,
        Jordan Crouse <>,, Joerg Roedel <>,,
        Akhil P Oommen <>,,
        Robin Murphy <>,
Subject: Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use
 system cache

On 2021-01-08 23:39, wrote:
> On 2021-01-07 21:47, Sai Prakash Ranjan wrote:
>> On 2021-01-07 22:27, wrote:
>>> On 2021-01-06 03:56, Will Deacon wrote:
>>>> On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
>>>>> commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY 
>>>>> flag")
>>>>> removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it 
>>>>> went
>>>>> the memory type setting required for the non-coherent masters to 
>>>>> use
>>>>> system cache. Now that system cache support for GPU is added, we 
>>>>> will
>>>>> need to mark the memory as normal sys-cached for GPU to use system 
>>>>> cache.
>>>>> Without this, the system cache lines are not allocated for GPU. We 
>>>>> use
>>>>> the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page 
>>>>> protection
>>>>> flag as the flag cannot be exposed via DMA api because of no 
>>>>> in-tree
>>>>> users.
>>>>> Signed-off-by: Sai Prakash Ranjan 
>>>>> <>
>>>>> ---
>>>>>  drivers/iommu/io-pgtable-arm.c | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>> diff --git a/drivers/iommu/io-pgtable-arm.c 
>>>>> b/drivers/iommu/io-pgtable-arm.c
>>>>> index 7c9ea9d7874a..3fb7de8304a2 100644
>>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>>> @@ -415,6 +415,9 @@ static arm_lpae_iopte 
>>>>> arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>>>>  		else if (prot & IOMMU_CACHE)
>>>>>  			pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
>>>>> +		else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
>>>>>  	}
>>> While this approach of enabling system cache globally for both page
>>> tables and other buffers
>>> works for the GPU usecase, this isn't ideal for other clients that 
>>> use
>>> system cache. For example,
>>> video clients only want to cache a subset of their buffers in the
>>> system cache, due to the sizing constraint
>>> imposed by how much of the system cache they can use. So, it would be
>>> ideal to have
>>> a way of expressing the desire to use the system cache on a 
>>> per-buffer
>>> basis. Additionally,
>>> our video clients use the DMA layer, and since the requirement is for
>>> caching in the system cache
>>> to be a per buffer attribute, it seems like we would have to have a
>>> DMA attribute to express
>>> this on a per-buffer basis.
>> I did bring this up initially [1], also where is this video client
>> in upstream? AFAIK, only system cache user in upstream is GPU.
>> We cannot add any DMA attribute unless there is any user upstream
> Right, there wouldn't be an upstream user, which would be problematic,
> but I was thinking of having it so that when video or any of our other
> clients that use this attribute on a per buffer basis upstreams their
> code, it's not too much of a stretch to add the support.


>> as per [2], so when the support for such a client is added, wouldn't
>> ((data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) || 
>> work?
> I don't think that will work, because we currently have clients who use 
> the
> system cache as follows:
> -cache only page tables in the system cache
> -cache only data buffers in the system cache
> -cache both page tables and all buffers in the system cache
> -cache both page tables and some buffers in the system cache
> The approach you're suggesting doesn't allow for the last case, as 
> caching the
> page tables in the system cache involves setting
> so we will end up losing the flexibility to cache some data buffers in
> the system cache.

Ah yes, you are right, I believe Jordan mentioned the same [1].


> Ideally, the page table quirk would drive the settings for the TCR,
> and the prot flag
> drives the PTE for the mapping, as is done with the page table walker
> being dma-coherent,
> while buffers are mapped as cacheable based on IOMMU_CACHE. Thoughts?

Right, mixing the two is not correct. Will's suggestion for a new prot
flag sounds good to me, I will work on that.


QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
of Code Aurora Forum, hosted by The Linux Foundation

Powered by blists - more mailing lists