linux-kernel - Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use system cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <73b1957d0898a937e5e88c1a469352ea@codeaurora.org>
Date:   Mon, 11 Jan 2021 10:08:08 +0530
From:   Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
To:     isaacm@...eaurora.org
Cc:     Will Deacon <will@...nel.org>, Rob Clark <robdclark@...il.com>,
        Jordan Crouse <jcrouse@...eaurora.org>,
        linux-arm-msm@...r.kernel.org, Joerg Roedel <joro@...tes.org>,
        linux-kernel@...r.kernel.org,
        Akhil P Oommen <akhilpo@...eaurora.org>,
        iommu@...ts.linux-foundation.org,
        Robin Murphy <robin.murphy@....com>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use
 system cache

On 2021-01-08 23:39, isaacm@...eaurora.org wrote:
> On 2021-01-07 21:47, Sai Prakash Ranjan wrote:
>> On 2021-01-07 22:27, isaacm@...eaurora.org wrote:
>>> On 2021-01-06 03:56, Will Deacon wrote:
>>>> On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
>>>>> commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY 
>>>>> flag")
>>>>> removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it 
>>>>> went
>>>>> the memory type setting required for the non-coherent masters to 
>>>>> use
>>>>> system cache. Now that system cache support for GPU is added, we 
>>>>> will
>>>>> need to mark the memory as normal sys-cached for GPU to use system 
>>>>> cache.
>>>>> Without this, the system cache lines are not allocated for GPU. We 
>>>>> use
>>>>> the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page 
>>>>> protection
>>>>> flag as the flag cannot be exposed via DMA api because of no 
>>>>> in-tree
>>>>> users.
>>>>> 
>>>>> Signed-off-by: Sai Prakash Ranjan 
>>>>> <saiprakash.ranjan@...eaurora.org>
>>>>> ---
>>>>>  drivers/iommu/io-pgtable-arm.c | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>> 
>>>>> diff --git a/drivers/iommu/io-pgtable-arm.c 
>>>>> b/drivers/iommu/io-pgtable-arm.c
>>>>> index 7c9ea9d7874a..3fb7de8304a2 100644
>>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>>> @@ -415,6 +415,9 @@ static arm_lpae_iopte 
>>>>> arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>>>>  		else if (prot & IOMMU_CACHE)
>>>>>  			pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
>>>>>  				<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
>>>>> +		else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
>>>>> +			pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
>>>>> +				<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
>>>>>  	}
>>>> 
>>> While this approach of enabling system cache globally for both page
>>> tables and other buffers
>>> works for the GPU usecase, this isn't ideal for other clients that 
>>> use
>>> system cache. For example,
>>> video clients only want to cache a subset of their buffers in the
>>> system cache, due to the sizing constraint
>>> imposed by how much of the system cache they can use. So, it would be
>>> ideal to have
>>> a way of expressing the desire to use the system cache on a 
>>> per-buffer
>>> basis. Additionally,
>>> our video clients use the DMA layer, and since the requirement is for
>>> caching in the system cache
>>> to be a per buffer attribute, it seems like we would have to have a
>>> DMA attribute to express
>>> this on a per-buffer basis.
>>> 
>> 
>> I did bring this up initially [1], also where is this video client
>> in upstream? AFAIK, only system cache user in upstream is GPU.
>> We cannot add any DMA attribute unless there is any user upstream
> Right, there wouldn't be an upstream user, which would be problematic,
> but I was thinking of having it so that when video or any of our other
> clients that use this attribute on a per buffer basis upstreams their
> code, it's not too much of a stretch to add the support.

Agreed.

>> as per [2], so when the support for such a client is added, wouldn't
>> ((data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) || 
>> PROT_FLAG)
>> work?
> I don't think that will work, because we currently have clients who use 
> the
> system cache as follows:
> -cache only page tables in the system cache
> -cache only data buffers in the system cache
> -cache both page tables and all buffers in the system cache
> -cache both page tables and some buffers in the system cache
> 
> The approach you're suggesting doesn't allow for the last case, as 
> caching the
> page tables in the system cache involves setting
> IO_PGTABLE_QUIRK_ARM_OUTER_WBWA,
> so we will end up losing the flexibility to cache some data buffers in
> the system cache.
> 

Ah yes, you are right, I believe Jordan mentioned the same [1].

[1] 
https://lore.kernel.org/lkml/20200709161352.GC21059@jcrouse1-lnx.qualcomm.com/

> Ideally, the page table quirk would drive the settings for the TCR,
> and the prot flag
> drives the PTE for the mapping, as is done with the page table walker
> being dma-coherent,
> while buffers are mapped as cacheable based on IOMMU_CACHE. Thoughts?
> 

Right, mixing the two is not correct. Will's suggestion for a new prot
flag sounds good to me, I will work on that.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation