[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73b1957d0898a937e5e88c1a469352ea@codeaurora.org>
Date: Mon, 11 Jan 2021 10:08:08 +0530
From: Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
To: isaacm@...eaurora.org
Cc: Will Deacon <will@...nel.org>, Rob Clark <robdclark@...il.com>,
Jordan Crouse <jcrouse@...eaurora.org>,
linux-arm-msm@...r.kernel.org, Joerg Roedel <joro@...tes.org>,
linux-kernel@...r.kernel.org,
Akhil P Oommen <akhilpo@...eaurora.org>,
iommu@...ts.linux-foundation.org,
Robin Murphy <robin.murphy@....com>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use
system cache
On 2021-01-08 23:39, isaacm@...eaurora.org wrote:
> On 2021-01-07 21:47, Sai Prakash Ranjan wrote:
>> On 2021-01-07 22:27, isaacm@...eaurora.org wrote:
>>> On 2021-01-06 03:56, Will Deacon wrote:
>>>> On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote:
>>>>> commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY
>>>>> flag")
>>>>> removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it
>>>>> went
>>>>> the memory type setting required for the non-coherent masters to
>>>>> use
>>>>> system cache. Now that system cache support for GPU is added, we
>>>>> will
>>>>> need to mark the memory as normal sys-cached for GPU to use system
>>>>> cache.
>>>>> Without this, the system cache lines are not allocated for GPU. We
>>>>> use
>>>>> the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page
>>>>> protection
>>>>> flag as the flag cannot be exposed via DMA api because of no
>>>>> in-tree
>>>>> users.
>>>>>
>>>>> Signed-off-by: Sai Prakash Ranjan
>>>>> <saiprakash.ranjan@...eaurora.org>
>>>>> ---
>>>>> drivers/iommu/io-pgtable-arm.c | 3 +++
>>>>> 1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/io-pgtable-arm.c
>>>>> b/drivers/iommu/io-pgtable-arm.c
>>>>> index 7c9ea9d7874a..3fb7de8304a2 100644
>>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>>> @@ -415,6 +415,9 @@ static arm_lpae_iopte
>>>>> arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>>>> else if (prot & IOMMU_CACHE)
>>>>> pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
>>>>> << ARM_LPAE_PTE_ATTRINDX_SHIFT);
>>>>> + else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
>>>>> + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
>>>>> + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
>>>>> }
>>>>
>>> While this approach of enabling system cache globally for both page
>>> tables and other buffers
>>> works for the GPU usecase, this isn't ideal for other clients that
>>> use
>>> system cache. For example,
>>> video clients only want to cache a subset of their buffers in the
>>> system cache, due to the sizing constraint
>>> imposed by how much of the system cache they can use. So, it would be
>>> ideal to have
>>> a way of expressing the desire to use the system cache on a
>>> per-buffer
>>> basis. Additionally,
>>> our video clients use the DMA layer, and since the requirement is for
>>> caching in the system cache
>>> to be a per buffer attribute, it seems like we would have to have a
>>> DMA attribute to express
>>> this on a per-buffer basis.
>>>
>>
>> I did bring this up initially [1], also where is this video client
>> in upstream? AFAIK, only system cache user in upstream is GPU.
>> We cannot add any DMA attribute unless there is any user upstream
> Right, there wouldn't be an upstream user, which would be problematic,
> but I was thinking of having it so that when video or any of our other
> clients that use this attribute on a per buffer basis upstreams their
> code, it's not too much of a stretch to add the support.
Agreed.
>> as per [2], so when the support for such a client is added, wouldn't
>> ((data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) ||
>> PROT_FLAG)
>> work?
> I don't think that will work, because we currently have clients who use
> the
> system cache as follows:
> -cache only page tables in the system cache
> -cache only data buffers in the system cache
> -cache both page tables and all buffers in the system cache
> -cache both page tables and some buffers in the system cache
>
> The approach you're suggesting doesn't allow for the last case, as
> caching the
> page tables in the system cache involves setting
> IO_PGTABLE_QUIRK_ARM_OUTER_WBWA,
> so we will end up losing the flexibility to cache some data buffers in
> the system cache.
>
Ah yes, you are right, I believe Jordan mentioned the same [1].
[1]
https://lore.kernel.org/lkml/20200709161352.GC21059@jcrouse1-lnx.qualcomm.com/
> Ideally, the page table quirk would drive the settings for the TCR,
> and the prot flag
> drives the PTE for the mapping, as is done with the page table walker
> being dma-coherent,
> while buffers are mapped as cacheable based on IOMMU_CACHE. Thoughts?
>
Right, mixing the two is not correct. Will's suggestion for a new prot
flag sounds good to me, I will work on that.
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation
Powered by blists - more mailing lists