[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7f2d853b9d6cf5b4781810ce86dc8ceb@codeaurora.org>
Date: Wed, 23 Jun 2021 19:13:01 +0530
From: Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
To: Robin Murphy <robin.murphy@....com>
Cc: Will Deacon <will@...nel.org>, Joerg Roedel <joro@...tes.org>,
Thierry Reding <treding@...dia.com>,
linux-arm-msm@...r.kernel.org,
Douglas Anderson <dianders@...omium.org>,
linux-kernel@...r.kernel.org, iommu@...ts.linux-foundation.org,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use
tlb_flush_all() for partial walk flush
Hi Robin,
On 2021-06-23 00:07, Robin Murphy wrote:
> On 2021-06-22 15:27, Sai Prakash Ranjan wrote:
>> Hi Robin,
>>
>> On 2021-06-22 17:41, Robin Murphy wrote:
>>> On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
>>>> Hi Robin,
>>>>
>>>> On 2021-06-21 21:15, Robin Murphy wrote:
>>>>> On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
>>>>>> Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire
>>>>>> context
>>>>>> with tlb_flush_all() callback in partial walk flush to improve
>>>>>> unmap
>>>>>> performance on select few platforms where the cost of
>>>>>> over-invalidation
>>>>>> is less than the unmap latency.
>>>>>
>>>>> I still think this doesn't belong anywhere near io-pgtable at all.
>>>>> It's a driver-internal decision how exactly it implements a
>>>>> non-leaf
>>>>> invalidation, and that may be more complex than a predetermined
>>>>> boolean decision. For example, I've just realised for SMMUv3 we
>>>>> can't
>>>>> invalidate multiple levels of table at once with a range command,
>>>>> since if we assume the whole thing is mapped at worst-case page
>>>>> granularity we may fail to invalidate any parts which are mapped as
>>>>> intermediate-level blocks. If invalidating a 1GB region (with 4KB
>>>>> granule) means having to fall back to 256K non-range commands, we
>>>>> may
>>>>> not want to invalidate by VA then, even though doing so for a 2MB
>>>>> region is still optimal.
>>>>>
>>>>> It's also quite feasible that drivers might want to do this for
>>>>> leaf
>>>>> invalidations too - if you don't like issuing 512 commands to
>>>>> invalidate 2MB, do you like issuing 511 commands to invalidate
>>>>> 2044KB?
>>>>> - and at that point the logic really has to be in the driver
>>>>> anyway.
>>>>>
>>>>
>>>> Ok I will move this to tlb_flush_walk() functions in the drivers. In
>>>> the previous
>>>> v1 thread, you suggested to make the choice in
>>>> iommu_get_dma_strict() test,
>>>> I assume you meant the test in iommu_dma_init_domain() with a flag
>>>> or was it
>>>> the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in
>>>> init_domain?
>>>
>>> Yes, I meant literally inside the same condition where we currently
>>> set "pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in
>>> arm_smmu_init_domain_context().
>>>
>>
>> Ok got it, thanks.
>>
>>>> I am still a bit confused on where this flag would be? Should this
>>>> be a part
>>>> of struct iommu_domain?
>>>
>>> Well, if you were to rewrite the config with an alternative set of
>>> flush_ops at that point it would be implicit. For a flag, probably
>>> either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be
>>> less
>>> useful than generalising straight to a "maximum number of by-VA
>>> invalidations it's worth sending individually" threshold value?
>>
>> But then we would still need some flag to make this implementation
>> specific (qcom specific for now) and this threshold would just be
>> another condition although it would have been useful if this was
>> generic enough.
>
> Well, for that approach I assume we could do something like
> special-case 0, or if it's a mutable per-domain value maybe just
> initialise it to SIZE_MAX or whatever such that it would never be
> reached in practice. Whichever way, it was meant to be implied that
> anything at the domain level would still be subject to final
> adjustment by the init_context hook.
>
Ok that should work, so I went ahead with another set of flush_ops
and posted out v3.
Thanks,
Sai
>
>>> It's clear to me what overall shape and separation of responsibility
>>> is
>>> most logical, but beyond that I don't have a particularly strong
>>> opinion on the exact implementation; I've just been chucking ideas
>>> around :)
>>>
>>
>> Your ideas are very informative and useful :)
>>
>> Thanks,
>> Sai
>>
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation
Powered by blists - more mailing lists