[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b093b93-7fab-be35-59d8-4463c199719a@arm.com>
Date: Tue, 22 Jun 2021 13:11:12 +0100
From: Robin Murphy <robin.murphy@....com>
To: Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
Cc: Will Deacon <will@...nel.org>, Joerg Roedel <joro@...tes.org>,
Thierry Reding <treding@...dia.com>,
linux-arm-msm@...r.kernel.org,
Douglas Anderson <dianders@...omium.org>,
linux-kernel@...r.kernel.org, iommu@...ts.linux-foundation.org,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use
tlb_flush_all() for partial walk flush
On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
> Hi Robin,
>
> On 2021-06-21 21:15, Robin Murphy wrote:
>> On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
>>> Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
>>> with tlb_flush_all() callback in partial walk flush to improve unmap
>>> performance on select few platforms where the cost of over-invalidation
>>> is less than the unmap latency.
>>
>> I still think this doesn't belong anywhere near io-pgtable at all.
>> It's a driver-internal decision how exactly it implements a non-leaf
>> invalidation, and that may be more complex than a predetermined
>> boolean decision. For example, I've just realised for SMMUv3 we can't
>> invalidate multiple levels of table at once with a range command,
>> since if we assume the whole thing is mapped at worst-case page
>> granularity we may fail to invalidate any parts which are mapped as
>> intermediate-level blocks. If invalidating a 1GB region (with 4KB
>> granule) means having to fall back to 256K non-range commands, we may
>> not want to invalidate by VA then, even though doing so for a 2MB
>> region is still optimal.
>>
>> It's also quite feasible that drivers might want to do this for leaf
>> invalidations too - if you don't like issuing 512 commands to
>> invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
>> - and at that point the logic really has to be in the driver anyway.
>>
>
> Ok I will move this to tlb_flush_walk() functions in the drivers. In the
> previous
> v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
> I assume you meant the test in iommu_dma_init_domain() with a flag or
> was it
> the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in
> init_domain?
Yes, I meant literally inside the same condition where we currently set
"pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in
arm_smmu_init_domain_context().
> I am still a bit confused on where this flag would be? Should this be a
> part
> of struct iommu_domain?
Well, if you were to rewrite the config with an alternative set of
flush_ops at that point it would be implicit. For a flag, probably
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less
useful than generalising straight to a "maximum number of by-VA
invalidations it's worth sending individually" threshold value? It's
clear to me what overall shape and separation of responsibility is most
logical, but beyond that I don't have a particularly strong opinion on
the exact implementation; I've just been chucking ideas around :)
Cheers,
Robin.
Powered by blists - more mailing lists