linux-kernel - Re: [PATCH 1/3] iommu/io-pgtable-arm: Add nents_per_pgtable in struct io_pgtable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZO5R5i4n2WI2GnKQ@Asurada-Nvidia>
Date:   Tue, 29 Aug 2023 13:15:34 -0700
From:   Nicolin Chen <nicolinc@...dia.com>
To:     Robin Murphy <robin.murphy@....com>
CC:     <will@...nel.org>, <jgg@...dia.com>, <joro@...tes.org>,
        <jean-philippe@...aro.org>, <apopple@...dia.com>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <iommu@...ts.linux.dev>
Subject: Re: [PATCH 1/3] iommu/io-pgtable-arm: Add nents_per_pgtable in
 struct io_pgtable_cfg

On Tue, Aug 29, 2023 at 04:37:00PM +0100, Robin Murphy wrote:

> On 2023-08-22 17:42, Nicolin Chen wrote:
> > On Tue, Aug 22, 2023 at 10:19:21AM +0100, Robin Murphy wrote:
> > 
> > > >    out_free_data:
> > > > @@ -1071,6 +1073,7 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > > >                                          ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> > > >        if (cfg->coherent_walk)
> > > >                cfg->arm_mali_lpae_cfg.transtab |= ARM_MALI_LPAE_TTBR_SHARE_OUTER;
> > > > +     cfg->nents_per_pgtable = 1 << data->bits_per_level;
> > > 
> > > The result of this highly complex and expensive calculation is clearly
> > > redundant with the existing bits_per_level field, so why do we need to
> > > waste space storing when the driver could simply use bits_per_level?
> > 
> > bits_per_level is in the private struct arm_lpae_io_pgtable, while
> > drivers can only access struct io_pgtable_cfg. Are you suggesting
> > to move bits_per_level out of the private struct arm_lpae_io_pgtable
> > to the public struct io_pgtable_cfg?
> > 
> > Or am I missing another bits_per_level?
> 
> Bleh, apologies, I always confuse myself trying to remember the fiddly
> design of io-pgtable data. However, I think this then ends up proving
> the opposite point - the number of pages per table only happens to be a
> fixed constant for certain formats like LPAE, but does not necessarily
> generalise. For instance for a single v7s config it would be 1024 or 256
> or 16 depending on what has actually been unmapped.
> 
> The mechanism as proposed implicitly assumes LPAE format, so I still
> think we're better off making that assumption explicit. And at that
> point arm-smmu-v3 can then freely admit it already knows the number is
> simply 1/8th of the domain page size.

Hmm, I am not getting that "1/8th" part, would you mind elaborating?

Also, what we need is actually an arbitrary number for max_tlbi_ops.
And I think it could be irrelevant to the page size, i.e. either a
4K pgsize or a 64K pgsize could use the same max_tlbi_ops number,
because what eventually impacts the latency is the number of loops
of building/issuing commands.

So, combining your narrative above that nents_per_pgtable isn't so
general as we have in the tlbflush for MMU, perhaps we could just
decouple max_tlbi_ops from the pgtable and pgsize, instead define
something like this in the SMMUv3 driver:
	/*
	 * A request for a large number of TLBI commands could result in a big
	 * overhead and latency on SMMUs without ARM_SMMU_FEAT_RANGE_INV. Set
	 * a threshold to the number, so the driver would switch to one single
	 * full-range command.
	 */
	#define MAX_TLBI_OPS 8192

Any thought?

Thanks
Nicolin