lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230830214958.GA30121@willie-the-truck>
Date:   Wed, 30 Aug 2023 22:49:59 +0100
From:   Will Deacon <will@...nel.org>
To:     Nicolin Chen <nicolinc@...dia.com>
Cc:     Robin Murphy <robin.murphy@....com>, jgg@...dia.com,
        joro@...tes.org, jean-philippe@...aro.org, apopple@...dia.com,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        iommu@...ts.linux.dev
Subject: Re: [PATCH 1/3] iommu/io-pgtable-arm: Add nents_per_pgtable in
 struct io_pgtable_cfg

On Tue, Aug 29, 2023 at 03:15:52PM -0700, Nicolin Chen wrote:
> Meanwhile, by re-looking at Will's commit log:
>     arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE
> 
>     In order to reduce the possibility of soft lock-ups, we bound the
>     maximum number of TLBI operations performed by a single call to
>     flush_tlb_range() to an arbitrary constant of 1024.
> 
>     Whilst this does the job of avoiding lock-ups, we can actually be a bit
>     smarter by defining this as PTRS_PER_PTE. Due to the structure of our
>     page tables, using PTRS_PER_PTE means that an outer loop calling
>     flush_tlb_range() for entire table entries will end up performing just a
>     single TLBI operation for each entry. As an example, mremap()ing a 1GB
>     range mapped using 4k pages now requires only 512 TLBI operations when
>     moving the page tables as opposed to 262144 operations (512*512) when
>     using the current threshold of 1024.
> 
> I found that I am actually not quite getting the calculation at the
> end for the comparison between 512 and 262144.
> 
> For a 4K pgsize setup, MAX_TLBI_OPS is set to 512, calculated from
> 4096 / 8. Then, any VA range >= 2MB will trigger a flush_tlb_all().
> By setting the threshold to 1024, the 2MB size bumps up to 4MB, i.e.
> the condition becomes range >= 4MB.
> 
> So, it seems to me that requesting a 1GB invalidation will trigger
> a flush_tlb_all() in either case of having a 2MB or a 4MB threshold?
> 
> I can get that the 262144 is the number of pages in a 1GB size, so
> the number of per-page invalidations will be 262144 operations if
> there was no threshold to replace with a full-as invalidation. Yet,
> that wasn't the case since we had a 4MB threshold with an arbitrary
> 1024 for MAX_TLBI_OPS?

I think this is because you can't always batch up the entire range as
you'd like due to things like locking concerns. For example,
move_page_tables() can end up invalidating 2MiB at a time, which is
too low to trigger the old threshold and so you end up doing ever single
pte individually.

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ