lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a485f4af-8196-4b25-9f15-25aebe390904@arm.com>
Date:   Thu, 16 Nov 2023 09:34:56 +0000
From:   Ryan Roberts <ryan.roberts@....com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        Marc Zyngier <maz@...nel.org>,
        Oliver Upton <oliver.upton@...ux.dev>,
        James Morse <james.morse@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Zenghui Yu <yuzenghui@...wei.com>,
        Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Alexander Potapenko <glider@...gle.com>,
        Andrey Konovalov <andreyknvl@...il.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Matthew Wilcox <willy@...radead.org>,
        Yu Zhao <yuzhao@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        David Hildenbrand <david@...hat.com>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        John Hubbard <jhubbard@...dia.com>, Zi Yan <ziy@...dia.com>,
        linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork()

On 15/11/2023 21:37, Andrew Morton wrote:
> On Wed, 15 Nov 2023 16:30:05 +0000 Ryan Roberts <ryan.roberts@....com> wrote:
> 
>> However, the primary motivation for this change is to reduce the number
>> of tlb maintenance operations that the arm64 backend has to perform
>> during fork
> 
> Do you have a feeling for how much performance improved due to this? 

The commit log for patch 13 (the one which implements ptep_set_wrprotects() for
armt64) has performance numbers for a fork() microbenchmark with/without the
optimization:

---8<---

I see huge performance regression when PTE_CONT support was added, then
the regression is mostly fixed with the addition of this change. The
following shows regression relative to before PTE_CONT was enabled
(bigger negative value is bigger regression):

|   cpus |   before opt |   after opt |
|-------:|-------------:|------------:|
|      1 |       -10.4% |       -5.2% |
|      8 |       -15.4% |       -3.5% |
|     16 |       -38.7% |       -3.7% |
|     24 |       -57.0% |       -4.4% |
|     32 |       -65.8% |       -5.4% |

---8<---

Note that's running on Ampere Altra, where TLBI tends to have high cost.

> 
> Are there other architectures which might similarly benefit?  By
> implementing ptep_set_wrprotects(), it appears.  If so, what sort of
> gains might they see?

The rationale for this is to reduce expense for arm64 to manage
contpte-mappings. If other architectures support contpte-mappings then they
could benefit from this API for the same reasons that arm64 benefits. I have a
vague understanding that riscv has a similar concept to the arm64's contiguous
bit, so perhaps they are a future candidate. But I'm not familiar with the
details of the riscv feature so couldn't say whether they would be likely to see
the same level of perf improvement as arm64.

Thanks,
Ryan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ