[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1952fc7c-fb21-7d0e-661b-afa59b4580e5@redhat.com>
Date: Thu, 7 Oct 2021 19:07:24 +0200
From: David Hildenbrand <david@...hat.com>
To: Nadav Amit <namit@...are.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
Nick Piggin <npiggin@...il.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 2/2] mm/mprotect: do not flush on permission promotion
On 07.10.21 18:16, Nadav Amit wrote:
>
>> On Oct 7, 2021, at 5:13 AM, David Hildenbrand <david@...hat.com> wrote:
>>
>> On 25.09.21 22:54, Nadav Amit wrote:
>>> From: Nadav Amit <namit@...are.com>
>>> Currently, using mprotect() to unprotect a memory region or uffd to
>>> unprotect a memory region causes a TLB flush. At least on x86, as
>>> protection is promoted, no TLB flush is needed.
>>> Add an arch-specific pte_may_need_flush() which tells whether a TLB
>>> flush is needed based on the old PTE and the new one. Implement an x86
>>> pte_may_need_flush().
>>> For x86, PTE protection promotion or changes of software bits does
>>> require a flush, also add logic that considers the dirty-bit. Changes to
>>> the access-bit do not trigger a TLB flush, although architecturally they
>>> should, as Linux considers the access-bit as a hint.
>>
>> Is the added LOC worth the benefit? IOW, do we have some benchmark that really benefits from that?
>
> So you ask whether the added ~10 LOC (net) worth the benefit?
I read "3 files changed, 46 insertions(+), 1 deletion(-)" to optimize
something without proof, so I naturally have to ask. So this is just a
"usually we optimize and show numbers to proof" comment.
>
> Let’s start with the cost of this patch.
>
> If you ask about complexity, I think that it is a rather simple
> patch and documented as needed. Please be more concrete if you
> think otherwise.
It is most certainly added complexity, although documented cleanly.
>
> If you ask about the runtime overhead, my experience is that
> such code, which mostly does bit operations, has negligible cost.
> The execution time of mprotect code, and other similar pieces of
> code, is mostly dominated by walking the page-tables & getting
> the pages (which might require cold or random memory accesses),
> acquiring the locks, and of course the TLB flushes that this
> patch tries to eliminate.
I'm absolutely not concerned about runtime overhead :)
>
> As for the benefit: TLB flush on x86 of a single PTE has an
> overhead of ~200 cycles. If a TLB shootdown is needed, for instance
> on multithreaded applications, this overhead can grow to few
> microseconds or even more, depending on the number of sockets,
> whether the workload runs in a VM (and worse if CPUs are
> overcommitted) and so on.
>
> This overhead is completely unnecessary on many occasions. If
> you run mprotect() to add permissions, or as I noted in my case,
> to do something similar using userfaultfd. Note that the
> potentially unnecessary TLB flush/shootdown takes place while
> you hold the mmap-lock for write in the case of mprotect(),
> thereby potentially preventing other threads from making
> progress during that time.
>
> On my in-development workload it was a considerable overhead
> (I didn’t collect numbers though). Basically, I track dirty
> pages using uffd, and every page-fault that can be easily
> resolved by unprotecting cause a TLB flush/shootdown.
Any numbers would be helpful.
>
> If you want, I will write a microbenchmarks and give you numbers.
> If you look for further optimizations (although you did not indicate
> so), such as doing the TLB batching from do_mprotect_key(),
> (i.e. batching across VMAs), we can discuss it and apply it on
> top of these patches.
I think this patch itself is sufficient if we can show a benefit; I do
wonder if existing benchmarks could already show a benefit, I feel like
they should if this makes a difference. Excessive mprotect() usage
(protect<>unprotect) isn't something unusual.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists