lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh8oi0qQtYDFTfm7d1s5C8mG7ig=NfzGWt4zbjXMzcdqQ@mail.gmail.com>
Date:   Thu, 27 Oct 2022 13:31:22 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nadav Amit <nadav.amit@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>,
        John Hubbard <jhubbard@...dia.com>, x86@...nel.org,
        willy@...radead.org, Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Andrea Arcangeli <aarcange@...hat.com>,
        kirill.shutemov@...ux.intel.com, jroedel@...e.de, ubizjak@...il.com
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

On Thu, Oct 27, 2022 at 1:15 PM Nadav Amit <nadav.amit@...il.com> wrote:
>
> I think it might be easier to come up with new rules instead of phrasing the
> existing ones.

I'm ok with that, but I think you are missing a very important issue:
all the cases where we can short-circuit TLB invalidations *entirely*.

You don't mention those at all.

Those optimizations are *very* important. Process exit is one of the
most performance-critical pieces of code in the kernel on some loads,
because a lot of traditional unix loads have a *ton* of small
fork/exec/exit sequences, and the whole "do just one TLB flush" was at
least historically quite a big deal.

So one very big issue here is when zap_page_tables() can end up
skipping TLB flushes entirely, because nobody cares.

And no, the fix is not to turn it into some "just increment a
generation number".

We want to avoid *even that* cost for the whole "we don't actually
need a TLB flush at all, until we actually free the pages".

So there are two levels of tlb flush optimizations

 (a) avoiding them entirely in the first place

 (b) the whole "once you have to flush, keep track of lazy modes and
TLB generations, and flush ranges"

And honestly, I think you ignored (a), and that's where we do exactly
those kinds of "this case doesn't need to flush AT ALL" things.

So when you say

>       The thing I like about this scheme
> the most is that it avoids relying on almost all the OS data-structures
> (e.g., PageAnon()), making it much easier to grasp.

I think it's because you've ignored a big part of the whole issue.

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ