lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj-SqWifNOnZ49zvGP9h06mJ36PDWkcMbfjnt=2gvE+ig@mail.gmail.com>
Date:   Sat, 29 Oct 2022 14:12:15 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nadav Amit <nadav.amit@...il.com>
Cc:     John Hubbard <jhubbard@...dia.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>, X86 ML <x86@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        jroedel@...e.de, ubizjak@...il.com,
        Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

On Sat, Oct 29, 2022 at 1:56 PM Nadav Amit <nadav.amit@...il.com> wrote:
>
> On Oct 29, 2022, at 1:15 PM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
> > (b) we could move the "page_remove_rmap()" into the "flush-and-free" path too
> >
> > And (b) would be fairly easy - same model as that dirty bit patch,
> > just a 'do page_remove_rmap too' - except page_remove_rmap() wants the
> > vma as well (and we delay the TLB flush over multiple vma's, so it's
> > not just a "save vma in mmu_gather”).
>
> (b) sounds reasonable and may potentially allow future performance
> improvements (batching, doing stuff without locks).

So the thing is, I think (b) makes sense from a TLB flush standpoint,
but as mentioned, if filesystems _really_ want to feel in control,
none of these locks fundamentally help, because the whole "gup +
set_page_dirty()" situation still exists.

And that fundamentally does not hold any locks between the lookup and
the set_page_dirty(), and fundamentally doesn't stop an rmap from
happening in between, so while the zap_page_range() and dirty TLB case
case be dealt with that way, you don't actually fix any fundamental
issues.

Now, neither does my (c) case on its own, but as John Hubbard alluded
to, *if* we had some sane serialization for a struct page, maybe that
could then be used to at least avoid the issue with "rmap no longer
exists", and make the filesystem handling case a bit better.

IOW, I really do think that not only is (a) the current solution, it's
the *correct* solution.

But it is possible that (c) could then be used as a way to make (a)
more palatable to filesystems, in that at least then there would be
some way to serialize with "set_page_dirty()".

But (b) cannot be used for that - because GUP fundamentally breaks
that rmap association.

It's all nasty.

                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ