lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <47678198-C502-47E1-B7C8-8A12352CDA95@gmail.com>
Date:   Sat, 29 Oct 2022 11:05:12 -0700
From:   Nadav Amit <nadav.amit@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>,
        John Hubbard <jhubbard@...dia.com>, X86 ML <x86@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        jroedel@...e.de, ubizjak@...il.com,
        Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

On Oct 28, 2022, at 5:42 PM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> I think the proper fix (or at least _a_ proper fix) would be to
> actually carry the dirty bit along to the __tlb_remove_page() point,
> and actually treat it exactly the same way as the page pointer itself
> - set the page dirty after the TLB flush, the same way we can free the
> page after the TLB flush.
> 
> We could easiy hide said dirty bit in the low bits of the
> "batch->pages[]" array or something like that. We'd just have to add
> the 'dirty' argument to __tlb_remove_page_size() and friends.

Thank you for your quick response. I was slow to respond due to a jet lag.

Anyhow, I am not sure whether the solution that you propose would work.
Please let me know if my understanding makes sense.

Let’s assume that we do not call set_page_dirty() before we remove the rmap
but only after we invalidate the page [*]. Let’s assume that
shrink_page_list() is called after the page’s rmap is removed and the page
is no longer mapped, but before set_page_dirty() was actually called.

In such a case, shrink_page_list() would consider the page clean, and would
indeed keep the page (since __remove_mapping() would find elevated page
refcount), which appears to give us a chance to mark the page as dirty
later.

However, IIUC, in this case shrink_page_list() might still call
filemap_release_folio() and release the buffers, so calling set_page_dirty()
afterwards - after the actual TLB invalidation took place - would fail.

> Your idea of "do the page_remove_rmap() late instead" would also work,
> but the reason I think just squirrelling away the dirty bit is the
> "proper" fix is that it would get rid of the whole need for
> 'force_flush' in this area entirely. So we'd not only fix that race
> you noticed, we'd actually do so and reduce the number of TLB flushes
> too.

I’m all for reducing the number of TLB flushes, and your solution does sound
better in general. I proposed something that I considered having the path of
least resistance (i.e., least chance of breaking something). I can do what
you propsosed, but I am not sure how to deal with the buffers being removed.

One more note: This issue, I think, also affects migrate_vma_collect_pmd().
Alistair recently addressed an issue there, but in my prior feedback to him
I missed this issue.


[*] Note that this would be for our scenario pretty much the same if we also
called set_page_dirty() before removing the rmap, but the page was cleaned
while the TLB invalidation has still not been performed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ