lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHrRAQBPD68PFr0wcbV4fYTEYgxFj0AeeOn6W1sszw9xw@mail.gmail.com>
Date:   Tue, 17 Jan 2023 13:05:15 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     Michal Hocko <mhocko@...e.com>, akpm@...ux-foundation.org,
        michel@...pinasse.org, jglisse@...gle.com, vbabka@...e.cz,
        hannes@...xchg.org, mgorman@...hsingularity.net, dave@...olabs.net,
        willy@...radead.org, liam.howlett@...cle.com, peterz@...radead.org,
        ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
        paulmck@...nel.org, luto@...nel.org, songliubraving@...com,
        peterx@...hat.com, david@...hat.com, dhowells@...hat.com,
        hughd@...gle.com, bigeasy@...utronix.de, kent.overstreet@...ux.dev,
        punit.agrawal@...edance.com, lstoakes@...il.com,
        peterjung1337@...il.com, rientjes@...gle.com,
        axelrasmussen@...gle.com, joelaf@...gle.com, minchan@...gle.com,
        shakeelb@...gle.com, tatashin@...gle.com, edumazet@...gle.com,
        gthelen@...gle.com, gurua@...gle.com, arjunroy@...gle.com,
        soheil@...gle.com, hughlynch@...gle.com, leewalsh@...gle.com,
        posk@...gle.com, linux-mm@...ck.org,
        linux-arm-kernel@...ts.infradead.org,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH 18/41] mm/khugepaged: write-lock VMA while collapsing a
 huge page

On Tue, Jan 17, 2023 at 12:28 PM Jann Horn <jannh@...gle.com> wrote:
>
> On Tue, Jan 17, 2023 at 4:25 PM Michal Hocko <mhocko@...e.com> wrote:
> > On Mon 09-01-23 12:53:13, Suren Baghdasaryan wrote:
> > > Protect VMA from concurrent page fault handler while collapsing a huge
> > > page. Page fault handler needs a stable PMD to use PTL and relies on
> > > per-VMA lock to prevent concurrent PMD changes. pmdp_collapse_flush(),
> > > set_huge_pmd() and collapse_and_free_pmd() can modify a PMD, which will
> > > not be detected by a page fault handler without proper locking.
> >
> > I am struggling with this changelog. Maybe because my recollection of
> > the THP collapsing subtleties is weak. But aren't you just trying to say
> > that the current #PF handling and THP collapsing need to be mutually
> > exclusive currently so in order to keep that assumption you have mark
> > the vma write locked?
> >
> > Also it is not really clear to me how that handles other vmas which can
> > share the same thp?
>
> It's not about the hugepage itself, it's about how the THP collapse
> operation frees page tables.
>
> Before this series, page tables can be walked under any one of the
> mmap lock, the mapping lock, and the anon_vma lock; so when khugepaged
> unlinks and frees page tables, it must ensure that all of those either
> are locked or don't exist. This series adds a fourth lock under which
> page tables can be traversed, and so khugepaged must also lock out that one.
>
> There is a codepath in khugepaged that iterates through all mappings
> of a file to zap page tables (retract_page_tables()), which locks each
> visited mm with mmap_write_trylock() and now also does
> vma_write_lock().
>
>
> I think one aspect of this patch that might cause trouble later on, if
> support for non-anonymous VMAs is added, is that retract_page_tables()
> now does vma_write_lock() while holding the mapping lock; the page
> fault handling path would probably take the locks the other way
> around, leading to a deadlock? So the vma_write_lock() in
> retract_page_tables() might have to become a trylock later on.
>
> Related: Please add the new VMA lock to the big lock ordering comments
> at the top of mm/rmap.c. (And maybe later mm/filemap.c, if/when you
> add file VMA support.)

Thanks for the clarifications and the warning. I'll add appropriate
comments and will take this deadlocking scenario into account when
later implementing support for file-backed page faults.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ