linux-kernel - Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap() and

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHbLzkqw3G5oN8YsPYPzRjYa3WMfSoaOL9Pp54GhOKsGGcXWeg@mail.gmail.com>
Date:   Tue, 27 Sep 2022 13:56:05 -0700
From:   Yang Shi <shy828301@...il.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Alistair Popple <apopple@...dia.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Zi Yan <ziy@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Oscar Salvador <osalvador@...e.de>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap()
 and _move()

On Mon, Sep 26, 2022 at 6:52 PM Huang, Ying <ying.huang@...el.com> wrote:
>
> Alistair Popple <apopple@...dia.com> writes:
>
> > Yang Shi <shy828301@...il.com> writes:
> >
> >> On Mon, Sep 26, 2022 at 2:37 AM Alistair Popple <apopple@...dia.com> wrote:
> >>>
> >>>
> >>> Huang Ying <ying.huang@...el.com> writes:
> >>>
> >>> > This is a preparation patch to batch the page unmapping and moving
> >>> > for the normal pages and THP.
> >>> >
> >>> > In this patch, unmap_and_move() is split to migrate_page_unmap() and
> >>> > migrate_page_move().  So, we can batch _unmap() and _move() in
> >>> > different loops later.  To pass some information between unmap and
> >>> > move, the original unused newpage->mapping and newpage->private are
> >>> > used.
> >>>
> >>> This looks like it could cause a deadlock between two threads migrating
> >>> the same pages if force == true && mode != MIGRATE_ASYNC as
> >>> migrate_page_unmap() will call lock_page() while holding the lock on
> >>> other pages in the list. Therefore the two threads could deadlock if the
> >>> pages are in a different order.
> >>
> >> It seems unlikely to me since the page has to be isolated from lru
> >> before migration. The isolating from lru is atomic, so the two threads
> >> unlikely see the same pages on both lists.
> >
> > Oh thanks! That is a good point and I agree since lru isolation is
> > atomic the two threads won't see the same pages. migrate_vma_setup()
> > does LRU isolation after locking the page which is why the potential
> > exists there. We could potentially switch that around but given
> > ZONE_DEVICE pages aren't on an lru it wouldn't help much.
> >
> >> But there might be other cases which may incur deadlock, for example,
> >> filesystem writeback IIUC. Some filesystems may lock a bunch of pages
> >> then write them back in a batch. The same pages may be on the
> >> migration list and they are also dirty and seen by writeback. I'm not
> >> sure whether I miss something that could prevent such a deadlock from
> >> happening.
> >
> > I'm not overly familiar with that area but I would assume any filesystem
> > code doing this would already have to deal with deadlock potential.
>
> Thank you very much for pointing this out.  I think the deadlock is a
> real issue.  Anyway, we shouldn't forbid other places in kernel to lock
> 2 pages at the same time.
>
> The simplest solution is to batch page migration only if mode ==
> MIGRATE_ASYNC.  Then we may consider to fall back to non-batch mode if
> mode != MIGRATE_ASYNC and trylock page fails.

Seems like so...

>
> Best Regards,
> Huang, Ying
>
> [snip]