lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkqRyav0fZ5gzaKbkTfGBxkQXTpu0NJz-A9j7UaHhVBxEQ@mail.gmail.com>
Date:   Tue, 27 Sep 2022 13:57:47 -0700
From:   Yang Shi <shy828301@...il.com>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     "Huang, Ying" <ying.huang@...el.com>,
        Alistair Popple <apopple@...dia.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Zi Yan <ziy@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Oscar Salvador <osalvador@...e.de>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap()
 and _move()

On Tue, Sep 27, 2022 at 1:35 PM John Hubbard <jhubbard@...dia.com> wrote:
>
> On 9/26/22 18:51, Huang, Ying wrote:
> >>> But there might be other cases which may incur deadlock, for example,
> >>> filesystem writeback IIUC. Some filesystems may lock a bunch of pages
> >>> then write them back in a batch. The same pages may be on the
> >>> migration list and they are also dirty and seen by writeback. I'm not
> >>> sure whether I miss something that could prevent such a deadlock from
> >>> happening.
> >>
> >> I'm not overly familiar with that area but I would assume any filesystem
> >> code doing this would already have to deal with deadlock potential.
> >
> > Thank you very much for pointing this out.  I think the deadlock is a
> > real issue.  Anyway, we shouldn't forbid other places in kernel to lock
> > 2 pages at the same time.
> >
>
> I also agree that we cannot make any rules such as "do not lock > 1 page
> at the same time, elsewhere in the kernel", because it is already
> happening, for example in page-writeback.c, which locks PAGEVEC_SIZE
> (15) pages per batch [1].
>
> The only deadlock prevention convention that I see is the convention of
> locking the pages in order of ascending address. That only helps if
> everything does it that way, and migrate code definitely does not.
> However...I thought that up until now, at least, the migrate code relied
> on trylock (which can fail, and so migration can fail, too), to avoid
> deadlock. Is that changing somehow, I didn't see it?

The trylock is used by async mode which does try to avoid blocking.
But sync mode does use lock. The current implementation of migration
does migrate one page at a time, so it is not a problem.

>
>
> [1] https://elixir.bootlin.com/linux/latest/source/mm/page-writeback.c#L2296
>
> thanks,
>
> --
> John Hubbard
> NVIDIA
>
> > The simplest solution is to batch page migration only if mode ==
> > MIGRATE_ASYNC.  Then we may consider to fall back to non-batch mode if
> > mode != MIGRATE_ASYNC and trylock page fails.
> >
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ