lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pmfgjnpj.fsf@nvdebian.thelocal>
Date:   Wed, 28 Sep 2022 10:59:08 +1000
From:   Alistair Popple <apopple@...dia.com>
To:     Yang Shi <shy828301@...il.com>
Cc:     John Hubbard <jhubbard@...dia.com>,
        "Huang, Ying" <ying.huang@...el.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Zi Yan <ziy@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Oscar Salvador <osalvador@...e.de>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap()
 and _move()


Yang Shi <shy828301@...il.com> writes:

> On Tue, Sep 27, 2022 at 1:35 PM John Hubbard <jhubbard@...dia.com> wrote:
>>
>> On 9/26/22 18:51, Huang, Ying wrote:
>> >>> But there might be other cases which may incur deadlock, for example,
>> >>> filesystem writeback IIUC. Some filesystems may lock a bunch of pages
>> >>> then write them back in a batch. The same pages may be on the
>> >>> migration list and they are also dirty and seen by writeback. I'm not
>> >>> sure whether I miss something that could prevent such a deadlock from
>> >>> happening.
>> >>
>> >> I'm not overly familiar with that area but I would assume any filesystem
>> >> code doing this would already have to deal with deadlock potential.
>> >
>> > Thank you very much for pointing this out.  I think the deadlock is a
>> > real issue.  Anyway, we shouldn't forbid other places in kernel to lock
>> > 2 pages at the same time.
>> >
>>
>> I also agree that we cannot make any rules such as "do not lock > 1 page
>> at the same time, elsewhere in the kernel", because it is already
>> happening, for example in page-writeback.c, which locks PAGEVEC_SIZE
>> (15) pages per batch [1].

That's not really the case though. The inner loop of write_cache_page()
only ever locks one page at a time, either directly via the
unlock_page() on L2338 (those goto's are amazing) or indirectly via
(*writepage)() on L2359.

So there's no deadlock potential there because unlocking any previously
locked page(s) doesn't depend on obtaining the lock for another page.
Unless I've missed something?

>> The only deadlock prevention convention that I see is the convention of
>> locking the pages in order of ascending address. That only helps if
>> everything does it that way, and migrate code definitely does not.
>> However...I thought that up until now, at least, the migrate code relied
>> on trylock (which can fail, and so migration can fail, too), to avoid
>> deadlock. Is that changing somehow, I didn't see it?
>
> The trylock is used by async mode which does try to avoid blocking.
> But sync mode does use lock. The current implementation of migration
> does migrate one page at a time, so it is not a problem.
>
>>
>>
>> [1] https://elixir.bootlin.com/linux/latest/source/mm/page-writeback.c#L2296
>>
>> thanks,
>>
>> --
>> John Hubbard
>> NVIDIA
>>
>> > The simplest solution is to batch page migration only if mode ==
>> > MIGRATE_ASYNC.  Then we may consider to fall back to non-batch mode if
>> > mode != MIGRATE_ASYNC and trylock page fails.
>> >
>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ