lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6bf05cb-044c-47a9-3c65-e41b1e42b702@suse.cz>
Date:   Wed, 2 Sep 2020 19:51:45 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Pavel Tatashin <pasha.tatashin@...een.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-mm <linux-mm@...ck.org>
Subject: Re: [PATCH] mm/memory_hotplug: drain per-cpu pages again during
 memory offline

On 9/2/20 5:13 PM, Michal Hocko wrote:
> On Wed 02-09-20 16:55:05, Vlastimil Babka wrote:
>> On 9/2/20 4:26 PM, Pavel Tatashin wrote:
>> > On Wed, Sep 2, 2020 at 10:08 AM Michal Hocko <mhocko@...e.com> wrote:
>> >>
>> >> >
>> >> > Thread#1 - continue
>> >> >          free_unref_page_commit
>> >> >            migratetype = get_pcppage_migratetype(page);
>> >> >               // get old migration type
>> >> >            list_add(&page->lru, &pcp->lists[migratetype]);
>> >> >               // add new page to already drained pcp list
>> >> >
>> >> > Thread#2
>> >> > Never drains pcp again, and therefore gets stuck in the loop.
>> >> >
>> >> > The fix is to try to drain per-cpu lists again after
>> >> > check_pages_isolated_cb() fails.
>> >>
>> >> But this means that the page is not isolated and so it could be reused
>> >> for something else. No?
>> > 
>> > The page is in a movable zone, has zero references, and the section is
>> > isolated (i.e. set_pageblock_migratetype(page, MIGRATE_ISOLATE);) is
>> > set. The page should be offlinable, but it is lost in a pcp list as
>> > that list is never drained again after the first failure to migrate
>> > all pages in the range.
>> 
>> Yeah. To answer Michal's "it could be reused for something else" - yes, somebody
>> could allocate it from the pcplist before we do the extra drain. But then it
>> becomes "visible again" and the loop in __offline_pages() should catch it by
>> scan_movable_pages() - do_migrate_range(). And this time the pageblock is
>> already marked as isolated, so the page (freed by migration) won't end up on the
>> pcplist again.
> 
> So the page block is marked MIGRATE_ISOLATE but the allocation itself
> could be used for non migrateable objects. Or does anything prevent that
> from happening?

In a movable zone, the allocation should not be used for non migrateable
objects. E.g. if the zone was not ZONE_MOVABLE, the offlining could fail
regardless of this race (analogically for migrating away from CMA pageblocks).

> We really do depend on isolation to not allow reuse when offlining.

This is not really different than if the page on pcplist was allocated just a
moment before the offlining, thus isolation started. We ultimately rely on being
able to migrate any allocated pages away during the isolation. This "freeing to
pcplists" race doesn't fundamentally change anything in this regard. We just
have to guarantee that pages on pcplists will be eventually flushed, to make
forward progress, and there was a bug in this aspect.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ