lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 30 Mar 2017 09:55:32 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Joonsoo Kim <js1304@...il.com>,
        Andrea Arcangeli <aarcange@...hat.com>
Cc:     Joonsoo Kim <iamjoonsoo.kim@....com>,
        Michal Hocko <mhocko@...nel.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Mel Gorman <mgorman@...e.de>, Xishi Qiu <qiuxishi@...wei.com>,
        Toshi Kani <toshi.kani@....com>, xieyisheng1@...wei.com,
        slaoub@...il.com, Zhang Zhen <zhenzhang.zhang@...wei.com>,
        Reza Arbab <arbab@...ux.vnet.ibm.com>,
        Yasuaki Ishimatsu <yasu.isimatu@...il.com>,
        Tang Chen <tangchen@...fujitsu.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Rientjes <rientjes@...gle.com>,
        Daniel Kiper <daniel.kiper@...cle.com>,
        Igor Mammedov <imammedo@...hat.com>,
        Andi Kleen <ak@...ux.intel.com>
Subject: Re: ZONE_NORMAL vs. ZONE_MOVABLE

On 03/20/2017 07:33 AM, Joonsoo Kim wrote:
>> The fact sticky movable pageblocks aren't ideal for CMA doesn't mean
>> they're not ideal for memory hotunplug though.
>>
>> With CMA there's no point in having the sticky movable pageblocks
>> scattered around and it's purely a misfeature to use sticky movable
>> pageblocks because you need the whole CMA area contiguous hence a
>> ZONE_CMA is ideal.
> No. CMA ranges could be registered many times for each devices and they
> could be scattered due to device's H/W limitation. So, current implementation
> in kernel, MIGRATE_CMA pageblocks, are scattered sometimes.
> 
>> As opposed with memory hotplug the sticky movable pageblocks would
>> allow the kernel to satisfy the current /sys API and they would
>> provide no downside unlike in the CMA case where the size of the
>> allocation is unknown.
> No, same downside also exists in this case. Downside is not related to the case
> that device uses that range. It is related to VM management to this range and
> problems are the same. For example, with sticky movable pageblock, we need to
> subtract number of freepages in sticky movable pageblock when watermark is
> checked for non-movable allocation and it causes some problems.

Agree. Right now for CMA we have to account NR_FREE_CMA_PAGES (number of
free pages within MIGRATE_CMA pageblocks), which brings all those hooks
and other troubles for keep the accounting precise (there used to be
various races in there). This goes against the rest of page grouping by
mobility design, which wasn't meant to be precise for performance
reasons (e.g. when you change pageblock type and move pages between
freelists, any pcpu cached pages are left at their previous type's list).

We also can't ignore this accounting, as then the watermark check could
then pass for e.g. UNMOVABLE allocation, which would proceed to find
that the only free pages available are within the MIGRATE_CMA (or
sticky-movable) pageblocks, where it's not allowed to fallback to. If
only then we went reclaiming, the zone balance checks would also
consider the zone balanced, even though unmovable allocations would
still not be possible.

Even with this extra accounting, things are not perfect, because reclaim
doesn't guarantee freeing the pages in the right pageblocks, so we can
easily overreclaim. That's mainly why I agreed that ZONE_CMA should be
better than the current implementation, and I'm skeptical about the
sticky-movable pageblock idea. Note the conversion to node-lru reclaim
has changed things somewhat, as we can't reclaim a single zone anymore,
but the accounting troubles remain.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ