linux-kernel - Re: [PATCH] mm: sort freelist by rank number

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ebea485c-7cce-3a4a-2ac4-7a608efe2844@suse.cz>
Date:   Mon, 3 Aug 2020 17:45:55 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     David Hildenbrand <david@...hat.com>, pullip.cho@...sung.com,
        akpm@...ux-foundation.org
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        hyesoo.yu@...sung.com, janghyuck.kim@...sung.com
Subject: Re: [PATCH] mm: sort freelist by rank number

On 8/3/20 9:57 AM, David Hildenbrand wrote:
> On 03.08.20 08:10, pullip.cho@...sung.com wrote:
>> From: Cho KyongHo <pullip.cho@...sung.com>
>> 
>> LPDDR5 introduces rank switch delay. If three successive DRAM accesses
>> happens and the first and the second ones access one rank and the last
>> access happens on the other rank, the latency of the last access will
>> be longer than the second one.
>> To address this panelty, we can sort the freelist so that a specific
>> rank is allocated prior to another rank. We expect the page allocator
>> can allocate the pages from the same rank successively with this
>> change. It will hopefully improves the proportion of the consecutive
>> memory accesses to the same rank.
> 
> This certainly needs performance numbers to justify ... and I am sorry,
> "hopefully improves" is not a valid justification :)
> 
> I can imagine that this works well initially, when there hasn't been a
> lot of memory fragmentation going on. But quickly after your system is
> under stress, I doubt this will be very useful. Proof me wrong. ;)

Agreed. The implementation of __preferred_rank() seems to be very simple and
optimistic.
I think these systems could perhaps better behave as NUMA with (interleaved)
nodes for each rank, then you immediately have all the mempolicies support etc
to achieve what you need? Of course there's some cost as well, but not the costs
of adding hacks to page allocator core?