linux-kernel - reply: [PATCHv5] mm: skip CMA pages when they are not available

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <a74760bd1d81467db2a03b77d3aef7d3@BJMBX01.spreadtrum.com>
Date: Tue, 13 Aug 2024 09:58:32 +0000
From: 黄朝阳 (Zhaoyang Huang)
	<zhaoyang.huang@...soc.com>
To: Breno Leitao <leitao@...ian.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
        Matthew Wilcox
	<willy@...radead.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Minchan Kim
	<minchan@...nel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Zhaoyang Huang
	<huangzhaoyang@...il.com>,
        王科 (Ke Wang)
	<Ke.Wang@...soc.com>,
        "usamaarif642@...il.com" <usamaarif642@...il.com>,
        "riel@...riel.com" <riel@...riel.com>,
        "hannes@...xchg.org"
	<hannes@...xchg.org>,
        "nphamcs@...il.com" <nphamcs@...il.com>
Subject: reply: [PATCHv5] mm: skip CMA pages when they are not available

>
>On Wed, May 31, 2023 at 10:51:01AM +0800, zhaoyang.huang wrote:
>> From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
>>
>> This patch fixes unproductive reclaiming of CMA pages by skipping them
>> when they are not available for current context. It is arise from
>> bellowing OOM issue, which caused by large proportion of MIGRATE_CMA
>pages among free pages.
>
>Hello,
>
>I've been looking into a problem with high memory pressure causing OOMs in
>some of our workloads, and it seems that this change may have introduced lock
>contention when there is high memory pressure.
>
>I've collected some metrics for my specific workload that suggest this change
>has increased the lruvec->lru_lock waittime-max by 500x and the
>waittime-avg by 20x.
>
>Experiment
>==========
>
>The experiment involved 100 hosts, each with 64GB of memory and a single
>Xeon 8321HC CPU. The experiment ran for over 80 hours.
>
>Half of the hosts (50) were configured with the patch reverted and lock stat
>enabled, while the other half was run against the upstream version.
>All machines had hugetlb_cma=6G set as a command-line argument.
>
>In this context, "upstream" refers to kernel release 6.9 with some minor
>changes that should not impact the results.
>
>Workload
>========
>
>The workload is a Java based application that fully utilized the memory, in fact,
>the JVM runs with `-Xms50735m -Xmx50735m` arguments.
>
>Results:
>=======
>
>A few values from lockstat:
>
>                  waittime-max   waittime-total  waittime-avg
>holdtime-max
>6.9:                    242889      15618873933           715
>17485
>6.9-with-revert:           487        688563299            34
>464
>
>The full data could be seen at:
>https://docs.google.com/spreadsheets/d/1Dl-8ImlE4OZrfKjbyWAIWWuQtgD3f
>wEEl9INaZQZ4e8/edit?usp=sharing
>
>Possible causes:
>================
>
>I've been discussing this with colleagues and we're speculating that the high
>contention might be linked to the fact that CMA regions are now being skipped.
>This could potentially extend the duration of the
>isolate_lru_folios() 'while' loop, resulting in increased pressure on the lock.
>
>However, I want to emphasize that I'm not an expert in this area and I am
>simply sharing the data I collected.
Could you please try below patch which could be helpful

https://lore.kernel.org/linux-mm/CAOUHufa7OBtNHKMhfu8wOOE4f0w3b0_2KzzV7-hrc9rVL8e=iw@mail.gmail.com/