[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <o3gxx5uipm53gccoccjjdvtvv6gkyx4r7qexzdkg3uqtqc7wsv@yd3rqfsy2bpz>
Date: Tue, 28 Oct 2025 12:54:29 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Akinobu Mita <akinobu.mita@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: oom-killer not invoked on systems with multiple memory-tiers
Hi Akinobu,
On Wed, Oct 22, 2025 at 10:57:35PM +0900, Akinobu Mita wrote:
> On systems with multiple memory-tiers consisting of DRAM and CXL memory,
> the OOM killer is not invoked properly.
>
> Here's the command to reproduce:
>
> $ stress-ng --oomable -v --memrate 20 --memrate-bytes 10G \
> --memrate-rd-mbs 1 --memrate-wr-mbs 1
>
> The memory usage is the number of workers specified with the --memrate
> option multiplied by the buffer size specified with the --memrate-bytes
> option, so please adjust it so that it exceeds the total size of the
> installed DRAM and CXL memory.
>
> If swap is disabled, you can usually expect the OOM killer to terminate
> the stress-ng process when memory usage approaches the installed memory size.
>
> However, if multiple memory-tiers exist (multiple
> /sys/devices/virtual/memory_tiering/memory_tier<N> directories exist),
> and /sys/kernel/mm/numa/demotion_enabled is true and
> /sys/kernel/mm/lru_gen/min_ttl_ms is 0, the OOM killer will not be invoked
> and the system will become inoperable.
>
> If /sys/kernel/mm/numa/demotion_enabled is false, or if demotion_enabled
> is true but /sys/kernel/mm/lru_gen/min_ttl_ms is set to a non-zero value
> such as 1000, the OOM killer will be invoked properly.
>
> This issue can be reproduced using NUMA emulation even on systems with
> only DRAM. However, to configure multiple memory-tiers using fake nodes,
> you must apply the attached patch.
>
> You can create two-fake memory-tiers by booting a single-node system with
> the following boot options:
>
> numa=fake=2
> numa_emulation.default_dram=1,0
> numa_emulation.read_latency=100,1000
> numa_emulation.write_latency=100,1000
> numa_emulation.read_bandwidth=100000,10000
> numa_emulation.write_bandwidth=100000,10000
>
Thanks for the report. Can you try to repro this with traditional LRU
i.e. not MGLRU? I just want to see if this is MGLRU only issue or more
general.
Powered by blists - more mailing lists