[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87cy673rk7.fsf@linux.dev>
Date: Tue, 28 Oct 2025 09:43:36 -0700
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: bot+bpf-ci@...nel.org
Cc: akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
ast@...nel.org, surenb@...gle.com, mhocko@...nel.org,
shakeel.butt@...ux.dev, hannes@...xchg.org, andrii@...nel.org,
inwardvessel@...il.com, linux-mm@...ck.org, cgroups@...r.kernel.org,
bpf@...r.kernel.org, martin.lau@...nel.org, song@...nel.org,
memxor@...il.com, tj@...nel.org, daniel@...earbox.net, eddyz87@...il.com,
yonghong.song@...ux.dev, clm@...a.com, ihor.solodrai@...ux.dev
Subject: Re: [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc
bot+bpf-ci@...nel.org writes:
> ```
> commit bce57b9b96e05731a14c7e10a364fdd9b733cda2
> Author: Roman Gushchin <roman.gushchin@...ux.dev>
>
> mm: introduce bpf_out_of_memory() BPF kfunc
>
> This patch introduces bpf_out_of_memory() BPF kfunc to allow BPF programs
> to declare OOM events and trigger the kernel OOM handling mechanism. The
> function accepts a nullable memcg pointer for system-wide or memcg-scoped
> OOMs.
>
>> diff --git a/include/linux/oom.h b/include/linux/oom.h
>> index 721087952..3cbdcd013 100644
>> --- a/include/linux/oom.h
>> +++ b/include/linux/oom.h
>> @@ -21,6 +21,11 @@ enum oom_constraint {
>> CONSTRAINT_MEMCG,
>> };
>>
>> +enum bpf_oom_flags {
>> + BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK = 1 << 0,
>> + BPF_OOM_FLAGS_LAST = 1 << 1,
>> +};
>> +
>> /*
>> * Details of the page allocation that triggered the oom killer that are used to
>> * determine what should be killed.
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 3c86cd755..d7fca4bf5 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>
> [ ... ]
>
>> +/**
>> + * bpf_out_of_memory - declare Out Of Memory state and invoke OOM killer
>> + * @memcg__nullable: memcg or NULL for system-wide OOMs
>> + * @order: order of page which wasn't allocated
>> + * @flags: flags
>> + * @constraint_text__nullable: custom constraint description for the OOM report
>> + *
>> + * Declares the Out Of Memory state and invokes the OOM killer.
>> + *
>> + * OOM handlers are synchronized using the oom_lock mutex. If wait_on_oom_lock
>> + * is true, the function will wait on it. Otherwise it bails out with -EBUSY
>> + * if oom_lock is contended.
>> + *
>> + * Generally it's advised to pass wait_on_oom_lock=false for global OOMs
>> + * and wait_on_oom_lock=true for memcg-scoped OOMs.
>> + *
>> + * Returns 1 if the forward progress was achieved and some memory was freed.
>> + * Returns a negative value if an error occurred.
>> + */
>> +__bpf_kfunc int bpf_out_of_memory(struct mem_cgroup *memcg__nullable,
>> + int order, u64 flags)
>> +{
>> + struct oom_control oc = {
>> + .memcg = memcg__nullable,
>> + .order = order,
>> + };
>
> Can bpf_out_of_memory() work correctly for system-wide OOMs with
> uninitialized gfp_mask?
>
> Looking at out_of_memory():
>
> if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
> return true;
>
> With gfp_mask=0 and memcg=NULL (system-wide OOM), this evaluates to:
> !(0 & __GFP_FS) && !false = true, causing an early return without
> invoking the OOM killer. The comment in out_of_memory() explains this
> check is about allocation context: "The OOM killer does not compensate
> for IO-less reclaim."
>
> All other callers of out_of_memory() initialize gfp_mask:
> - mm/page_alloc.c: .gfp_mask = gfp_mask
> - mm/memcontrol.c: .gfp_mask = gfp_mask
>
> For memcg OOMs, the code works because is_memcg_oom(oc) is true,
> bypassing the check. But for system-wide OOMs (NULL memcg), won't this
> cause the function to return immediately without killing any process?
This is a good catch! It must be .gfp_mask = GFP_KERNEL.
Fixed.
Thanks!
Powered by blists - more mailing lists