[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <jzzdeczuyraup2zrspl6b74muf3bly2a3acejfftcldfmz4ekk@s5mcbeim34my>
Date: Mon, 25 Aug 2025 12:41:07 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: zhongjinji <zhongjinji@...or.com>
Cc: mhocko@...e.com, rientjes@...gle.com, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, tglx@...utronix.de,
liam.howlett@...cle.com, lorenzo.stoakes@...cle.com, liulu.liu@...or.com,
feng.han@...or.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v5 1/2] mm/oom_kill: Do not delay oom reaper when the
victim is frozen
+cgroups
On Mon, Aug 25, 2025 at 09:38:54PM +0800, zhongjinji wrote:
> The OOM reaper can quickly reap a process's memory when the system
> encounters OOM, helping the system recover. If the victim process is
> frozen and cannot be unfrozen in time, the reaper delayed by two seconds
> will cause the system to fail to recover quickly from the OOM state.
>
> When an OOM occurs, if the victim is not unfrozen, delaying the OOM reaper
> will keep the system in a bad state for two seconds. Before scheduling the
> oom_reaper task, check whether the victim is in a frozen state. If the
> victim is frozen, do not delay the OOM reaper.
>
> Signed-off-by: zhongjinji <zhongjinji@...or.com>
> ---
> mm/oom_kill.c | 40 +++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 25923cfec9c6..4b4d73b1e00d 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -683,6 +683,41 @@ static void wake_oom_reaper(struct timer_list *timer)
> wake_up(&oom_reaper_wait);
> }
>
> +/*
> + * When the victim is frozen, the OOM reaper should not be delayed, because
> + * if the victim cannot be unfrozen promptly, it may block the system from
> + * quickly recovering from the OOM state.
> + */
> +static bool should_delay_oom_reap(struct task_struct *tsk)
> +{
> + struct mm_struct *mm = tsk->mm;
> + struct task_struct *p;
> + bool ret;
> +
On v2, shouldn't READ_ONCE(tsk->frozen) be enough instead of mm check
and checks insode for_each_process()?
> + if (!mm)
> + return true;
> +
> + if (!frozen(tsk))
> + return true;
> +
> + if (atomic_read(&mm->mm_users) <= 1)
> + return false;
> +
> + rcu_read_lock();
> + for_each_process(p) {
> + if (!process_shares_mm(p, mm))
> + continue;
> + if (same_thread_group(tsk, p))
> + continue;
> + ret = !frozen(p);
> + if (ret)
> + break;
> + }
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
> /*
> * Give the OOM victim time to exit naturally before invoking the oom_reaping.
> * The timers timeout is arbitrary... the longer it is, the longer the worst
> @@ -694,13 +729,16 @@ static void wake_oom_reaper(struct timer_list *timer)
> #define OOM_REAPER_DELAY (2*HZ)
> static void queue_oom_reaper(struct task_struct *tsk)
> {
> + bool delay;
> +
> /* mm is already queued? */
> if (test_and_set_bit(MMF_OOM_REAP_QUEUED, &tsk->signal->oom_mm->flags))
> return;
>
> get_task_struct(tsk);
> + delay = should_delay_oom_reap(tsk);
> timer_setup(&tsk->oom_reaper_timer, wake_oom_reaper, 0);
> - tsk->oom_reaper_timer.expires = jiffies + OOM_REAPER_DELAY;
> + tsk->oom_reaper_timer.expires = jiffies + (delay ? OOM_REAPER_DELAY : 0);
> add_timer(&tsk->oom_reaper_timer);
> }
>
> --
> 2.17.1
>
Powered by blists - more mailing lists