lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <345a04ad-cf25-4af5-802a-bc8826d37b19@redhat.com>
Date: Wed, 18 Jun 2025 13:54:20 +0200
From: David Hildenbrand <david@...hat.com>
To: Zihuan Zhang <zhangzihuan@...inos.cn>, Michal Hocko <mhocko@...e.com>
Cc: Peter Zijlstra <peterz@...radead.org>, rafael@...nel.org,
 len.brown@...el.com, pavel@...nel.org, kees@...nel.org, mingo@...hat.com,
 juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com,
 Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com,
 linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH] PM: Optionally block user fork during freeze to
 improve performance

On 18.06.25 13:30, Zihuan Zhang wrote:
> Hi David,
> 
> 在 2025/6/16 15:45, David Hildenbrand 写道:
>>
>>>> [...]
>>> In our test scenario, although new processes can indeed be created
>>> during the usleep_range() intervals between freeze iterations, it’s
>>> actually difficult to make the freezer fail outright. This is because
>>> user processes are forcibly frozen: when they return to user space and
>>> check for pending signals, they enter try_to_freeze() and transition
>>> into the refrigerator.
>>>
>>> However, since the scheduler is fair by design, it gives both newly
>>> forked tasks and yet-to-be-frozen tasks a chance to run. This
>>> competition for CPU time can slightly delay the overall freeze process.
>>> While this typically doesn’t lead to failure, it does cause more retries
>>> than necessary, especially under CPU pressure.
>>
>> I think that goes back to my original comment: why are we even
>> allowing fork children to run at all when we are currently freezing
>> all tasks?
>>
>> I would imagine that try_to_freeze_tasks() should force any new
>> processes (forked children) to start in the frozen state directly and
>> not get scheduled in the first place.
>>
> Thanks again for your comments and suggestion.
> 
> We understand the motivation behind your idea: ideally, newly forked
> tasks during freezing should either be immediately frozen or prevented
> from running at all, to avoid unnecessary retries and delays. That makes
> perfect sense.
> 
> However, implementing this seems non-trivial under the current freezer
> model, as it relies on voluntary transitions and lacks a mechanism to
> block forked children from being scheduled.
> 
> Any insights or pointers would be greatly appreciated.

I'm afraid I can't provide too much guidance on scheduler logic.

Apparently we have this freezer_active global that forces existing 
frozen pages to enter the freezing_slow_path().

There, we perform multiple checks, including "pm_freezing && !(p->flags 
& PF_KTHREAD)".

I would have thought that we would want to make fork()/clone() children 
while freezing also result in freezing_slow_path()==true, and stop them 
from getting scheduled in the first place.

Again, no scheduler expert, but that's something I would look into.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ