lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tencent_46288A6EC6007C3C980C22DB856268526209@qq.com>
Date: Wed,  1 Oct 2025 09:15:27 +0800
From: Han Guangjiang <gj.han@...mail.com>
To: peterz@...radead.org
Cc: bsegall@...gle.com,
	dietmar.eggemann@....com,
	fanggeng@...iang.com,
	gj.han@...mail.com,
	hanguangjiang@...iang.com,
	juri.lelli@...hat.com,
	linux-kernel@...r.kernel.org,
	mgorman@...e.de,
	mingo@...hat.com,
	rostedt@...dmis.org,
	vincent.guittot@...aro.org,
	vschneid@...hat.com,
	yangchen11@...iang.com
Subject: Re: [PATCH] sched/fair: Fix DELAY_DEQUEUE issue related to cgroup throttling

>> From: Han Guangjiang <hanguangjiang@...iang.com>
>>
>> When both CPU cgroup and memory cgroup are enabled with parent cgroup
>> resource limits much smaller than child cgroup's, the system frequently
>> hangs with NULL pointer dereference:
>>
> Is this the same issue as here:
>
>   https://lore.kernel.org/all/105ae6f1-f629-4fe7-9644-4242c3bed035@amd.com/T/#u
>
>   ?

Yes, based on the patch modifications, I believe this is the same issue.
When dequeue_entities() is executed on a delay_dequeued task while the
cgroup is being throttled, it returns early and misses the
__block_task() operation on the task. This leads to inconsistency
between p->on_rq and se->on_rq.

When PI or scheduler switching occurs, the second dequeue_entities()
call assumes the task is still in the CFS scheduler, but in reality
it is no longer there.

By the way, I have a question about the hrtick_update() in
dequeue_entities(). Should it be changed to:

dequeue_entities()
{
    ...
    if (p) {
        hrtick_update(rq);
    }
    ...
}

And remove hrtick_update() from dequeue_task_fair()?
Because for dequeue_delayed tasks, hrtick_update() will be executed
twice in this proces.

Also, should the return type of dequeue_entities() be changed to
match dequeue_task_fair(), where true means the task was actually
removed from the queue, and false means it was delay dequeued?

Thanks,
Han Guangjiang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ