lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251117185550.365156-1-kprateek.nayak@amd.com>
Date: Mon, 17 Nov 2025 18:55:45 +0000
From: K Prateek Nayak <kprateek.nayak@....com>
To: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
	<vincent.guittot@...aro.org>, John Stultz <jstultz@...gle.com>, "Johannes
 Weiner" <hannes@...xchg.org>, Suren Baghdasaryan <surenb@...gle.com>,
	<linux-kernel@...r.kernel.org>
CC: Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
	<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, K Prateek Nayak
	<kprateek.nayak@....com>
Subject: [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution

When booting into a kernel with CONFIG_SCHED_PROXY_EXEC and CONFIG_PSI,
a inconsistent task state warning was noticed soon after the boot
similar to:

    psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4

On analysis, the following sequence of event was found to be the cause
of the splat:

o Blocked task is retained on the runqueue.
o psi_sched_switch() sees task_on_rq_queued() and retains the runnable
  signals for the task.
o Tasks blocks later via proxy_deactivate() but psi_dequeue() doesn't
  adjust the PSI flags since DEQUEUE_SLEEP is set expecting
  psi_sched_switch() to fix the signals.
o The blocked task is woken up with the PSI state still reflecting that
  the task is runnable (TSK_RUNNING) leading to the splat.


Simply tracking proxy_deactivate() is not enough since the task's
blocked_on relationship can be cleared remotely without acquiring the
runqueue lock which can force a blocked task to run before a wakeup -
pick_next_task() pickes the blocked donor and since blocked on
relationship was cleared remotely, task_is_blocked() returns false
leading to the task being run on the CPU.

If the task blocks again before it is woken up, psi_sched_switch() will
try to clear the runnable signals (TSK_RUNNING) unconditionally leading
to a different splat similar to:

    psi: inconsistent task state! task=... cpu=... psi_flags=10 clear=14 set=0


To get around this, track the complete lifecycle of a blocked doner
right from delaying the deactivation to the wakeup. When in
blocked/donor state, PSI will consider these tasks similar to delayed
tasks - blocked but migratable.

When the ttwu_runnable() finally wakeups up the task, or if the donor is
deactivated via proxy_deactivate(), the proxy indicator is cleared to
show that the task is either fully blocked or fully runnable now.

Patch 1 and 2 were cleanups to make life slightly easier when auditing
the implementation and inspecting the debug logs. Patch 3 to 5 implement
the tracking of donor states and a couple of fixes on top.

Series was tested on top of tip:sched/core for a while running
sched-messaging without observing any inconsistent task state warning
and should apply cleanly on top of:

    git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core

at commit 33cf66d88306 ("sched/fair: Proportional newidle balance").

---
K Prateek Nayak (5):
  sched/psi: Make psi stubs consistent for !CONFIG_PSI
  sched/psi: Prepend "0x" to format specifiers when printing PSI flags
  sched/core: Track blocked tasks retained on rq for proxy
  sched/core: Block proxy task on pick when blocked_on is cleared before
    wakeup
  sched/psi: Fix PSI signals of blocked tasks retained for proxy

 include/linux/sched.h |  4 +++
 kernel/sched/core.c   | 59 +++++++++++++++++++++++++++++++++++++++++--
 kernel/sched/psi.c    |  4 +--
 kernel/sched/sched.h  |  2 ++
 kernel/sched/stats.h  |  6 ++---
 5 files changed, 68 insertions(+), 7 deletions(-)


base-commit: 33cf66d88306663d16e4759e9d24766b0aaa2e17
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ