[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250728135022.255578-1-gmonaco@redhat.com>
Date: Mon, 28 Jul 2025 15:50:12 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Nam Cao <namcao@...utronix.de>
Cc: Gabriele Monaco <gmonaco@...hat.com>,
Tomas Glozar <tglozar@...hat.com>,
Juri Lelli <jlelli@...hat.com>,
Clark Williams <williams@...hat.com>,
John Kacur <jkacur@...hat.com>
Subject: [PATCH v5 0/9] rv: Add monitors to validate task switch
This series adds three monitors to the sched collection, extends and
replaces previously existing monitors:
{tss,snroc} => sts:
Not only prove that switches occur in scheduling context and scheduling
needs interrupt disabled but also that each call to the scheduler
disables interrupts to (optionally) switch.
nrp (NEW):
* preemption requires need resched which is cleared by any switch
(includes a non optimal workaround for /nested/ preemptions)
sssw (NEW):
* suspension requires setting the task to sleepable and, after the
switch occurs, the task requires a wakeup to come back to runnable
opid (NEW):
* waking and need-resched operations occur with interrupts and
preemption disabled or in IRQ without explicitly disabling preemption
Also include some minor cleanup patches (1-4) tracepoints (6) and
preparatory fixes (5) covering some corner cases:
The series is currently based on the tracing rv/for-next tree.
Patch 1 adds da_handle_start_run_event_ also to per-task monitors
Patch 2 removes a trailing whitespace from the rv tracepoint string
Patch 3 fixes an out-of-bound memory access in DA tracepoints
Patch 4 adjusts monitors to have minimised Kconfig dependencies
Patch 5 detects race conditions when rv monitors run concurrently and
retries applying the events
Patch 6 adds the need_resched and removes unused arguments from
schedule entry/exit tracepoints
Patch 7 adds the sts monitor to replace tss and sncid
Patch 8 adds the nrp and sssw monitors
Patch 9 adds the opid monitor
NOTES
The nrp and sssw monitors include workarounds for racy conditions:
* A sleeping task requires to set the state to sleepable, but in case of
a task sleeping on an rtlock, the set sleepable and wakeup events race
and we don't always see the right order:
5d..2. 107.488369: event: 639: sleepable x set_sleepable -> sleepable
4d..5. 107.488369: event: 639: sleepable x wakeup -> running (final)
5d..3. 107.488385: error: 639: switch_suspend not expected in the state running
wakeup() set_state()
state=RUNNING
trace_set_state()
trace_wakeup()
state=SLEEPING
I added a special event (switch_block) but there may be a better way.
Taking a pi_lock in rtlock_slowlock_locked when setting the state to
TASK_RTLOCK_WAIT avoids this race, although this is likely not
something we want to do.
* I consider preemption any scheduling with preempt==true and assume
this can happen only if need resched is set.
In practice, however, we may see a preemption where the flag
is not set. This can happen in one specific condition:
need_resched
preempt_schedule()
preempt_schedule_irq()
__schedule()
!need_resched
__schedule()
We start a standard preemption (e.g. from preempt_enable when the flag
is set), an interrupts occurs before we schedule and, on its exit path,
it schedules, which clears the need_resched flag.
When the preempted task runs again, we continue the standard
preemption started earlier, although the flag is no longer set.
I added a workaround to allow the model not to fail in this condition,
by allowing a preemption without need_resched if an interrupt is
received. This might catch false negatives too.
Changes since V4:
* Drop already applied patches for the tools/ directory
* Avoid using smp_processor_id() from tracepoint context
Changes since V3 [1]:
* Fix condition for lines shorter than 100 columns in dot2c.
* Fix Kconfig tooltip for container monitors in rvgen.
* Improve condition not to skip pid-0 in userspace tool.
* Rearrange monitors to reduce needed tracepoints and arguments.
* Separately track errors when DAs run out of retries due to races.
* Fix issue in opid with multiple handlers in the same interrupt.
* Cleanup patches by removing, squashing and reordering.
Changes since RFC2:
* Arrange commits to prevent failed build while bisecting.
* Avoid dot2k generated files to reach the column limit. (Nam Cao)
* Rearrange and simplify da_monitor retry on racing events.
* Improve nrp monitor to handle /nested/ preemption on IRQ.
* Added minor patches (6-10).
* Cleanup and rearrange order.
Changes since RFC [2]:
* Remove wakeup tracepoint in try_to_block_task and use a different
flavour of sched_set_state
* Split the large srs monitor in two separate monitors for preemption
and sleep. These no longer have a concept of running task, they just
enforce the requirements for the different types of sched out.
* Restore the snroc monitor to describe the relationship between
generic sched out and sched in.
* Add opid monitor.
* Fix some build errors and cleanup.
[1] - https://lore.kernel.org/lkml/20250715071434.22508-1-gmonaco@redhat.com
[2] - https://lore.kernel.org/lkml/20250404084512.98552-11-gmonaco@redhat.com
To: Ingo Molnar <mingo@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
To: Steven Rostedt <rostedt@...dmis.org>
To: Nam Cao <namcao@...utronix.de>
Cc: Tomas Glozar <tglozar@...hat.com>
Cc: Juri Lelli <jlelli@...hat.com>
Cc: Clark Williams <williams@...hat.com>
Cc: John Kacur <jkacur@...hat.com>
Gabriele Monaco (9):
rv: Add da_handle_start_run_event_ to per-task monitors
rv: Remove trailing whitespace from tracepoint string
rv: Use strings in da monitors tracepoints
rv: Adjust monitor dependencies
rv: Retry when da monitor detects race conditions
sched: Adapt sched tracepoints for RV task model
rv: Replace tss and sncid monitors with more complete sts
rv: Add nrp and sssw per-task monitors
rv: Add opid per-cpu monitor
Documentation/trace/rv/monitor_sched.rst | 307 +++++++++++++++---
include/linux/rv.h | 3 +-
include/linux/sched.h | 7 +-
include/rv/da_monitor.h | 131 +++++---
include/trace/events/sched.h | 12 +-
kernel/sched/core.c | 13 +-
kernel/trace/rv/Kconfig | 11 +-
kernel/trace/rv/Makefile | 6 +-
kernel/trace/rv/monitors/{tss => nrp}/Kconfig | 12 +-
kernel/trace/rv/monitors/nrp/nrp.c | 138 ++++++++
kernel/trace/rv/monitors/nrp/nrp.h | 75 +++++
kernel/trace/rv/monitors/nrp/nrp_trace.h | 15 +
kernel/trace/rv/monitors/opid/Kconfig | 19 ++
kernel/trace/rv/monitors/opid/opid.c | 168 ++++++++++
kernel/trace/rv/monitors/opid/opid.h | 104 ++++++
.../sncid_trace.h => opid/opid_trace.h} | 8 +-
kernel/trace/rv/monitors/sched/Kconfig | 1 +
kernel/trace/rv/monitors/sco/sco.c | 4 +-
kernel/trace/rv/monitors/scpd/Kconfig | 2 +-
kernel/trace/rv/monitors/scpd/scpd.c | 4 +-
kernel/trace/rv/monitors/sncid/sncid.c | 95 ------
kernel/trace/rv/monitors/sncid/sncid.h | 49 ---
kernel/trace/rv/monitors/snep/Kconfig | 2 +-
kernel/trace/rv/monitors/snep/snep.c | 4 +-
.../trace/rv/monitors/{sncid => sssw}/Kconfig | 10 +-
kernel/trace/rv/monitors/sssw/sssw.c | 116 +++++++
kernel/trace/rv/monitors/sssw/sssw.h | 105 ++++++
kernel/trace/rv/monitors/sssw/sssw_trace.h | 15 +
kernel/trace/rv/monitors/sts/Kconfig | 19 ++
kernel/trace/rv/monitors/sts/sts.c | 156 +++++++++
kernel/trace/rv/monitors/sts/sts.h | 117 +++++++
.../{tss/tss_trace.h => sts/sts_trace.h} | 8 +-
kernel/trace/rv/monitors/tss/tss.c | 90 -----
kernel/trace/rv/monitors/tss/tss.h | 47 ---
kernel/trace/rv/monitors/wip/Kconfig | 2 +-
kernel/trace/rv/rv_trace.h | 114 ++++---
tools/verification/models/sched/nrp.dot | 29 ++
tools/verification/models/sched/opid.dot | 35 ++
tools/verification/models/sched/sncid.dot | 18 -
tools/verification/models/sched/sssw.dot | 30 ++
tools/verification/models/sched/sts.dot | 38 +++
tools/verification/models/sched/tss.dot | 18 -
42 files changed, 1665 insertions(+), 492 deletions(-)
rename kernel/trace/rv/monitors/{tss => nrp}/Kconfig (51%)
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.c
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.h
create mode 100644 kernel/trace/rv/monitors/nrp/nrp_trace.h
create mode 100644 kernel/trace/rv/monitors/opid/Kconfig
create mode 100644 kernel/trace/rv/monitors/opid/opid.c
create mode 100644 kernel/trace/rv/monitors/opid/opid.h
rename kernel/trace/rv/monitors/{sncid/sncid_trace.h => opid/opid_trace.h} (66%)
delete mode 100644 kernel/trace/rv/monitors/sncid/sncid.c
delete mode 100644 kernel/trace/rv/monitors/sncid/sncid.h
rename kernel/trace/rv/monitors/{sncid => sssw}/Kconfig (58%)
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.c
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.h
create mode 100644 kernel/trace/rv/monitors/sssw/sssw_trace.h
create mode 100644 kernel/trace/rv/monitors/sts/Kconfig
create mode 100644 kernel/trace/rv/monitors/sts/sts.c
create mode 100644 kernel/trace/rv/monitors/sts/sts.h
rename kernel/trace/rv/monitors/{tss/tss_trace.h => sts/sts_trace.h} (67%)
delete mode 100644 kernel/trace/rv/monitors/tss/tss.c
delete mode 100644 kernel/trace/rv/monitors/tss/tss.h
create mode 100644 tools/verification/models/sched/nrp.dot
create mode 100644 tools/verification/models/sched/opid.dot
delete mode 100644 tools/verification/models/sched/sncid.dot
create mode 100644 tools/verification/models/sched/sssw.dot
create mode 100644 tools/verification/models/sched/sts.dot
delete mode 100644 tools/verification/models/sched/tss.dot
base-commit: b8a7fba39cd49eab343bfe561d85bb5dc57541af
--
2.50.1
Powered by blists - more mailing lists