[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eed6ff19-a944-4e4c-96e4-0f44e888c71d@kzalloc.com>
Date: Wed, 29 Oct 2025 07:53:20 +0900
From: Yunseong Kim <ysk@...lloc.com>
To: Gabriele Monaco <gmonaco@...hat.com>, Nam Cao <nam.cao@...aro.org>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Tomas Glozar <tglozar@...hat.com>, Shung-Hsi Yu <shung-hsi.yu@...e.com>,
Byungchul Park <byungchul@...com>, syzkaller@...glegroups.com,
linux-rt-devel@...ts.linux.dev, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [Question] Detecting Sleep-in-Atomic Context in PREEMPT_RT via RV
(Runtime Verification) monitor rtapp:sleep
Hi Gabriele,
On 10/27/25 9:20 PM, Gabriele Monaco wrote:
> On Mon, 2025-10-27 at 15:54 +0900, Yunseong Kim wrote:
>> Hi Nam,
>>
>> I've been very interested in RV (Runtime Verification) to proactively detect
>> "sleep in atomic" scenarios on PREEMPT_RT kernels. Specifically, I'm looking
>> for ways to find cases where sleeping spinlocks or memory allocations are used
>> within preemption-disabled or irq-disabled contexts. While searching for
>> solutions, I discovered the RV subsystem.
>>
>
> Hi Yunseong,
>
> I'm sure Nam can be more specific on this, but let me add my 2 cents here.
Thank you so much for your detailed response! It cleared up many of the
questions I had.
> The sleep monitor doesn't really do what you want, its violations are real time
> tasks (typically userspace tasks with RR/FIFO policies) sleeping in a way that
> might incur latencies. For instance using non PI locks or imprecise sleep.
So that’s the role of rtapp:sleep you mentioned. Thank you again for
clarifying it.
> What you need here is to validate kernel code, RV was actually designed for
> that, but there's currently no monitor that does what you want.
It’s a valuable chance to make a contribution to RV!
> The closest thing I can think of is monitors like scpd and snep in the sched
> collection [1]. Those however won't catch what you need because they focus on
> the preemption tracepoints and schedule, which works fine also in your scenario.
>
> We could add similar monitors to catch what you want though:
>
> |
> |
> v
> +-----------------+
> | cant_sleep | <+
> +-----------------+ |
> | |
> | preempt_enable | preempt_disable
> v |
> kmalloc |
> lock_acquire |
> +--------------- can_sleep |
> | |
> +--------------> -+
>
> which would become slightly more complicated if considering irq enable/disable
> too. This is a deterministic automaton representation (see [1] for examples),
> you could use an LTL like sleep as well, I assume (needs a per-CPU monitor which
> is not merged yet for LTL).
>
> This is simplified but you can of course put conditions on what kind of
> allocations and locks you're interested in.
If the goal is to detect this state before the output from __might_resched()
under CONFIG_DEBUG_ATOMIC_SLEEP (i.e., before an actual context switch occurs),
I am considering whether Deterministic Automata (.dot/DA) or Linear Temporal
Logic (.ltl/LTL) would be more appropriate for modeling this check. I'm also
thinking about whether I need to create a comprehensive table of all sleepable
functions for this purpose on the PREEMPT_RT kernel.
If this check is necessary, I’m planning to try the following verification:
RULE = always ((IN_ATOMIC or IRQS_DISABLED) imply not CALLS_RT_SLEEPER)
I’m also planning to add sleepable functions, including sleepable spinlocks
and memory allocations callable under PREEMPT_RT preempt/IRQ-disabled states,
to the RV monitor kernel module.
I’m considering adding the following functions as a result:
// Mutex & Semaphore (or Lockdep's 'lock_acquire' for lock cases)
"mutex_lock",
"mutex_lock_interruptible",
"mutex_lock_killable",
"down_interruptible",
"down_killable",
"rwsem_down_read_failed",
"rwsem_down_write_failed",
"ww_mutex_lock",
"rt_spin_lock",
"rt_read_lock",
"rt_write_lock",
// or just "lock_acquire" for LOCKDEP enabled kernel.
// sleep & schedule
"msleep",
"ssleep",
"usleep_range",
"wait_for_completion",
"schedule",
"cond_resched",
// User-space memory access
"copy_from_user",
"copy_to_user",
"__get_user_asm",
"__put_user_asm",
// memory allocation
"__vmalloc",
"__kmalloc"
> Now this specific case would require lockdep for the definition of lock_acquire
> tracepoints. So I'm not sure how useful this monitor would be since lockdep is
> going to complain too. You could use contention tracepoints to catch exactly
> when sleep is going to occur and not /potential/ failures.
I’ll look into this lockdep realated part further as well.
> I only gave a quick thought on this, there may be better models/event fitting
> your usecase, but I hope you get the idea.
>
> [1] - https://docs.kernel.org/trace/rv/monitor_sched.html#monitor-scpd
Thank you for providing a diagram and references that make it easier to
understand!
>> Here are my questions:
>>
>> 1. Does the rtapp:sleep monitor proactively detect scenarios that
>> could lead to sleeping in atomic context, perhaps before
>> CONFIG_DEBUG_ATOMIC_SLEEP (enabled) would trigger at the actual point of
>> sleeping?
>
> I guess I answered this already, but TL;DR no, you'd need a dedicated monitor.
>
>> 2. Is there a way to enable this monitor (e.g., rtapp:sleep)
>> immediately as soon as the RV subsystem is loaded during boot time?
>> (How to make this "default turn on"?)
>
> Currently not, but you could probably use any sort of startup script to turn it
> on soon enough.
>
>> 3. When a "violation detected" message occurs at runtime, is it
>> possible to get a call stack of the location that triggered the
>> violation? The panic reactor provides a full stack, but I'm
>> wondering if this is also possible with the printk reactor.
>
> You can use ftrace and rely on error tracepoints instead of reactors. Each RV
> violation triggers a tracepoint (e.g. error_sleep) and you can print a call
> stack there. E.g.:
>
> echo stacktrace > /sys/kernel/tracing/events/rv/error_sleep/trigger
>
> Here I use sleep as an example, but all monitors have their own error events
> (e.g. error_wwnr, error_snep, etc.).
>
> Does this all look useful in your scenario?
Thank you once again for your thorough explanation. Many of the questions
I initially had have now been resolved!
> Gabriele
Best regards,
Yunseong Kim
Powered by blists - more mailing lists