[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84ecxgit04.fsf@jogness.linutronix.de>
Date: Fri, 25 Apr 2025 09:51:47 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Nam Cao <namcao@...utronix.de>, Gabriele Monaco <gmonaco@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 20/22] rv: Add rtapp_sleep monitor
On 2025-04-25, Nam Cao <namcao@...utronix.de> wrote:
> On Thu, Apr 24, 2025 at 03:55:34PM +0200, Gabriele Monaco wrote:
>> I've been playing with these monitors, code-wise they look good.
>> I tested a bit and they seem to work without many surprises by doing
>> something as simple as:
>>
>> perf stat -e rv:error_sleep stress-ng --cpu-sched 1 -t 10s
>> -- shows several errors --
>
> This one is a monitor's bug.
>
> The monitor mistakenly sees the task getting woken up, *then* sees it going
> to sleep.
>
> This is due to trace_sched_switch() being called with a stale 'prev_state'.
> 'prev_state' is read at the beginning of __schedule(), but
> trace_sched_switch() is invoked a bit later. Therefore if task->__state is
> changed inbetween, 'prev_state' is not the value of task->__state.
>
> The monitor checks (prev_state & TASK_INTERRUPTIBLE) to determine if the
> task is going to sleep. This can be incorrect due to the race above. The
> monitor sees the task going to sleep, but actually it is just preempted.
If I understand this correctly, trace_sched_switch() is reporting
accurate state transition information, but by the time it is reported
that state may have already changed (in which case another
trace_sched_switch() occurs later).
So in this example, the task did go to sleep. Why do you think it was
preempted instead?
John Ogness
Powered by blists - more mailing lists