linux-kernel - Re: [PATCH 5/5] rv: Add rts monitor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250805154515.CchJtec3@linutronix.de>
Date: Tue, 5 Aug 2025 17:45:15 +0200
From: Nam Cao <namcao@...utronix.de>
To: Gabriele Monaco <gmonaco@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] rv: Add rts monitor

On Tue, Aug 05, 2025 at 02:22:17PM +0200, Nam Cao wrote:
> On Tue, Aug 05, 2025 at 10:40:30AM +0200, Gabriele Monaco wrote:
> > Hello Nam,
> > 
> > I just built and booted up the monitor in a VM (virtme-ng), the
> > configuration has preemptirq tracepoints and all monitors so far (as we
> > have seen earlier, it doesn't build if rtapp monitors are not there
> > because of the circular dependency in the tracepoints).
> > 
> > All I did was to enable the monitor and printk reactor, but I get a
> > whole lot of errors (as in, I need to quit the VM for it to stop):
> > 
> > [ 1537.699834] rv: rts: 7: violation detected
> > [ 1537.699930] rv: rts: 3: violation detected
> > [ 1537.701827] rv: rts: 6: violation detected
> > [ 1537.704894] rv: rts: 0: violation detected
> > [ 1537.704925] rv: rts: 0: violation detected
> > [ 1537.704988] rv: rts: 3: violation detected
> > [ 1537.705019] rv: rts: 3: violation detected
> > [ 1537.705998] rv: rts: 0: violation detected
> > [ 1537.706024] rv: rts: 0: violation detected
> > [ 1537.709875] rv: rts: 6: violation detected
> > [ 1537.709921] rv: rts: 6: violation detected
> > [ 1537.711241] rv: rts: 6: violation detected
> > 
> > Curiously enough, I only see those CPUs (0, 3, 6 and 7).
> > Other runs have different CPUs but always a small subset (e.g. 10-15,
> > 6-7 only 2).
> > It doesn't always occur but enabling/disabling the monitor might help
> > triggering it.
> > 
> > Any idea what is happening?

There are two issues:

  - When the monitor is disabled then enabled, the number of queued task
    does not reset. The monitor may mistakenly thinks there are queued RT
    tasks, but there aren't.

  - The enqueue tracepoint is registered before the dequeue tracepoint.
    Therefore there may be a enqueue followed by a dequeue, but the monitor
    missed the latter.

The first issue can be fixed by reseting the queued task number at
enabling.

For the second issue, LTL monitors need something similar to
da_monitor_enabled_##name(void). But a quick workaround is reordering the
tracepoint registerations.

So with the below diff, I no longer see the issue.

Thanks again for noticing this!

Nam

diff --git a/kernel/trace/rv/monitors/rts/rts.c b/kernel/trace/rv/monitors/rts/rts.c
index 473004b673c5..3ddbf09db0dd 100644
--- a/kernel/trace/rv/monitors/rts/rts.c
+++ b/kernel/trace/rv/monitors/rts/rts.c
@@ -81,14 +81,21 @@ static void handle_sched_switch(void *data, bool preempt, struct task_struct *pr
 
 static int enable_rts(void)
 {
+	unsigned int cpu;
 	int retval;
 
 	retval = ltl_monitor_init();
 	if (retval)
 		return retval;
 
-	rv_attach_trace_probe("rts", enqueue_task_rt_tp, handle_enqueue_task_rt);
+	for_each_possible_cpu(cpu) {
+		unsigned int *queued = per_cpu_ptr(&nr_queued, cpu);
+
+		*queued = 0;
+	}
+
 	rv_attach_trace_probe("rts", dequeue_task_rt_tp, handle_dequeue_task_rt);
+	rv_attach_trace_probe("rts", enqueue_task_rt_tp, handle_enqueue_task_rt);
 	rv_attach_trace_probe("rts", sched_switch, handle_sched_switch);
 
 	return 0;