[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c152e0f5f431739eb2aa5fb78ef6e17521f3826b.camel@redhat.com>
Date: Wed, 16 Apr 2025 15:05:12 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Nam Cao <namcao@...utronix.de>
Cc: Steven Rostedt <rostedt@...dmis.org>,
linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org,
john.ogness@...utronix.de
Subject: Re: [PATCH v3 13/22] rv: Add support for LTL monitors
On Wed, 2025-04-16 at 13:56 +0200, Nam Cao wrote:
> On Wed, Apr 16, 2025 at 11:34:53AM +0200, Gabriele Monaco wrote:
> > On Wed, 2025-04-16 at 08:51 +0200, Nam Cao wrote:
> > > #endif /* CONFIG_DA_MON_EVENTS_ID */
> > > +#if CONFIG_LTL_MON_EVENTS_ID
> > > +TRACE_EVENT(event_ltl_monitor_id,
> > > +
> > > + TP_PROTO(struct task_struct *task, char *states, char
> > > *atoms, char *next),
> > > +
> > > + TP_ARGS(task, states, atoms, next),
> > > +
> > > + TP_STRUCT__entry(
> > > + __string(comm, task->comm)
> > > + __field(pid_t, pid)
> > > + __string(states, states)
> > > + __string(atoms, atoms)
> > > + __string(next, next)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + __assign_str(comm);
> > > + __entry->pid = task->pid;
> > > + __assign_str(states);
> > > + __assign_str(atoms);
> > > + __assign_str(next);
> > > + ),
> > > +
> > > + TP_printk("%s[%d]: (%s) x (%s) -> (%s)",
> > > __get_str(comm),
> > > __entry->pid, __get_str(states),
> > > + __get_str(atoms), __get_str(next))
> > > +);
> > > +
> > > +TRACE_EVENT(error_ltl_monitor_id,
> > > +
> > > + TP_PROTO(struct task_struct *task),
> > > +
> > > + TP_ARGS(task),
> > > +
> > > + TP_STRUCT__entry(
> > > + __string(comm, task->comm)
> > > + __field(pid_t, pid)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + __assign_str(comm);
> > > + __entry->pid = task->pid;
> > > + ),
> > > +
> > > + TP_printk("%s[%d]: violation detected", __get_str(comm),
> >
> > In your workflow you're probably using events and errors together,
> > but
> > wouldn't it help printing the atoms together with the violation
> > detected?
> > At least to give a clue on the error in case the user doesn't want
> > to
> > see the entire trace (which might be needed for a full debug
> > though).
> >
> > The same could be said from reactors, the user doesn't have much
> > information to infer what went wrong.
>
> Actually my intention for the "event" tracepoints are only for
> debugging
> the monitor itself. I don't want to bother users with the Büchi
> automaton,
> because that's implementation details.
>
> The "error" tracepoints should be enough for identifying problems
> with
> realtime applications. Because errors from the monitors are
> unambiguous:
>
> - pagefault monitor: error means an RT task is raising pagefault
> - sleep monitor: error means an RT task is delayed unboundedly
>
> That and a stacktrace (e.g. from perf) is enough to understand the
> problem.
> That was all I need to identifying problems with pipewire using the
> monitors.
>
> In the future, we can have other monitors whose warnings are
> ambiguous, and
> a more detailed error message will be necessary. But for now, I think
> we
> can keep it simple.
>
Mmh alright, makes sense, you can ignore it then.
> > > +def abbreviate_atoms(atoms: list[str]) -> list[str]:
> > > + abbrs = list()
> > > + for atom in atoms:
> > > + size = 1
> > > + while True:
> > > + abbr = atom[:size]
> > > + if sum(a.startswith(abbr) for a in atoms) == 1:
> > > + break
> > > + size += 1
> > > + abbrs.append(abbr.lower())
> > > + return abbrs
> >
> > I get this is just a matter of preference, so feel free to ignore
> > my
> > suggestion.
> > This abbreviation algorithm doesn't work too well with atoms
> > starting
> > with the same substring and can produce some unexpected result:
> >
> > LTL_BLOCK_ON_RT_MUTEX: b,
> > LTL_KERNEL_THREAD: ke,
> > LTL_KTHREAD_SHOULD_STOP: kt,
> > LTL_NANOSLEEP: n,
> > LTL_PI_FUTEX: p,
> > LTL_RT: r,
> > LTL_SLEEP: s,
> > LTL_TASK_IS_MIGRATION: task_is_m,
> > LTL_TASK_IS_RCU: task_is_r,
> > LTL_WAKE: wa,
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO: woken_by_e,
> > LTL_WOKEN_BY_HARDIRQ: woken_by_h,
> > LTL_WOKEN_BY_NMI: woken_by_n,
> >
> > "woken_by_*" and "task_is_*" atom can get unnecessarily long and
> > while reading "kt" I might think about kernel_thread.
> >
> > I was thinking about something like:
> >
> > LTL_BLOCK_ON_RT_MUTEX: b_o_r_m
> > LTL_KERNEL_THREAD: k_t
> > LTL_KTHREAD_SHOULD_STOP: k_s_s
> > LTL_NANOSLEEP: n
> > LTL_PI_FUTEX: p_f
> > LTL_RT: r
> > LTL_SLEEP: s
> > LTL_TASK_IS_MIGRATION: t_i_m
> > LTL_TASK_IS_RCU: t_i_r
> > LTL_WAKE: w
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO: w_b_e_o_h_p
> > LTL_WOKEN_BY_HARDIRQ: w_b_h
> > LTL_WOKEN_BY_NMI: w_b_n
> >
> > or even
> >
> > LTL_BLOCK_ON_RT_MUTEX: b_m
> > LTL_KERNEL_THREAD: k_t
> > LTL_KTHREAD_SHOULD_STOP: k_s_s
> > LTL_NANOSLEEP: n
> > LTL_PI_FUTEX: p_f
> > LTL_RT: r
> > LTL_SLEEP: s
> > LTL_TASK_IS_MIGRATION: t_m
> > LTL_TASK_IS_RCU: t_r
> > LTL_WAKE: w
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO: w_e_h_p
> > LTL_WOKEN_BY_HARDIRQ: w_h
> > LTL_WOKEN_BY_NMI: w_n
> >
> > I used the following code to come up with this:
> >
> > def abbreviate_atoms(atoms: list[str]) -> list[str]:
> > # completely arbitrary..
> > skip = [ "is", "by", "or", "and" ]
> > def abbr (n, s):
> > return '_'.join(word[:n] for word in s.lower().split('_')
> > if word not in skip)
> > for n in range(1, 32):
> > abbrs = [abbr(n, a) for a in atoms]
> > if len(abbrs) == len(set(abbrs)):
> > return abbrs
> >
> > Which could even be tuned to use 2 letters per block instead of 1
> > (improving readability by a lot actually)..
> > 'bl_on_rt_mu', 'ke_th', 'kt_sh_st', 'na', 'pi_fu', 'rt', 'sl',
> > 'ta_mi',
> > 'ta_rc', 'wa', 'wo_eq_hi_pr', 'wo_ha', 'wo_nm'
> >
> > What do you think?
>
> Yes, this would be very nice. But as mentioned, this is mainly for
> debugging the monitors, not for end users. Therefore I don't want to
> spend
> too much energy on it.
>
> Let me see how "nice" we can make it without spending too many hours.
> Thanks for the suggestion!
>
Sure, do what you see fit.
From my point of view the monitors (19-20/22) and synthesis look good.
Reviewed-by: Gabriele Monaco <gmonaco@...hat.com>
Thanks,
Gabriele
Powered by blists - more mailing lists