linux-kernel - Re: [PATCH v3 13/22] rv: Add support for LTL monitors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c152e0f5f431739eb2aa5fb78ef6e17521f3826b.camel@redhat.com>
Date: Wed, 16 Apr 2025 15:05:12 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Nam Cao <namcao@...utronix.de>
Cc: Steven Rostedt <rostedt@...dmis.org>,
 linux-trace-kernel@...r.kernel.org, 	linux-kernel@...r.kernel.org,
 john.ogness@...utronix.de
Subject: Re: [PATCH v3 13/22] rv: Add support for LTL monitors



On Wed, 2025-04-16 at 13:56 +0200, Nam Cao wrote:
> On Wed, Apr 16, 2025 at 11:34:53AM +0200, Gabriele Monaco wrote:
> > On Wed, 2025-04-16 at 08:51 +0200, Nam Cao wrote:
> > >  #endif /* CONFIG_DA_MON_EVENTS_ID */
> > > +#if CONFIG_LTL_MON_EVENTS_ID
> > > +TRACE_EVENT(event_ltl_monitor_id,
> > > +
> > > +	TP_PROTO(struct task_struct *task, char *states, char
> > > *atoms, char *next),
> > > +
> > > +	TP_ARGS(task, states, atoms, next),
> > > +
> > > +	TP_STRUCT__entry(
> > > +		__string(comm, task->comm)
> > > +		__field(pid_t, pid)
> > > +		__string(states, states)
> > > +		__string(atoms, atoms)
> > > +		__string(next, next)
> > > +	),
> > > +
> > > +	TP_fast_assign(
> > > +		__assign_str(comm);
> > > +		__entry->pid = task->pid;
> > > +		__assign_str(states);
> > > +		__assign_str(atoms);
> > > +		__assign_str(next);
> > > +	),
> > > +
> > > +	TP_printk("%s[%d]: (%s) x (%s) -> (%s)",
> > > __get_str(comm),
> > > __entry->pid, __get_str(states),
> > > +		  __get_str(atoms), __get_str(next))
> > > +);
> > > +
> > > +TRACE_EVENT(error_ltl_monitor_id,
> > > +
> > > +	TP_PROTO(struct task_struct *task),
> > > +
> > > +	TP_ARGS(task),
> > > +
> > > +	TP_STRUCT__entry(
> > > +		__string(comm, task->comm)
> > > +		__field(pid_t, pid)
> > > +	),
> > > +
> > > +	TP_fast_assign(
> > > +		__assign_str(comm);
> > > +		__entry->pid = task->pid;
> > > +	),
> > > +
> > > +	TP_printk("%s[%d]: violation detected", __get_str(comm),
> > 
> > In your workflow you're probably using events and errors together,
> > but
> > wouldn't it help printing the atoms together with the violation
> > detected?
> > At least to give a clue on the error in case the user doesn't want
> > to
> > see the entire trace (which might be needed for a full debug
> > though).
> > 
> > The same could be said from reactors, the user doesn't have much
> > information to infer what went wrong.
> 
> Actually my intention for the "event" tracepoints are only for
> debugging
> the monitor itself. I don't want to bother users with the Büchi
> automaton,
> because that's implementation details.
> 
> The "error" tracepoints should be enough for identifying problems
> with
> realtime applications. Because errors from the monitors are
> unambiguous:
> 
>   - pagefault monitor: error means an RT task is raising pagefault
>   - sleep monitor: error means an RT task is delayed unboundedly
> 
> That and a stacktrace (e.g. from perf) is enough to understand the
> problem.
> That was all I need to identifying problems with pipewire using the
> monitors.
> 
> In the future, we can have other monitors whose warnings are
> ambiguous, and
> a more detailed error message will be necessary. But for now, I think
> we
> can keep it simple.
> 

Mmh alright, makes sense, you can ignore it then.

> > > +def abbreviate_atoms(atoms: list[str]) -> list[str]:
> > > +    abbrs = list()
> > > +    for atom in atoms:
> > > +        size = 1
> > > +        while True:
> > > +            abbr = atom[:size]
> > > +            if sum(a.startswith(abbr) for a in atoms) == 1:
> > > +                break
> > > +            size += 1
> > > +        abbrs.append(abbr.lower())
> > > +    return abbrs
> > 
> > I get this is just a matter of preference, so feel free to ignore
> > my
> > suggestion.
> > This abbreviation algorithm doesn't work too well with atoms
> > starting
> > with the same substring and can produce some unexpected result:
> > 
> > LTL_BLOCK_ON_RT_MUTEX:                b,
> > LTL_KERNEL_THREAD:                    ke,
> > LTL_KTHREAD_SHOULD_STOP:              kt,
> > LTL_NANOSLEEP:                        n,
> > LTL_PI_FUTEX:                         p,
> > LTL_RT:                               r,
> > LTL_SLEEP:                            s,
> > LTL_TASK_IS_MIGRATION:                task_is_m,
> > LTL_TASK_IS_RCU:                      task_is_r,
> > LTL_WAKE:                             wa,
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO:    woken_by_e,
> > LTL_WOKEN_BY_HARDIRQ:                 woken_by_h,
> > LTL_WOKEN_BY_NMI:                     woken_by_n,
> > 
> > "woken_by_*" and "task_is_*" atom can get unnecessarily long and
> > while reading "kt" I might think about kernel_thread.
> > 
> > I was thinking about something like:
> > 
> > LTL_BLOCK_ON_RT_MUTEX:                b_o_r_m
> > LTL_KERNEL_THREAD:                    k_t
> > LTL_KTHREAD_SHOULD_STOP:              k_s_s
> > LTL_NANOSLEEP:                        n
> > LTL_PI_FUTEX:                         p_f
> > LTL_RT:                               r
> > LTL_SLEEP:                            s
> > LTL_TASK_IS_MIGRATION:                t_i_m
> > LTL_TASK_IS_RCU:                      t_i_r
> > LTL_WAKE:                             w
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO:    w_b_e_o_h_p
> > LTL_WOKEN_BY_HARDIRQ:                 w_b_h
> > LTL_WOKEN_BY_NMI:                     w_b_n
> > 
> > or even
> > 
> > LTL_BLOCK_ON_RT_MUTEX:                b_m
> > LTL_KERNEL_THREAD:                    k_t
> > LTL_KTHREAD_SHOULD_STOP:              k_s_s
> > LTL_NANOSLEEP:                        n
> > LTL_PI_FUTEX:                         p_f
> > LTL_RT:                               r
> > LTL_SLEEP:                            s
> > LTL_TASK_IS_MIGRATION:                t_m
> > LTL_TASK_IS_RCU:                      t_r
> > LTL_WAKE:                             w
> > LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO:    w_e_h_p
> > LTL_WOKEN_BY_HARDIRQ:                 w_h
> > LTL_WOKEN_BY_NMI:                     w_n
> > 
> > I used the following code to come up with this:
> > 
> > def abbreviate_atoms(atoms: list[str]) -> list[str]:
> >     # completely arbitrary..
> >     skip = [ "is", "by", "or", "and" ]
> >     def abbr (n, s):
> >         return '_'.join(word[:n] for word in s.lower().split('_')
> > if word not in skip)
> >     for n in range(1, 32):
> >         abbrs = [abbr(n, a) for a in atoms]
> >         if len(abbrs) == len(set(abbrs)):
> >             return abbrs
> > 
> > Which could even be tuned to use 2 letters per block instead of 1
> > (improving readability by a lot actually)..
> > 'bl_on_rt_mu', 'ke_th', 'kt_sh_st', 'na', 'pi_fu', 'rt', 'sl',
> > 'ta_mi',
> > 'ta_rc', 'wa', 'wo_eq_hi_pr', 'wo_ha', 'wo_nm'
> > 
> > What do you think?
> 
> Yes, this would be very nice. But as mentioned, this is mainly for
> debugging the monitors, not for end users. Therefore I don't want to
> spend
> too much energy on it.
> 
> Let me see how "nice" we can make it without spending too many hours.
> Thanks for the suggestion!
> 

Sure, do what you see fit.
From my point of view the monitors (19-20/22) and synthesis look good.

Reviewed-by: Gabriele Monaco <gmonaco@...hat.com>

Thanks,
Gabriele