linux-kernel - Re: [RFC PATCH 08/17] rv: Add Hybrid Automata monitor type

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd9dc21acd4fac68a3948ded59cf9235f0e4bb7f.camel@redhat.com>
Date: Mon, 25 Aug 2025 09:48:23 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Nam Cao <namcao@...utronix.de>
Cc: linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>, 
 Masami Hiramatsu <mhiramat@...nel.org>, linux-trace-kernel@...r.kernel.org,
 Tomas Glozar <tglozar@...hat.com>,  Juri Lelli <jlelli@...hat.com>, Clark
 Williams <williams@...hat.com>, John Kacur <jkacur@...hat.com>
Subject: Re: [RFC PATCH 08/17] rv: Add Hybrid Automata monitor type

On Thu, 2025-08-21 at 14:18 +0200, Nam Cao wrote:
> On Thu, Aug 14, 2025 at 05:08:00PM +0200, Gabriele Monaco wrote:
> > Deterministic automata define which events are allowed in every
> > state,
> > but cannot define more sophisticated constraint taking into account
> > the
> > system's environment (e.g. time or other states not producing
> > events).
> > 
> > Add the Hybrid Automata monitor type as an extension of
> > Deterministic
> > automata where each state transition is validating a constraint on
> > a finite number of environment variables.
> > Hybrid automata can be used to implement timed automata, where the
> > environment variables are clocks.
> > 
> > Also implement the necessary functionality to handle clock
> > constraints (ns or jiffy granularity) on state and events.
> > 
> > Signed-off-by: Gabriele Monaco <gmonaco@...hat.com>
> 
> So you have figured out how to deal with the time problem. Cool!
> 
> I'm curious, instead of a new monitor type, would the entire thing be
> simpler if these new features are added as extension to DA monitor
> instead?
> 
> The existing "pure DA" monitors would just not use the constraint and
> timer stuffs and would behave same as before.
> 
> Just an idea, I'm not sure how it would look like. But I think we
> might reduce some line count.

Mmh, that might save some lines, especially the *_hooks() macros.
The few functions that are now duplicated would end up together with a
condition, though.

I'm however not too fond of forcing any DA user to allocate space for a
timer. Imagine a custom kernel for an embedded device trying to squeeze
some RV monitors in tasks and ending up requiring 64 bytes per monitor
instead of 8.
If this doesn't look confusing to you, I'd prefer them separate at
least there.

> > +/*
> > + * ha_monitor_reset_all_stored - reset all environment variables
> > in the monitor
> > + */
> > +static inline void ha_monitor_reset_all_stored(struct ha_monitor
> > *ha_mon)
> > +{
> > +	for (int i = 0; i < ENV_MAX_STORED; i++)
> > +		smp_store_mb(ha_mon->env_store[i],
> > ENV_INVALID_VALUE);
> 
> Why is memory barrier needed here?

Right, those are not needed, will remove.

> I think checkpatch.pl should complain about this? Please add a
> comment explaining the purpose of this memory barrier.
> 
> The same applied for the other memory barriers.
> 
> > +}
> > +
> > +/*
> > + * ha_monitor_init_env - setup timer and reset all environment
> > + *
> > + * Called from a hook in the DA start functions, it supplies the
> > da_mon
> > + * corresponding to the current ha_mon.
> > + * Not all hybrid automata require the timer, still set it for
> > simplicity.
> > + */
> > +static inline void ha_monitor_init_env(struct da_monitor *da_mon)
> > +{
> > +	struct ha_monitor *ha_mon = to_ha_monitor(da_mon);
> > +
> > +	ha_monitor_reset_all_stored(ha_mon);
> > +	if (unlikely(!ha_mon->timer.base))
> > +		hrtimer_setup(&ha_mon->timer,
> > ha_monitor_timer_callback,
> > +			      CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > +}
> > +
> > +/*
> > + * ha_monitor_reset_env - stop timer and reset all environment
> > + *
> > + * Called from a hook in the DA reset functions, it supplies the
> > da_mon
> > + * corresponding to the current ha_mon.
> > + * Not all hybrid automata require the timer, still clear it for
> > simplicity.
> > + */
> > +static inline void ha_monitor_reset_env(struct da_monitor *da_mon)
> > +{
> > +	struct ha_monitor *ha_mon = to_ha_monitor(da_mon);
> > +
> > +	ha_monitor_reset_all_stored(ha_mon);
> > +	/* Initialisation resets the monitor before initialising
> > the timer */
> > +	if (likely(ha_mon->timer.base))
> > +		ha_cancel_timer(ha_mon);
> > +}
> 
> Looking at hrtimer->timer.base seems quite hacky. It seems that this
> could be broken due to a future hrtimer's refactoring.
> 
> Instead, how about moving hrtimer_setup() into monitor's enabling
> function?
> Then you can always hrtimer_cancel() here.

I didn't really like that either, it just seemed the easiest option at
the time. But yours is a good point..

Timers are per monitor and the enabling function is to register the
monitor functionality, so it won't run for all tasks (and some aren't
there yet).

The right spot for this is the start function (where it's currently
happening), the main issue is that resets are running at times when the
monitor is not started, so the timer may not be there when we attempt
to cancel it.

I'll check a bit more if we really need those resets, and in the worst
case use something like da_monitoring() as a condition there.
Then I'd remove the same condition in ha_monitor_init_env(), which
basically would setup the timer every time the monitor starts, that
doesn't seem an issue to me.

> 
> > +/*
> > + * ha_cancel_timer - Cancel the timer and return whether it
> > expired
> > + *
> > + * Return true if the timer was cancelled after expiration but
> > before the
> > + * callback could run.
> > + */
> > +static inline bool ha_cancel_timer(struct ha_monitor *ha_mon)
> > +{
> > +	ktime_t remaining;
> > +
> > +	if (!hrtimer_active(&ha_mon->timer))
> > +		return false;
> > +	remaining = hrtimer_get_remaining(&ha_mon->timer);
> > +	if (hrtimer_try_to_cancel(&ha_mon->timer) == 1 &&
> > remaining < 0)
> > +		return true;
> > +	return false;
> > +}
> 
> This is still racy. The timer could expire after
> hrtimer_get_remaining(), but before hrtimer_try_to_cancel()

I believe in that case hrtimer_try_to_cancel would return -1 (callback
is executing) or 0 if the callback is done.
I guess that should be serialised by the hrtimers locks, but I'll
double check if there can indeed be a race.

> Is it really important that we need to care about the tiny window
> "after expiration but before the callback could run"? What if we just
> use hrtimer_cancel() here?

As Juri pointed out, on PREEMPT_RT kernels if the timer isn't marked as
_HARD, the window won't even be tiny.

That was a (blind) attempt to reduce the overhead of using timers, I'm
going to run more tests to understand if that really makes sense. Not
using _HARD and having the timer not serviced, would in fact end up
just like not using timers at all, which is what the monitor could do
in the first place.

Thanks,
Gabriele