linux-kernel - Re: [RFC PATCH 08/17] rv: Add Hybrid Automata monitor type

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <762f7d52bf75475d3ec2587a8e370e4fb2a5ae6a.camel@redhat.com>
Date: Tue, 19 Aug 2025 11:48:01 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>, 
 Masami Hiramatsu <mhiramat@...nel.org>, linux-trace-kernel@...r.kernel.org,
 Nam Cao <namcao@...utronix.de>, Tomas Glozar <tglozar@...hat.com>, Juri
 Lelli <jlelli@...hat.com>, Clark Williams <williams@...hat.com>,  John
 Kacur <jkacur@...hat.com>
Subject: Re: [RFC PATCH 08/17] rv: Add Hybrid Automata monitor type



On Tue, 2025-08-19 at 11:18 +0200, Juri Lelli wrote:
> Hi!
> 
> On 14/08/25 17:08, Gabriele Monaco wrote:
> 
> ...
> 
> > +/*
> > + * ha_monitor_init_env - setup timer and reset all environment
> > + *
> > + * Called from a hook in the DA start functions, it supplies the
> > da_mon
> > + * corresponding to the current ha_mon.
> > + * Not all hybrid automata require the timer, still set it for
> > simplicity.
> > + */
> > +static inline void ha_monitor_init_env(struct da_monitor *da_mon)
> > +{
> > +	struct ha_monitor *ha_mon = to_ha_monitor(da_mon);
> > +
> > +	ha_monitor_reset_all_stored(ha_mon);
> > +	if (unlikely(!ha_mon->timer.base))
> > +		hrtimer_setup(&ha_mon->timer,
> > ha_monitor_timer_callback,
> > +			      CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > +}
> 
> ...
> 
> > +/*
> > + * Helper functions to handle the monitor timer.
> > + * Not all monitors require a timer, in such case the timer will
> > be set up but
> > + * never armed.
> > + * Timers start since the last reset of the supplied env or from
> > now if env is
> > + * not an environment variable. If env was not initialised no
> > timer starts.
> > + * Timers can expire on any CPU unless the monitor is per-cpu,
> > + * where we assume every event occurs on the local CPU.
> > + */
> > +static inline void ha_start_timer_ns(struct ha_monitor *ha_mon,
> > enum envs env,
> > +				     u64 expire)
> > +{
> > +	int mode = HRTIMER_MODE_REL;
> > +	u64 passed = 0;
> > +
> > +	if (env >= 0 && env < ENV_MAX_STORED) {
> > +		if (ha_monitor_env_invalid(ha_mon, env))
> > +			return;
> > +		passed = ha_get_env(ha_mon, env);
> > +	}
> > +	if (RV_MON_TYPE == RV_MON_PER_CPU)
> > +		mode |= HRTIMER_MODE_PINNED;
> > +	hrtimer_start(&ha_mon->timer, ns_to_ktime(expire -
> > passed), mode);
> > +}
> 
> Also, my only concern with the usage of per-task timers is that
> reprogramming add overhead, so I wonder if this gets noticeable when
> running some kind of performance sensitive workload in production (as
> it was reported for dl-server). Did you test such a case?

That's a good point, I need to check the actual overhead..

One thing to note is that this timer is used only on state constraints,
one could write roughly the same monitor like this:

  +------------------------------------------+
  |                 enqueued                 |
  +------------------------------------------+
    |
    | sched_switch_in;clk < threshold_jiffies
    v

or like this:

  +------------------------------------------+
  |                 enqueued                 |
  |         clk < threshold_jiffies          |
  +------------------------------------------+
    |
    | sched_switch_in
    v

the first won't fail as soon as the threshold passes, but will
eventually fail when the sched_switch_in event occurs. This won't use a
timer at all (well, mostly, some calls are still made to keep the code
general, I could improve that if it matters).

Depending on the monitor, the first option could be a lower overhead
yet valid alternative to the second, if it's guaranteed sched_switch_in
will eventually come and reaction latency isn't an issue.

> Does this also need to be _HARD on RT for the monitor to work?

That might be something we want configurable actually.. I assume the
more aggressive the timer is, the more overhead it will have on the
system.
Some monitors might be fine with a bit of latency.

For example in the deadline case, I believe, the monitor is not
supposed to fix anything, but merely report violations. So we don't
really care to react on time, but to react at all.

I'm going to assess the overhead and see how to offer some more
configurability.

Thanks,
Gabriele