lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d907bcaa640a44ecb739b3253df49e16bcd4e38d.camel@redhat.com>
Date: Tue, 19 Aug 2025 12:53:29 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>, 
 Masami Hiramatsu <mhiramat@...nel.org>, linux-trace-kernel@...r.kernel.org,
 Nam Cao <namcao@...utronix.de>, Tomas Glozar <tglozar@...hat.com>, Juri
 Lelli <jlelli@...hat.com>, Clark Williams <williams@...hat.com>,  John
 Kacur <jkacur@...hat.com>
Subject: Re: [RFC PATCH 08/17] rv: Add Hybrid Automata monitor type



On Tue, 2025-08-19 at 12:08 +0200, Juri Lelli wrote:
> On 19/08/25 11:48, Gabriele Monaco wrote:
> > That's a good point, I need to check the actual overhead..
> > 
> > One thing to note is that this timer is used only on state
> > constraints,
> > one could write roughly the same monitor like this:
> > 
> >   +------------------------------------------+
> >   |                 enqueued                 |
> >   +------------------------------------------+
> >     |
> >     | sched_switch_in;clk < threshold_jiffies
> >     v
> > 
> > or like this:
> > 
> >   +------------------------------------------+
> >   |                 enqueued                 |
> >   |         clk < threshold_jiffies          |
> >   +------------------------------------------+
> >     |
> >     | sched_switch_in
> >     v
> > 
> > the first won't fail as soon as the threshold passes, but will
> > eventually fail when the sched_switch_in event occurs. This won't
> > use a timer at all (well, mostly, some calls are still made to keep
> > the code general, I could improve that if it matters).
> > 
> > Depending on the monitor, the first option could be a lower
> > overhead yet valid alternative to the second, if it's guaranteed
> > sched_switch_in will eventually come and reaction latency isn't an
> > issue.
> 
> Right, as in the first example you have in the docs. I was thinking
> it would be cool to possibly replace the hung task monitor with this
> one, but again we would need to check for overhead, as the definition
> that does expect a switch_in to eventually happen wouldn't work in
> this case.

Yeah if the overhead is really high that might be an option. Although
the monitor might become a bit pointless then: if a task starves
forever, no error will be reported.

If that's a real issue, I might look at other options where to check
for constraints (the tick perhaps).

> > > Does this also need to be _HARD on RT for the monitor to work?
> > 
> > That might be something we want configurable actually.. I assume
> > the more aggressive the timer is, the more overhead it will have on
> > the system.
> > Some monitors might be fine with a bit of latency.
> 
> It might not only be about latency, as if the callback timer is not
> serviced in case of starvation (if it's not hard) then the monitor
> won't probably react and we won't be able to rely on it.

I think hit that in some conditions and changed the ha_cancel_timer()
to handle this case.

After leaving the state arming a timer, we always cancel it (to avoid
it expiring outside) at that time if it was expiring but the callback
didn't run, the monitor fails.

Again, if the monitor never leaves the state, we'd never report a
failure, but I'm not sure how common that is.

Thanks,
Gabriele


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ