linux-kernel - Re: [PATCH v3 12/17] sched: Adapt sched tracepoints for RV task model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1da3336a234fddcea9dc91f5ef9943e7ccecc07e.camel@redhat.com>
Date: Wed, 16 Jul 2025 18:14:40 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>, Steven
 Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, 
	linux-trace-kernel@...r.kernel.org, Nam Cao <namcao@...utronix.de>, Tomas
 Glozar	 <tglozar@...hat.com>, Juri Lelli <jlelli@...hat.com>, Clark
 Williams	 <williams@...hat.com>, John Kacur <jkacur@...hat.com>
Subject: Re: [PATCH v3 12/17] sched: Adapt sched tracepoints for RV task
 model

On Wed, 2025-07-16 at 17:31 +0200, Peter Zijlstra wrote:
> On Wed, Jul 16, 2025 at 04:38:36PM +0200, Gabriele Monaco wrote:
> 
> > So as you said, we can still reconstruct what happened from the
> > trace, but the model suddenly needs more states and more events.
> 
> So given a sequence like:
> 
>   trace_sched_enter_tp();
>   { trace_irq_disable();
>     **irq_entry();**
>     **irq_exit();**
>     trace_irq_enable(); } * Ni
>   trace_irq_disable();
>   { trace_sched_switch(); } * Nj
>   trace_irq_enable();
>   { trace_irq_disable();
>     **irq_entry();**
>     **irq_exit();**
>     trace_irq_enable(); } * Nk
>   trace_sched_exit_tp();
> 
> It becomes somewhat hard to figure out which exact IRQ disabled
> section
> the switch did not happen in (Nj == 0).
> 
> > If we could directly tell whether interrupts were disabled manually
> > or from an actual interrupt, that wouldn't be necessary, for
> > instance (as in the original model by Daniel).
> 
> Hmm.. we do indeed appear to trace the IRQ state before adding
> HARDIRQ_OFFSET to preempt_count(). Yes, that complicates things a
> little.
> 
> So... it *might* be possible to lift lockdep_hardirq_enter() to
> before we start tracing. But then you're stuck to running with
> lockdep enabled -- I'm thinking that's not ideal, given those other
> patches you sent.
> 
> I'm going to go on holidays soon, but I've made a note to see if we
> can lift setting HARDIRQ_OFFSET before we start tracing. IIRC the
> current order is because setting HARDIRQ_OFFSET is using
> preempt_count_add() which can be instrumented itself.
> 

Yeah I wondered if that was something perhaps required by RCU or
something else (some calls are in the way). NMIs have it set during the
tracepoints, for instance.

Thanks again and enjoy your holiday!

Gabriele

> But we could use __preempt_count_add() instead, then we loose the
> tracing from setting HARDIRQ_OFFSET, but I don't think that is a
> problem. We already get the latency from the IRQ tracepoints after
> all.
> 
> > I get your point why we don't really need the additional
> > tracepoint, but some
> > arguments giving more context come almost for free.
> 
> Right. So please always try and justify adding tracepoints.