lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 02 Mar 2022 11:46:11 +0100
From:   Nicolas Saenz Julienne <nsaenzju@...hat.com>
To:     paulmck@...nel.org
Cc:     rostedt@...dmis.org, bristot@...nel.org, mingo@...hat.com,
        linux-kernel@...r.kernel.org, mtosatti@...hat.com
Subject: Re: [PATCH] tracing/osnoise: Force quiescent states while tracing

On Tue, 2022-03-01 at 09:56 -0800, Paul E. McKenney wrote:
> On Tue, Mar 01, 2022 at 11:00:08AM +0100, Nicolas Saenz Julienne wrote:
> > On Mon, 2022-02-28 at 14:11 -0800, Paul E. McKenney wrote:
> > > On Mon, Feb 28, 2022 at 03:14:23PM +0100, Nicolas Saenz Julienne wrote:
> > > > At the moment running osnoise on an isolated CPU and a PREEMPT_RCU
> > > > kernel might have the side effect of extending grace periods too much.
> > > > This will eventually entice RCU to schedule a task on the isolated CPU
> > > > to end the overly extended grace period, adding unwarranted noise to the
> > > > CPU being traced in the process.
> 
> Ah, I misread the above paragraph.  Apologies!
> 
> Nevertheless, could you please add something explicit to the effect that
> RCU is completing grace periods as required?

Yes, of course.

[...]
> > > o	At about 30 milliseconds into the grace period, RCU forces an
> > > 	explicit context switch on the wayward CPU.  This should get
> > > 	the CPU's attention even in CONFIG_PREEMPT=y kernels.
> > > 
> > > So what is happening for you instead?
> > 
> > Well, that's exactly what I'm seeing, but it doesn't play well with osnoise.
> 
> Whew!!!  ;-)
> 
> > Here's a simplified view of what the tracer does:
> > 
> > 	time1 = get_time();
> > 	while(1) {
> > 		time2 = get_time();
> > 		if (time2 - time1 > threshold)
> > 			trace_noise();
> > 		cond_resched();
> > 		time1 = time2;
> > 	}
> > 
> > This is pinned to a specific CPU, and in the most extreme cases is expected to
> > take 100% of CPU time. Eventually, some SMI, NMI/interrupt, or process
> > execution will trigger the threshold, and osnoise will provide some nice traces
> > explaining what happened.
> > 
> > RCU forcing a context switch on the wayward CPU is introducing unwarranted
> > noise as it's triggered by the fact we're measuring and wouldn't happen
> > otherwise.
> > 
> > If this were user-space, we'd be in an EQS, which would make this problem go
> > away. An option would be mimicking this behaviour (assuming irq entry/exit code
> > did the right thing):
> > 
> > 	rcu_eqs_enter(); <--
> > 	time1 = get_time();
> > 	while(1) {
> > 		time2 = get_time();
> > 		if (time2 - time1 > threshold)
> > 			trace_noise();
> > 		rcu_eqs_exit(); <--
> > 		cond_resched();
> > 		rcu_eqs_enter(); <--
> > 		time1 = time2;
> > 	}
> > 
> > But given the tight loop this isn't much different than what I'm proposing at
> > the moment, isn't it? rcu_momentary_dyntick_idle() just emulates a really fast
> > EQS entry/exit.
> 
> And that is in fact exactly what rcu_momentary_dyntick_idle() was
> intended for:
> 
> Acked-by: Paul E. McKenney <paulmck@...nel.org>

Thanks!

-- 
Nicolás Sáenz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ