lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Jul 2019 22:15:06 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Eiichi Tsukata <devel@...ukata.com>
Cc:     joel@...lfernandes.org, paulmck@...ux.vnet.ibm.com,
        tglx@...utronix.de, peterz@...radead.org, mingo@...hat.com,
        fweisbec@...il.com, luto@...capital.net,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Tue, 30 Jul 2019 11:00:42 +0900
Eiichi Tsukata <devel@...ukata.com> wrote:

> Thanks for comments.
> 
> On 2019/07/30 0:21, Steven Rostedt wrote:
> > On Mon, 29 Jul 2019 10:07:34 +0900
> > Eiichi Tsukata <devel@...ukata.com> wrote:
> >   
> >> If context tracking is enabled, causing page fault in preemptirq
> >> irq_enable or irq_disable events triggers the following RCU EQS warning.
> >>
> >> Reproducer:
> >>
> >>   // CONFIG_PREEMPTIRQ_EVENTS=y
> >>   // CONFIG_CONTEXT_TRACKING=y
> >>   // CONFIG_RCU_EQS_DEBUG=y
> >>   # echo 1 > events/preemptirq/irq_disable/enable
> >>   # echo 1 > options/userstacktrace  
> > 
> > So the problem is only with userstacktrace enabled?  
> 
> It can happen when tracing code causes page fault in preemptirq events.
> For example, the following perf command also hit the warning:
> 
>   # perf record -e 'preemptirq:irq_enable' -g ls

Again,

That's not a irq trace event issue, that's a stack trace issue.

> 
> 
> >>  
> >>  __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> >>  {
> >> +	enum ctx_state prev_state;
> >> +
> >>  	if (this_cpu_read(tracing_irq_cpu)) {
> >> -		if (!in_nmi())
> >> +		if (!in_nmi()) {  
> > 
> > This is a very high fast path (for tracing irqs off and such). Instead
> > of adding a check here for a case that is seldom used (userstacktrace
> > and tracing irqs on/off). Move this to surround the userstack trace
> > code.
> > 
> > -- Steve  
> 
> If the problem was only with userstacktrace, it will be reasonable to
> surround only the userstack unwinder. But the situation is similar to
> the previous "tracing vs CR2" case. As Peter taught me in
> https://lore.kernel.org/lkml/20190708074823.GV3402@hirez.programming.kicks-ass.net/
> there are some other codes likely to to user access.
> So I surround preemptirq events earlier.

I disagree. The issue is with the attached callbacks that call
something (a stack unwinder) that can fault.

This is called whenever irqs are disabled. I say we surround the
culprit (the stack unwinder or stack trace) and not punish the irq
enable/disable events.

So NAK on this patch.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ