linux-kernel - Re: [BUG] stack tracing causes: kernel/module.c:271 module_assert_mutex_or

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170405145425.357937b8@gandalf.local.home>
Date:   Wed, 5 Apr 2017 14:54:25 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] stack tracing causes: kernel/module.c:271
 module_assert_mutex_or_preempt

On Wed, 5 Apr 2017 10:59:25 -0700
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:

> > Note, this has nothing to do with trace_rcu_dyntick(). It's the
> > function tracer tracing inside RCU, calling the stack tracer to record
> > a new stack if it sees its larger than any stack before. All I need is
> > a way to tell the stack tracer to not record a stack if it is in this
> > RCU critical section.
> > 
> > If you can add a "in_rcu_critical_section()" function, that the stack
> > tracer can test, and simply exit out like it does if in_nmi() is set,
> > that would work too. Below is my current work around.  
> 
> Except that the rcu_irq_enter() would already have triggered the bug
> that was (allegedly) fixed by my earlier patch.  So, yes, the check for
> rcu_is_watching() would work around this bug, but the hope is that
> with my earlier fix, this workaround would not be needed.

Note, if I had a "in_rcu_critical_section()" I wouldn't need to call
rcu_irq_enter(). I could fall out before that. My current workaround
does the check, even though it breaks things, it would hopefully fix
things as it calls rcu_irq_exit() immediately.

Would I would have is:

	if (in_rcu_critical_section())
		goto out;

	rcu_irq_enter();

which would probably be the easiest fix.

> 
> So could you please test my earlier patch?

I could, but it wouldn't tell me anything immediately. It's a hard race
to hit. Which I never could hit it when I tried, but it would appear to
hit immediately when testing other things :-p

Remember, it only triggers when a new max stack size is hit. The bug
happens when that new max stack size is in the rcu critical section.

I guess I could force it to trigger by inserting a call in your code
that clears the max stack size.

-- Steve

> 
> This patch does not conflict with anything on -rcu, so you could
> carry it if that helps.
>