linux-kernel - Re: cgroup trace events acquire sleeping locks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180709163010.257a08a0@gandalf.local.home>
Date:   Mon, 9 Jul 2018 16:30:10 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     tglx@...utronix.de, Clark Williams <williams@...hat.com>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: cgroup trace events acquire sleeping locks

On Mon, 9 Jul 2018 22:22:15 +0200
Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:

> On 2018-07-09 15:01:54 [-0400], Steven Rostedt wrote:
> > > which is the trace_cgroup_rmdir() trace event in cgroup_rmdir(). The
> > > trace event invokes cgroup_path() which acquires a spin_lock_t and this
> > > is invoked within a preempt_disable()ed section.   
> > 
> > Correct. And I wish no trace event took spin locks.  
> 
> is there an easy way to detect this? I mean instead hitting the trace
> event with debug enabled and doing a review of each of them.

Hmm, good question. I could possibly make all the tracepoint code into
its own section. And then look to see if any spin locks exist in them.
That wouldn't be too trivial to implement though.

> 
> > > It says "Preemption disabled at" migrate_enable() but this is not true.
> > > A printk() just before the lock reports preempt_count() of two and
> > > sometimes one. I *think*
> > > - one is from rcu_read_lock_sched_notrace() in __DO_TRACE()
> > > - the second is from preempt_disable_notrace() in ring_buffer_lock_reserve()
> > > 
> > > I would prefer not to turn kernfs_rename_lock into raw_spin_lock_t. We
> > > had a similar problem with a i915 trace event which eventually vanished
> > > (and before I just disabled it).
> > > 
> > > So how likely are chances that we can use rcu_read_lock() in __DO_TRACE()?  
> > 
> > Not very.  
> 
> Is there a reason for this? I don't think this is documented. I changed
> it to the "normal" RCU read section and it appeared to work :)
> 

Well, there's trace points in RCU code. Not sure how they will react.

> > > And how likely are chances that the preempt_disable() in
> > > ring_buffer_lock_reserve() could be avoided while the trace event is
> > > invoked?  
> > 
> > Even less likely. The design of the ring buffer is based on not being
> > able to be preempted.  
> 
> I was expecting this.
> 
> > > I guess nothing of this is easy peasy. Any suggestions?
> > >   
> > 
> > One solution, albeit not so pretty, is to move the grabbing of the
> > path, outside the trace event. But this should work.  
> 
> okay, wasn't aware of the trace_cgroup_##type##_enabled() magic. Yes,
> this should work. Do you mind posting this upstream?

Sure.

-- Steve