linux-kernel - Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150416111041.66043164@gandalf.local.home>
Date:	Thu, 16 Apr 2015 11:10:41 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Jan Kiszka <jan.kiszka@...mens.com>
Cc:	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	RT <linux-rt-users@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to
 prevent deadlocks

On Thu, 16 Apr 2015 16:28:58 +0200
Jan Kiszka <jan.kiszka@...mens.com> wrote:

> On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote:
> > On 04/16/2015 04:06 PM, Jan Kiszka wrote:
> >> ftrace may trigger rb_wakeups while holding pi_lock which will also be
> >> requested via trace_...->...->ring_buffer_unlock_commit->...->
> >> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes
> >> deadlocks when trying to use ftrace under -rt.
> >>
> >> Resolve this by marking the ring buffer's irq_work as HARD_IRQ.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@...mens.com>
> >> ---
> >>
> >> I'm not yet sure if this doesn't push work into hard-irq context that
> >> is better not done there on -rt.
> > 
> > everything should be done in the soft-irq.
> > 
> >>
> >> I'm also not sure if there aren't more such cases, given that -rt turns
> >> the default irq_work wakeup policy around. But maybe we are lucky.
> > 
> > The only thing that is getting done in the hardirq is the FULL_NO_HZ
> > thingy. I would be _very_ glad if we could keep it that way.

tracing is special, even more so than NO_HZ_FULL, as it also traces
that as well (and even RCU). Tracing the kernel is like a debugger.
Ideally, it would not be part of the kernel, but just an external
observer. Without special hardware that is not the case, so we try to
be outside the main system as much as possible.

> 
> Then - to my current understanding - we need an NMI-safe trigger for
> soft-irq work. Is there anything like this existing already? Or can we
> still use the IPI-based kick without actually doing the work in hard-irq
> context?
> 

The reason why it uses irq_work() is because a simple wakeup can
deadlock the system if called by the tracing infrastructure (as we see
raise_softirq() does too).

But yeah, there's no real need to have the ring buffer irq work
handler run from hardirq context. The only requirement is that you can
not do the raise from the irq_work_queue call. If you want to have the
hardirq work handle do the raise softirq, that's fine. Perhaps that's
the solution? Have all irq_work_queue() always trigger the hard irq, but
the hard irq may just raise a softirq or it will call the handler
directly if IRQ_WORK_HARD_IRQ is set.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/