linux-kernel - Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <alpine.LRH.2.02.1711151023150.5420@file01.intranet.prod.int.rdu2.redhat.com>
Date:   Wed, 15 Nov 2017 12:08:32 -0500 (EST)
From:   Mikulas Patocka <mpatocka@...hat.com>
To:     Sebastian Siewior <bigeasy@...utronix.de>
cc:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper



On Tue, 14 Nov 2017, Sebastian Siewior wrote:

> [ minimize CC ]
> 
> On 2017-11-13 12:56:53 [-0500], Mikulas Patocka wrote:
> > Hi
> > 
> > I'm submitting this patch for the CONFIG_PREEMPT_RT patch. It fixes 
> > deadlocks in device mapper when real time preemption is used.
> 
> I run into a EXT4 deadlock which is fixed by your patch. I think we had
> earlier issues which were duct taped via
>    fs-jbd2-pull-your-plug-when-waiting-for-space.patch
> 
> What I observed from that deadlock I meant to debug is that I had one
> task owning a bit_spinlock (buffer lock) and wanting W i_data_sem and
> another task owning W i_data_sem and waiting for the same bit_spinlock.
> Here it was wb_writeback() vs vfs_truncate() (to keep it short).

So, send the stacktraces to VFS maintainers. This could deadlock on non-RT 
kernel too.

> Could you please explain how this locking is supposed to work

The scenario for the deadlock is explained here 
https://www.redhat.com/archives/dm-devel/2014-May/msg00089.html
It was fixed by commit d67a5f4b5947aba4bfe9a80a2b86079c215ca755

> and why RT deadlocks while !RT does not? Or does !RT rely on the flush 
> in sched_submit_work() which is often skipped in RT because most locks 
> are rtmutex based?

Yes - that's the reason. The non-rt kernel uses rt mutexes very rarely 
(they are only used in kernel/rcu/tree.h, include/linux/i2c.h and 
kernel/futex.c).

If non-rt kernel used rt mutexes in the I/O stack, the deadlock would also 
happen there.

> Because if that is
> the case then we might get the deadlock upstream, too once we get a
> rtmutex somewhere in VFS (since I doubt it would be possible with a
> futex based test case).
> 
> > Mikulas
> 
> Sebastian

Mikulas