lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1711211107550.12859@file01.intranet.prod.int.rdu2.redhat.com>
Date:   Tue, 21 Nov 2017 11:11:59 -0500 (EST)
From:   Mikulas Patocka <mpatocka@...hat.com>
To:     Mike Galbraith <efault@....de>
cc:     Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Siewior <bigeasy@...utronix.de>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-rt-users@...r.kernel.org
Subject: Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper



On Tue, 21 Nov 2017, Mike Galbraith wrote:

> On Tue, 2017-11-21 at 09:37 +0100, Thomas Gleixner wrote:
> > On Tue, 21 Nov 2017, Mike Galbraith wrote:
> > > On Mon, 2017-11-20 at 16:33 -0500, Mikulas Patocka wrote:
> > > > 
> > > > Is there some specific scenario where you need to call 
> > > > blk_schedule_flush_plug from rt_spin_lock_fastlock?
> > > 
> > > Excellent question.  What's the difference between not getting IO
> > > started because you meet a mutex with an rt_mutex under the hood, and
> > > not getting IO started because you meet a spinlock with an rt_mutex
> > > under the hood?  If just doing the mutex side puts this thing back to
> > > sleep, I'm happy.
> > 
> > Think about it from the mainline POV.
> > 
> > The spinlock cannot ever go to schedule and therefore cannot create a
> > situation which requires an unplug. The RT substitution of the spinlock
> > with a rtmutex based sleeping spinlock should not change that at all.
> > 
> > A regular mutex/rwsem etc. can and will unplug when the lock is contended
> > and the caller blocks. The RT conversion of these locks to rtmutex based
> > variants creates the problem: Unplug cannot be called when the task has
> > pi_blocked_on set because the unplug path might content on yet another
> > lock. So unplugging in the slow path before setting pi_blocked_on is the
> > right thing to do.
> 
> Sure.  What alarms me about IO deadlocks reappearing after all this
> time is that at the time I met them, I needed every last bit of that
> patchlet I showed to kill them, whether that should have been the case
> or not.  'course that tree contained roughly a zillion patches..
> 
> Whatever, time will tell if I'm properly alarmed, or merely paranoid :)
> 
> 	-Mike

So, drop the spinlock unplugging and leave only mutex unplugging, 
reproduce the deadlock and send the stacktraces.

Mikulas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ