lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 5 Oct 2017 17:23:20 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Daniel Vetter <daniel.vetter@...ll.ch>
cc:     Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Sasha Levin <alexander.levin@...izon.com>,
        Daniel Vetter <daniel.vetter@...el.com>
Subject: Re: [PATCH] drm/i915: Preallocate mmu notifier to unbreak cpu hotplug
 deadlock

On Thu, 5 Oct 2017, Daniel Vetter wrote:

> 4.14-rc1 gained the fancy new cross-release support in lockdep, which
> seems to have uncovered a few more rules about what is allowed and
> isn't.
> 
> This one here seems to indicate that allocating a work-queue while
> holding mmap_sem is a no-go, so let's try to preallocate it.
> 
> Of course another way to break this chain would be somewhere in the
> cpu hotplug code, since this isn't the only trace we're finding now
> which goes through msr_create_device.

That's an interesting multi chain circular dependency which is related to
devfs.

Now the MSR device is not the only one which is creating that
dependency. We have CPUID and MCE as well. That's what a quick search in
x86 revealed. No idea whether there are more of those interesting bits and
pieces.

To fix it on the hotplug side we'd have to introduce extra state space
which is handled outside the cpuhotplug_rwsem region, but inside of the
cpu_maps_update_begin()/end() region, which has a nasty pile of
implications vs. the state registration/deregistration as this stuff can be
built as modules. So we'd need a complete set of new interfaces and
handling routines with some explicit restrictions on those state callbacks.

I rather prefer not to go there unless its unavoidable, which brings me to
the obvious question about the stop_machine() usage in the graphics code.

void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
{
        stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL);
}

The function name is telling. The machine is wedged and stop_machine()
might make it even more wedged when looking at this splat :)

The called function name is interesting as well. Is that _BKL postfix a
leftover of the BKL removal a couple of years ago?

Aside of that, is it really required to use stomp_machine() for this
synchronization? We certainly have less intrusive mechansisms than that.

Thanks,

	tglx







Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ