lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181213095814.GC21184@phenom.ffwll.local>
Date:   Thu, 13 Dec 2018 10:58:14 +0100
From:   Daniel Vetter <daniel@...ll.ch>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Daniel Vetter <daniel.vetter@...ll.ch>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        ramalingam.c@...el.com,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Daniel Vetter <daniel.vetter@...el.com>
Subject: Re: [PATCH] drivers/base: use a worker for sysfs unbind

On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@...ll.ch> wrote:
> >
> > Drivers might want to remove some sysfs files, which needs the same
> > locks and ends up angering lockdep. Relevant snippet of the stack
> > trace:
> >
> >   kernfs_remove_by_name_ns+0x3b/0x80
> >   bus_remove_driver+0x92/0xa0
> >   acpi_video_unregister+0x24/0x40
> >   i915_driver_unload+0x42/0x130 [i915]
> >   i915_pci_remove+0x19/0x30 [i915]
> >   pci_device_remove+0x36/0xb0
> >   device_release_driver_internal+0x185/0x250
> >   unbind_store+0xaf/0x180
> >   kernfs_fop_write+0x104/0x190
> 
> Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> source of the lockdep unhappiness?

Yeah I guess I cut out too much of the lockdep splat. It complains about
kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
class. It's ofc not the same lock, so no real deadlock. Getting the
device_release_driver outside of the callchain under kernfs_fop_write,
which this patch does, "fixes" it. For "fixes" = shut up lockdep.

Other options:
- Anotate the recursion with the usual lockdep annotations. Potentially
  results in lockdep not catching real deadlocks (you can still have other
  loops closing the deadlock, maybe through some subsystem/bus lock).

- Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
  that know what they're doing), which should be fine if we refcount
  everything properly (bus, driver & device).

- Also note that probably the same bug exists on the bind sysfs interface,
  but we don't use that, so I don't care :-)

- Most of these issues are never visible in normal usage, since normally
  driver bind/unbind is done from a kthread or model_load/unload, neither
  of which is running in the context of that kernfs mutex kernfs_fop_write
  holds. That's why I think the task work is the best solution, since it
  changes the locking context of the unbind sysfs to match the locking
  context of module unload and hotunplug. Unfortunately that trick doesn't
  work for the bind sysfs file, since that way we can't thread the errno
  value back to userspace.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ