lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180430211754.GB54182@dtor-ws>
Date:   Mon, 30 Apr 2018 14:17:54 -0700
From:   Dmitry Torokhov <dmitry.torokhov@...il.com>
To:     Daniel Vetter <daniel.vetter@...ll.ch>
Cc:     Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Daniel Vetter <daniel.vetter@...el.com>,
        Benjamin Tissoires <benjamin.tissoires@...hat.com>,
        Arvind Yadav <arvind.yadav.cs@...il.com>,
        Stephen Lyons <slysven@...ginmedia.com>,
        linux-input@...r.kernel.org
Subject: Re: [PATCH] input/psmouse: Don't hold the mutex while calling
 ->disconnect

Hi Daniel,

On Mon, Apr 30, 2018 at 09:56:49PM +0200, Daniel Vetter wrote:
> At least trackpoint_disconnect wants to remove some sysfs files, and
> we can't remove sysfs files while holding psmouse_mutex:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
> ------------------------------------------------------
> kworker/0:3/102 is trying to acquire lock:
>  (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80
> 
> but task is already holding lock:
>  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (psmouse_mutex){+.+.}:
>        psmouse_attr_set_helper+0x28/0x140
>        kernfs_fop_write+0xfe/0x180
>        __vfs_write+0x1e/0x130
>        vfs_write+0xbd/0x1b0
>        SyS_write+0x40/0xa0
>        do_syscall_64+0x65/0x1a0
>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> -> #0 (kn->count#130){++++}:
>        __kernfs_remove+0x243/0x2b0
>        kernfs_remove_by_name_ns+0x3b/0x80
>        remove_files.isra.0+0x2b/0x60
>        sysfs_remove_group+0x38/0x80
>        sysfs_remove_groups+0x24/0x40
>        trackpoint_disconnect+0x2c/0x50
>        psmouse_disconnect+0x8f/0x160
>        serio_disconnect_driver+0x28/0x40
>        serio_driver_remove+0xc/0x10
>        device_release_driver_internal+0x15b/0x230
>        serio_handle_event+0x1c8/0x260
>        process_one_work+0x215/0x620
>        worker_thread+0x48/0x3a0
>        kthread+0xfb/0x130
>        ret_from_fork+0x3a/0x50
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(psmouse_mutex);
>                                lock(kn->count#130);
>                                lock(psmouse_mutex);
>   lock(kn->count#130);
> 
>  *** DEADLOCK ***
> 
> 6 locks held by kworker/0:3/102:
>  #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
>  #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
>  #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
>  #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> 
> stack backtrace:
> CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
> Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
> Workqueue: events_long serio_handle_event
> Call Trace:
>  dump_stack+0x5f/0x86
>  print_circular_bug.isra.18+0x1d0/0x2c0
>  __lock_acquire+0x14ae/0x1b60
>  ? kernfs_remove_by_name_ns+0x20/0x80
>  ? lock_acquire+0xaf/0x200
>  lock_acquire+0xaf/0x200
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  __kernfs_remove+0x243/0x2b0
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  ? kernfs_name_hash+0xd/0x70
>  ? kernfs_find_ns+0x7e/0x100
>  kernfs_remove_by_name_ns+0x3b/0x80
>  remove_files.isra.0+0x2b/0x60
>  sysfs_remove_group+0x38/0x80
>  sysfs_remove_groups+0x24/0x40
>  trackpoint_disconnect+0x2c/0x50
>  psmouse_disconnect+0x8f/0x160
>  serio_disconnect_driver+0x28/0x40
>  serio_driver_remove+0xc/0x10
>  device_release_driver_internal+0x15b/0x230
>  serio_handle_event+0x1c8/0x260
>  process_one_work+0x215/0x620
>  worker_thread+0x48/0x3a0
>  ? _raw_spin_unlock_irqrestore+0x4c/0x60
>  kthread+0xfb/0x130
>  ? process_one_work+0x620/0x620
>  ? _kthread_create_on_node+0x30/0x30
>  ret_from_fork+0x3a/0x50
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@...el.com>
> Cc: Dmitry Torokhov <dmitry.torokhov@...il.com>
> Cc: Benjamin Tissoires <benjamin.tissoires@...hat.com>
> Cc: Daniel Vetter <daniel.vetter@...ll.ch>
> Cc: Arvind Yadav <arvind.yadav.cs@...il.com>
> Cc: Stephen Lyons <slysven@...ginmedia.com>
> Cc: linux-input@...r.kernel.org
> ---
>  drivers/input/mouse/psmouse-base.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
> index 8900c3166ebf..06ccd8e22f3c 100644
> --- a/drivers/input/mouse/psmouse-base.c
> +++ b/drivers/input/mouse/psmouse-base.c
> @@ -1484,8 +1484,10 @@ static void psmouse_disconnect(struct serio *serio)
>  		psmouse_deactivate(parent);
>  	}
>  
> +	mutex_unlock(&psmouse_mutex);
>  	if (psmouse->disconnect)
>  		psmouse->disconnect(psmouse);
> +	mutex_lock(&psmouse_mutex);

Why do you think it is proper to drop this mutex? It is introduced for a
reason.

I think the trace you are seeing is due to:

commit 988cd7afb3f37598891ca70b4c6eb914c338c46a
Author: Tejun Heo <tj@...nel.org>
Date:   Mon Feb 3 14:02:58 2014 -0500

    kernfs: remove kernfs_addrm_cxt

where we started taking kernfs_mutex when adding/removing sysfs files.
Ultimately I think we may have to change switching protocol to a work
item to be done asynchronously from writing to sysfs attribute.

Thanks.

-- 
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ