linux-kernel - Re: [Bisected] Regression: cpu stuck in gvfsd-fuse, can't shutdown

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141112112643.GA30821@ulmo.nvidia.com>
Date:	Wed, 12 Nov 2014 12:26:49 +0100
From:	Thierry Reding <thierry.reding@...il.com>
To:	Giedrius Statkevicius <giedriuswork@...il.com>
Cc:	Greg KH <gregkh@...uxfoundation.org>, martink@...teo.de,
	linux-kernel@...r.kernel.org
Subject: Re: [Bisected] Regression: cpu stuck in gvfsd-fuse, can't shutdown

On Tue, Nov 11, 2014 at 11:44:26PM +0200, Giedrius Statkevicius wrote:
> On 2014.11.11 23:05, Greg KH wrote:
> > 
> > If you revert this patch, does things go back to "normal" for you?
> 
> Originally I've only tested where the HEAD was
> 32eca22180804f71b06b63fd29b72f58be8b3c47 versus
> 32eca22180804f71b06b63fd29b72f58be8b3c47~1 but now I recompiled and
> tested a vanilla 3.18.0-rc4-next-20141111 on which this issue occurs and
> then tried a version with that particular patch reverted and then no
> lockups happen.

I've run into this same issue with sshfs:

[   49.231095] BUG: spinlock bad magic on CPU#1, sshfs/180
[   49.239078]  lock: fuse_miscdevice+0x0/0x24, .magic: c09ce64c, .owner: /0, .owner_cpu: -1065526976
[   49.248551] CPU: 1 PID: 180 Comm: sshfs Not tainted 3.18.0-rc4-next-20141111-00275-g3eeaa958e58c-dirty #2654
[   49.258443] [<c00161f8>] (unwind_backtrace) from [<c0011a88>] (show_stack+0x10/0x14)
[   49.266269] [<c0011a88>] (show_stack) from [<c07b50b4>] (dump_stack+0x98/0xd8)
[   49.273618] [<c07b50b4>] (dump_stack) from [<c0068670>] (do_raw_spin_lock+0x1a4/0x1a8)
[   49.281621] [<c0068670>] (do_raw_spin_lock) from [<c022a0f0>] (fuse_dev_release+0x1c/0x68)
[   49.289900] [<c022a0f0>] (fuse_dev_release) from [<c00f5078>] (__fput+0x80/0x1c8)
[   49.297470] [<c00f5078>] (__fput) from [<c003fc38>] (task_work_run+0xb4/0xec)
[   49.304700] [<c003fc38>] (task_work_run) from [<c001140c>] (do_work_pending+0xa0/0xc0)
[   49.312712] [<c001140c>] (do_work_pending) from [<c000e5e0>] (work_pending+0xc/0x20)
[   49.701449] BUG: spinlock lockup suspected on CPU#1, sshfs/180
[   49.707327]  lock: fuse_miscdevice+0x0/0x24, .magic: c09ce64c, .owner: /0, .owner_cpu: -1065526976
[   49.716341] CPU: 1 PID: 180 Comm: sshfs Not tainted 3.18.0-rc4-next-20141111-00275-g3eeaa958e58c-dirty #2654
[   49.726238] [<c00161f8>] (unwind_backtrace) from [<c0011a88>] (show_stack+0x10/0x14)
[   49.734051] [<c0011a88>] (show_stack) from [<c07b50b4>] (dump_stack+0x98/0xd8)
[   49.741293] [<c07b50b4>] (dump_stack) from [<c00685c8>] (do_raw_spin_lock+0xfc/0x1a8)
[   49.749178] [<c00685c8>] (do_raw_spin_lock) from [<c022a0f0>] (fuse_dev_release+0x1c/0x68)
[   49.757508] [<c022a0f0>] (fuse_dev_release) from [<c00f5078>] (__fput+0x80/0x1c8)
[   49.765058] [<c00f5078>] (__fput) from [<c003fc38>] (task_work_run+0xb4/0xec)
[   49.772264] [<c003fc38>] (task_work_run) from [<c001140c>] (do_work_pending+0xa0/0xc0)
[   49.780197] [<c001140c>] (do_work_pending) from [<c000e5e0>] (work_pending+0xc/0x20)

Reverting 32eca2218080 ("misc: always assign miscdevice to file->
private_data in open()") fixes the issue for me.

Looking at the stacktrace and correlating to the code, what happens is
that fuse_fill_super() checks that file->private_data hasn't been set
yet and errors out otherwise. Clearly this is what the misc_open()
change in the above commit triggers.

The BUG ensuing from that comes from the fact that the error cleanup
path assumes that if file->private_data is set, it will be a struct
fuse_conn *, so it's not a surprise that fuse_dev_release() will fail
as above.

The root of the issue is that the assumption in the above commit, that
drivers will always overwrite ->private_data, isn't true at least in
case of FUSE.

Thierry

Content of type "application/pgp-signature" skipped