lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8bb82057-1fdf-cb99-0549-2a1a27600d15@kernel.dk>
Date:   Tue, 15 Dec 2020 09:18:00 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Xiaoguang Wang <xiaoguang.wang@...ux.alibaba.com>,
        Pavel Begunkov <asml.silence@...il.com>,
        Nadav Amit <nadav.amit@...il.com>
Cc:     linux-fsdevel@...r.kernel.org, io-uring@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>
Subject: Re: Lockdep warning on io_file_data_ref_zero() with 5.10-rc5

On 12/14/20 11:58 PM, Xiaoguang Wang wrote:
> hi,
> 
>> On 11/28/20 5:13 PM, Pavel Begunkov wrote:
>>> On 28/11/2020 23:59, Nadav Amit wrote:
>>>> Hello Pavel,
>>>>
>>>> I got the following lockdep splat while rebasing my work on 5.10-rc5 on the
>>>> kernel (based on 5.10-rc5+).
>>>>
>>>> I did not actually confirm that the problem is triggered without my changes,
>>>> as my iouring workload requires some kernel changes (not iouring changes),
>>>> yet IMHO it seems pretty clear that this is a result of your commit
>>>> e297822b20e7f ("io_uring: order refnode recyclingā€¯), that acquires a lock in
>>>> io_file_data_ref_zero() inside a softirq context.
>>>
>>> Yeah, that's true. It was already reported by syzkaller and fixed by Jens, but
>>> queued for 5.11. Thanks for letting know anyway!
>>>
>>> https://lore.kernel.org/io-uring/948d2d3b-5f36-034d-28e6-7490343a5b59@kernel.dk/T/#t
>>>
>>>
>>> Jens, I think it's for the best to add it for 5.10, at least so that lockdep
>>> doesn't complain.
>>
>> Yeah maybe, though it's "just" a lockdep issue, it can't trigger any
>> deadlocks. I'd rather just keep it in 5.11 and ensure it goes to stable.
>> This isn't new in this series.
> Sorry, I'm not familiar with lockdep implementation, here I wonder why you say
> it can't trigger any deadlocks, looking at that the syzbot report, seems that
> the deadlock may happen.

Because the only time the lock is actually grabbed in bh context is when
it has dropped to zero and is no longer used. The classic deadlock for this
is if regular use has both contexts, so you can get:

CPU0			CPU1
grab_lock()
			bh context, grab_lock()

deadlock. But this simply cannot happen here, as by the time we get to
grabbing it from bh context, there can by definition be no other users
of it left (or new ones).

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ