linux-kernel - Re: WARNING: bad unlock balance detected! - mkfs.ext4/426 is trying to release lock (rcu_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201207060746.GT11935@casper.infradead.org>
Date:   Mon, 7 Dec 2020 06:07:46 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Naresh Kamboju <naresh.kamboju@...aro.org>
Cc:     linux-stable <stable@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org, rcu@...r.kernel.org,
        lkft-triage@...ts.linaro.org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Sasha Levin <sashal@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Miquel Raynal <miquel.raynal@...tlin.com>,
        Mark Rutland <mark.rutland@....com>
Subject: Re: WARNING: bad unlock balance detected! - mkfs.ext4/426 is trying
 to release lock (rcu_read_lock)

On Mon, Dec 07, 2020 at 11:17:29AM +0530, Naresh Kamboju wrote:
> While running "mkfs -t ext4" on arm64 juno-r2 device connected with SSD drive
> the following kernel warning reported on stable rc 5.9.13-rc1 kernel.
> 
> Steps to reproduce:
> ------------------
> # boot arm64 Juno-r2 device with stable-rc 5.9.13-rc1.
> # Connect SSD drive
> # Format the file system ext4 type
>  mkfs -t ext4 <SSD-drive>
> # you will notice this warning

Does it happen easily?  Can you bisect?

> Crash log:
> --------------
> Writing superblocks and filesystem accounting information:   0/895
> [   86.131095]
> [   86.132592] =====================================
> [   86.137300] WARNING: bad unlock balance detected!
> [   86.142012] 5.9.13-rc1 #1 Not tainted
> [   86.145675] -------------------------------------
> [   86.150384] mkfs.ext4/426 is trying to release lock (rcu_read_lock) at:
> [   86.157020] [<ffff80001063478c>] blk_queue_exit+0xcc/0x1b0
> [   86.162511] but there are no more locks to release!

This really doesn't make much sense.  blk_queue_exit() in 5.9.12 does:

        percpu_ref_put(&q->q_usage_counter);
(literally, that's the entire function)

percpu_ref_put() does:

       rcu_read_lock();

        if (__ref_is_percpu(ref, &percpu_count))
                this_cpu_sub(*percpu_count, nr);
        else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
                ref->release(ref);

        rcu_read_unlock();

Unless ->release() has an unbalanced rcu_read_unlock(), there definitely
is a lock to release!  Some archaeology says that ->release is
blk_queue_usage_counter_release(), which calls
        wake_up_all(&q->mq_freeze_wq);

which doesn't appear to use RCU at all.  So this trace makes no sense,
and all I can do is ask you to bisect it.