linux-kernel - Re: BUG_ON in rcu_sync

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160914125835.GA6673@redhat.com>
Date:   Wed, 14 Sep 2016 14:58:36 +0200
From:   Oleg Nesterov <oleg@...hat.com>
To:     Nikolay Borisov <kernel@...p.com>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org
Subject: Re: BUG_ON in rcu_sync_func triggered

On 09/14, Nikolay Borisov wrote:
>
> [  557.006656]  [<ffffffff81307a9b>] dump_stack+0x6b/0xa0
> [  557.012737]  [<ffffffff81054a85>] warn_slowpath_common+0x95/0xe0
> [  557.019781]  [<ffffffff81054aea>] warn_slowpath_null+0x1a/0x20
> [  557.026645]  [<ffffffff810ab9a8>] rcu_sync_enter+0x148/0x1a0
> [  557.033309]  [<ffffffff8109c9be>] percpu_down_write+0x1e/0xf0
> [  557.040074]  [<ffffffff81315683>] ? call_rwsem_down_write_failed+0x13/0x20
> [  557.048092]  [<ffffffff811a868b>] freeze_super+0xab/0x1b0
> [  557.054456]  [<ffffffff811b7c0d>] do_vfs_ioctl+0x29d/0x560
> [  557.060920]  [<ffffffff811aae7e>] ? SYSC_newfstat+0x2e/0x40
> [  557.067480]  [<ffffffff811b7f62>] SyS_ioctl+0x92/0xa0
> [  557.073465]  [<ffffffff8163c357>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [  557.081015] ---[ end trace fc087420ac1d8f16 ]---
> [  557.086507] XXX: ffff880473326b08 gp=2 cnt=-1 cb=1
> [  557.092326] rbd: rbd19: added with size 0x500000000
>
> This is: if (WARN_ON(rsp->gp_count < 0)) xxx(rsp);

Thanks a lot. This is what I wanted to see. However, I can't understand why
you did not hit the similar WARN_ON(rsp->gp_count <= 0) in rcu_sync_exit()
before that.

OK, in any case this doesn't look as a bug in rcu/sync.c, could you please
try the fix below? Not sure it will help, perhaps there is something else...
No need to revert the previous debugging patch.

Thanks,

Oleg.


diff --git a/fs/super.c b/fs/super.c
index d78b984..a90bdff 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1344,7 +1344,9 @@ int thaw_super(struct super_block *sb)
 	int error;
 
 	down_write(&sb->s_umount);
-	if (sb->s_writers.frozen == SB_UNFROZEN) {
+	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
+		if (sb->s_writers.frozen != SB_UNFROZEN)
+			pr_crit("THAW: hit the race: %d\n", sb->s_writers.frozen);
 		up_write(&sb->s_umount);
 		return -EINVAL;
 	}