[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJFSNy4EZzL7+aWC8xD63rTgcQ3OaBokNB_scpzRDRA53sukEA@mail.gmail.com>
Date: Fri, 23 Sep 2016 16:35:04 +0300
From: Nikolay Borisov <kernel@...p.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>
Subject: Re: BUG_ON in rcu_sync_func triggered
On Wed, Sep 14, 2016 at 3:58 PM, Oleg Nesterov <oleg@...hat.com> wrote:
> On 09/14, Nikolay Borisov wrote:
>>
>> [ 557.006656] [<ffffffff81307a9b>] dump_stack+0x6b/0xa0
>> [ 557.012737] [<ffffffff81054a85>] warn_slowpath_common+0x95/0xe0
>> [ 557.019781] [<ffffffff81054aea>] warn_slowpath_null+0x1a/0x20
>> [ 557.026645] [<ffffffff810ab9a8>] rcu_sync_enter+0x148/0x1a0
>> [ 557.033309] [<ffffffff8109c9be>] percpu_down_write+0x1e/0xf0
>> [ 557.040074] [<ffffffff81315683>] ? call_rwsem_down_write_failed+0x13/0x20
>> [ 557.048092] [<ffffffff811a868b>] freeze_super+0xab/0x1b0
>> [ 557.054456] [<ffffffff811b7c0d>] do_vfs_ioctl+0x29d/0x560
>> [ 557.060920] [<ffffffff811aae7e>] ? SYSC_newfstat+0x2e/0x40
>> [ 557.067480] [<ffffffff811b7f62>] SyS_ioctl+0x92/0xa0
>> [ 557.073465] [<ffffffff8163c357>] entry_SYSCALL_64_fastpath+0x12/0x6a
>> [ 557.081015] ---[ end trace fc087420ac1d8f16 ]---
>> [ 557.086507] XXX: ffff880473326b08 gp=2 cnt=-1 cb=1
>> [ 557.092326] rbd: rbd19: added with size 0x500000000
>>
>> This is: if (WARN_ON(rsp->gp_count < 0)) xxx(rsp);
>
> Thanks a lot. This is what I wanted to see. However, I can't understand why
> you did not hit the similar WARN_ON(rsp->gp_count <= 0) in rcu_sync_exit()
> before that.
>
> OK, in any case this doesn't look as a bug in rcu/sync.c, could you please
> try the fix below? Not sure it will help, perhaps there is something else...
> No need to revert the previous debugging patch.
>
> Thanks,
>
> Oleg.
>
>
> diff --git a/fs/super.c b/fs/super.c
> index d78b984..a90bdff 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1344,7 +1344,9 @@ int thaw_super(struct super_block *sb)
> int error;
>
> down_write(&sb->s_umount);
> - if (sb->s_writers.frozen == SB_UNFROZEN) {
> + if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> + if (sb->s_writers.frozen != SB_UNFROZEN)
> + pr_crit("THAW: hit the race: %d\n", sb->s_writers.frozen);
> up_write(&sb->s_umount);
> return -EINVAL;
> }
>
I was away on holiday so that's why I was silent. However, with this
patch applied I couldn't reproduce the issue nor the pr_crit
triggered. Have you had any moments of epiphany re. this issue? Should
some FS people be involved in the discussion?
Powered by blists - more mailing lists