lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <72a3b3f4-1b74-6c03-9d04-ac4bb721a55a@windriver.com>
Date:   Mon, 17 May 2021 09:55:30 +0800
From:   "Xu, Yanfei" <yanfei.xu@...driver.com>
To:     paulmck@...nel.org
Cc:     josh@...htriplett.org, rostedt@...dmis.org,
        mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
        joel@...lfernandes.org, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] rcu: fix a deadlock caused by not release
 rcu_node->lock



On 5/17/21 6:58 AM, Paul E. McKenney wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Sun, May 16, 2021 at 05:50:10PM +0800, yanfei.xu@...driver.com wrote:
>> From: Yanfei Xu <yanfei.xu@...driver.com>
>>
>> rcu_node->lock isn't released in rcu_print_task_stall() if the rcu_node
>> don't contain tasks which blocking the GP. However this rcu_node->lock
>> will be used again in rcu_dump_cpu_stacks() soon while the ndetected is
>> non-zero. As a result the cpu will hung by this deadlock.
>>
>> Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
>> Signed-off-by: Yanfei Xu <yanfei.xu@...driver.com>
> 
> Also a good catch, thank you!  Queued for further review and testing,
> wordsmithed as shown below.  The rcutorture scripts have been known to
> work on ARM in the past, and might still do so.  (I test on x86.)
> 
> As always, please check to make sure that I didn't mess something up.
> 

Looks good to me, Thanks!

Regards,
Yanfei

>                                                          Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit e0a9b77f245ae4fe1537120fd5319bf9e091618e
> Author: Yanfei Xu <yanfei.xu@...driver.com>
> Date:   Sun May 16 17:50:10 2021 +0800
> 
>      rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock
> 
>      If rcu_print_task_stall() is invoked on an rcu_node structure that does
>      not contain any tasks blocking the current grace period, it takes an
>      early exit that fails to release that rcu_node structure's lock.  This
>      results in a self-deadlock, which is detected by lockdep.
> 
>      To reproduce this bug:
> 
>      tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 3 --trust-make --configs "TREE03" --kconfig "CONFIG_PROVE_LOCKING=y" --bootargs "rcutorture.stall_cpu=30 rcutorture.stall_cpu_block=1 rcutorture.fwd_progress=0 rcutorture.test_boost=0"
> 
>      This will also result in other complaints, including RCU's scheduler
>      hook complaining about blocking rather than preemption and an rcutorture
>      writer stall.
> 
>      Only a partial RCU CPU stall warning message will be printed because of
>      the self-deadlock.
> 
>      This commit therefore releases the lock on the rcu_print_task_stall()
>      function's early exit path.
> 
>      Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
>      Signed-off-by: Yanfei Xu <yanfei.xu@...driver.com>
>      Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> 
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index a10ea1f1f81f..d574e3bbd929 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -267,8 +267,10 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
>          struct task_struct *ts[8];
> 
>          lockdep_assert_irqs_disabled();
> -       if (!rcu_preempt_blocked_readers_cgp(rnp))
> +       if (!rcu_preempt_blocked_readers_cgp(rnp)) {
> +               raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>                  return 0;
> +       }
>          pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
>                 rnp->level, rnp->grplo, rnp->grphi);
>          t = list_entry(rnp->gp_tasks->prev,
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ