lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR11MB5880F450C2DDD04D4C76F814DA099@PH0PR11MB5880.namprd11.prod.outlook.com>
Date:   Tue, 8 Mar 2022 07:37:24 +0000
From:   "Zhang, Qiang1" <qiang1.zhang@...el.com>
To:     Frederic Weisbecker <frederic@...nel.org>
CC:     "paulmck@...nel.org" <paulmck@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Uladzislau Rezki <uladzislau.rezki@...y.com>,
        Boqun Feng <boqun.feng@...il.com>
Subject: RE: [PATCH] rcu/nocb: Clear rdp offloaded flags when rcuop/rcuog
 kthreads spawn failed


On Mon, Feb 28, 2022 at 05:36:29PM +0800, Zqiang wrote:
> When CONFIG_RCU_NOCB_CPU is enabled and 'rcu_nocbs' is set, the rcuop 
> and rcuog kthreads is created. however the rcuop or rcuog kthreads 
> creation may fail, if failed, clear rdp offloaded flags.
> 
> Signed-off-by: Zqiang <qiang1.zhang@...el.com>
> ---
>  kernel/rcu/tree_nocb.h | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 
> 46694e13398a..94b279147954 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1246,7 +1246,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
>  				"rcuog/%d", rdp_gp->cpu);
>  		if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) {
>  			mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
> -			return;
> +			goto end;
>  		}
>  		WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
>  		if (kthread_prio)
> @@ -1258,12 +1258,22 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
>  	t = kthread_run(rcu_nocb_cb_kthread, rdp,
>  			"rcuo%c/%d", rcu_state.abbr, cpu);
>  	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
> -		return;
> +		goto end;
>  
>  	if (kthread_prio)
>  		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
>  	WRITE_ONCE(rdp->nocb_cb_kthread, t);
>  	WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
> +	return;
> +end:
> +	if (cpumask_test_cpu(cpu, rcu_nocb_mask)) {
> +		rcu_segcblist_offload(&rdp->cblist, false);
> +		rcu_segcblist_clear_flags(&rdp->cblist,
> +				SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP);
> +		rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_LOCKING);
> +		rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_RCU_CORE);
> +	}
>>
>>Thanks you, consequences are indeed bad otherwise because the target is considered offloaded but nothing actually handles the callbacks.
>>
>>A few issues though:
>>
>>* The rdp_gp kthread may be running concurrently. If it's iterating this rdp and
>>  the SEGCBLIST_LOCKING flag is cleared in the middle, rcu_nocb_unlock() won't
>>  release (among many other possible issues).
>>
>>* we should clear the cpu from rcu_nocb_mask or we won't be able to later
>>  re-offload it.
>>
>>* we should then delete the rdp from the group list:
>>
>>     list_del_rcu(&rdp->nocb_entry_rdp);
>>
>>So ideally we should call rcu_nocb_rdp_deoffload(). But then bear in mind:
>>
>>1) We must lock rcu_state.barrier_mutex and hotplug read lock. But since we
>>   are calling rcutree_prepare_cpu(), we maybe holding hotplug write lock
>>   already.
>>
>>   Therefore we first need to invert the locking dependency order between
>>   rcu_state.barrier_mutex and hotplug lock and then just lock the barrier_mutex
>>   before calling rcu_nocb_rdp_deoffload() from our failure path.
>>   
>>
>>2) On rcu_nocb_rdp_deoffload(), handle non-existing nocb_gp and/or nocb_cb
>>   kthreads. Make sure we are holding nocb_gp_kthread_mutex.

Sorry for my late reply,  Is the nocb_gp_kthread_mutex really necessary?
Because the cpu online/offline is serial operation,  It is protected by  cpus_write_lock()

Thanks
Zqiang

>>
>>I'm going to take your patch and adapt it along those lines.
>>
>>Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ