linux-kernel - RE: [PATCH] rcu: Add rnp->cbovldmask check in rcutree_migrate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <PH0PR11MB5880A25078A7D4DBAB1267C2DAC59@PH0PR11MB5880.namprd11.prod.outlook.com>
Date:   Fri, 6 May 2022 00:40:09 +0000
From:   "Zhang, Qiang1" <qiang1.zhang@...el.com>
To:     "paulmck@...nel.org" <paulmck@...nel.org>
CC:     "frederic@...nel.org" <frederic@...nel.org>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] rcu: Add rnp->cbovldmask check in
 rcutree_migrate_callbacks()

On Thu, May 05, 2022 at 11:52:36PM +0800, Zqiang wrote:
> Currently, the rnp's cbovlmask is set in call_rcu(). when CPU going 
> offline, the outgoing CPU's callbacks is migrated to target CPU, the 
> number of callbacks on the my_rdp may be overloaded, if overload and 
> there is no call_rcu() call on target CPU for a long time, the rnp's 
> cbovldmask is not set in time. in order to fix this situation, add
> check_cb_ovld_locked() in rcutree_migrate_callbacks() to help CPU more 
> quickly reach quiescent states.
> 
> Signed-off-by: Zqiang <qiang1.zhang@...el.com>

>Doesn't this get set right at the end of the current grace period?
>Given that there is a callback overload, there should be a grace period in progress.
>
>See this code in rcu_gp_cleanup():
>
>		if (rcu_is_leaf_node(rnp))
>			for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
>				rdp = per_cpu_ptr(&rcu_data, cpu);
>				check_cb_ovld_locked(rdp, rnp);
>			}
>
>So what am I missing here?  Or are you planning to remove the above code?

We only checked the overloaded rdp at the end of current grace period, for
my_rdp overloaded cause by migration callbacks to it,  if the my_rdp overloaded,
and the my_rdp->mynode 's cbovldmask  is empty,  the my_rdp overloaded may be
not checked at end of the current grace period.

I have another question, why don't we call check_cb_ovld_locked() when rdp's n_cbs decreases.
for example call check_cb_ovld_locked() in rcu_do_bacth(), not at the end of grace period.

>
>If so, wouldn't you also need to clear the indication for the CPU that is going offline, being careful to handle the case where the two CPUs have different leaf rcu_node structures?

Yes the offline CPU need to clear.

Thanks,
Zqiang

>
>							Thanx, Paul
>
> ---
>  kernel/rcu/tree.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 
> 9dc4c4e82db6..bcc5876c9753 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4577,6 +4577,7 @@ void rcutree_migrate_callbacks(int cpu)
>  	needwake = needwake || rcu_advance_cbs(my_rnp, my_rdp);
>  	rcu_segcblist_disable(&rdp->cblist);
>  	WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) != 
> !rcu_segcblist_n_cbs(&my_rdp->cblist));
> +	check_cb_ovld_locked(my_rdp, my_rnp);
>  	if (rcu_rdp_is_offloaded(my_rdp)) {
>  		raw_spin_unlock_rcu_node(my_rnp); /* irqs remain disabled. */
>  		__call_rcu_nocb_wake(my_rdp, true, flags);
> --
> 2.25.1
>