linux-kernel - RE: [PATCH] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <PH0PR11MB58808A0169E7318EF248443ADA109@PH0PR11MB5880.namprd11.prod.outlook.com>
Date:   Sun, 27 Nov 2022 09:48:14 +0000
From:   "Zhang, Qiang1" <qiang1.zhang@...el.com>
To:     Joel Fernandes <joel@...lfernandes.org>
CC:     Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "frederic@...nel.org" <frederic@...nel.org>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug


> On Nov 26, 2022, at 12:52 AM, Zhang, Qiang1 <qiang1.zhang@...el.com> wrote:
> 
> 
>> 
>> Hi Zqiang,
>> 
>> On 11/25/2022 9:24 PM, Zqiang wrote:
>> Currently, for the case of num_online_cpus() <= 1, return directly, 
>> indicates the end of current grace period and then release old data.
>> it's not accurate, for SMP system, when num_online_cpus() is equal 
>> one, maybe another cpu that in offline process(after invoke
>> __cpu_disable()) is still in the rude RCU-Tasks critical section 
>> holding the old data, this lead to memory corruption.
>> 
>> 
>> 
>> Was this race seen in your testing? For the outgoing CPU, once that 
>> CPU marks itself offline (and decrements __num_online_cpus), do we 
>> have tracing active on that CPU, and synchronize_rcu_tasks_rude() not 
>> waiting for it could potentially lead to memory corruption?
> 
> Hi Neeraj
> 
> Indeed, I didn't see race in the actual production environment, Maybe 
> my commit information description is not accurate enough, like the 
> scene I described with joel.
> 
> If in cpuhp_invoke_callback, some callback is in rude rcu-tasks read 
> ctrical section, and still holding old data, but in this time, 
> synchronize_rcu_tasks_rude() not waiting, and release old data.
> 
> Suppose the system has two cpus
> 
>    CPU0                                                                     CPU1
>                         cpu_stopper_thread
>                                                                                  take_cpu_down
>                            __cpu_disable
>                            dec __num_online_cpus 
> rcu_tasks_rude_wait_gp                                      cpuhp_invoke_callback  
>    num_online_cpus() == 1
>        return;
> 
> when __num_online_cpus == 1, the CPU1 not completely offline.
>
>Agreed with yours and Neeraj assessment.
>
>> 
>> As per my understanding, given that outgoing/incoming CPU 
>> decrements/increments the __num_online_cpus value, and 
>> num_online_cpus() is a plain read, problem could happen when the 
>> incoming CPU updates the __num_online_cpus  value, however, 
>> rcu_tasks_rude_wait_gp()'s
>> num_online_cpus() call didn't observe the increment. So,
>> cpus_read_lock/unlock() seems to be required to handle this case.
> 
> Yes, the same problem will be encountered when going online, due to 
> access  __num_online_cpus  that is not protected by 
> cpus_read_lock/unlock() in rcu_tasks_rude_wait_gp().
> 
> Do I need to change the commit information to send v2?
>
>I think so. If you could add the CPU sequence diagram you mentioned, that would be great.
>
>Also I suggest add more details of which specific parts of the hotplug process (the ones in stop machine only) are susceptible to the issue. That is, only those hotplug callbacks that are in  stop machine which may have trampolines prematurely freed from another cpu, right?

Yes, your describe is correct, I will resend, I will resend.

Thanks
Zqiang

>
>Thanks!
>
>  - Joel
>
>
>
> 
> Thanks
> Zqiang
> 
>> 
>> 
>> Thanks
>> Neeraj
>> 
>> Therefore, this commit add cpus_read_lock/unlock() before executing 
>> num_online_cpus().
>> 
>> Signed-off-by: Zqiang <qiang1.zhang@...el.com>
>> ---
>>  kernel/rcu/tasks.h | 20 ++++++++++++++++++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 
>> 4a991311be9b..08e72c6462d8 100644
>> --- a/kernel/rcu/tasks.h
>> +++ b/kernel/rcu/tasks.h
>> @@ -1033,14 +1033,30 @@ static void rcu_tasks_be_rude(struct 
>> work_struct *work)  {  }
>> 
>> +static DEFINE_PER_CPU(struct work_struct, rude_work);
>> +
>>  // Wait for one rude RCU-tasks grace period.
>>  static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)  {
>> +    int cpu;
>> +    struct work_struct *work;
>> +
>> +    cpus_read_lock();
>>      if (num_online_cpus() <= 1)
>> -        return;    // Fastpath for only one CPU.
>> +        goto end;// Fastpath for only one CPU.
>> 
>>      rtp->n_ipis += cpumask_weight(cpu_online_mask) > -    schedule_on_each_cpu(rcu_tasks_be_rude);
>> +    for_each_online_cpu(cpu) {
>> +        work = per_cpu_ptr(&rude_work, cpu);
>> +        INIT_WORK(work, rcu_tasks_be_rude);
>> +        schedule_work_on(cpu, work);
>> +    }
>> +
>> +    for_each_online_cpu(cpu)
>> +        flush_work(per_cpu_ptr(&rude_work, cpu));
>> +
>> +end:
>> +    cpus_read_unlock();
>>  }
>> 
>>  void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);