lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c947326e-26bc-1e86-60c0-cbe939b0002d@oracle.com>
Date:   Wed, 5 Jan 2022 15:28:26 -0800
From:   Dongli Zhang <dongli.zhang@...cle.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>, x86@...nel.org
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, hpa@...or.com, peterz@...radead.org,
        rafael.j.wysocki@...el.com, vkuznets@...hat.com,
        alison.schofield@...el.com, boris.ostrovsky@...cle.com,
        joe.jin@...cle.com, longpeng2@...wei.com, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RESEND 1/1] x86/smpboot: check cpu_initialized_mask first
 after returning from schedule()

Hi Tim,

On 1/5/22 11:19 AM, Tim Chen wrote:
> On Thu, 2021-12-23 at 13:03 -0800, Dongli Zhang wrote:
>> To online a new CPU, the master CPU signals the secondary and waits
>> for at
>> most 10 seconds until cpu_initialized_mask is set by the secondary
>> CPU.
>> There is a case that the master CPU fails the online when it takes
>> 10+
>> seconds for schedule() to return (although the cpu_initialized_mask
>> is
>> already set by the secondary CPU).
>>
>> 1. The master CPU signals the secondary CPU in do_boot_cpu().
>>
>> 2. As the cpu_initialized_mask is still not set, the master CPU
>> reschedules
>> via schedule().
>>
>> 3. Suppose it takes 10+ seconds until schedule() returns due to
>> performance
>> issue. The secondary CPU sets the cpu_initialized_mask during those
>> 10+
>> seconds.
> 
> The patch seems reasonable. But do you know what other task got run
> and caused such long delay (>10 sec), preventing switch
> back to in the master CPU?  It seems like there could be some problem
> causing the long delay. It is better if we understand the root cause
> of that.

So far I do not have a consistent repro to verify which kernel function (or
kernel thread) is making the trouble.

However, it is not necessary for other kernel task to take >10 sec to reproduce
the issue, e.g.:

1. The master CPU signals the secondary CPU.
2. Due to the issue at KVM or hardware level (let's assume there is an issue),
the secondary CPU is waken up at the 8th second.
3. The schedule() returns after 3 seconds (due to the 3-second delay by another
kernel task).
4. The 3+8=11 seconds are more than 10 seconds.

Therefore, we should first check whether the secondary CPU is waken up, instead
of timeout. Otherwise, the secondary CPU will be put into an non-recoverable
state, until the OS reboots,

Thank you very much!

Dongli Zhang

> 
> Tim
> 
>>
>> 4. Once the schedule() at the master CPU returns, although the
>> cpu_initialized_mask is set, the time_before(jiffies, timeout) fails.
>> As a
>> result, the master CPU regards this as failure.
>>
>> [   51.983296] smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
>>
>> 5. Since the secondary CPU is stuck at state CPU_UP_PREPARE, any
>> further
>> online operation will fail by cpu_check_up_prepare(), unless the
>> below
>> patch set is applied.
>>
>> https://urldefense.com/v3/__https://lore.kernel.org/all/20211206152034.2150770-1-bigeasy@linutronix.de/__;!!ACWV5N9M2RV99hQ!d7NY8MgLj7W5ZGWS_0HHsvE52WNh2WJbJwLNBJYLGzGIY9BzKg-PUYiIYMmrwud36Ys$ 
>>
>> This issue is resolved by always first checking whether the secondary
>> CPU
>> has set cpu_initialized_mask.
>>
>> Cc: Longpeng(Mike) <longpeng2@...wei.com>
>> Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
>> Cc: Joe Jin <joe.jin@...cle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@...cle.com>
>> ---
>> To resend by Cc linux-kernel@...r.kernel.org as well.
>>
>>  arch/x86/kernel/smpboot.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 617012f4619f..faad4fcf67eb 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1145,7 +1145,7 @@ static int do_boot_cpu(int apicid, int cpu,
>> struct task_struct *idle,
>>  		 */
>>  		boot_error = -1;
>>  		timeout = jiffies + 10*HZ;
>> -		while (time_before(jiffies, timeout)) {
>> +		while (true) {
>>  			if (cpumask_test_cpu(cpu,
>> cpu_initialized_mask)) {
>>  				/*
>>  				 * Tell AP to proceed with
>> initialization
>> @@ -1154,6 +1154,10 @@ static int do_boot_cpu(int apicid, int cpu,
>> struct task_struct *idle,
>>  				boot_error = 0;
>>  				break;
>>  			}
>> +
>> +			if (time_after_eq(jiffies, timeout))
>> +				break;
>> +
>>  			schedule();
>>  		}
>>  	}
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ