linux-kernel - Re: [PATCH v4 2/2] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53746492.20701@linux.vnet.ibm.com>
Date:	Thu, 15 May 2014 12:24:10 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
CC:	Tejun Heo <tj@...nel.org>, peterz@...radead.org,
	tglx@...utronix.de, mingo@...nel.org, rusty@...tcorp.com.au,
	akpm@...ux-foundation.org, hch@...radead.org, mgorman@...e.de,
	riel@...hat.com, bp@...e.de, rostedt@...dmis.org,
	mgalbraith@...e.de, ego@...ux.vnet.ibm.com,
	paulmck@...ux.vnet.ibm.com, oleg@...hat.com, rjw@...ysocki.net,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] CPU hotplug, stop-machine: Plug race-window that
 leads to "IPI-to-offline-CPU"

On 05/13/2014 09:27 PM, Frederic Weisbecker wrote:
> On Tue, May 13, 2014 at 02:32:00PM +0530, Srivatsa S. Bhat wrote:
>>
>>  kernel/stop_machine.c |   39 ++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 34 insertions(+), 5 deletions(-)
>>
>> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
>> index 01fbae5..288f7fe 100644
>> --- a/kernel/stop_machine.c
>> +++ b/kernel/stop_machine.c
>> @@ -130,8 +130,10 @@ enum multi_stop_state {
>>  	MULTI_STOP_NONE,
>>  	/* Awaiting everyone to be scheduled. */
>>  	MULTI_STOP_PREPARE,
>> -	/* Disable interrupts. */
>> -	MULTI_STOP_DISABLE_IRQ,
>> +	/* Disable interrupts on CPUs not in ->active_cpus mask. */
>> +	MULTI_STOP_DISABLE_IRQ_INACTIVE,
>> +	/* Disable interrupts on CPUs in ->active_cpus mask. */
>> +	MULTI_STOP_DISABLE_IRQ_ACTIVE,
>>  	/* Run the function */
>>  	MULTI_STOP_RUN,
>>  	/* Exit */
>> @@ -189,12 +191,39 @@ static int multi_cpu_stop(void *data)
>>  	do {
>>  		/* Chill out and ensure we re-read multi_stop_state. */
>>  		cpu_relax();
>> +
>> +		/*
>> +		 * We use 2 separate stages to disable interrupts, namely
>> +		 * _INACTIVE and _ACTIVE, to ensure that the inactive CPUs
>> +		 * disable their interrupts first, followed by the active CPUs.
>> +		 *
>> +		 * This is done to avoid a race in the CPU offline path, which
>> +		 * can lead to receiving IPIs on the outgoing CPU *after* it
>> +		 * has gone offline.
>> +		 *
>> +		 * During CPU offline, we don't want the other CPUs to send
>> +		 * IPIs to the active_cpu (the outgoing CPU) *after* it has
>> +		 * disabled interrupts (because, then it will notice the IPIs
>> +		 * only after it has gone offline). We can prevent this by
>> +		 * making the other CPUs disable their interrupts first - that
>> +		 * way, they will run the stop-machine code with interrupts
>> +		 * disabled, and hence won't send IPIs after that point.
>> +		 */
>> +
>>  		if (msdata->state != curstate) {
>>  			curstate = msdata->state;
>>  			switch (curstate) {
>> -			case MULTI_STOP_DISABLE_IRQ:
>> -				local_irq_disable();
>> -				hard_irq_disable();
>> +			case MULTI_STOP_DISABLE_IRQ_INACTIVE:
>> +				if (!is_active) {
>> +					local_irq_disable();
>> +					hard_irq_disable();
>> +				}
>> +				break;
>> +			case MULTI_STOP_DISABLE_IRQ_ACTIVE:
>> +				if (is_active) {
>> +					local_irq_disable();
>> +					hard_irq_disable();
> 
> I have no idea about possible IPI latencies due to hardware. But are we sure that a stop
> machine transition state is enough to make sure we get a pending IPI? Shouldn't we have
> some sort of IPI flush in between, like polling on call_single_queue?
> 

That might not be actually required, but the concept of flushing out
all pending work before going offline (irrespective of whether the
outgoing CPU got the corresponding IPIs in time or not) sounds like
a good idea, because we can guarantee that any late IPIs landing on
the CPU (and thus generating the warning) will be completely harmless.

We can empty the call_single_queue after disabling interrupts in
the _ACTIVE stage. That way, we can guarantee that all pending IPI
functions have been executed by the outgoing CPU (even if the
corresponding IPIs come a bit later). Also, no new IPI work can be
assigned to that CPU beyond the _INACTIVE stage.

Thanks for the suggestion, Frederic!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/