lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140510030635.GC22539@mtj.dyndns.org>
Date:	Fri, 9 May 2014 23:06:35 -0400
From:	Tejun Heo <tj@...nel.org>
To:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, peterz@...radead.org,
	tglx@...utronix.de, mingo@...nel.org, rusty@...tcorp.com.au,
	fweisbec@...il.com, hch@...radead.org, mgorman@...e.de,
	riel@...hat.com, bp@...e.de, rostedt@...dmis.org,
	mgalbraith@...e.de, ego@...ux.vnet.ibm.com,
	paulmck@...ux.vnet.ibm.com, oleg@...hat.com, rjw@...ysocki.net,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] CPU hotplug, stop-machine: Plug race-window that
 leads to "IPI-to-offline-CPU"

On Wed, May 07, 2014 at 03:31:51AM +0530, Srivatsa S. Bhat wrote:
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 01fbae5..7abb361 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -165,12 +165,13 @@ static void ack_state(struct multi_stop_data *msdata)
>  		set_state(msdata, msdata->state + 1);
>  }
>  
> +

Why add a new line here?

>  /* This is the cpu_stop function which stops the CPU. */
>  static int multi_cpu_stop(void *data)
>  {
>  	struct multi_stop_data *msdata = data;
>  	enum multi_stop_state curstate = MULTI_STOP_NONE;
> -	int cpu = smp_processor_id(), err = 0;
> +	int cpu = smp_processor_id(), num_active_cpus, err = 0;

	TYPE var0 = INIT0, var1, var2 = INIT2;

looks kinda weird.  Maybe collect initialized ones to one side or
separate out uninitialized one to a separate declaration?

Also, isn't nr_active_cpus more common way of naming it?

>  	unsigned long flags;
>  	bool is_active;
>  
> @@ -180,15 +181,38 @@ static int multi_cpu_stop(void *data)
>  	 */
>  	local_save_flags(flags);
>  
> -	if (!msdata->active_cpus)
> +	if (!msdata->active_cpus) {
>  		is_active = cpu == cpumask_first(cpu_online_mask);
> -	else
> +		num_active_cpus = 1;
> +	} else {
>  		is_active = cpumask_test_cpu(cpu, msdata->active_cpus);
> +		num_active_cpus = cpumask_weight(msdata->active_cpus);
> +	}
>  
>  	/* Simple state machine */
>  	do {
>  		/* Chill out and ensure we re-read multi_stop_state. */
>  		cpu_relax();
> +
> +		/*
> +		 * In the case of CPU offline, we don't want the other CPUs to
> +		 * send IPIs to the active_cpu (the one going offline) after it
> +		 * has entered the _DISABLE_IRQ state (because, then it will
> +		 * notice the IPIs only after it goes offline). So ensure that
> +		 * the active_cpu always follows the others while entering
> +		 * each subsequent state in this state-machine.
> +		 *
> +		 * msdata->thread_ack tracks the number of CPUs that are yet to
> +		 * move to the next state, during each transition. So make the
> +		 * active_cpu(s) wait until ->thread_ack indicates that the
> +		 * active_cpus are the only ones left to complete the transition.
> +		 */
> +		if (is_active) {
> +			/* Wait until all the non-active threads ack the state */
> +			while (atomic_read(&msdata->thread_ack) > num_active_cpus)
> +				cpu_relax();
> +		}

Wouldn't it be cleaner to separate this out to a separate stage so
that there are two separate DISABLE_IRQ stages - sth like
MULTI_STOP_DISABLE_IRQ_INACTIVE and MULTI_STOP_DISABLE_IRQ_ACTIVE?
The above adds an ad-hoc mechanism on top of the existing mechanism
which is built to sequence similar things anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ