lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Feb 2009 20:41:32 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] generic-smp: remove single ipi fallback for
 smp_call_function_many()

On Mon, 2009-02-16 at 20:10 +0100, Oleg Nesterov wrote:
> On 02/16, Peter Zijlstra wrote:
> >
> > @@ -347,31 +384,32 @@ void smp_call_function_many(const struct
> >  		return;
> >  	}
> >
> > -	data = kmalloc(sizeof(*data) + cpumask_size(), GFP_ATOMIC);
> > -	if (unlikely(!data)) {
> > -		/* Slow path. */
> > -		for_each_online_cpu(cpu) {
> > -			if (cpu == smp_processor_id())
> > -				continue;
> > -			if (cpumask_test_cpu(cpu, mask))
> > -				smp_call_function_single(cpu, func, info, wait);
> > -		}
> > -		return;
> > +	data = kmalloc(sizeof(*data), GFP_ATOMIC);
> > +	if (data)
> > +		data->csd.flags = CSD_FLAG_ALLOC;
> > +	else {
> > +		data = &per_cpu(cfd_data, me);
> > +		/*
> > +		 * We need to wait for all previous users to go away.
> > +		 */
> > +		while (data->csd.flags & CSD_FLAG_LOCK)
> > +			cpu_relax();
> > +		data->csd.flags = CSD_FLAG_LOCK;
> >  	}
> >
> >  	spin_lock_init(&data->lock);
> > -	data->csd.flags = CSD_FLAG_ALLOC;
> >  	if (wait)
> >  		data->csd.flags |= CSD_FLAG_WAIT;
> >  	data->csd.func = func;
> >  	data->csd.info = info;
> > -	cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
> > -	cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
> > -	data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
> > -
> > -	spin_lock_irqsave(&call_function_lock, flags);
> > -	list_add_tail_rcu(&data->csd.list, &call_function_queue);
> > -	spin_unlock_irqrestore(&call_function_lock, flags);
> > +	cpumask_and(&data->cpumask, mask, cpu_online_mask);
> > +	cpumask_clear_cpu(smp_processor_id(), &data->cpumask);
> 
> (perhaps it makes sense to use "me" instead of smp_processor_id())

Ah, missed one it seems, thanks ;-)

> > +	data->refs = cpumask_weight(&data->cpumask);
> > +
> > +	spin_lock_irqsave(&call_function.lock, flags);
> > +	call_function.counter++;
> > +	list_add_tail_rcu(&data->csd.list, &call_function.queue);
> > +	spin_unlock_irqrestore(&call_function.lock, flags);
> 
> What if the initialization above leaks into the critical section?
> 
> I mean, generic_smp_call_function_interrupt() running on another CPU
> can see the result of list_add_tail_rcu() and cpumask_and(data->cpumask)
> but not (say) "data->refs = ...".

Ho humm, nice :-)

So best would be to put that data initialization under data->lock. This
would be a bug in the original code too.

> Actually I don't understand the counter/free_list logic.
> 
> 	generic_smp_call_function_interrupt:
> 
> 			/*
> 			 * When the global queue is empty, its guaranteed that no cpu
> 			 * is still observing any entry on the free_list, therefore
> 			 * we can go ahead and unlock them.
> 			 */
> 			if (!--call_function.counter)
> 				list_splice_init(&call_function.free_list, &free_list);
> 
> I can't see why "its guaranteed that no cpu ...". Let's suppose CPU 0
> "hangs" for some reason in generic_smp_call_function_interrupt() right
> before "if (!cpumask_test_cpu(cpu, data->cpumask))" test. Then it is
> possible that another CPU removes the single entry (which doesn't have
> CPU 0 in data->cpumask) from call_function.queue.

Then call_function.counter wouldn't be 0, and we would not release the
list entries.

> Now, if that entry is re-added, we can have a problem if CPU 0 sees
> the partly initialized entry.

Right, so the scenario is that a cpu observes the list entry after we do
list_del_rcu() -- most likely a cpu not in the original mask, taversing
the list for a next entry -- we have to wait for some quiescent state to
ensure nobody is still referencing it.

We cannot use regular RCU, because its quiesent state takes forever to
happen, therefore this implementes a simple counting rcu for the queue
domain only.

When the list is empty, there's nobody seeing any elements, ergo we can
release the entries and re-use them.

> But why do we need counter/free_list at all?
> Can't we do the following:
> 
> 	- initialize call_function_data.lock at boot time
> 
> 	- change smp_call_function_many() to initialize cfd_data
> 	  under ->lock
> 
> 	- change generic_smp_call_function_interrupt() to do
> 
> 		list_for_each_entry_rcu(data) {
> 
> 			if (!cpumask_test_cpu(cpu, data->cpumask))
> 				continue;
> 
> 			spin_lock(&data->lock);
> 			if (!cpumask_test_cpu(cpu, data->cpumask)) {
> 				spin_unlock(data->lock);
> 				continue;
> 			}
> 
> 			cpumask_clear_cpu(cpu, data->cpumask);
> 			refs = --data->refs;
> 
> 			func = data->func;
> 			info = data->info;
> 			spin_unlock(&data->lock);
> 
> 			func(info);
> 
> 			if (refs)
> 				continue;
> 
> 			...
> 		}
> 
> Afaics, it is OK if smp_call_function_many() sees the "unneeded" entry on
> list, the only thing we must ensure is that we can't "misunderstand" this
> entry.
> 
> No?

Hmm, could that not leed to loops, and or missed entries in the
list-iteration?

The saves approach seemed to wait until sure nobody observed the entry
before re-using it.

> 
> Off-topic question, looks like smp_call_function_single() must not be
> called from interrupt/bh handler, right? But the comment says nothing.
> 
> And,
> 	smp_call_function_single:
> 
> 		/* Can deadlock when called with interrupts disabled */
> 		WARN_ON(irqs_disabled());
> 	
> Just curious, we can only deadlock if wait = T, right?

Right.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ