lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Feb 2009 21:55:01 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] generic-smp: remove single ipi fallback for
 smp_call_function_many()

On Mon, 2009-02-16 at 21:30 +0100, Oleg Nesterov wrote:
> On 02/16, Peter Zijlstra wrote:
> >
> > On Mon, 2009-02-16 at 20:10 +0100, Oleg Nesterov wrote:
> > > Actually I don't understand the counter/free_list logic.
> > >
> > > 	generic_smp_call_function_interrupt:
> > >
> > > 			/*
> > > 			 * When the global queue is empty, its guaranteed that no cpu
> > > 			 * is still observing any entry on the free_list, therefore
> > > 			 * we can go ahead and unlock them.
> > > 			 */
> > > 			if (!--call_function.counter)
> > > 				list_splice_init(&call_function.free_list, &free_list);
> > >
> > > I can't see why "its guaranteed that no cpu ...". Let's suppose CPU 0
> > > "hangs" for some reason in generic_smp_call_function_interrupt() right
> > > before "if (!cpumask_test_cpu(cpu, data->cpumask))" test. Then it is
> > > possible that another CPU removes the single entry (which doesn't have
> > > CPU 0 in data->cpumask) from call_function.queue.
> >
> > Then call_function.counter wouldn't be 0, and we would not release the
> > list entries.
> 
> Why it wouldn't be 0? IOW, do you mean that the spurious CALL_FUNCTION_VECTOR
> is not possible?

Ah, I think I see what you mean, you're right. I should not be counting
the number of entries in the queue, for when the last one gets removed
does not correlate to another observing entries.

> > > Now, if that entry is re-added, we can have a problem if CPU 0 sees
> > > the partly initialized entry.
> >
> > Right, so the scenario is that a cpu observes the list entry after we do
> > list_del_rcu() -- most likely a cpu not in the original mask, taversing
> > the list for a next entry -- we have to wait for some quiescent state to
> > ensure nobody is still referencing it.
> 
> Yes I see. But if we change generic_smp_call_function_interrupt() to
> re-check cpumask_test_cpu(cpu, data->cpumask) under data->lock then
> we don't have to wait for quiescent state, afaics. And we have to
> take this lock anyway.

Suppose entries A,B,C,D are queued, and some cpu1 is traversing the list
to execute C and is currently at B waiting for data->lock. Concurrently
cpu2 has completed B and removes it from the list, when cpu3 takes B and
inserts it at the end.

So we end up with A->C->D->B, and cpu1 will find its at the end of the
list and stop, never having seen C, the one it was meant to execute.

One possible solution would be to always queue new entries at the head,
but that would be unfair wrt function execution -- do we care?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ