lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1296530136.7862.22.camel@marge.simson.net>
Date:	Tue, 01 Feb 2011 04:15:36 +0100
From:	Mike Galbraith <efault@....de>
To:	Milton Miller <miltonm@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>, akpm@...ux-foundation.org,
	Anton Blanchard <anton@...ba.org>,
	xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
	npiggin@...il.com, rusty@...tcorp.com.au,
	torvalds@...ux-foundation.org, paulmck@...ux.vnet.ibm.com,
	benh@...nel.crashing.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] smp_call_function_many: handle concurrent clearing of
 mask

On Mon, 2011-01-31 at 14:26 -0600, Milton Miller wrote:
> On Mon, 31 Jan 2011 about 08:21:22 +0100,  Mike Galbraith wrote:
> > Wondering if a final sanity check makes sense.  I've got a perma-spin
> > bug where comment apparently happened.  Another CPU's diddle the mask
> > IPI may make this CPU do horrible things to itself as it's setting up to
> > IPI others with that mask.
> > 
> > ---
> >  kernel/smp.c |    3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > Index: linux-2.6.38.git/kernel/smp.c
> > ===================================================================
> > --- linux-2.6.38.git.orig/kernel/smp.c
> > +++ linux-2.6.38.git/kernel/smp.c
> > @@ -490,6 +490,9 @@ void smp_call_function_many(const struct
> >  	cpumask_and(data->cpumask, mask, cpu_online_mask);
> >  	cpumask_clear_cpu(this_cpu, data->cpumask);
> >  
> > +	/* Did you pass me a mask that can be changed/emptied under me? */
> > +	BUG_ON(cpumask_empty(data->cpumask));
> > +
> 
> I was thinking of this as "the ipi cpumask was cleared", but I realize now
> you are saying the caller passed in a cpumask, but between the cpu_first/
> cpu_next calls above and the cpumask_and another cpu cleared all the cpus?
> 
> I could see how that could happen on say a mask of cpus that might have a
> translation context, or cpus that need a push to complete an rcu window.
> Instead of the BUG_ON, we can handle the mask being cleared.
> 
> The arch code to send the IPI must handle an empty mask, as the other
> cpus are racing to clear their bit while its trying to send the IPI.
> In fact that expected race is the cause of the x86 warning in bz 23042
> https://bugzilla.kernel.org/show_bug.cgi?id=23042  that Andrew pointed
> out.
> 
> 
> How about this [untested] patch?
> 
> Mike Galbraith reported finding a lockup where aparently the passed in
> cpumask was cleared on other cpu(s) while this cpu was preparing its
> smp_call_function_many block.   Detect this race and unlock the call
> data block.  Note: arch_send_call_function_ipi_mask must still handle an
> empty mask because the element is globally visable before it is called.
> And obviously there are no guarantees to which cpus are notified if the
> mask is changed during the call.

Yes, that would work.  In my case, it was passed mm_cpumask(mm).  What
is unclear is whether mask at call time was what the programmer needed
action on, ie mask changing may be intolerable information loss/gain.

	-Mike


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ