lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Jan 2013 09:25:55 +0000
From:	"Jan Beulich" <JBeulich@...e.com>
To:	"Ingo Molnar" <mingo@...nel.org>,
	"Linus Torvalds" <torvalds@...ux-foundation.org>
Cc:	"Milton Miller" <miltonm@....com>,
	"Wang YanQing" <udknight@...il.com>,
	"Mike Galbraith" <efault@....de>,
	"Peter Zijlstra" <peterz@...radead.org>,
	"Thomas Gleixner" <tglx@...utronix.de>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	<mina86@...a86.org>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"stable" <stable@...r.kernel.org>
Subject: Re: [PATCH]smp: Fix send func call IPI to empty cpu mask

>>> On 27.01.13 at 16:50, Ingo Molnar <mingo@...nel.org> wrote:

> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> 
>> On Fri, Jan 25, 2013 at 11:53 PM, Wang YanQing <udknight@...il.com> wrote:
>> > I get below warning every day with 3.7,
>> > one or two times per day.
>> >
>> > [ 2235.186027] WARNING: at 
> /mnt/sda7/kernel/linux/arch/x86/kernel/apic/ipi.c:109 
> default_send_IPI_mask_logical+0x2f/0xb8()
>> > [ 2235.186030] Hardware name: Aspire 4741
>> > [ 2235.186032] empty IPI mask
>> > [ 2235.186079]  [<c1015cbc>] native_send_call_func_ipi+0x4f/0x57
>> > [ 2235.186087]  [<c1053453>] smp_call_function_many+0x191/0x1a9
>> > [ 2235.186097]  [<c101e074>] native_flush_tlb_others+0x21/0x24
>> > [ 2235.186101]  [<c101e0da>] flush_tlb_page+0x63/0x89
>> > [ 2235.186105]  [<c101d360>] ptep_set_access_flags+0x20/0x26
>> > [ 2235.186111]  [<c108fadd>] do_wp_page+0x234/0x502
>> > [ 2235.186121]  [<c1090825>] handle_pte_fault+0x50d/0x54c
>> > [ 2235.186148]  [<c1090934>] handle_mm_fault+0xd0/0xe2
>> > [ 2235.186153]  [<c12dd143>] __do_page_fault+0x411/0x42d
>> > [ 2235.186166]  [<c12dd167>] do_page_fault+0x8/0xa
>> > [ 2235.186170]  [<c12db31a>] error_code+0x5a/0x60
>> >
>> > This patch fix it.
>> >
>> > This patch also fix some system hang problem:
>> > If the data->cpumask been cleared after pass
>> >
>> >         if (WARN_ONCE(!mask, "empty IPI mask"))
>> >                 return;
>> > then the problem 83d349f3 fix will happen again.
>> 
>> Hmm. We have very consciously tried to avoid the extra copy, although
>> I'm not entirely sure why (it might possibly hurt on the MAXSMP
>> configuration).
>> 
>> See for example commit 723aae25d5cd ("smp_call_function_many: handle
>> concurrent clearing of mask") which fixed another version of this
>> problem.
>> 
>> But I do agree that it looks like the copy is required, simply because
>> - as you say - once we've done the "list_add_rcu()" to add it to the
>> queue, we can have (another) IPI to the target CPU that can now see it
>> and clear the mask.
>> 
>> So by the time we get to actually send the IPI, the mask might have
>> been cleared by another IPI. So I do agree that your patch seems
>> correct, but I really really want to run it by other people.
>> 
>> Guys? Original patch on lkml. The other possible fix might be 
>> to take the &call_function.lock earlier in 
>> generic_smp_call_function_interrupt(), so that we can never 
>> clear the bit while somebody is adding entries to the list... 
>> But I think it very much tries to avoid that on purpose right 
>> now, with only the last CPU responding to that IPI taking the 
>> lock.
>> 
>> So copying the IPI mask seems to be the reasonable approach. 
>> Comments?
> 
> Agreed, looks correct to me as well - I've queued the fix up in 
> tip:x86/urgent.

But the patch is obviously incomplete for the CPUMASK_OFFSTACK
case, as the newly added cpumask_ipi member never gets
its bit array allocated.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ