[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090120082946.GC31473@elte.hu>
Date: Tue, 20 Jan 2009 09:29:46 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Li Zefan <lizf@...fujitsu.com>
Cc: Jaswinder Singh Rajput <jaswinder@...nel.org>,
Nick Piggin <npiggin@...e.de>,
Rusty Russell <rusty@...tcorp.com.au>,
Mike Travis <travis@....com>,
LKML <linux-kernel@...r.kernel.org>,
Suresh Siddha <suresh.b.siddha@...el.com>,
"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
Subject: Re: [BUG] kernel BUG at arch/x86/kernel/tlb_32.c:130!
* Li Zefan <lizf@...fujitsu.com> wrote:
> Ingo Molnar wrote:
> > * Ingo Molnar <mingo@...e.hu> wrote:
> >
> >> * Li Zefan <lizf@...fujitsu.com> wrote:
> >>
> >>> I was using mmotm 2009-01-16-16-18, and I ran into this BUG,
> >>> the line is:
> >>> BUG_ON(cpumask_empty(cpumask));
> >>>
> >>> I suspect it is caused by:
> >>>
> >>> commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
> >>> Author: Rusty Russell <rusty@...tcorp.com.au>
> >>> Date: Sat Jan 10 21:58:09 2009 -0800
> >>>
> >>> x86: change flush_tlb_others to take a const struct cpumask
> >>>
> >>> Impact: reduce stack usage, use new cpumask API.
> >> Jaswinder reported a similar crash.
> >>
> >> Mike, Rusty, what's going on with this commit? Why does this code:
> >>
> >> + if (cpumask_any_but(&mm->cpu_vm_mask, smp_processor_id()) < nr_cpu_ids)
> >> + flush_tlb_others(&mm->cpu_vm_mask, mm, TLB_FLUSH_ALL);
> >>
> >> Assume that mm->cpu_vm_mask wont change? TLB flushes go async and the
> >> MM's schedulability is not locked during that. I.e. mm->cpu_vm_mask can
> >> change under you while the TLB flush IPIs are flying around - while when
> >> the cpumask was passed on-stack this wouldnt happen.
> >
> > okay, a testsystem of mine just triggered this crash too.
> >
> > Li Zefan, Jaswinder, does the patch below fix it for you?
> >
>
> I'll test it, but I have to run for several hours to confirm it, since
> the bug is not easy to trigger. :)
yes. It triggered on an old 32-bit dual-socket HyperThreading system for
me and that is because HyperThreading is very good at triggering narrow
SMP races. (which this one certainly is)
(I'd expect Nehalem to trigger this TLB flushing race more easily too,
where HyperThreading made a comeback.)
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists