[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090604211744.GB9213@us.ibm.com>
Date: Thu, 4 Jun 2009 14:17:45 -0700
From: Gary Hade <garyhade@...ibm.com>
To: Gary Hade <garyhade@...ibm.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>, mingo@...e.hu,
mingo@...hat.com, linux-kernel@...r.kernel.org, tglx@...utronix.de,
hpa@...or.com, x86@...nel.org, yinghai@...nel.org, lcm@...ibm.com
Subject: Re: [RESEND] [PATCH v2] [BUGFIX] x86/x86_64: fix CPU offlining
triggered "active" device IRQ interrruption
On Thu, Jun 04, 2009 at 01:04:37PM -0700, Gary Hade wrote:
> On Wed, Jun 03, 2009 at 02:13:23PM -0700, Eric W. Biederman wrote:
> > Gary Hade <garyhade@...ibm.com> writes:
> >
> > > Correct, after the fix was applied my testing did _not_ show
> > > the lockups that you are referring to. I wonder if there is a
> > > chance that the root cause of those old failures and the root
> > > cause of issue that my fix addresses are the same?
> > >
> > > Can you provide the test case that demonstrated the old failure
> > > cases so I can try it on our systems? Also, do you recall what
> > > mainline version demonstrated the old failure
> >
> > The irq migration has already been moved to interrupt context by the
> > time I started working on it. And I managed to verify that there were
> > indeed problems with moving it out of interrupt context before my code
> > merged.
> >
> > So if you want to reproduce it reduce your irq migration to the essentials.
> > Set IRQ_MOVE_PCNTXT, and always migrate the irqs from process context
> > immediately.
> >
> > Then migrate an irq that fires at a high rate rapidly from one cpu to
> > another.
> >
> > Right now you are insulated from most of the failures because you still
> > don't have IRQ_MOVE_PCNTXT. So you are only really testing your new code
> > in the cpu hotunplug path.
>
> OK, I'm confused.
>
> It sounds like you want me force IRQ_MOVE_PCNTXT so that I can
> test in a configuration that you say is already broken. Why
> in the heck would this config, where you expect lockups without
> the fix, be a productive environment in which to test the fix?
Sorry, I did not say this well. Trying again:
Why would this config, where you already expect lockups
for reasons that you say are not addressed by the fix, be
a productive environment in which to test the fix?
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@...ibm.com
http://www.ibm.com/linux/ltc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists