[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <200901302230.15418.rjw@sisk.pl>
Date: Fri, 30 Jan 2009 22:30:14 +0100
From: "Rafael J. Wysocki" <rjw@...k.pl>
To: Ingo Molnar <mingo@...e.hu>
Cc: Frédéric Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Maciej Rutecki <maciej.rutecki@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
On Friday 30 January 2009, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <rjw@...k.pl> wrote:
>
> > On Thursday 29 January 2009, Ingo Molnar wrote:
> > >
> > > * Rafael J. Wysocki <rjw@...k.pl> wrote:
> > >
> > > > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > > >
> > > > > * Rafael J. Wysocki <rjw@...k.pl> wrote:
> > > > >
> > > > > > > In fact whatever check you put in it's _always_ going to be
> > > > > > > fundamentally more fragile than direct instrumentation: you cannot
> > > > > > > possibly check all possible places that enable interrupts. (they could
> > > > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > > >
> > > > > > In this particular case, I'm not really interested in that. What I'm
> > > > > > interested in is which driver's ->suspend_late() or ->resume_early() (or
> > > > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy
> > > > > > to check directly.
> > > > >
> > > > > But this is exactly what it does - without any need for debug checks
> > > > > spread around!
> > > > >
> > > > > You'll get a _full stack dump_ from the very driver that is enabling
> > > > > interrupts! You dont get a trace - you get a stack dump of the very place
> > > > > that is buggy. It does not get any better than that.
> > > >
> > > > I'm not going to argue.
> > > >
> > > > Nevertheless, IMO something like the patch below should be sufficient to catch
> > > > these bugs.
> > > >
> > > > Thanks,
> > > > Rafael
> > > >
> > > >
> > > > ---
> > > > drivers/base/power/main.c | 12 ++++++++++++
> > > > drivers/base/sys.c | 21 ++++++++++++++++-----
> > > > include/linux/pm.h | 18 ++++++++++++++++++
> > > > 3 files changed, 46 insertions(+), 5 deletions(-)
> > >
> > > hm, so now you sprinkle debug checks all around the code, instead of
> > > putting in a single pair of:
> > >
> > > force_irqs_off_start();
> > > ...
> > > force_irqs_off_end();
> >
> > And what debug options exactly would that require to be set to work?
>
> hm, if you worry about that aspect: we could make it seemlessly enabled if
> PM_DEBUG is enabled.
That would be useful, but OTOH I'd rather not like PM_DEBUG to select multiple
tracing options. Perhaps it's better to add PM_CHECK_IRQS or something similar
and make that depend on PM_DEBUG and whatever else is necessary.
> > > which would catch everything that your checks would catch - and it
> > > would catch more.
> >
> > Except that the checks trigger in specific places, so if a check
> > triggers you know precisely where the bug happened regardless of what
> > garbage is in the call trace.
>
> This argument is 100% mystery to me. Do you really not see the quality
> difference between a stack trace generated _right at the buggy piece of
> code_ and a warning later on that might (or might not) trigger?
>
> Especially considering that your approach wont catch such bugs:
>
> ...
> spin_unlock_irq();
> ...
> spin_lock_irq();
> ...
>
> Or such bugs:
>
> local_irq_enable();
> ...
> local_irq_disable();
>
> Or such bugs:
>
> spin_lock_irq_save(&lock1, flags);
> ...
> spin_lock_irqsave(&lock2, flags);
> ...
> spin_unlock_irq(&lock2); /* accidental bug */
> ...
> spin_unlock_irq_restore(&lock1, flags);
I didn't think about that.
I see a value of having this kind of things trigger a warning, but also I see
a value of having some checks in the code, independent of any extra debug
options.
Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists