linux-kernel - Re: [Linux 2.6.29-rc2] BUG: using smp_processor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <200901302230.15418.rjw@sisk.pl>
Date:	Fri, 30 Jan 2009 22:30:14 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Frédéric Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Maciej Rutecki <maciej.rutecki@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible

On Friday 30 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@...k.pl> wrote:
> 
> > On Thursday 29 January 2009, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@...k.pl> wrote:
> > > 
> > > > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > > > 
> > > > > * Rafael J. Wysocki <rjw@...k.pl> wrote:
> > > > > 
> > > > > > > In fact whatever check you put in it's _always_ going to be 
> > > > > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > > > > possibly check all possible places that enable interrupts. (they could 
> > > > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > > > 
> > > > > > In this particular case, I'm not really interested in that.  What I'm 
> > > > > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > > > > to check directly.
> > > > > 
> > > > > But this is exactly what it does - without any need for debug checks 
> > > > > spread around!
> > > > > 
> > > > > You'll get a _full stack dump_ from the very driver that is enabling 
> > > > > interrupts! You dont get a trace - you get a stack dump of the very place 
> > > > > that is buggy. It does not get any better than that.
> > > > 
> > > > I'm not going to argue.
> > > > 
> > > > Nevertheless, IMO something like the patch below should be sufficient to catch
> > > > these bugs.
> > > > 
> > > > Thanks,
> > > > Rafael
> > > > 
> > > > 
> > > > ---
> > > >  drivers/base/power/main.c |   12 ++++++++++++
> > > >  drivers/base/sys.c        |   21 ++++++++++++++++-----
> > > >  include/linux/pm.h        |   18 ++++++++++++++++++
> > > >  3 files changed, 46 insertions(+), 5 deletions(-)
> > > 
> > > hm, so now you sprinkle debug checks all around the code, instead of 
> > > putting in a single pair of:
> > > 
> > >     force_irqs_off_start();
> > >     ...
> > >     force_irqs_off_end();
> > 
> > And what debug options exactly would that require to be set to work?
> 
> hm, if you worry about that aspect: we could make it seemlessly enabled if 
> PM_DEBUG is enabled.

That would be useful, but OTOH I'd rather not like PM_DEBUG to select multiple
tracing options.  Perhaps it's better to add PM_CHECK_IRQS or something similar
and make that depend on PM_DEBUG and whatever else is necessary.

> > > which would catch everything that your checks would catch - and it 
> > > would catch more.
> > 
> > Except that the checks trigger in specific places, so if a check 
> > triggers you know precisely where the bug happened regardless of what 
> > garbage is in the call trace.
> 
> This argument is 100% mystery to me. Do you really not see the quality 
> difference between a stack trace generated _right at the buggy piece of 
> code_ and a warning later on that might (or might not) trigger?
> 
> Especially considering that your approach wont catch such bugs:
> 
>    ...
>    spin_unlock_irq();
>    ...
>    spin_lock_irq();
>    ...
> 
> Or such bugs:
> 
>    local_irq_enable();
>    ...
>    local_irq_disable();
> 
> Or such bugs:
> 
>    spin_lock_irq_save(&lock1, flags);
>    ...
>            spin_lock_irqsave(&lock2, flags);
>            ...
>            spin_unlock_irq(&lock2);          /* accidental bug */
>    ...
>    spin_unlock_irq_restore(&lock1, flags);

I didn't think about that.

I see a value of having this kind of things trigger a warning, but also I see
a value of having some checks in the code, independent of any extra debug
options.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/