linux-kernel - Re: WARNING: at kernel/sched.c:4440 sub_preempt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200901152153.42159.nickpiggin@yahoo.com.au>
Date:	Thu, 15 Jan 2009 21:53:41 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Michal Hocko <mhocko@...e.cz>, LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: WARNING: at kernel/sched.c:4440 sub_preempt_count+0x81/0x95()

On Thursday 15 January 2009 21:00:13 Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@...oo.com.au> wrote:
> > On Tuesday 13 January 2009 23:34:25 Ingo Molnar wrote:
> > > * Michal Hocko <mhocko@...e.cz> wrote:
> > > > Hi,
> > > > I am not sure whether someone has already reported this, but
> > > > I can see the following early boot WARNING with the 2.6.29-rc1 kernel
> > > > in the serial console output:
> > > >
> > > > ------------[ cut here ]------------
> > > > WARNING: at kernel/sched.c:4440 sub_preempt_count+0x81/0x95()
> > >
> > > That one should be fixed in tip/master and it is in the to-Linus queue
> > > of fixes:
> > >
> > >   http://people.redhat.com/mingo/tip.git/README
> > >
> > > it's this commit:
> > >
> > >   01e3eb8: Revert "sched: improve preempt debugging"
> > >
> > > if you want to cherry-pick the fix.
> >
> > OK, but I still don't think this is the actual problem, but there is
> > something amiss in the init code being exposed by it.
>
> the warnings triggered after a softirq, and there's already preempt-leak
> checks in the softirq code - so we can exclude that.
>
> a hardirq might have leaked a preempt count - but that would have quite
> bad effects [with quick atomic check asserts in schedule()], wouldnt it?
> So i tend to think that this is a false positive.
>
> One problem i can think of (and which i outlined in the revert commit log)
> is that if a hardirq hits this window in lock_kernel():
>
> void __lockfunc lock_kernel(void)
> {
>         int depth = current->lock_depth+1;
>                                            <-------------- HERE

current->lock_depth is not yet modified at this point.

>         if (likely(!depth))
>                 __lock_kernel();

And here we increment preempt_count when taking the BKL, but we
has not yet modified current->lock_depth, so preempt debugging
will proceed with no impact of my patch at this point, regardless
of what interrupts are taken.

So there is a race window, but it is to err on the safe side and
not trigger a false report.

>         current->lock_depth = depth;
> }

At this point, the BKL addition to preempt_count will be taken into
account in underflow checking of subsequent preempt counters.


> then we have kernel_locked() already true (it checks lock_depth), but the
> preempt count is not elevated yet via __lock_kernel(). So if we return
> from the hardirq [and run into softirqs that end with a preempt_enable() -
> a pure hardirq exit has no preempt debug check] we'll incorrectly think
> that there's a preempt leak going on.

I don't think so. And anyway BKL is taken very early in boot and not
released for a long time, so I don't think it is possible for these
kinds of races to trigger here anyway.

So I really think it still is a bug or some other misunderstanding we
have. I'd prefer to try to fix it or at least discover why it is happening.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/