linux-kernel - Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BANLkTi=Ad2DUQ2Lr-Q5Y+eYxKMyz04fL2g@mail.gmail.com>
Date:	Wed, 27 Apr 2011 16:28:09 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Bruno Prémont <bonbons@...ux-vserver.org>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mike Frysinger <vapier.adi@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org,
	"Paul E. McKenney" <paul.mckenney@...aro.org>,
	Pekka Enberg <penberg@...nel.org>
Subject: Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression?

On Wed, Apr 27, 2011 at 3:32 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Well that's going to paper over the problem at hand possibly. I really
> don't see why that thing would run for more than 950ms in a row even
> if there is a large number of callbacks pending.

Stop with this bogosity already, guys.

We _know_ it didn't run continuously for 950ms. That number is totally
made up. There's not enough work for it to run that long, but more
importantly, the thread has zero CPU time. There is _zero_ reason to
believe that it runs for long periods.

There is some scheduler bug, probably the rt_time hasn't been
initialized at all, or runtime we compare against is zero, or the
calculations are just wrong.

The 950ms didn't happen. Stop harping on it. It almost certainly
simply doesn't exist.

Since that

       if (rt_rq->rt_time > runtime) {
               rt_rq->rt_throttled = 1;
+               printk_once(KERN_WARNING "sched: RT throttling activated\n");

test triggers, we know that either 'runtime' or 'rt_time' is just
bogus. Make the printk print out the values, and maybe that gives some
hints.

But in the meantime, I'd suggest looking for the places that
initialize or calculate those values, and just assume that some of
them are buggy.

> And then I don't have an explanation for the hosed CPU accounting and
> why that thing does not get another 950ms RT time when the 50ms
> throttling break is over.

Again, don't even bother talking about "another 950ms". It didn't
happen in the first place, there's no "another" there either.

                      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/