lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTin3UG=xF1VQOtdEDOnShoMQwQ7gFg@mail.gmail.com>
Date:	Tue, 26 Apr 2011 10:12:39 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Bruno Prémont <bonbons@...ux-vserver.org>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	paulmck@...ux.vnet.ibm.com, Mike Frysinger <vapier.adi@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org,
	"Paul E. McKenney" <paul.mckenney@...aro.org>,
	Pekka Enberg <penberg@...nel.org>
Subject: Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression?

On Tue, Apr 26, 2011 at 9:38 AM, Bruno Prémont
<bonbons@...ux-vserver.org> wrote:
>
> Here it comes:
>
> rcu_kthread (when build processes are STOPped):
> [  836.050003] rcu_kthread     R running   7324     6      2 0x00000000
> [  836.050003]  dd473f28 00000046 5a000240 dd65207c dd407360 dd651d40 0000035c dd473ed8
> [  836.050003]  c10bf8a2 c14d63d8 dd65207c dd473f28 dd445040 dd445040 dd473eec c10be848
> [  836.050003]  dd651d40 dd407360 ddfdca00 dd473f14 c10bfde2 00000000 00000001 000007b6
> [  836.050003] Call Trace:
> [  836.050003]  [<c10bf8a2>] ? check_object+0x92/0x210
> [  836.050003]  [<c10be848>] ? init_object+0x38/0x70
> [  836.050003]  [<c10bfde2>] ? free_debug_processing+0x112/0x1f0
> [  836.050003]  [<c103d9fd>] ? lock_timer_base+0x2d/0x70
> [  836.050003]  [<c13c8ec7>] schedule_timeout+0x137/0x280

Hmm.

I'm adding Ingo and Peter to the cc, because this whole "rcu_kthread
is running, but never actually running" is starting to smell like a
scheduler issue.

Peter/Ingo: RCUTINY seems to be broken for Bruno. During any kind of
heavy workload, at some point it looks like rcu_kthread simply stops
making any progress. It's constantly in runnable state, but it doesn't
actually use any CPU time, and it's not processing the RCU callbacks,
so the RCU memory freeing isn't happening, and slabs just build up
until the machine dies.

And it really is RCUTINY, because the thing doesn't happen with the
regular tree-RCU.

This is without CONFIG_RCU_BOOST_PRIO, so we basically have

        struct sched_param sp;

        rcu_kthread_task = kthread_run(rcu_kthread, NULL, "rcu_kthread");
        sp.sched_priority = RCU_BOOST_PRIO;
        sched_setscheduler_nocheck(rcu_kthread_task, SCHED_FIFO, &sp);

where RCU_BOOST_PRIO is 1 for the non-boost case.

Is that so low that even the idle thread will take priority? It's a UP
config with PREEMPT_VOLUNTARY. So pretty much _all_ the stars are
aligned for odd scheduling behavior.

Other users of SCHED_FIFO tend to set the priority really high (eg
"MAX_RT_PRIO-1" is clearly the default one - softirq's, watchdog), but
"1" is not unheard of either (touchscreen/ucb1400_ts and
mmc/core/sdio_irq), and there are some other random choises out tere.

Any ideas?

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ