lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080629065556.GA20398@elte.hu>
Date:	Sun, 29 Jun 2008 08:55:56 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Dmitry Adamushko <dmitry.adamushko@...il.com>
Cc:	Heiko Carstens <heiko.carstens@...ibm.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Avi Kivity <avi@...ranet.com>, linux-kernel@...r.kernel.org
Subject: Re: [BUG] CFS vs cpu hotplug


* Dmitry Adamushko <dmitry.adamushko@...il.com> wrote:

> Hello,
> 
> it seems to be related to migrate_dead_tasks().
> 
> Firstly I added traces to see all tasks being migrated with 
> migrate_live_tasks() and migrate_dead_tasks(). On my setup the problem 
> pops up (the one with "se == NULL" in the loop of 
> pick_next_task_fair()) shortly after the traces indicate that some has 
> been migrated with migrate_dead_tasks()). btw., I can reproduce it 
> much faster now with just a plain cpu down/up loop.
> 
> [disclaimer] Well, unless I'm really missing something important in 
> this late hour [/desclaimer] pick_next_task() is not something 
> appropriate for migrate_dead_tasks() :-)
> 
> the following change seems to eliminate the problem on my setup 
> (although, I kept it running only for a few minutes to get a few 
> messages indicating migrate_dead_tasks() does move tasks and the 
> system is still ok)
> 
> [ quick hack ]
> 
> @@ -5887,6 +5907,7 @@ static void migrate_dead_tasks(unsigned int dead_cpu)
>                 next = pick_next_task(rq, rq->curr);
>                 if (!next)
>                         break;
> +               next->sched_class->put_prev_task(rq, next);
>                 migrate_dead(dead_cpu, next);
> 

thanks Dmitry - i've applied this chunk to tip/master and 
tip/sched/urgent, for more testing.

if this turns out to be the final and full fix today, would you mind to 
submit the rest of your checks as well? It seems like a rather sensible 
set of sanity checks. Put under CONFIG_SCHED_DEBUG or a new 
(default-off) config option.

it would also be _very_ nice to have a built-in cpu hotplug tester in 
the kernel, a'ka CONFIG_RCU_TORTURE_TEST=y. There's already sample code 
in kernel/tracing/ of how to initiate hotplug events from within the 
kernel.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ