lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1247068556.9777.58.camel@twins>
Date:	Wed, 08 Jul 2009 17:55:56 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Lucas De Marchi <lucas.de.marchi@...il.com>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org
Subject: Re: possible migration bug with hotplug cpu

On Wed, 2009-07-08 at 17:48 +0200, Lucas De Marchi wrote:
> I was doing some analysis with the number of migrations in my application and
> I think there's a bug in this accounting or even worse, in the migrations
> mechanism when used together with cpu hotplug.
> 
> I turned off all CPUs except one using the hotplug mechanism, after what I
> launghed my application that has 8 threads. Before they finish they print the
> file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
> migrations per thread. The function set_task_cpu is responsible for updating
> the migrations counter and is called by 9 other functions. With some tests I
> discovered that 95% of these migrations come from try_to_wake_up and the other
> 5% from pull_task and __migrate_task.
> 
> Looking at try_to_wake_up:
> 
> ....
> 	cpu = task_cpu(p);
> 	orig_cpu = cpu;
> 	this_cpu = smp_processor_id();
> 
> #ifdef CONFIG_SMP
> 	if (unlikely(task_running(rq, p)))
> 		goto out_activate;
> 
> 	cpu = p->sched_class->select_task_rq(p, sync);  //<<<<===
> 	if (cpu != orig_cpu) {                          //<<<<===
> 		set_task_cpu(p, cpu);
> ....
> 
> p->sched_class->select_task_rq(p, sync)  is returning a different cpu of
> task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
> similar for rt and normal tasks. For RT, the only possible problem could be on
> find_lowest_rq, but I'm still rying to find out why. Since you have more
> experience with this code, if you could give it a look I'd appreciate.
> 
> Is there any obscure reason why this behavior could be right?

If the task last ran on a now unplugged cpu this would be correct, is
this indeed what happens?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ