lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200803072236.26256.rjw@sisk.pl>
Date:	Fri, 7 Mar 2008 22:36:25 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	"Dmitry Adamushko" <dmitry.adamushko@...il.com>, ego@...ibm.com,
	mingo@...e.hu, oleg@...sign.ru, yi.y.yang@...el.com,
	linux-kernel@...r.kernel.org, tglx@...utronix.de
Subject: Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes are dealocked when cpu is set to offline

On Friday, 7 of March 2008, Andrew Morton wrote:
> On Fri, 7 Mar 2008 14:02:20 +0100
> "Dmitry Adamushko" <dmitry.adamushko@...il.com> wrote:
> 
> > Hi,
> > 
> > 'watchdog' is of SCHED_FIFO class. The standard load-balancer doesn't
> > move RT tasks between cpus anymore and there is a special mechanism in
> > scher_rt.c instead (I think, it's .25 material).
> > 
> > So I wonder, whether __migrate_task() is still capable of properly
> > moving a RT task to another CPU (e.g. for the case when it's in
> > TASK_RUNNING state) without breaking something in the rt migration
> > mechanism (or whatever else) that would leave us with a runqueue in
> > the 'inconsistent' state...
> > (I've taken a quick look at the relevant code so can't confirm it yet)
> > 
> > maybe it'd be faster if somebody could do a quick test now with the
> > following line commented out in kernel/softlockup.c :: watchdog()
> > 
> > -         sched_setscheduler(current, SCHED_FIFO, &param);
> > 
> 
> Yup, thanks.  This:
> 
> 
>  kernel/softirq.c      |    2 +-
>  kernel/softlockup.c   |    2 +-
>  kernel/stop_machine.c |    2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff -puN kernel/softlockup.c~a kernel/softlockup.c
> --- a/kernel/softlockup.c~a
> +++ a/kernel/softlockup.c
> @@ -211,7 +211,7 @@ static int watchdog(void *__bind_cpu)
>  	struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
>  	int this_cpu = (long)__bind_cpu;
>  
> -	sched_setscheduler(current, SCHED_FIFO, &param);
> +//	sched_setscheduler(current, SCHED_FIFO, &param);
>  
>  	/* initialize timestamp */
>  	touch_softlockup_watchdog();
> diff -puN kernel/stop_machine.c~a kernel/stop_machine.c
> --- a/kernel/stop_machine.c~a
> +++ a/kernel/stop_machine.c
> @@ -188,7 +188,7 @@ struct task_struct *__stop_machine_run(i
>  		struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
>  
>  		/* One high-prio thread per cpu.  We'll do this one. */
> -		sched_setscheduler(p, SCHED_FIFO, &param);
> +//		sched_setscheduler(p, SCHED_FIFO, &param);
>  		kthread_bind(p, cpu);
>  		wake_up_process(p);
>  		wait_for_completion(&smdata.done);
> diff -puN kernel/softirq.c~a kernel/softirq.c
> --- a/kernel/softirq.c~a
> +++ a/kernel/softirq.c
> @@ -622,7 +622,7 @@ static int __cpuinit cpu_callback(struct
>  
>  		p = per_cpu(ksoftirqd, hotcpu);
>  		per_cpu(ksoftirqd, hotcpu) = NULL;
> -		sched_setscheduler(p, SCHED_FIFO, &param);
> +//		sched_setscheduler(p, SCHED_FIFO, &param);
>  		kthread_stop(p);
>  		takeover_tasklets(hotcpu);
>  		break;
> _
> 
> fixes the wont-power-off regression.
> 
> But 2.6.24 runs the watchdog threads SCHED_FIFO too.  Are you saying that
> it's the migration code which changed? 

Well, that would be a problem for suspend/hibernation if there were SCHED_FIFO
non-freezable tasks bound to the nonboot CPUs.  I'm not aware of any, but ...

Also, it should be possible to just offline one or more CPUs using the sysfs
interface at any time and what happens if there are any RT tasks bound to these
CPUs at that time?  That would be the $subject problem, IMHO.

Anyone who made the change affecting __migrate_task() so it's unable to migrate
RT tasks any more should have fixed the CPU hotplug as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ