lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 26 Feb 2007 14:25:15 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Michal Piotrowski <michal.k.k.piotrowski@...il.com>
Cc:	tglx@...utronix.de,
	Michal Piotrowski <michal.k.k.piotrowski@...glemail.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: [patch] sched: fix SMT scheduler bug


* Michal Piotrowski <michal.k.k.piotrowski@...il.com> wrote:

> Ok, I think that this is a SMT scheduler problem.
> 
> System works fine for fifteen hours.

hmmm. I think it's this piece of smt-nice code in sched.c:

        if (dependent_sleeper(cpu, rq, next))
                next = rq->idle;

this will skip to run this HT CPU if the sibling CPU is running a task 
that is more important. And dependent_sleeper() doesnt skip kernel 
threads such as ksoftirqd:

        /* kernel/rt threads do not participate in dependent sleeping */
        if (!p->mm || rt_task(p))
                return 0;

but ... what if the highest-prio thread is a user-space task /and/ there 
is ksoftirqd also queued, which is now delayed? We schedule the idle 
task instead of processing softirqs. Ouch!

To test this theory, could you turn the SMT scheduler back on but also 
apply the patch below. Does it still work fine?

	Ingo

---------------------->
Subject: [patch] sched: fix SMT scheduler bug
From: Ingo Molnar <mingo@...e.hu>

the SMT scheduler incorrectly skips kernel threads even if they
are runnable (but they are preempted by a higher-prio user-space
task which got SMT-delayed by an even higher-priority task
running on a sibling CPU).

fix this for now by only doing the SMT-nice optimization if
the to-be-delayed task is the only runnable task. (This should
cover most of the real-life cases anyway.)

this bug has been in the SMT scheduler since 2.6.17 or so, but has
only been noticed now by the active check in the dynticks code.

Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
 kernel/sched.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -3537,7 +3537,7 @@ need_resched_nonpreemptible:
 		}
 	}
 	next->sleep_type = SLEEP_NORMAL;
-	if (dependent_sleeper(cpu, rq, next))
+	if (rq->nr_running == 1 && dependent_sleeper(cpu, rq, next))
 		next = rq->idle;
 switch_tasks:
 	if (next == rq->idle)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ