linux-kernel - Re: [patch] CFS scheduler, -v12

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <465CBD35.2000109@bigpond.net.au>
Date:	Wed, 30 May 2007 09:54:29 +1000
From:	Peter Williams <pwil3058@...pond.net.au>
To:	"Siddha, Suresh B" <suresh.b.siddha@...el.com>
CC:	Ingo Molnar <mingo@...e.hu>, colpatch@...ibm.com,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Con Kolivas <kernel@...ivas.org>,
	Christoph Lameter <clameter@....com>,
	Dmitry Adamushko <dmitry.adamushko@...il.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [patch] CFS scheduler, -v12

Siddha, Suresh B wrote:
> On Thu, May 24, 2007 at 04:23:19PM -0700, Peter Williams wrote:
>> Siddha, Suresh B wrote:
>>> On Thu, May 24, 2007 at 12:43:58AM -0700, Peter Williams wrote:
>>>> Further testing indicates that CONFIG_SCHED_MC is not implicated and
>>>> it's CONFIG_SCHED_SMT that's causing the problem.  This rules out the
>>>> code in find_busiest_group() as it is common to both macros.
>>>>
>>>> I think this makes the scheduling domain parameter values the most
>>>> likely cause of the problem.  I'm not very familiar with this code so
>>>> I've added those who've modified this code in the last year or
>>>> so to the
>>>> address of this e-mail.
>>> What platform is this? I remember you mentioned its a 2 cpu box. Is it
>>> dual core or dual package or one with HT?
>> It's a single CPU HT box i.e. 2 virtual CPUs.  "cat /proc/cpuinfo"
>> produces:
> 
> Peter, I tried on a similar box and couldn't reproduce this problem
> with x86_64

Mine's a 32 bit machine.

> 2.6.22-rc3 kernel

I haven't tried rc3 yet.

> and using defconfig(has SCHED_SMT turned on).
> I am using top and just the spinners.  I don't have gkrellm running, is that
> required to reproduce the issue?

Not necessarily.  But you may need to do a number of trials as sheer 
chance plays a part.

> 
> I tried number of times and also in runlevels 3,5(with top running
> in a xterm incase of runlevel 5).

I've always done it in run level 5 using gnome-terminal.  I use 10 
consecutive trials without seeing the problem as an indication of its 
absence but will cut that short if I see a 3/1 which quickly recovers 
(see below).

> 
> In runlevel 5, occasionally for one refresh screen of top, I see three
> spinners on one cpu and one spinner on other(with X or someother app
> also on the cpu with one spinner). But it balances nicely for the
> immd next refresh of the top screen.

Yes, that (the fact that it recovers quickly) confirms that the problem 
isn't present for your system.  If load balancing occurs when other 
tasks than the spinners are actually running a 1/3 split for the 
spinners is a reasonable outcome so seeing the occasional 1/3 split is 
OK but it should return to 2/2 as soon as the other tasks sleep.

When I'm doing my tests (for the various combinations of macros) I 
always count a case where I see a 3/1 split that quickly recovers as 
proof that this problem isn't present for that case and cease testing.

> 
> I tried with various refresh rates of top too.. Do you see the issue
> at runlevel 3 too?

I haven't tried that.

Do your spinners ever relinquish the CPU voluntarily?

Peter
-- 
Peter Williams                                   pwil3058@...pond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/