linux-kernel - Re: [RFC PATCH 0/2] sched: simplify the select_task_rq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1358750523.4994.55.camel@marge.simpson.net>
Date:	Mon, 21 Jan 2013 07:42:03 +0100
From:	Mike Galbraith <bitbucket@...ine.de>
To:	Michael Wang <wangyun@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...hat.com,
	peterz@...radead.org, mingo@...nel.org, a.p.zijlstra@...llo.nl
Subject: Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote:

> That seems like the default one, could you please show me the numbers in
> your datapoint file?

Yup, I do not touch the workfile.  Datapoints is what you see in the
tabulated result...

1
1
1
5
5
5
10
10
10
...

so it does three consecutive runs at each load level.  I quiesce the
box, set governor to performance, echo 250 32000 32 4096
> /proc/sys/kernel/sem, then ./multitask -nl -f, and point it
at ./datapoints.

> I'm not familiar with this benchmark, but I'd like to have a try on my
> server, to make sure whether it is a generic issue.

One thing I didn't like about your changes is that you don't ask
wake_affine() if it's ok to pull cross node or not, which I though might
induce imbalance, but twiddling that didn't fix up the collapse, pretty
much leaving only the balance path.

> >> And I'm confusing about how those new parameter value was figured out
> >> and how could them help solve the possible issue?
> > 
> > Oh, that's easy.  I set sched_min_granularity_ns such that last_buddy
> > kicks in when a third task arrives on a runqueue, and set
> > sched_wakeup_granularity_ns near minimum that still allows wakeup
> > preemption to occur.  Combined effect is reduced over-scheduling.
> 
> That sounds very hard, to catch the timing, whatever, it could be an
> important clue for analysis.

(Play with the knobs with a bunch of different loads, I think you'll
find that those settings work well)

> >> Do you have any idea about which part in this patch set may cause the issue?
> > 
> > Nope, I'm as puzzled by that as you are.  When the box had 40 cores,
> > both virgin and patched showed over-scheduling effects, but not like
> > this.  With 20 cores, symptoms changed in a most puzzling way, and I
> > don't see how you'd be directly responsible.
> 
> Hmm...
> 
> > 
> >> One change by designed is that, for old logical, if it's a wake up and
> >> we found affine sd, the select func will never go into the balance path,
> >> but the new logical will, in some cases, do you think this could be a
> >> problem?
> > 
> > Since it's the high load end, where looking for an idle core is most
> > likely to be a waste of time, it makes sense that entering the balance
> > path would hurt _some_, it isn't free.. except for twiddling preemption
> > knobs making the collapse just go away.  We're still going to enter that
> > path if all cores are busy, no matter how I twiddle those knobs.
> 
> May be we could try change this back to the old way later, after the aim
> 7 test on my server.

Yeah, something funny is going on.  I'd like select_idle_sibling() to
just go away, that task be integrated into one and only one short and
sweet balance path.  I don't see why fine_idlest* needs to continue
traversal after seeing a zero.  It should be just fine to say gee, we're
done.  Hohum, so much for pure test and report, twiddle twiddle tweak,
bend spindle mutilate ;-) 
   
-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/