[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1358750523.4994.55.camel@marge.simpson.net>
Date: Mon, 21 Jan 2013 07:42:03 +0100
From: Mike Galbraith <bitbucket@...ine.de>
To: Michael Wang <wangyun@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com,
peterz@...radead.org, mingo@...nel.org, a.p.zijlstra@...llo.nl
Subject: Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote:
> That seems like the default one, could you please show me the numbers in
> your datapoint file?
Yup, I do not touch the workfile. Datapoints is what you see in the
tabulated result...
1
1
1
5
5
5
10
10
10
...
so it does three consecutive runs at each load level. I quiesce the
box, set governor to performance, echo 250 32000 32 4096
> /proc/sys/kernel/sem, then ./multitask -nl -f, and point it
at ./datapoints.
> I'm not familiar with this benchmark, but I'd like to have a try on my
> server, to make sure whether it is a generic issue.
One thing I didn't like about your changes is that you don't ask
wake_affine() if it's ok to pull cross node or not, which I though might
induce imbalance, but twiddling that didn't fix up the collapse, pretty
much leaving only the balance path.
> >> And I'm confusing about how those new parameter value was figured out
> >> and how could them help solve the possible issue?
> >
> > Oh, that's easy. I set sched_min_granularity_ns such that last_buddy
> > kicks in when a third task arrives on a runqueue, and set
> > sched_wakeup_granularity_ns near minimum that still allows wakeup
> > preemption to occur. Combined effect is reduced over-scheduling.
>
> That sounds very hard, to catch the timing, whatever, it could be an
> important clue for analysis.
(Play with the knobs with a bunch of different loads, I think you'll
find that those settings work well)
> >> Do you have any idea about which part in this patch set may cause the issue?
> >
> > Nope, I'm as puzzled by that as you are. When the box had 40 cores,
> > both virgin and patched showed over-scheduling effects, but not like
> > this. With 20 cores, symptoms changed in a most puzzling way, and I
> > don't see how you'd be directly responsible.
>
> Hmm...
>
> >
> >> One change by designed is that, for old logical, if it's a wake up and
> >> we found affine sd, the select func will never go into the balance path,
> >> but the new logical will, in some cases, do you think this could be a
> >> problem?
> >
> > Since it's the high load end, where looking for an idle core is most
> > likely to be a waste of time, it makes sense that entering the balance
> > path would hurt _some_, it isn't free.. except for twiddling preemption
> > knobs making the collapse just go away. We're still going to enter that
> > path if all cores are busy, no matter how I twiddle those knobs.
>
> May be we could try change this back to the old way later, after the aim
> 7 test on my server.
Yeah, something funny is going on. I'd like select_idle_sibling() to
just go away, that task be integrated into one and only one short and
sweet balance path. I don't see why fine_idlest* needs to continue
traversal after seeing a zero. It should be just fine to say gee, we're
done. Hohum, so much for pure test and report, twiddle twiddle tweak,
bend spindle mutilate ;-)
-Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists