[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0707281224100.22772@tongli.jf.intel.com>
Date: Sat, 28 Jul 2007 12:38:23 -0700 (PDT)
From: Tong Li <tong.n.li@...el.com>
To: Chris Snook <csnook@...hat.com>
cc: "Bill Huey (hui)" <billh@...ppy.monkey.org>,
Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org
Subject: Re: [RFC] scheduler: improve SMP fairness in CFS
On Fri, 27 Jul 2007, Chris Snook wrote:
> Bill Huey (hui) wrote:
>> You have to consider the target for this kind of code. There are
>> applications
>> where you need something that falls within a constant error bound.
>> According
>> to the numbers, the current CFS rebalancing logic doesn't achieve that to
>> any degree of rigor. So CFS is ok for SCHED_OTHER, but not for anything
>> more
>> strict than that.
>
> I've said from the beginning that I think that anyone who desperately needs
> perfect fairness should be explicitly enforcing it with the aid of realtime
> priorities. The problem is that configuring and tuning a realtime
> application is a pain, and people want to be able to approximate this
> behavior without doing a whole lot of dirty work themselves. I believe that
> CFS can and should be enhanced to ensure SMP-fairness over potentially short,
> user-configurable intervals, even for SCHED_OTHER. I do not, however,
> believe that we should take it to the extreme of wasting CPU cycles on
> migrations that will not improve performance for *any* task, just to avoid
> letting some tasks get ahead of others. We should be as fair as possible but
> no fairer. If we've already made it as fair as possible, we should account
> for the margin of error and correct for it the next time we rebalance. We
> should not burn the surplus just to get rid of it.
Proportional-share scheduling actually has one of its roots in real-time
and having a p-fair scheduler is essential for real-time apps (soft
real-time).
>
> On a non-NUMA box with single-socket, non-SMT processors, a constant error
> bound is fine. Once we add SMT, go multi-core, go NUMA, and add
> inter-chassis interconnects on top of that, we need to multiply this error
> bound at each stage in the hierarchy, or else we'll end up wasting CPU cycles
> on migrations that actually hurt the processes they're supposed to be
> helping, and hurt everyone else even more. I believe we should enforce an
> error bound that is proportional to migration cost.
>
I think we are actually in agreement. When I say constant bound, it can
certainly be a constant that's determined based on inputs from the memory
hierarchy. The point is that it needs to be a constant independent of
things like # of tasks.
> But this patch is only relevant to SCHED_OTHER. The realtime scheduler
> doesn't have a concept of fairness, just priorities. That why each realtime
> priority level has its own separate runqueue. Realtime schedulers are
> supposed to be dumb as a post, so they cannot heuristically decide to do
> anything other than precisely what you configured them to do, and so they
> don't get in the way when you're context switching a million times a second.
Are you referring to hard real-time? As I said, an infrastructure that
enables p-fair scheduling, EDF, or things alike is the foundation for
real-time. I designed DWRR, however, with a target of non-RT apps,
although I was hoping the research results might be applicable to RT.
tong
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists