[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4537238A.7060106@yahoo.com.au>
Date:	Thu, 19 Oct 2006 17:04:42 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Paul Jackson <pj@....com>
CC:	holt@....com, suresh.b.siddha@...el.com, dino@...ibm.com,
	menage@...gle.com, Simon.Derr@...l.net,
	linux-kernel@...r.kernel.org, mbligh@...gle.com,
	rohitseth@...gle.com, dipankar@...ibm.com
Subject: Re: exclusive cpusets broken with cpu hotplug
Paul Jackson wrote:
> Nick wrote:
> 
>>I don't understand why you think the "implicit" (as in, not directly user
>>controlled?) linkage is wrong.
> 
> 
> Twice now I've given the following specific example.  I am not yet
> confident that I have it right, and welcome feedback.
Sorry, I skimmed over that.
> 
> However, Suresh has apparently agreed with my conclusion that one
> can use the current linkage between cpu_exclusive cpusets and sched
> domains to get unexpected and perhaps undesirable sched domain setups.
> 
> What's your take on this example:
> 
> 
>>Example:
>>
>>    As best as I can tell (which is not very far ;), if some hapless
>>    user does the following:
>>
>>	    /dev/cpuset		cpu_exclusive == 1; cpus == 0-7
>>	    /dev/cpuset/a	cpu_exclusive == 1; cpus == 0-3
>>	    /dev/cpsuet/b	cpu_exclusive == 1; cpus == 4-7
>>
>>    and then runs a big job in the top cpuset (/dev/cpuset), then that
>>    big job will not load balance correctly, with whatever threads
>>    in the big job that got stuck on cpus 0-3 isolated from whatever
>>    threads got stuck on cpus 4-7.
>>
>>Is this correct?
> 
> 
> If I have concluded incorrectly what happens in the above example
> (good chance) then please educate me on how this stuff works.
So that depends on what cpusets asks for. If, when setting up a and
b, it asks to partition the domains, then yes that breaks the parent
cpuset gets broken.
> I should warn you that I have demonstrated a remarkable resistance
> to being educatible on this subject ;).
Don't worry about the whole sched-domains implementation if you just
consider that partitioning the domains creates a hard partition
among the system's CPUs (but the upshot is that within the partitions,
balancing works pretty nicely).
So in your above example, cpusets should only ask for a partition of
the 0-7 CPUs.
If you wanted to get fancy and detect that there are no jobs in the
root cpuset, then you could make the two smaller partitions, and revert
back to the one bigger one if something gets assigned to it.
But that's all a matter of how you want cpusets to manage it, I really
don't think a user should control this (we simply shouldn't allow
situations where we put a partition in the middle of a cpuset).
Thanks,
Nick
-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
