lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <200709301335.00441.nickpiggin@yahoo.com.au> Date: Sun, 30 Sep 2007 13:34:59 +1000 From: Nick Piggin <nickpiggin@...oo.com.au> To: Paul Jackson <pj@....com> Cc: akpm@...ux-foundation.org, menage@...gle.com, linux-kernel@...r.kernel.org, dino@...ibm.com, cpw@....com, mingo@...e.hu Subject: Re: [PATCH] cpuset and sched domains: sched_load_balance flag On Monday 01 October 2007 04:07, Paul Jackson wrote: > Nick wrote: > > The user should just be able to specify exactly the partitioning of > > tasks required, and cpusets should ask the scheduler to do the best > > job of load balancing possible. > > If the cpusets which have 'sched_load_balance' enabled are disjoint > (their 'cpus' cpus_allowed masks don't overlap) then you get exactly > what you're asking for. In that case there is exactly one sched domain > for the 'cpus' allowed by each cpuset that has sched_load_balanced > enabled. But you could do that just by having the current cpuset scheme able to properly partition the system. You can't (easily) do this now because you have so many tasks in the root cpuset that it is impossible to know whether or not you actually want to load balance them. You would do this by creating partitioning cpusets which carve up the root cpuset (basically -- have multiple roots). > But there is another case in which one does not want what you ask for. > > That case involves the situation where one is running a third part > batch scheduler on part of ones big system, and doing other stuff > (perhaps Ingo's realtime stuff) on another part of the system. In this case the admin would simply not partition the system (they would retain a single root cpuset). Neither approach is really fundamentally more or less powerful than the other, but what I object to in yours is adding these flags which don't allow the admin to specify what they want, but to specify how they want it done. Moreover, sched_load_balance doesn't really sound like a good name for asking for a partition. It's more like you're just asking to have better load balancing over that set, which you could equally achieve by adding a second set of sched domains (and the global domains could keep globally balancing). Basically: the admin doesn't know best when it comes to how the scheduler should work; the admin knows best about how they intend the system to be used. > In that case, the system admin will be advised to turn off > sched_load_balance on the top cpuset. But in that case the system > admin will -not- know from moment to moment what jobs the batch > scheduler is running on the cpus assigned to its control. Only the > batch scheduler knows that. > > The batch scheduler is code that was written by someone else, in > some other company, some other time. That code does not get to > control the overall sched domain partitioning of the entire system. > The batch scheduler gets to say, in affect: > > Here's where I need load balancing to occur, in the normal fashion, > and here's where I don't need it. Rather than require the admin to know the intricate details about how and why the scheduler load balancing gets broken, and when they might or might not need to use this flag, they can just specify what they want to be done, and the kernel can choose the optimal strategy. > In short, you insisting that only a single administrative point of > control determine the systems sched domains. Sometimes that fits > the way the system is managed, and my patch lets you do that. But > sometimes this is a shared responsibility, between a piece of third > party software and the system admin, and my patch allows for that > case as well. > > This is a typical sort of situation that arises from having hierarchical > cpuset definitions, and highlights the reason (and the use case, > involving third party batch schedulers) that I went with a hierarchical > cpuset architecture in the first place. No, I'm insisting that *no* single administrative point of control determines the sched domains. Not directly. The kernel should. cpusets API should be rich enough that the kernel can derive tihs information from what the admin has intended. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists