lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20121004171947.GA2088@thinkpad>
Date:	Thu, 4 Oct 2012 19:19:47 +0200
From:	Andrea Righi <andrea@...terlinux.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Paul Menage <paul@...lmenage.org>, Ingo Molnar <mingo@...hat.com>,
	linux-kernel@...r.kernel.org, Paul Turner <pjt@...gle.com>,
	Glauber Costa <glommer@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH RFC 1/3] sched: introduce distinct per-cpu load average

On Thu, Oct 04, 2012 at 02:12:08PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-10-04 at 11:43 +0200, Andrea Righi wrote:
> > 
> > Right, the update must be atomic to have a coherent nr_uninterruptible
> > value. And AFAICS the only way to account a coherent
> > nr_uninterruptible
> > value per-cpu is to go with atomic ops... mmh... I'll think more on
> > this. 
> 
> You could stick it in the cpu controller instead of cpuset, add a
> per-cpu nr_uninterruptible counter to struct task_group and update it
> from the enqueue/dequeue paths. Those already are per-cgroup (through
> cfs_rq, which has a tg pointer).
> 
> That would also give you better semantics since it would really be the
> load of the tasks of the cgroup, not whatever happened to run on a
> particular cpu regardless of groups. Then again, it might be 'fun' to
> get the hierarchical semantics right :-)
> 
> OTOH it would also make calculating the load-avg O(nr_cgroups) and since
> we do this from the tick and people are known to create a shitload (on
> the order of 1e3 and upwards) of those this might not actually be a very
> good idea.

That would be an interesting path to explore, even if my concern goes to
the large hosting companies that want to create like a cpu cgroup for
each user. In this case we may have big scalability issues.  Maintaining
all the required stats per-cpu seems a more scalable solution to me
(except probably for the large SMP systems case...).

I wonder if it is worth to define rq->nr_uninterruptible as a pointer to
percpu data rather than converting it to an atomic var... but this would
be even worst for the large SMP systems. Especially for those that are
not interested in the loadavg feature.

> 
> Also, your patch 2 relies on the load avg function to be additive yet
> your completely fail to mention this and state whether this is so or
> not.

Correct, I'll report a more detailed description in the next version.

> 
> Furthermore, please look at PER_CPU() and friends as alternatives to
> [NR_CPUS] arrays.

Will do.

Thanks again for your suggestions.

-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ