lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 26 Apr 2013 14:38:27 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Vincent Guittot <vincent.guittot@...aro.org>
Cc:	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	linaro-kernel@...ts.linaro.org, mingo@...nel.org,
	linux@....linux.org.uk, pjt@...gle.com, santosh.shilimkar@...com,
	Morten.Rasmussen@....com, chander.kashyap@...aro.org,
	cmetcalf@...era.com, tony.luck@...el.com, alex.shi@...el.com,
	preeti@...ux.vnet.ibm.com, paulmck@...ux.vnet.ibm.com,
	tglx@...utronix.de, len.brown@...el.com, arjan@...ux.intel.com,
	amit.kucheria@...aro.org, corbet@....net, l.majewski@...sung.com
Subject: Re: [PATCH 03/14] sched: pack small tasks

On Thu, Apr 25, 2013 at 07:23:19PM +0200, Vincent Guittot wrote:

> +static bool is_buddy_busy(int cpu)
> +{
> +	struct rq *rq = cpu_rq(cpu);
> +	u32 sum = rq->avg.runnable_avg_sum;
> +	u32 period = rq->avg.runnable_avg_period;
> +
> +	/*
> +	 * If a CPU accesses the runnable_avg_sum and runnable_avg_period
> +	 * fields of its buddy CPU while the latter updates it, it can get the
> +	 * new version of a field and the old version of the other one. This
> +	 * can generate erroneous decisions. We don't want to use a lock
> +	 * mechanism for ensuring the coherency because of the overhead in
> +	 * this critical path.
> +	 * The runnable_avg_period of a runqueue tends to the max value in
> +	 * less than 345ms after plugging a CPU, which implies that we could
> +	 * use the max value instead of reading runnable_avg_period after
> +	 * 345ms. During the starting phase, we must ensure a minimum of
> +	 * coherency between the fields. A simple rule is runnable_avg_sum <=
> +	 * runnable_avg_period.
> +	 */
> +	sum = min(sum, period);
> +
> +	/*
> +	 * A busy buddy is a CPU with a high load or a small load with a lot of
> +	 * running tasks.
> +	 */
> +	return (sum > (period / (rq->nr_running + 2)));
> +}


I'm still not sold on the entire nr_running thing and the comment doesn't say
why you're doing it.

I'm fairly sure there's software out there that wakes a gazillion threads at a
time only for a gazillion-1 to go back to sleep immediately. Patterns like that
completely defeat your heuristic.

Does that matter... I don't know.

What happens if you keep this thing simple and only put a cap on utilization
(say 80%) and drop the entire nr_running magic? Have you seen it make an actual
difference or did it just seem like a good (TM) thing to do?

> +static bool is_light_task(struct task_struct *p)
> +{
> +	/* A light task runs less than 20% in average */
> +	return ((p->se.avg.runnable_avg_sum  * 5) <
> +			(p->se.avg.runnable_avg_period));
> +}

There superfluous () and ' ' in there. Also why 20%.. seemed like a good
number? Do we want a SCHED_DEBUG sysctl for that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ