linux-kernel - Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <11aaa3a8-e6b9-cf1f-08bb-0f8e1b63942b@linux.intel.com>
Date:   Wed, 4 Sep 2019 10:32:05 -0700
From:   Tim Chen <tim.c.chen@...ux.intel.com>
To:     subhra mazumdar <subhra.mazumdar@...cle.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, tglx@...utronix.de,
        steven.sistare@...cle.com, dhaval.giani@...cle.com,
        daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, mgorman@...hsingularity.net,
        parth@...ux.ibm.com, patrick.bellasi@....com
Subject: Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

On 8/30/19 10:49 AM, subhra mazumdar wrote:
> Add Cgroup interface for latency-nice. Each CPU Cgroup adds a new file
> "latency-nice" which is shared by all the threads in that Cgroup.

Subhra,

Thanks for posting the patchset.  Having a latency nice hint
is useful beyond idle load balancing.  I can think of other
application scenarios, like scheduling batch machine learning AVX 512
processes with latency sensitive processes.  AVX512 limits the frequency
of the CPU and it is best to avoid latency sensitive task on the
same core with AVX512.  So latency nice hint allows the scheduler
to have a criteria to determine the latency sensitivity of a task
and arrange latency sensitive tasks away from AVX512 tasks.

You configure the latency hint on a cgroup basis.
But I think not all tasks in a cgroup necessarily have the same
latency sensitivity.

For example, I can see that cgroup can be applied on a per user basis,
and the user could run different tasks that have different latency sensitivity.
We may also need a way to configure latency sensitivity on a per task basis instead on
a per cgroup basis.

Tim

> @@ -631,6 +631,7 @@ struct task_struct {
>  	int				static_prio;
>  	int				normal_prio;
>  	unsigned int			rt_priority;
> +	u64				latency_nice;

Does it need to be 64 bit?  Max latency nice is only 100.

>  
>  	const struct sched_class	*sched_class;
>  	struct sched_entity		se;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 874c427..47969bc 100644

> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b52ed1a..365c928 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -143,6 +143,13 @@ static inline void cpu_load_update_active(struct rq *this_rq) { }
>  #define NICE_0_LOAD		(1L << NICE_0_LOAD_SHIFT)
>  
>  /*
> + * Latency-nice default value
> + */

Will be useful to add comments to let reader know 
that higher latency nice number means a task is more 
latency tolerant.

Is there a reason for setting the default to be a low
value of 5?

Seems like we will default to only to search the
same core for idle cpu on a smaller system, 
as we only search 5% of the cpu span of the target sched domain.

> +#define	LATENCY_NICE_DEFAULT	5
> +#define	LATENCY_NICE_MIN	1
> +#define	LATENCY_NICE_MAX	100
> +