linux-kernel - Re: [PATCH 6/6] sched/deadline: Fixes cpu/rd/dl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40b1f087-2411-8fb3-4aad-20791e63d940@redhat.com>
Date:   Mon, 18 Jan 2021 14:17:43 +0100
From:   Daniel Bristot de Oliveira <bristot@...hat.com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>,
        linux-kernel@...r.kernel.org
Cc:     Marco Perronet <perronet@...-sws.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Li Zefan <lizefan@...wei.com>, Tejun Heo <tj@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Valentin Schneider <valentin.schneider@....com>,
        cgroups@...r.kernel.org
Subject: Re: [PATCH 6/6] sched/deadline: Fixes cpu/rd/dl_bw references for
 suspended tasks

On 1/15/21 3:36 PM, Dietmar Eggemann wrote:
> On 12/01/2021 16:53, Daniel Bristot de Oliveira wrote:
> 
> [...]
> 
>> ----- %< -----
>>   #!/bin/bash
>>   # Enter on the cgroup directory
>>   cd /sys/fs/cgroup/
>>
>>   # Check it if is cgroup v2 and enable cpuset
>>   if [ -e cgroup.subtree_control ]; then
>>   	# Enable cpuset controller on cgroup v2
>>   	echo +cpuset > cgroup.subtree_control
>>   fi
>>
>>   echo LOG: create an exclusive cpuset and assigned the CPU 0 to it
>>   # Create cpuset groups
>>   rmdir dl-group &> /dev/null
>>   mkdir dl-group
>>
>>   # Restrict the task to the CPU 0
>>   echo 0 > dl-group/cpuset.mems
>>   echo 0 > dl-group/cpuset.cpus
>>   echo root >  dl-group/cpuset.cpus.partition
>>
>>   echo LOG: dispatching a regular task
>>   sleep 100 &
>>   CPUSET_PID="$!"
>>
>>   # let it settle down
>>   sleep 1
>>
>>   # Assign the second task to the cgroup
> 
> There is only one task 'CPUSET_PID' involved here?

Ooops, yep, I will remove the "second" part.

>>   echo LOG: moving the second DL task to the cpuset
>>   echo "$CPUSET_PID" > dl-group/cgroup.procs 2> /dev/null
>>
>>   CPUSET_ALLOWED=`cat /proc/$CPUSET_PID/status | grep Cpus_allowed_list | awk '{print $2}'`
>>
>>   chrt -p -d --sched-period 1000000000 --sched-runtime 100000000 0 $CPUSET_PID
>>   ACCEPTED=$?
>>
>>   if [ $ACCEPTED == 0 ]; then
>>   	echo PASS: the task became DL
>>   else
>>   	echo FAIL: the task was rejected as DL
>>   fi
> 
> [...]
> 
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 5961a97541c2..3c2775e6869f 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5905,15 +5905,15 @@ static int __sched_setscheduler(struct task_struct *p,
>>  #ifdef CONFIG_SMP
>>  		if (dl_bandwidth_enabled() && dl_policy(policy) &&
>>  				!(attr->sched_flags & SCHED_FLAG_SUGOV)) {
>> -			cpumask_t *span = rq->rd->span;
>> +			struct root_domain *rd = dl_task_rd(p);
>>  
>>  			/*
>>  			 * Don't allow tasks with an affinity mask smaller than
>>  			 * the entire root_domain to become SCHED_DEADLINE. We
>>  			 * will also fail if there's no bandwidth available.
>>  			 */
>> -			if (!cpumask_subset(span, p->cpus_ptr) ||
>> -			    rq->rd->dl_bw.bw == 0) {
>> +			if (!cpumask_subset(rd->span, p->cpus_ptr) ||
>> +			    rd->dl_bw.bw == 0) {
>>  				retval = -EPERM;
>>  				goto unlock;
>>  			}
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index c221e14d5b86..1f6264cb8867 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -2678,8 +2678,8 @@ int sched_dl_overflow(struct task_struct *p, int policy,
>>  	u64 period = attr->sched_period ?: attr->sched_deadline;
>>  	u64 runtime = attr->sched_runtime;
>>  	u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
>> -	int cpus, err = -1, cpu = task_cpu(p);
>> -	struct dl_bw *dl_b = dl_bw_of(cpu);
>> +	int cpus, err = -1, cpu = dl_task_cpu(p);
>> +	struct dl_bw *dl_b = dl_task_root_bw(p);
>>  	unsigned long cap;
>>  
>>  	if (attr->sched_flags & SCHED_FLAG_SUGOV)
>>
> 
> Wouldn't it be sufficient to just introduce dl_task_cpu() and use the
> correct cpu to get rd->span or struct dl_bw in __sched_setscheduler()
> -> sched_dl_overflow()?

While I think that having the dl_task_rd() makes the code more intuitive, I have
no problem on not adding it...

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 5961a97541c2..0573f676696a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5905,7 +5905,8 @@ static int __sched_setscheduler(struct task_struct *p,
>  #ifdef CONFIG_SMP
>                 if (dl_bandwidth_enabled() && dl_policy(policy) &&
>                                 !(attr->sched_flags & SCHED_FLAG_SUGOV)) {
> -                       cpumask_t *span = rq->rd->span;
> +                       int cpu = dl_task_cpu(p);
> +                       cpumask_t *span = cpu_rq(cpu)->rd->span;
>  
>                         /*
>                          * Don't allow tasks with an affinity mask smaller than
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index c221e14d5b86..308ecaaf3d28 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2678,7 +2678,7 @@ int sched_dl_overflow(struct task_struct *p, int policy,
>         u64 period = attr->sched_period ?: attr->sched_deadline;
>         u64 runtime = attr->sched_runtime;
>         u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
> -       int cpus, err = -1, cpu = task_cpu(p);
> +       int cpus, err = -1, cpu = dl_task_cpu(p);
>         struct dl_bw *dl_b = dl_bw_of(cpu);
>         unsigned long cap;
> 

This way works for me too.

Thanks!

-- Daniel