lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18533ecd-0000-47f7-df24-f4f7ed45862e@gmail.com>
Date: Thu, 28 Nov 2024 20:18:10 +0800
From: Hao Jia <jiahao.kernel@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, mingo@...nel.org, juri.lelli@...hat.com,
 vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
 bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
 vschneid@...hat.com, linux-kernel@...r.kernel.org,
 Hao Jia <jiahao1@...iang.com>
Subject: Re: [PATCH] sched/core: Do not migrate ineligible tasks in
 sched_balance_rq()



On 2024/11/28 19:47, Peter Zijlstra wrote:
> On Thu, Nov 28, 2024 at 07:30:37PM +0800, Hao Jia wrote:
>>
>>
>> On 2024/11/28 17:19, Peter Zijlstra wrote:
>>> On Thu, Nov 28, 2024 at 04:48:58PM +0800, Hao Jia wrote:
>>>> From: Hao Jia <jiahao1@...iang.com>
>>>>
>>>> When the PLACE_LAG scheduling feature is enabled, if a task
>>>> is ineligible (lag < 0) on the source cpu runqueue, it will
>>>> also be ineligible when it is migrated to the destination
>>>> cpu runqueue.
>>>>
>>>> Because we will keep the original equivalent lag of
>>>> the task in place_entity(). So if the task was ineligible
>>>> before, it will still be ineligible after migration.
>>>
>>> This is not accurate, it will be eleigible, irrespective of lag, if
>>> there are no other tasks. I think your patch tries to do this, but I'm
>>> fairly sure it got it wrong.
>>
>> Thank you for your reply. The expression in my commit message is inaccurate,
>> and I will correct it in the patch v2. If I understand correctly, a task
>> meeting the following conditions:
>>
>>   sched_feat(PLACE_LAG) && cfs_rq->nr_running && !entity_eligible(cfs_rq,
>> &p->se),
>>
>> will remain ineligible both before and after migration.
>>
>> If I am wrong, please correct me. Thank you!
> 
> Problem is you're checking the wrong nr_running.

Oops, that's my mistake. Thank you very much for pointing it out.

I should be focusing on the nr_running of the destination cfs_rq.

Thanks,
Hao

> 
>>>> @@ -9358,13 +9358,14 @@ static inline int migrate_degrades_locality(struct task_struct *p,
>>>>    static
>>>>    int can_migrate_task(struct task_struct *p, struct lb_env *env)
>>>>    {
>>>> +	struct cfs_rq *cfs_rq = task_cfs_rq(p);
> 
> This is task's current cfs_rq. What you're interested in is destination
> cfs_rq. If the destination is empty, then lag is irrelevant.
> 
> You want something like:
> 
> #if CONFIG_FAIR_GROUP_SCHED
> 	struct task_group *tg = task_group(p);
> 	struct cfs_rq *dst_cfs_rq = tg->cfs_rq[env->dst_cpu];
> #else
> 	struct cfs_rq = &env->dst_rq->cfs_rq;
> #endif
> 
> 
> Also, please add benchmark details that show this actually makes a
> difference.
> 
> Notably we keep rq->cfs_tasks in MRU order; most recently ran task is
> head and balancing takes from the tail, the task longest not ran.
> 
> The task longest not ran should have build up eligibility.

Thank you for your suggestion. It might be as you said, that inefficient 
task migrations are relatively rare in load balancing. I will use some 
benchmarks from MMTests for testing.

Thanks,
Hao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ