lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56d9d6cc-021e-4bd5-7254-aa39c9845500@gmail.com>
Date: Wed, 15 Jan 2025 19:55:04 +0800
From: Hao Jia <jiahao.kernel@...il.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...hat.com, peterz@...radead.org, mingo@...nel.org,
 juri.lelli@...hat.com, dietmar.eggemann@....com, rostedt@...dmis.org,
 bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
 linux-kernel@...r.kernel.org, Hao Jia <jiahao1@...iang.com>
Subject: Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in
 sched_balance_rq()



On 2025/1/15 17:28, Vincent Guittot wrote:
> On Wed, 15 Jan 2025 at 09:55, Hao Jia <jiahao.kernel@...il.com> wrote:
>>
>>
>>
>> On 2025/1/14 16:07, Vincent Guittot wrote:
>>> On Tue, 14 Jan 2025 at 04:18, Hao Jia <jiahao.kernel@...il.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/1/14 00:40, Vincent Guittot wrote:
>>>>> On Mon, 13 Jan 2025 at 10:21, Hao Jia <jiahao.kernel@...il.com> wrote:
>>>>>>
>>>>>> Friendly ping...
>>>>>>
>>>>>>
>>>>>> On 2024/12/23 17:14, Hao Jia wrote:
>>>>>>> From: Hao Jia <jiahao1@...iang.com>
>>>>>>>
>>>>>>> When the PLACE_LAG scheduling feature is enabled and
>>>>>>> dst_cfs_rq->nr_queued is greater than 1, if a task is
>>>>>>> ineligible (lag < 0) on the source cpu runqueue, it will
>>>>>>> also be ineligible when it is migrated to the destination
>>>>>>> cpu runqueue. Because we will keep the original equivalent
>>>>>>> lag of the task in place_entity(). So if the task was
>>>>>>> ineligible before, it will still be ineligible after
>>>>>>> migration.
>>>>>>>
>>>>>>> So in sched_balance_rq(), we prioritize migrating eligible
>>>>>>> tasks, and we soft-limit ineligible tasks, allowing them
>>>>>>> to migrate only when nr_balance_failed is non-zero to
>>>>>>> avoid load-balancing trying very hard to balance the load.
>>>>>
>>>>> Could you explain why you think it's better to balance eligible tasks
>>>>> in priority and potentially skip a load balance ?
>>>>
>>>> In place_entity(), we maintain the task's original equivalent lag, even
>>>> if we migrate the task to dst_rq, this does not change its eligibility
>>>> attribute.
>>>
>>> Yes, but you don't answer the question why it's better to select an
>>> eligible task vs a non eligible task.
>>>
>>>>
>>>> When there are multiple tasks on src_rq, and the dst_cpu has some
>>>> runnable tasks, migrating ineligible tasks to dst_rq will not allow them
>>>> to run. Therefore, such task migration is inefficient. We should
>>>
>>> Why is it inefficient ? load balance is about evenly balancing the
>>> number of tasks or the load between CPUs, it never says that the newly
>>> migrated task should run immediately
>>
>>
>> My initial thought is that when we need to migrate some tasks during
>> load balancing, at the current point in time, migrating ineligible tasks
>> to dst_cpu means they definitely cannot run there. Therefore, I prefer
>> to keep them on src_cpu to reduce the overhead of dequeueing and
>> enqueueing ineligible tasks.
> 
> Sorry but I still don't get why it's important and would make a
> difference. They are all runnable but ineligible tasks got more
> runtime than other at that point in time so there is no real
> difference


I adopt a lazy strategy for ineligible tasks. At the current point in 
time, even if we migrate ineligible tasks to the dst CPU, they still 
have to wait on the dst CPU until they become eligible. We do not see 
clear benefits from migrating ineligible tasks, but their dequeueing and 
enqueueing would instead incur overhead.

Let them wait on the src CPU until they become eligible before migrating 
them. this can reduce the number of task migrations.



Thanks,
Hao


> 
>>
>> Migrating eligible tasks to dst_cpu does not guarantee that they will
>> run earlier than on src_cpu. it depends on too many factors.
>>
>>
>>
>>>
>>>> prioritize migrating tasks that can run on dst_rq.
>>>>
>>>> In other words, migrating ineligible tasks is merely moving them to
>>>> another runqueue to wait until they become eligible.
>>>
>>> But I don't get why it's a problem. Migrating an eligible task might
>>> delay its scheduling because of its deadline vs other tasks already
>>> eligible on the dst_rq. Eligible and non eligible tasks are all
>>> runnable, it's just how much they have already run. In addition,
>>> migrating an eligible task will clear its positive vlag with
>>> DELAY_ZERO which is unfair IMO
>>
>>
>> Sorry, I'd like to ask you a question that confuses me: Why does
>> migrating eligible task will clear the positive vlag?
> 
> sorry I mess up everything that only for delayed dequeue task
> 
>>
>> In detach_task(), the ENQUEUE_DELAYED and DEQUEUE_SLEEP flags are not
>> set, and in dequeue_entity(), eligible tasks will not set sched_delayed,
>> so they will be dequeued normally with se->on_rq being 0.
>>
>> Similarly, attach_task() does not set the ENQUEUE_DELAYED and
>> DEQUEUE_SLEEP flags, and since se->on_rq is 0, it will not call
>> requeue_delayed_entity().
>>
>>
>> Thanks,
>> Hao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ