lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 27 Feb 2020 12:45:11 +0800
From:   Aubrey Li <aubrey.intel@...il.com>
To:     Vineeth Remanan Pillai <vpillai@...italocean.com>
Cc:     Aaron Lu <aaron.lwe@...il.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Dario Faggioli <dfaggioli@...e.com>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v4 00/19] Core scheduling v4

On Thu, Feb 27, 2020 at 5:54 AM Vineeth Remanan Pillai
<vpillai@...italocean.com> wrote:
>
> Hi Aubrey,
>>
>>
>> Thanks for the quick fix. I guess this patch should work for Aaron's case
>> because it almost removed the cookie checking here. Some comments
>> below:
>>
>> + if (rq->core->core_forceidle)
>> +         return true;
>>
>> We check cookie match during load balance to avoid two different
>> cookie tasks on the same core. If one cpu is forced idle, its sibling should
>> be running a non-idle task, not sure why we can return true directly here.
>> And if we need to check cookie, with this forced idle case, why it's special
>> to be picked up?
>
> The idea behind this was: If a core is forced idle, then we have a
> scenario where one task is running without a companion and one task
> is forced to be idle. So if this source task is of the same cookie
> as the forced idle task, then migrating this source task will offer a
> chance for the force idle task to run together with task in subsequent
> schedules. We don't have a framework to check the cookie of forced
> idle tasks and hence this if block is only partial fix.

To mitigate force idle, this helper is trying to prevent unmatched case, not
catch the matched case. If the task's cookie is matched with forced idle
cpu's sibling, it's able to pass the check later.  Comparing to my original
intention, this fix increases unmatched case as it returns immediately and
we don't check cookie later.

>>
>>
>> + if (task_h_load(p) > task_h_load(rq->curr))
>> +         return true;
>> + if (task_util_est(p) > task_util_est(rq->curr))
>> +         return true;
>>
>> These two need to exclude rq->curr == rq->idle case?
>> otherwise idle balance always return true?
>
> rq->curr being NULL can mean that the sibling is idle or forced idle.
> In both the cases, I think it makes sense to migrate a task so that it can
> compete with the other sibling for a chance to run. This function
> can_migrate_task actually only says if this task is eligible and
> later part of the code decides whether it is okay to migrate it
> based on factors like load and util and capacity. So I think its
> fine to declare the task as eligible if the dest core is running
> idle. Does this thinking make sense?
>
The intention here is stopping migration not allowing migration.
task_h_load(idle) is zero, but idle_task's sibling CPU is running
a task with cookieA, the fix returns immediately when
task_h_load(task cookieB) > 0, this puts another unmatched case in.

> On our testing, it did not show much degradation in performance with
> this change. I am reworking the fix by removing the check for
> task_est_util. It doesn't seem to be valid to check for util to migrate
> the task.

I'm not sure the check for task_h_load is exactly wanted here either.
>From my understanding, the problem is because we stopped high
weight task migration because its cookie does not match with the target's,
can we check high load here instead of high weight?

Thanks,
-Aubrey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ