lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f269370f-d72c-08e3-da06-3cd70cdf75f2@redhat.com>
Date:   Fri, 20 Jan 2023 13:10:38 -0500
From:   Waiman Long <longman@...hat.com>
To:     Will Deacon <will@...nel.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Lai Jiangshan <jiangshanlai@...il.com>, qperret@...gle.com
Subject: Re: [PATCH v10 2/5] sched: Use user_cpus_ptr for saving user provided
 cpumask in sched_setaffinity()

On 1/20/23 12:59, Will Deacon wrote:
> Hey Waiman,
>
> Cheers for the quick reply.
>
> On Tue, Jan 17, 2023 at 01:13:31PM -0500, Waiman Long wrote:
>> On 1/17/23 11:08, Will Deacon wrote:
>>> On Thu, Sep 22, 2022 at 02:00:38PM -0400, Waiman Long wrote:
>>>> The user_cpus_ptr field is added by commit b90ca8badbd1 ("sched:
>>>> Introduce task_struct::user_cpus_ptr to track requested affinity"). It
>>>> is currently used only by arm64 arch due to possible asymmetric CPU
>>>> setup. This patch extends its usage to save user provided cpumask
>>>> when sched_setaffinity() is called for all arches. With this patch
>>>> applied, user_cpus_ptr, once allocated after a successful call to
>>>> sched_setaffinity(), will only be freed when the task exits.
> [...]
>
>>> We've tracked this down as the cause of an arm64 regression in Android and I've
>>> reproduced the issue with mainline.
>>>
>>> Basically, if an arm64 system is booted with "allow_mismatched_32bit_el0" on
>>> the command-line, then the arch code will (amongst other things) call
>>> force_compatible_cpus_allowed_ptr() and relax_compatible_cpus_allowed_ptr()
>>> when exec()'ing a 32-bit or a 64-bit task respectively.
>> IOW, relax_compatible_cpus_allowed_ptr() can be called without a previous
>> force_compatible_cpus_allowed_ptr(). Right?
> In practice, these functions are only called by arm64 during exec. As above,
> exec()'ing a 32-bit task calls force_compatible_cpus_allowed_ptr() and
> exec()'ing a 64-bit task calls relax_compatible_cpus_allowed_ptr(). So
> they don't come in pairs at all; it's just that calling relax_[...] should
> try to restore the affinity mask if it was previously clobbered by
> force_[...].
>
That was what I thought.
>> A possible optimization in this case is to add a bit flag in the task_struct
>> to indicate a previous call to force_compatible_cpus_allowed_ptr(). Without
>> that flag set, relax_compatible_cpus_allowed_ptr() can return immediately.
> How is this an optimisation over a pointer comparison?

The sched_setaffinity() patch had repurposed user_cpus_ptr as a user 
requested cpu affinity mask irrespective if 
force_compatible_cpus_allowed_ptr() has been called or not. So checking 
against user_cpus_ptr will no longer serve its purpose as an indicator 
if force_compatible_cpus_allowed_ptr() has been called or not.


>>> I've had a crack at fixing the code above to restore the old behaviour, and it
>>> seems to work for my basic tests (still pending confirmation from others):
>>>
>>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index bb1ee6d7bdde..0d4a11384648 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -3125,17 +3125,16 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
>>>    void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
>>>    {
>>>           struct affinity_context ac = {
>>> -               .new_mask  = task_user_cpus(p),
>>> +               .new_mask  = p->user_cpus_ptr,
>>>                   .flags     = 0,
>>>           };
>>> -       int ret;
>>>           /*
>>>            * Try to restore the old affinity mask with __sched_setaffinity().
>>>            * Cpuset masking will be done there too.
>>>            */
>>> -       ret = __sched_setaffinity(p, &ac);
>>> -       WARN_ON_ONCE(ret);
>>> +       if (ac.new_mask)
>>> +               WARN_ON_ONCE(__sched_setaffinity(p, &ac));
>>>    }
>>>    void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
>>>
>>>
>>> With this change, task_user_cpus() is only used by restrict_cpus_allowed_ptr()
>>> so I'd be inclined to remove it altogether tbh.
>>>
>>> What do you think?
>> The problem here is that force_compatible_cpus_allowed_ptr() can be called
>> without a matching relax_compatible_cpus_allowed_ptr() at the end. So we may
>> end up artificially restrict the number of cpus that can be used when
>> running a 64-bit binary.
> Hmm, is this because an intervening call to sched_setaffinity() could've
> set ->user_cpus_ptr? If so, I'd have thought that would also point to a
> superset of the effective affinity -- is that not the case?
>
>> What do you think about the idea of having a bit flag to track that?
> I'm not hugely happy with that approach because it's adding additional state
> which is only needed for arm64, and only when operating in this funny
> asymmetric mode. I also don't understand how it would interact with the new
> sched_setaffinity() behaviour; would we need to clear the flag when that
> function updates the mask?
The new flag bit will be independent of sched_setaffinity() call. It is 
set when restrict_cpus_allowed_ptr() is called and cleared in 
relax_compatible_cpus_allowed_ptr() if it is set before. I will post a 
patch for your evaluation.
>
> Since I'm basically trying to re-instate the v6.1 behaviour to fix the arm64
> regression, I'm happy to review/test any proposal you have, but as we get
> closer to the 6.2 release I'm wondering whether it would make more sense to
> revert the sched_setaffinity() changes for now and I can help you with arm64
> review and testing if we bring the changes back for e.g. 6.4.

The purpose of the bit flag is to reinstate 6.1 behavior.

Cheers,
Longman

>
> Will
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ