linux-kernel - Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ada6743c-a212-39ef-f206-fc81ed4492ef@arm.com>
Date:   Tue, 24 Sep 2019 17:12:19 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>,
        shikemeng <shikemeng@...wei.com>, mingo@...hat.com,
        peterz@...radead.org
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix migration to invalid cpu in
 __set_cpus_allowed_ptr

On 24/09/2019 15:09, Dietmar Eggemann wrote:
> On 9/23/19 6:06 PM, Valentin Schneider wrote:
>> On 23/09/2019 16:43, Dietmar Eggemann wrote:
>>> I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.
>>>
>>> __set_cpus_allowed_ptr(...)
>>> {
>>>     ...
>>>     dest_cpu = cpumask_any_and(...)
>>>     ...
>>> }
>>>
>>> With:
>>>
>>> #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
>>> #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
>>> (src2p))
>>>
>>> cpumask_next_and() is called with n = -1 and in this case does not
>>> invoke cpumask_check().
>>>
>>
>> It won't warn here because it's still a valid return value, but it should
>> warn in the cpumask_test_cpu() that follows (in is_cpu_allowed()) because
>> it would be passed a value >= nr_cpu_ids. So at the very least this config
>> does catch cpumask_any*() return values being blindly passed to
>> cpumask_test_cpu().
> 
> OK, I see and agree.
> 
> But IMHO, we still don't call cpumask_test_cpu(dest_cpu, ...), right.
> 
> What the patch fixes is that it closes the window between two reads of
> cpu_active_mask in which cpuhp can potentially punch a hole into the
> cpu_active_mask.
> 
> If p is not running or queued and it's state is unequal to TASK_WAKING,
> a 'dest_cpu == nr_cpu_ids' goes unnoticed.

In this case we don't need to force it off to another CPU, since that will
get sorted out at its next wakeup. However, the patch still catches that
, since it does an early

if (dest_cpu >= nr_cpu_ids) {
        ret = -EINVAL;
        goto out;

and that's regardless of the task's state.

> Otherwise we see an 'unable
> to handle kernel paging request' or 'unable to handle page fault for
> address' bug in migration_cpu_stop() or move_queued_task().
> 
> Do I miss something?
> 
> [...]
>