lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Sep 2019 10:41:35 +0530
From:   Pavan Kondeti <pkondeti@...eaurora.org>
To:     KeMeng Shi <shikemeng@...wei.com>
Cc:     mingo@...hat.com, peterz@...radead.org, valentin.schneider@....com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched: fix migration to invalid cpu in
 __set_cpus_allowed_ptr

On Mon, Sep 16, 2019 at 06:53:28AM +0000, KeMeng Shi wrote:
> Oops occur when running qemu on arm64:
>  Unable to handle kernel paging request at virtual address ffff000008effe40
>  Internal error: Oops: 96000007 [#1] SMP
>  Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
>  pstate: 20000085 (nzCv daIf -PAN -UAO)
>  pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>  lr : move_queued_task.isra.21+0x124/0x298
>  ...
>  Call trace:
>   __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>   __migrate_task+0xc8/0xe0
>   migration_cpu_stop+0x170/0x180
>   cpu_stopper_thread+0xec/0x178
>   smpboot_thread_fn+0x1ac/0x1e8
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
> 
> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to
> migrage the process if process is not currently running on any one of the
> CPUs specified in affinity mask. __set_cpus_allowed_ptr will choose an
> invalid dest_cpu (dest_cpu >= nr_cpu_ids, 1024 in my virtual machine) if 
> CPUS in an affinity mask are deactived by cpu_down after cpumask_intersects
> check. cpumask_test_cpu of dest_cpu afterwards is overflow and may pass if
> corresponding bit is coincidentally set. As a consequence, kernel will
> access an invalid rq address associate with the invalid cpu in
> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs.
> 
> Process as follows may trigger the Oops:
> 1) A process repeatedly binds itself to cpu0 and cpu1 in turn by calling
> sched_setaffinity.
> 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online"
> and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn.
> 3) Oops appears if the invalid cpu is set in memory after tested cpumask.
> 
> Signed-off-by: KeMeng Shi <shikemeng@...wei.com>
> Reviewed-by: Valentin Schneider <valentin.schneider@....com>
> ---
> Changes in v2:
> -solve format problems in log
> 
>  kernel/sched/core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3c7b90bcbe4e..087f4ac30b60 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1656,7 +1656,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>  	if (cpumask_equal(p->cpus_ptr, new_mask))
>  		goto out;
>  
> -	if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
> +	dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
> +	if (dest_cpu >= nr_cpu_ids) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -1677,7 +1678,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>  	if (cpumask_test_cpu(task_cpu(p), new_mask))
>  		goto out;
>  
> -	dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
>  	if (task_running(rq, p) || p->state == TASK_WAKING) {
>  		struct migration_arg arg = { p, dest_cpu };
>  		/* Need help from migration thread: drop lock and wait. */
> -- 
> 2.19.1
> 
> 

The cpu_active_mask might have changed in between. Your fix looks good to me.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ