linux-kernel - Re: Which came first, hard kernel lockup or SATA errors?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 10 Oct 2017 07:04:19 -0700
From:   Ed Swierk <eswierk@...portsystems.com>
To:     eswierk@...portsystems.com, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: Which came first, hard kernel lockup or SATA errors?

Continuing the conversation with the voices in my head...

On Mon, Oct 9, 2017 at 10:45 PM, Ed Swierk <eswierk@...portsystems.com> wrote:
> Based on the addresses in the stack and registers, here's what I think
> happened.
> 
> On cpu 13:
> 
> - task_numa_fault() calls task_numa_migrate(), which selects the task
>   on cpu 0 as the dst_task.
> - migrate_swap() calls stop_two_cpus(), which acquires the cpu_stopper
>   locks for the dst_cpu (cpu 0, at 0xffff881033fce600) and src_cpu
>   (cpu X, at 0xffff8820341ce600).
> - stop_two_cpus() calls wake_up_process() on the lower-numbered cpu
>   first, which has to be cpu 0.
> - wake_up_process() spins until the cpu 0 task (at 0xffff88102cc8dc00)
>   is no longer on_cpu.
> 
> On cpu 0:
> 
> - pick_next_task_fair() calls idle_balance(). According to the "This
>   is OK" comment, current is on_cpu at this point.
> - idle_balance() calls load_balance() for dst_cpu 0.
> - load_balance() decides to move a task from cpu X, so calls
>   stop_one_cpu_nowait() on cpu X.
> - stop_one_cpu_nowait() spins trying to acquire the cpu_stopper lock
>   for cpu X (at 0xffff8820341ce600).
> 
> So idle_balance() on cpu 0 is stuck waiting for task_numa_fault() to
> move a task to cpu 0, which is blocked on idle_balance() completing.

Also, it appears that task_numa_fault() tries to migrate current, so
the src_cpu X used by task_numa_migrate() is cpu 13 in this
case. Though the key issue is that both task_numa_migrate() and
idle_balance() are trying to stop the same cpu, regardless of whether
it's the cpu task_numa_migrate() is running on.

So I'm wondering how this situation could be prevented.

Can task_numa_migrate() avoid picking a dst_task that might itself
try to stop either src_cpu or dst_cpu?

Or, can load_balance() avoid a cpu that might be stopped for migration
(or any other reason), or detect such a conflict and bail out rather
than spinning forever?

--Ed