lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 31 Oct 2019 23:15:57 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Ram Muthiah <rammuthiah@...gle.com>,
        Quentin Perret <qperret@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, aaron.lwe@...il.com,
        mingo@...nel.org, pauld@...hat.com, jdesfossez@...italocean.com,
        naravamudan@...italocean.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, juri.lelli@...hat.com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        kernel-team@...roid.com, john.stultz@...aro.org
Subject: Re: NULL pointer dereference in pick_next_task_fair

On 30/10/2019 23:50, Ram Muthiah wrote:
> Quentin and I were able to create a setup which reproduces the issue.
> 
> Given this, I tried Peter's proposed fix and was still able to reproduce the
> issue unfortunately. Current patch is located here -
> https://android-review.googlesource.com/c/kernel/common/+/1153487
> 
> Our mitigation for this issue on the android-mainline branch has been to
> revert 67692435c411 ("sched: Rework pick_next_task() slow-path").
> https://android-review.googlesource.com/c/kernel/common/+/1152564
> 

Still no cigar, but one thing to note is that we got a similar splat on a
Juno just yesterday, here it is fed through decode_stacktrace.sh:

[   22.930829] Mem abort info:
[   22.935904]   ESR = 0x96000006
[   22.938924]   EC = 0x25: DABT (current EL), IL = 32 bits
[   22.944178]   SET = 0, FnV = 0
[   22.947197]   EA = 0, S1PTW = 0
[   22.950300] Data abort info:
[   22.953145]   ISV = 0, ISS = 0x00000006
[   22.956937]   CM = 0, WnR = 0
[   22.959870] user pgtable: 4k pages, 48-bit VAs, pgdp=00000009f3905000
[   22.966243] [0000000000000040] pgd=00000009efa6e003, pud=00000009f41c3003, pmd=0000000000000000
[   22.974858] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[   22.980369] Modules linked in: tda998x drm_kms_helper drm crct10dif_ce ip_tables x_tables ipv6 nf_defrag_ipv6
[   22.990200] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.4.0-rc2-00001-gaa57157be69f #1
[   22.998036] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 19 2018
[   23.008710] pstate: 60000085 (nZCv daIf -PAN -UAO)
[   23.013457] pc : set_next_entity (kernel/sched/fair.c:4156)
[   23.017511] lr : pick_next_task_fair (kernel/sched/fair.c:6829 (discriminator 1))
[   23.021991] sp : ffff800011a13e10
[   23.025267] x29: ffff800011a13e10 x28: 0000000000000000
[   23.030525] x27: ffff800011a13f10 x26: ffff800010c205ec
[   23.035782] x25: ffff000975cea108 x24: ffff800011789000
[   23.041038] x23: ffff8000113cf000 x22: ffff000975ce9b80
[   23.046294] x21: ffff00097ef78d40 x20: ffff000974e21a00
[   23.051550] x19: 0000000000000000 x18: 0000000000000000
[   23.056806] x17: 0000000000000000 x16: 0000000000000000
[   23.062062] x15: 0000000000000000 x14: 00000000000001ad
[   23.067317] x13: 0000000000000001 x12: 071c71c71c71c71c
[   23.072573] x11: 0000000000000500 x10: ffff00097ef77dc8
[   23.077829] x9 : 0000000000000087 x8 : ffff00097ef77de8
[   23.083085] x7 : 0000000000000003 x6 : 0000000000000000
[   23.088340] x5 : 0000000000000000 x4 : 0000000000000000
[   23.093596] x3 : 000000054f6e7800 x2 : 0000000000000000
[   23.098851] x1 : 0000000000000000 x0 : ffff000974e21a00
[   23.104107] Call trace:
[   23.106527] set_next_entity (kernel/sched/fair.c:4156)
[   23.110236] pick_next_task_fair (kernel/sched/fair.c:6829 (discriminator 1))
[   23.114377] __schedule (kernel/sched/core.c:3920 kernel/sched/core.c:4039)
[   23.117828] schedule_idle (kernel/sched/core.c:4165 (discriminator 1))
[   23.121365] do_idle (./arch/arm64/include/asm/current.h:19 ./arch/arm64/include/asm/preempt.h:31 kernel/sched/idle.c:275)
[   23.124557] cpu_startup_entry (kernel/sched/idle.c:355 (discriminator 1))
[   23.128439] secondary_start_kernel (arch/arm64/kernel/smp.c:262)

The faulty line is the very first se dereference in set_next_entity().
As Peter pointed out on IRC, this happens in the 'simple' path (prev is the
idle task).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ