lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161018154025.1e686cad@nial.brq.redhat.com>
Date:   Tue, 18 Oct 2016 15:40:25 +0200
From:   Igor Mammedov <imammedo@...hat.com>
To:     linux-kernel@...r.kernel.org
Cc:     mingo@...hat.com, peterz@...radead.org, tglx@...utronix.de,
        efault@....de, torvalds@...ux-foundation.org, imammedo@...hat.com
Subject: regression since 4.8 and newer in select_idle_siblings()

kernel crashes at runtime  due null pointer dereference at
  select_idle_sibling()
     -> select_idle_cpu()
         ...
         u64 avg_cost = this_sd->avg_scan_cost;

regression bisects to:
  commit 10e2f1acd0106c05229f94c70a344ce3a2c8008b
  Author: Peter Zijlstra <peterz@...radead.org>
  sched/core: Rewrite and improve select_idle_siblings()

to reproduce crash at runtime start VM with:
 qemu-system-x86_64 [-enable-kvm] \
    -smp 4,sockets=2 \
    linux48_disk.img

and offline cpu1 in guest:
 echo 0 > /sys/devices/system/cpu/cpu1/online

as result guest panics immediately or with some small delay
from some path that triggers access to select_idle_sibling().


To reproduce crash at boot start VM with a recent QEMU (since 2.7):
 qemu-2.7/qemu-system-x86_64
    -smp 1,sockets=2,cores=2,threads=1,maxcpus=4 \
    -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0 \
    -device qemu64-x86_64-cpu,socket-id=1,core-id=1,thread-id=0 \
    -kernel bzImage_v48 [-enable-kvm]


=== one of the panics ===
[    0.688680] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[    0.688685] IP: [<ffffffff810de382>] select_idle_sibling+0x172/0x3b0
[    0.688686] PGD 0 
[    0.688687] Oops: 0000 [#1] SMP
[    0.688690] CPU: 0 PID: 109 Comm: kworker/u8:2 Not tainted 4.8.0-rc8+ #675
[    0.688690] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
[    0.688694] Workqueue: events_unbound async_run_entry_fn
[    0.688695] task: ffff88007c258000 task.stack: ffff88007c3b0000
[    0.688697] RIP: 0010:[<ffffffff810de382>]  [<ffffffff810de382>] select_idle_sibling+0x172/0x3b0
[    0.688697] RSP: 0000:ffff88007c3b3bb0  EFLAGS: 00010007
[    0.688698] RAX: 000000000000051b RBX: 0000000000000004 RCX: 0000000000000001
[    0.688699] RDX: 0000000000000040 RSI: 0000000000000004 RDI: ffff88007d00a008
[    0.688699] RBP: ffff88007c3b3c10 R08: 0000000000000000 R09: 0000000000000000
[    0.688700] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002
[    0.688700] R13: ffff88007d00a008 R14: 0000000000000000 R15: 0000000000000004
[    0.688701] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[    0.688702] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.688703] CR2: 0000000000000078 CR3: 0000000001c06000 CR4: 00000000000006f0
[    0.688705] Stack:
[    0.688707]  0000000000000001 ffff88007c80e480 000000000000a118 ffff88007c282900
[    0.688708]  0000000100000000 0000000000000002 0000000200000000 ffff88007c80e600
[    0.688709]  ffff88007c282900 0000000000018ec0 0000000000000000 0000000000000000
[    0.688710] Call Trace:
[    0.688712]  [<ffffffff810decd7>] select_task_rq_fair+0x717/0x730
[    0.688713]  [<ffffffff810e1ba7>] ? update_curr+0xc7/0x150
[    0.688715]  [<ffffffff810dc33c>] ? __enqueue_entity+0x6c/0x70
[    0.688718]  [<ffffffff810d5224>] try_to_wake_up+0x104/0x390
[    0.688719]  [<ffffffff810d5c15>] wake_up_process+0x15/0x20
[    0.688724]  [<ffffffff8153cc03>] scsi_eh_wakeup+0x33/0xa0
[    0.688725]  [<ffffffff8153ccbc>] scsi_schedule_eh+0x4c/0x60
[    0.688728]  [<ffffffff8156d76f>] ata_std_sched_eh+0x3f/0x60
[    0.688729]  [<ffffffff8156d7c3>] ata_port_schedule_eh+0x13/0x20
[    0.688730]  [<ffffffff815618d4>] __ata_port_probe+0x44/0x60
[    0.688731]  [<ffffffff81565fe0>] ata_port_probe+0x20/0x40
[    0.688732]  [<ffffffff8156602e>] async_port_probe+0x2e/0x60
[    0.688734]  [<ffffffff810cccc9>] async_run_entry_fn+0x39/0x140
[    0.688736]  [<ffffffff810c34d2>] process_one_work+0x152/0x400
[    0.688738]  [<ffffffff810c38a5>] worker_thread+0x125/0x4b0
[    0.688739]  [<ffffffff810c3780>] ? process_one_work+0x400/0x400
[    0.688740]  [<ffffffff810c9cb8>] kthread+0xd8/0xf0
[    0.688744]  [<ffffffff816c4e3f>] ret_from_fork+0x1f/0x40
[    0.688745]  [<ffffffff810c9be0>] ? __kthread_parkme+0x70/0x70
[    0.688757] Code: c7 c0 20 dd 00 00 65 48 03 05 c3 bd f2 7e 4c 8b 30 48 c7 c0 c0 8e 01 00 65 48 03 05 b1 bd f2 7e 48 8b 80 c8 09 00 00 48 c1 e8 09 <49> 39 46 78 0f 87 29 02 00 00 65 8b 3d 9d bd f2 7e e8 b8 c8 ff 
[    0.688758] RIP  [<ffffffff810de382>] select_idle_sibling+0x172/0x3b0
[    0.688759]  RSP <ffff88007c3b3bb0>
[    0.688759] CR2: 0000000000000078
[    0.688762] ---[ end trace f10266de945b1779 ]---

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ