lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 30 Nov 2015 10:20:48 +0800
From:	Wanpeng Li <kernellwp@...il.com>
To:	Luca Abeni <luca.abeni@...tn.it>
Cc:	Juri Lelli <juri.lelli@....com>,
	Wanpeng li <wanpeng.li@...mail.com>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] sched/deadline: fix earliest_dl.next logic

2015-11-27 20:14 GMT+08:00 Luca Abeni <luca.abeni@...tn.it>:
> Hi all,
>
> I ran some quick tests on this patch (because I was working on something
> related), and it seems to me that it triggers a bug. Here are some
> information:
> - I applied the patch to git master (to be sure that I did not corrupt the
> patch,
>   I extracted it again with git format-patch, and posted it here:
>
> http://disi.unitn.it/~abeni/H-Test/0001-sched-deadline-fix-earliest_dl.next-logic.patch
> )
> - I tried a simple test program creating 4 periodic threads and scheduling
> them with
>   SCHED_DEADLINE. And I got something like:
> [   29.055688] ------------[ cut here ]------------
> [   29.056563] kernel BUG at
> /home/luca/Src/Kernel/linux/kernel/sched/deadline.c:1443!
> [   29.056563] invalid opcode: 0000 [#1] SMP
> [   29.056563] Modules linked in:
> [   29.056563] CPU: 0 PID: 1136 Comm: periodic_thread Not tainted 4.4.0-rc2+
> #3
> [   29.056563] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> Bochs 01/01/2011
> [   29.056563] task: ffff880006603d80 ti: ffff8800067f4000 task.ti:
> ffff8800067f4000
> [   29.056563] RIP: 0010:[<ffffffff8109704a>]  [<ffffffff8109704a>]
> pick_next_pushable_dl_task+0x5a/0x60
> [   29.056563] RSP: 0000:ffff8800067f7e38  EFLAGS: 00010046
> [   29.056563] RAX: ffff880006603d80 RBX: ffff880006a7ac50 RCX:
> ffff8800067f4000
> [   29.056563] RDX: ffff8800066040d0 RSI: 0000000000000000 RDI:
> ffff880007a15ac0
> [   29.056563] RBP: ffff8800067f7e38 R08: 0000000000000000 R09:
> 0000000000000000
> [   29.056563] R10: ffff880006603d80 R11: ffff880006604548 R12:
> ffff880007a15ac0
> [   29.056563] R13: ffff880006a7a900 R14: ffff8800066042e0 R15:
> ffffffff81a09cc0
> [   29.056563] FS:  00007f667e179700(0000) GS:ffff880007a00000(0000)
> knlGS:0000000000000000
> [   29.056563] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   29.056563] CR2: 00007f71f3c221b8 CR3: 00000000067ae000 CR4:
> 00000000000006f0
> [   29.056563] Stack:
> [   29.056563]  ffff8800067f7e60 ffffffff81097137 ffff880007a15ac0
> ffff880006a7ab20
> [   29.056563]  ffff880006a7a900 ffff8800067f7e90 ffffffff81098ef1
> ffff880006603d80
> [   29.056563]  ffff880007a15ac0 ffff880007a15ad8 0000000000000000
> ffff8800067f7ee8
> [   29.056563] Call Trace:
> [   29.056563]  [<ffffffff81097137>] dequeue_pushable_dl_task+0x47/0x90
> [   29.056563]  [<ffffffff81098ef1>] pick_next_task_dl+0x81/0x200
> [   29.056563]  [<ffffffff819521ad>] __schedule+0xb3d/0xd00
> [   29.056563]  [<ffffffff819523a7>] schedule+0x37/0x80
> [   29.056563]  [<ffffffff81050d46>] exit_to_usermode_loop+0x39/0x7f
> [   29.056563]  [<ffffffff81001a76>] prepare_exit_to_usermode+0x46/0x50
> [   29.056563]  [<ffffffff819586a5>] retint_user+0x8/0x20
> [   29.056563] Code: 3b 87 18 09 00 00 74 23 83 7a 98 01 7e 1b 83 ba f4 fc
> ff ff 01 75 10 8b 92 f8 fc ff ff 85 d2 78 04 0f 0b 31 c0 5d c3 0f 0b 0f 0b
> <0f> 0b 0f 0b 66 90 55 48 89 e5 41 54 49 89 f4 53 48 8b b7 18 09
> [   29.056563] RIP  [<ffffffff8109704a>]
> pick_next_pushable_dl_task+0x5a/0x60
> [   29.056563]  RSP <ffff8800067f7e38>
> [   29.056563] ---[ end trace 9f49218ec54a0b55 ]---
> - How to reproduce:
>   + Download http://disi.unitn.it/~abeni/H-Test/newcore-3.gz
>   + Start KVM with:
>         qemu-system-x86_64 -initrd newcore-3.gz -machine
> pc-q35-2.0,accel=kvm -smp cpus=2 -kernel bzImage -serial file:qlog.txt
> -append "console=ttyS0"
>   + Inside the VM, do
>         cd /
>         sudo ./periodic_thread
>   + You should see a crash in few seconds
>   The "periodic_thread" program just creates 4 periodic threads (with
> execution time about
>   10ms and period about 100ms) scheduled with SCHED_DEADLINE (with runtime
> 15ms and period
>   100ms). I can provide the source code if needed.
> - The full kernel log is available here:
> http://disi.unitn.it/~abeni/H-Test/qlog.txt
>
>
> Here is my understanding of the crash:
> - schedule() invokes pick_next_task_dl() which wants to do a context switch
> (by selecting
>   for execution a new task "p" which is different from "prev")
> - pick_next_task_dl() invokes put_prev, which puts the "prev" task in the
> pushable tasks
>   queue (WARNING! "prev" is still the "current" task in this rq, because the
> scheduler is
>   still running... I think this is the source of the issue)
> - then, pick_next_task_dl() invokes dequeue_pushable_dl_task() on p, to
> remove the selected
>   task from the pushable tasks queue...
> - ...But after your patch dequeue_pushable_dl_task() invokes
> pick_next_pushable_dl_task().
>   Which sees that the next pushable task is the "current" task (see above).
> This happens
>   becuase "prev" has already been inserted in the pushable tasks queue, and
> can be the
>   next pushable task... But "current" has not been updated yet.
> - The BUG_ON() at line 1443 of deadline.c is just "BUG_ON(task_current(rq,
> p))"

Thanks for your report and great analyse, Luca, you are right.

>
> Summing up, I think pick_next_pushable_dl_task() cannot be called from
> dequeue_pushable_dl_task() (at least, not without removing or modifying that
> BUG_ON()).

Juri, how about remove this BUG_ON() just like rt class?

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists