lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6a673d4-ee16-4458-bf68-8f75d5062984@arm.com>
Date: Tue, 20 Aug 2024 17:43:32 +0100
From: Hongyan Xia <hongyan.xia2@....com>
To: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
 juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com,
 youssefesmat@...omium.org, tglx@...utronix.de, efault@....de
Subject: Re: [PATCH 00/24] Complete EEVDF

Hi Peter,

On 27/07/2024 11:27, Peter Zijlstra wrote:
> Hi all,
> 
> So after much delay this is hopefully the final version of the EEVDF patches.
> They've been sitting in my git tree for ever it seems, and people have been
> testing it and sending fixes.
> 
> I've spend the last two days testing and fixing cfs-bandwidth, and as far
> as I know that was the very last issue holding it back.
> 
> These patches apply on top of queue.git sched/dl-server, which I plan on merging
> in tip/sched/core once -rc1 drops.
> 
> I'm hoping to then merge all this (+- the DVFS clock patch) right before -rc2.
> 
> 
> Aside from a ton of bug fixes -- thanks all! -- new in this version is:
> 
>   - split up the huge delay-dequeue patch
>   - tested/fixed cfs-bandwidth
>   - PLACE_REL_DEADLINE -- preserve the relative deadline when migrating
>   - SCHED_BATCH is equivalent to RESPECT_SLICE
>   - propagate min_slice up cgroups
>   - CLOCK_THREAD_DVFS_ID
> 

The latest tip/sched/core at commit

aef6987d89544d63a47753cf3741cabff0b5574c

crashes very early on on my Juno r2 board (arm64). The trace is here:

[    0.049599] ------------[ cut here ]------------
[    0.054279] kernel BUG at kernel/sched/deadline.c:63!
[    0.059401] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
[    0.066285] Modules linked in:
[    0.069382] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 
6.11.0-rc1-g55404cef33db #1070
[    0.077855] Hardware name: ARM Juno development board (r2) (DT)
[    0.083856] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS 
BTYPE=--)
[    0.090919] pc : enqueue_dl_entity+0x53c/0x540
[    0.095434] lr : dl_server_start+0xb8/0x10c
[    0.099679] sp : ffffffc081ca3c30
[    0.103034] x29: ffffffc081ca3c40 x28: 0000000000000001 x27: 
0000000000000002
[    0.110281] x26: 00000000000b71b0 x25: 0000000000000000 x24: 
0000000000000001
[    0.117525] x23: ffffff897ef21140 x22: 0000000000000000 x21: 
0000000000000000
[    0.124770] x20: ffffff897ef21040 x19: ffffff897ef219a8 x18: 
ffffffc080d0ad00
[    0.132015] x17: 000000000000002f x16: 0000000000000000 x15: 
ffffffc081ca8000
[    0.139260] x14: 00000000016ef200 x13: 00000000000e6667 x12: 
0000000000000001
[    0.146505] x11: 000000003b9aca00 x10: 0000000002faf080 x9 : 
0000000000000030
[    0.153749] x8 : 0000000000000071 x7 : 000000002cf93d25 x6 : 
000000002cf93d25
[    0.160994] x5 : ffffffc081e04938 x4 : ffffffc081ca3d40 x3 : 
0000000000000001
[    0.168238] x2 : 000000003b9aca00 x1 : 0000000000000001 x0 : 
ffffff897ef21040
[    0.175483] Call trace:
[    0.177958]  enqueue_dl_entity+0x53c/0x540
[    0.182117]  dl_server_start+0xb8/0x10c
[    0.186010]  enqueue_task_fair+0x5c8/0x6ac
[    0.190165]  enqueue_task+0x54/0x1e8
[    0.193793]  wake_up_new_task+0x250/0x39c
[    0.197862]  kernel_clone+0x140/0x2f0
[    0.201578]  user_mode_thread+0x4c/0x58
[    0.205468]  rest_init+0x24/0xd8
[    0.208743]  start_kernel+0x2bc/0x2fc
[    0.212460]  __primary_switched+0x80/0x88
[    0.216535] Code: b85fc3a8 7100051f 54fff8e9 17ffffce (d4210000)
[    0.222711] ---[ end trace 0000000000000000 ]---
[    0.227391] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.234187] ---[ end Kernel panic - not syncing: Attempted to kill 
the idle task! ]---

I'm not an expert in DL server so I have no idea where the problem could 
be. If you know where to look off the top of your head then much better. 
If not, I'll do some bi-section later.

Hongyan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ