[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6a673d4-ee16-4458-bf68-8f75d5062984@arm.com>
Date: Tue, 20 Aug 2024 17:43:32 +0100
From: Hongyan Xia <hongyan.xia2@....com>
To: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com,
youssefesmat@...omium.org, tglx@...utronix.de, efault@....de
Subject: Re: [PATCH 00/24] Complete EEVDF
Hi Peter,
On 27/07/2024 11:27, Peter Zijlstra wrote:
> Hi all,
>
> So after much delay this is hopefully the final version of the EEVDF patches.
> They've been sitting in my git tree for ever it seems, and people have been
> testing it and sending fixes.
>
> I've spend the last two days testing and fixing cfs-bandwidth, and as far
> as I know that was the very last issue holding it back.
>
> These patches apply on top of queue.git sched/dl-server, which I plan on merging
> in tip/sched/core once -rc1 drops.
>
> I'm hoping to then merge all this (+- the DVFS clock patch) right before -rc2.
>
>
> Aside from a ton of bug fixes -- thanks all! -- new in this version is:
>
> - split up the huge delay-dequeue patch
> - tested/fixed cfs-bandwidth
> - PLACE_REL_DEADLINE -- preserve the relative deadline when migrating
> - SCHED_BATCH is equivalent to RESPECT_SLICE
> - propagate min_slice up cgroups
> - CLOCK_THREAD_DVFS_ID
>
The latest tip/sched/core at commit
aef6987d89544d63a47753cf3741cabff0b5574c
crashes very early on on my Juno r2 board (arm64). The trace is here:
[ 0.049599] ------------[ cut here ]------------
[ 0.054279] kernel BUG at kernel/sched/deadline.c:63!
[ 0.059401] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
[ 0.066285] Modules linked in:
[ 0.069382] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
6.11.0-rc1-g55404cef33db #1070
[ 0.077855] Hardware name: ARM Juno development board (r2) (DT)
[ 0.083856] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 0.090919] pc : enqueue_dl_entity+0x53c/0x540
[ 0.095434] lr : dl_server_start+0xb8/0x10c
[ 0.099679] sp : ffffffc081ca3c30
[ 0.103034] x29: ffffffc081ca3c40 x28: 0000000000000001 x27:
0000000000000002
[ 0.110281] x26: 00000000000b71b0 x25: 0000000000000000 x24:
0000000000000001
[ 0.117525] x23: ffffff897ef21140 x22: 0000000000000000 x21:
0000000000000000
[ 0.124770] x20: ffffff897ef21040 x19: ffffff897ef219a8 x18:
ffffffc080d0ad00
[ 0.132015] x17: 000000000000002f x16: 0000000000000000 x15:
ffffffc081ca8000
[ 0.139260] x14: 00000000016ef200 x13: 00000000000e6667 x12:
0000000000000001
[ 0.146505] x11: 000000003b9aca00 x10: 0000000002faf080 x9 :
0000000000000030
[ 0.153749] x8 : 0000000000000071 x7 : 000000002cf93d25 x6 :
000000002cf93d25
[ 0.160994] x5 : ffffffc081e04938 x4 : ffffffc081ca3d40 x3 :
0000000000000001
[ 0.168238] x2 : 000000003b9aca00 x1 : 0000000000000001 x0 :
ffffff897ef21040
[ 0.175483] Call trace:
[ 0.177958] enqueue_dl_entity+0x53c/0x540
[ 0.182117] dl_server_start+0xb8/0x10c
[ 0.186010] enqueue_task_fair+0x5c8/0x6ac
[ 0.190165] enqueue_task+0x54/0x1e8
[ 0.193793] wake_up_new_task+0x250/0x39c
[ 0.197862] kernel_clone+0x140/0x2f0
[ 0.201578] user_mode_thread+0x4c/0x58
[ 0.205468] rest_init+0x24/0xd8
[ 0.208743] start_kernel+0x2bc/0x2fc
[ 0.212460] __primary_switched+0x80/0x88
[ 0.216535] Code: b85fc3a8 7100051f 54fff8e9 17ffffce (d4210000)
[ 0.222711] ---[ end trace 0000000000000000 ]---
[ 0.227391] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.234187] ---[ end Kernel panic - not syncing: Attempted to kill
the idle task! ]---
I'm not an expert in DL server so I have no idea where the problem could
be. If you know where to look off the top of your head then much better.
If not, I'll do some bi-section later.
Hongyan
Powered by blists - more mailing lists