linux-kernel - [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <16f96a109bec0b5849793c8fb90bd6b63a2eb62f.camel@codethink.co.uk>
Date: Thu, 28 Nov 2024 11:32:10 +0100
From: Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk>
To: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com, 
	juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, 	rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, vschneid@...hat.com, 	linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com,
 youssefesmat@...omium.org, 	tglx@...utronix.de, efault@....de
Subject: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

Hi all,

On Sat, 2024-07-27 at 12:27 +0200, Peter Zijlstra wrote:
> Hi all,
> 
> So after much delay this is hopefully the final version of the EEVDF patches.
> They've been sitting in my git tree for ever it seems, and people have been
> testing it and sending fixes.
> 
> I've spend the last two days testing and fixing cfs-bandwidth, and as far
> as I know that was the very last issue holding it back.
> 
> These patches apply on top of queue.git sched/dl-server, which I plan on merging
> in tip/sched/core once -rc1 drops.
> 
> I'm hoping to then merge all this (+- the DVFS clock patch) right before -rc2.
> 
> 
> Aside from a ton of bug fixes -- thanks all! -- new in this version is:
> 
>  - split up the huge delay-dequeue patch
>  - tested/fixed cfs-bandwidth
>  - PLACE_REL_DEADLINE -- preserve the relative deadline when migrating
>  - SCHED_BATCH is equivalent to RESPECT_SLICE
>  - propagate min_slice up cgroups
>  - CLOCK_THREAD_DVFS_ID

We found the following 7 commits from this patch set to crash in enqueue_dl_entity():

54a58a787791 sched/fair: Implement DELAY_ZERO
152e11f6df29 sched/fair: Implement delayed dequeue
e1459a50ba31 sched: Teach dequeue_task() about special task states
a1c446611e31 sched,freezer: Mark TASK_FROZEN special
781773e3b680 sched/fair: Implement ENQUEUE_DELAYED
f12e148892ed sched/fair: Prepare pick_next_task() for delayed dequeue
2e0199df252a sched/fair: Prepare exit/cleanup paths for delayed_dequeue

Resulting in the following crash dump (this is running v6.12.1):

[   14.652856] sched: DL replenish lagged too much
[   16.572706] ------------[ cut here ]------------
[   16.573115] WARNING: CPU: 5 PID: 912 at kernel/sched/deadline.c:1995 enqueue_dl_entity+0x46c/0x55c
[   16.573900] Modules linked in: overlay crct10dif_ce rk805_pwrkey snd_soc_es8316 pwm_fan
phy_rockchip_naneng_combphy rockchip_saradc rtc_hym8563 industrialio_trigg
ered_buffer kfifo_buf rockchip_thermal phy_rockchip_usbdp typec spi_rockchip_sfc snd_soc_rockchip_i2s_tdm
hantro_vpu v4l2_vp9 v4l2_h264 v4l2_jpeg panthor v4l2_mem2me
m rockchipdrm drm_gpuvm drm_exec drm_shmem_helper analogix_dp gpu_sched dw_mipi_dsi dw_hdmi cec
drm_display_helper snd_soc_audio_graph_card snd_soc_simple_card_utils
 drm_dma_helper drm_kms_helper cfg80211 rfkill pci_endpoint_test drm backlight dm_mod dax
[   16.578350] CPU: 5 UID: 0 PID: 912 Comm: job10 Not tainted 6.12.1-dirty #15
[   16.578956] Hardware name: radxa Radxa ROCK 5B/Radxa ROCK 5B, BIOS 2024.10-rc3 10/01/2024
[   16.579667] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   16.580273] pc : enqueue_dl_entity+0x46c/0x55c
[   16.580661] lr : dl_server_start+0x44/0x12c
[   16.581028] sp : ffff80008002bc00
[   16.581318] x29: ffff80008002bc00 x28: dead000000000122 x27: 0000000000000000
[   16.581941] x26: 0000000000000001 x25: 0000000000000000 x24: 0000000000000009
[   16.582563] x23: ffff33c976db0e40 x22: 0000000000000001 x21: 00000000002dc6c0
[   16.583186] x20: 0000000000000001 x19: ffff33c976db17a8 x18: 0000000000000000
[   16.583808] x17: ffff5dd9779ac000 x16: ffff800080028000 x15: 11c3485b851e0698
[   16.584430] x14: 11b4b257e4156000 x13: 0000000000000255 x12: 0000000000000000
[   16.585053] x11: ffff33c976db0ec0 x10: 0000000000000000 x9 : 0000000000000009
[   16.585674] x8 : 0000000000000005 x7 : ffff33c976db19a0 x6 : ffff33c78258b440
[   16.586296] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[   16.586918] x2 : 0000000000000001 x1 : 0000000000000001 x0 : ffff33c798e112f0
[   16.587540] Call trace:
[   16.587754]  enqueue_dl_entity+0x46c/0x55c
[   16.588113]  dl_server_start+0x44/0x12c
[   16.588449]  enqueue_task_fair+0x124/0x49c
[   16.588807]  enqueue_task+0x3c/0xe0
[   16.589113]  ttwu_do_activate.isra.0+0x6c/0x208
[   16.589511]  try_to_wake_up+0x1d0/0x61c
[   16.589847]  wake_up_process+0x18/0x24
[   16.590175]  kick_pool+0x84/0x150
[   16.590467]  __queue_work+0x2f4/0x544
[   16.590788]  delayed_work_timer_fn+0x1c/0x28
[   16.591161]  call_timer_fn+0x34/0x1ac
[   16.591481]  __run_timer_base+0x20c/0x314
[   16.591832]  run_timer_softirq+0x3c/0x78
[   16.592176]  handle_softirqs+0x124/0x35c
[   16.592520]  __do_softirq+0x14/0x20
[   16.592827]  ____do_softirq+0x10/0x1c
[   16.593148]  call_on_irq_stack+0x24/0x4c
[   16.593490]  do_softirq_own_stack+0x1c/0x2c
[   16.593857]  irq_exit_rcu+0x8c/0xc0
[   16.594163]  el0_interrupt+0x48/0xbc
[   16.594477]  __el0_irq_handler_common+0x18/0x24
[   16.594874]  el0t_64_irq_handler+0x10/0x1c
[   16.595232]  el0t_64_irq+0x190/0x194
[   16.595545] ---[ end trace 0000000000000000 ]---
[   16.595950] ------------[ cut here ]------------

It looks like it is trying to enqueue an already queued deadline task. Full serial console log available [1].

We are running the exact same scheduler stress test both on Intel NUCs as well as RADXA ROCK 5B board farms.
While so far we have not seen this on amd64 it crashes consistently/reproducible on aarch64.

We haven't had time to do a non-proprietary reproduction case as of yet but I wanted to report our current
findings asking for any feedback/suggestions from the community.

Thanks!

Cheers

Marcel

[1] https://hastebin.skyra.pw/hoqesigaye.yaml