linux-kernel - Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0b386131-1511-4764-8ebf-1a38fce791a0@arm.com>
Date: Tue, 10 Dec 2024 08:45:56 +0000
From: Luis Machado <luis.machado@....com>
To: Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk>,
 Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
 juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com,
 youssefesmat@...omium.org, tglx@...utronix.de, efault@....de
Subject: Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

On 11/28/24 10:32, Marcel Ziswiler wrote:
> Hi all,
> 
> On Sat, 2024-07-27 at 12:27 +0200, Peter Zijlstra wrote:
>> Hi all,
>>
>> So after much delay this is hopefully the final version of the EEVDF patches.
>> They've been sitting in my git tree for ever it seems, and people have been
>> testing it and sending fixes.
>>
>> I've spend the last two days testing and fixing cfs-bandwidth, and as far
>> as I know that was the very last issue holding it back.
>>
>> These patches apply on top of queue.git sched/dl-server, which I plan on merging
>> in tip/sched/core once -rc1 drops.
>>
>> I'm hoping to then merge all this (+- the DVFS clock patch) right before -rc2.
>>
>>
>> Aside from a ton of bug fixes -- thanks all! -- new in this version is:
>>
>>  - split up the huge delay-dequeue patch
>>  - tested/fixed cfs-bandwidth
>>  - PLACE_REL_DEADLINE -- preserve the relative deadline when migrating
>>  - SCHED_BATCH is equivalent to RESPECT_SLICE
>>  - propagate min_slice up cgroups
>>  - CLOCK_THREAD_DVFS_ID
> 
> We found the following 7 commits from this patch set to crash in enqueue_dl_entity():
> 
> 54a58a787791 sched/fair: Implement DELAY_ZERO
> 152e11f6df29 sched/fair: Implement delayed dequeue
> e1459a50ba31 sched: Teach dequeue_task() about special task states
> a1c446611e31 sched,freezer: Mark TASK_FROZEN special
> 781773e3b680 sched/fair: Implement ENQUEUE_DELAYED
> f12e148892ed sched/fair: Prepare pick_next_task() for delayed dequeue
> 2e0199df252a sched/fair: Prepare exit/cleanup paths for delayed_dequeue
> 
> Resulting in the following crash dump (this is running v6.12.1):
> 
> [   14.652856] sched: DL replenish lagged too much
> [   16.572706] ------------[ cut here ]------------
> [   16.573115] WARNING: CPU: 5 PID: 912 at kernel/sched/deadline.c:1995 enqueue_dl_entity+0x46c/0x55c
> [   16.573900] Modules linked in: overlay crct10dif_ce rk805_pwrkey snd_soc_es8316 pwm_fan
> phy_rockchip_naneng_combphy rockchip_saradc rtc_hym8563 industrialio_trigg
> ered_buffer kfifo_buf rockchip_thermal phy_rockchip_usbdp typec spi_rockchip_sfc snd_soc_rockchip_i2s_tdm
> hantro_vpu v4l2_vp9 v4l2_h264 v4l2_jpeg panthor v4l2_mem2me
> m rockchipdrm drm_gpuvm drm_exec drm_shmem_helper analogix_dp gpu_sched dw_mipi_dsi dw_hdmi cec
> drm_display_helper snd_soc_audio_graph_card snd_soc_simple_card_utils
>  drm_dma_helper drm_kms_helper cfg80211 rfkill pci_endpoint_test drm backlight dm_mod dax
> [   16.578350] CPU: 5 UID: 0 PID: 912 Comm: job10 Not tainted 6.12.1-dirty #15
> [   16.578956] Hardware name: radxa Radxa ROCK 5B/Radxa ROCK 5B, BIOS 2024.10-rc3 10/01/2024
> [   16.579667] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   16.580273] pc : enqueue_dl_entity+0x46c/0x55c
> [   16.580661] lr : dl_server_start+0x44/0x12c
> [   16.581028] sp : ffff80008002bc00
> [   16.581318] x29: ffff80008002bc00 x28: dead000000000122 x27: 0000000000000000
> [   16.581941] x26: 0000000000000001 x25: 0000000000000000 x24: 0000000000000009
> [   16.582563] x23: ffff33c976db0e40 x22: 0000000000000001 x21: 00000000002dc6c0
> [   16.583186] x20: 0000000000000001 x19: ffff33c976db17a8 x18: 0000000000000000
> [   16.583808] x17: ffff5dd9779ac000 x16: ffff800080028000 x15: 11c3485b851e0698
> [   16.584430] x14: 11b4b257e4156000 x13: 0000000000000255 x12: 0000000000000000
> [   16.585053] x11: ffff33c976db0ec0 x10: 0000000000000000 x9 : 0000000000000009
> [   16.585674] x8 : 0000000000000005 x7 : ffff33c976db19a0 x6 : ffff33c78258b440
> [   16.586296] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> [   16.586918] x2 : 0000000000000001 x1 : 0000000000000001 x0 : ffff33c798e112f0
> [   16.587540] Call trace:
> [   16.587754]  enqueue_dl_entity+0x46c/0x55c
> [   16.588113]  dl_server_start+0x44/0x12c
> [   16.588449]  enqueue_task_fair+0x124/0x49c
> [   16.588807]  enqueue_task+0x3c/0xe0
> [   16.589113]  ttwu_do_activate.isra.0+0x6c/0x208
> [   16.589511]  try_to_wake_up+0x1d0/0x61c
> [   16.589847]  wake_up_process+0x18/0x24
> [   16.590175]  kick_pool+0x84/0x150
> [   16.590467]  __queue_work+0x2f4/0x544
> [   16.590788]  delayed_work_timer_fn+0x1c/0x28
> [   16.591161]  call_timer_fn+0x34/0x1ac
> [   16.591481]  __run_timer_base+0x20c/0x314
> [   16.591832]  run_timer_softirq+0x3c/0x78
> [   16.592176]  handle_softirqs+0x124/0x35c
> [   16.592520]  __do_softirq+0x14/0x20
> [   16.592827]  ____do_softirq+0x10/0x1c
> [   16.593148]  call_on_irq_stack+0x24/0x4c
> [   16.593490]  do_softirq_own_stack+0x1c/0x2c
> [   16.593857]  irq_exit_rcu+0x8c/0xc0
> [   16.594163]  el0_interrupt+0x48/0xbc
> [   16.594477]  __el0_irq_handler_common+0x18/0x24
> [   16.594874]  el0t_64_irq_handler+0x10/0x1c
> [   16.595232]  el0t_64_irq+0x190/0x194
> [   16.595545] ---[ end trace 0000000000000000 ]---
> [   16.595950] ------------[ cut here ]------------


Random piece of data, but I also had some difficulty making things boot on Android when trying
Vincent's nr_running accounting series due to a very similar crash/stack trace. Though what I
saw went a bit further and actually crashed within task_contending, called from
enqueue_dl_entity. Possibly crashed in one of the inlined functions.

Even though the kernel was 6.8 and it was backports, it seems awfully similar to the above.

> 
> It looks like it is trying to enqueue an already queued deadline task. Full serial console log available [1].
> 
> We are running the exact same scheduler stress test both on Intel NUCs as well as RADXA ROCK 5B board farms.
> While so far we have not seen this on amd64 it crashes consistently/reproducible on aarch64.
> 
> We haven't had time to do a non-proprietary reproduction case as of yet but I wanted to report our current
> findings asking for any feedback/suggestions from the community.
> 
> Thanks!
> 
> Cheers
> 
> Marcel
> 
> [1] https://hastebin.skyra.pw/hoqesigaye.yaml
>