[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250418031550.GA1516180@bytedance>
Date: Fri, 18 Apr 2025 11:15:50 +0800
From: Aaron Lu <ziqianlu@...edance.com>
To: Florian Bezdeka <florian.bezdeka@...mens.com>
Cc: Valentin Schneider <vschneid@...hat.com>,
Ben Segall <bsegall@...gle.com>,
K Prateek Nayak <kprateek.nayak@....com>,
Peter Zijlstra <peterz@...radead.org>,
Josh Don <joshdon@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Xi Wang <xii@...gle.com>, linux-kernel@...r.kernel.org,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>,
Chengming Zhou <chengming.zhou@...ux.dev>,
Chuyi Zhou <zhouchuyi@...edance.com>,
Jan Kiszka <jan.kiszka@...mens.com>
Subject: Re: [RFC PATCH v2 7/7] sched/fair: alternative way of accounting
throttle time
Hi Florian,
On Thu, Apr 17, 2025 at 04:06:16PM +0200, Florian Bezdeka wrote:
> Hi Aaron,
>
> On Wed, 2025-04-09 at 20:07 +0800, Aaron Lu wrote:
> > @@ -5889,27 +5943,21 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
> > cfs_rq->throttled_clock_pelt_time += rq_clock_pelt(rq) -
> > cfs_rq->throttled_clock_pelt;
> >
> > - if (cfs_rq->throttled_clock_self) {
> > - u64 delta = rq_clock(rq) - cfs_rq->throttled_clock_self;
> > -
> > - cfs_rq->throttled_clock_self = 0;
> > -
> > - if (WARN_ON_ONCE((s64)delta < 0))
> > - delta = 0;
> > -
> > - cfs_rq->throttled_clock_self_time += delta;
> > - }
> > + if (cfs_rq->throttled_clock_self)
> > + account_cfs_rq_throttle_self(cfs_rq);
> >
> > /* Re-enqueue the tasks that have been throttled at this level. */
> > list_for_each_entry_safe(p, tmp, &cfs_rq->throttled_limbo_list, throttle_node) {
> > list_del_init(&p->throttle_node);
> > - enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP);
> > + enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP | ENQUEUE_THROTTLE);
> > }
> >
> > /* Add cfs_rq with load or one or more already running entities to the list */
> > if (!cfs_rq_is_decayed(cfs_rq))
> > list_add_leaf_cfs_rq(cfs_rq);
> >
> > + WARN_ON_ONCE(cfs_rq->h_nr_throttled);
> > +
> > return 0;
> > }
> >
>
> I got this warning while testing in our virtual environment:
Thanks for the report.
>
> Any idea?
>
Most likely the accounting of h_nr_throttle is incorrect somewhere.
> [ 26.639641] ------------[ cut here ]------------
> [ 26.639644] WARNING: CPU: 5 PID: 0 at kernel/sched/fair.c:5967 tg_unthrottle_up+0x1a6/0x3d0
The line doesn't match the code though, the below warning should be at
line 5959:
WARN_ON_ONCE(cfs_rq->h_nr_throttled);
> [ 26.639653] Modules linked in: veth xt_nat nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc xt_recent rfkill ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt vsock_loopback vmw_vsock_virtio_transport_common ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog vmw_vsock_vmci_transport xt_comment vsock nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables intel_rapl_msr intel_rapl_common nfnetlink binfmt_misc intel_uncore_frequency_common isst_if_mbox_msr isst_if_common skx_edac_common nfit libnvdimm ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel snd_pcm crypto_simd cryptd snd_timer rapl snd soundcore vmw_balloon vmwgfx pcspkr drm_ttm_helper ttm drm_client_lib button ac drm_kms_helper sg vmw_vmci evdev joydev serio_raw drm loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 overlay nls_ascii nls_cp437 vfat fat ext4 crc16 mbcache jbd2 squashfs dm_verity dm_bufio reed_solomon dm_mod
> [ 26.639715] sd_mod ata_generic mptspi mptscsih ata_piix mptbase libata scsi_transport_spi psmouse scsi_mod vmxnet3 i2c_piix4 i2c_smbus scsi_common
> [ 26.639726] CPU: 5 UID: 0 PID: 0 Comm: swapper/5 Not tainted 6.14.2-CFSfixes #1
6.14.2-CFSfixes seems to be a backported kernel?
Do you also see this warning when using this series on top of the said
base commit 6432e163ba1b("sched/isolation: Make use of more than one
housekeeping cpu")? Just want to make sure it's not a problem due to
backport.
Thanks,
Aaron
> [ 26.639729] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.24224532.B64.2408191458 08/19/2024
> [ 26.639731] RIP: 0010:tg_unthrottle_up+0x1a6/0x3d0
> [ 26.639735] Code: 00 00 48 39 ca 74 14 48 8b 52 10 49 8b 8e 58 01 00 00 48 39 8a 28 01 00 00 74 24 41 8b 86 68 01 00 00 85 c0 0f 84 8d fe ff ff <0f> 0b e9 86 fe ff ff 49 8b 9e 38 01 00 00 41 8b 86 40 01 00 00 48
> [ 26.639737] RSP: 0000:ffffa5df8029cec8 EFLAGS: 00010002
> [ 26.639739] RAX: 0000000000000001 RBX: ffff981c6fcb6a80 RCX: ffff981943752e40
> [ 26.639741] RDX: 0000000000000005 RSI: ffff981c6fcb6a80 RDI: ffff981943752d00
> [ 26.639742] RBP: ffff9819607dc708 R08: ffff981c6fcb6a80 R09: 0000000000000000
> [ 26.639744] R10: 0000000000000001 R11: ffff981969936a10 R12: ffff9819607dc708
> [ 26.639745] R13: ffff9819607dc9d8 R14: ffff9819607dc800 R15: ffffffffad913fb0
> [ 26.639747] FS: 0000000000000000(0000) GS:ffff981c6fc80000(0000) knlGS:0000000000000000
> [ 26.639749] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 26.639750] CR2: 00007ff1292dc44c CR3: 000000015350e006 CR4: 00000000007706f0
> [ 26.639779] PKRU: 55555554
> [ 26.639781] Call Trace:
> [ 26.639783] <IRQ>
> [ 26.639787] ? __pfx_tg_unthrottle_up+0x10/0x10
> [ 26.639790] ? __pfx_tg_nop+0x10/0x10
> [ 26.639793] walk_tg_tree_from+0x58/0xb0
> [ 26.639797] unthrottle_cfs_rq+0xf0/0x360
> [ 26.639800] ? sched_clock_cpu+0xf/0x190
> [ 26.639808] __cfsb_csd_unthrottle+0x11c/0x170
> [ 26.639812] ? __pfx___cfsb_csd_unthrottle+0x10/0x10
> [ 26.639816] __flush_smp_call_function_queue+0x103/0x410
> [ 26.639822] __sysvec_call_function_single+0x1c/0xb0
> [ 26.639826] sysvec_call_function_single+0x6c/0x90
> [ 26.639832] </IRQ>
> [ 26.639833] <TASK>
> [ 26.639834] asm_sysvec_call_function_single+0x1a/0x20
> [ 26.639840] RIP: 0010:pv_native_safe_halt+0xf/0x20
> [ 26.639844] Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 45 c1 13 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
> [ 26.639846] RSP: 0000:ffffa5df80117ed8 EFLAGS: 00000242
> [ 26.639848] RAX: 0000000000000005 RBX: ffff981940804000 RCX: ffff9819a9df7000
> [ 26.639849] RDX: 0000000000000005 RSI: 0000000000000005 RDI: 000000000005c514
> [ 26.639851] RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000001
> [ 26.639852] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> [ 26.639853] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 26.639858] default_idle+0x9/0x20
> [ 26.639861] default_idle_call+0x30/0x100
> [ 26.639863] do_idle+0x1fd/0x240
> [ 26.639869] cpu_startup_entry+0x29/0x30
> [ 26.639872] start_secondary+0x11e/0x140
> [ 26.639875] common_startup_64+0x13e/0x141
> [ 26.639881] </TASK>
> [ 26.639882] ---[ end trace 0000000000000000 ]---
Powered by blists - more mailing lists