lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87776d335eec8fe02b29d96818fd5c2dde5ed7af.camel@siemens.com>
Date: Tue, 22 Apr 2025 17:03:19 +0200
From: Florian Bezdeka <florian.bezdeka@...mens.com>
To: Aaron Lu <ziqianlu@...edance.com>
Cc: Valentin Schneider <vschneid@...hat.com>, Ben Segall
 <bsegall@...gle.com>,  K Prateek Nayak <kprateek.nayak@....com>, Peter
 Zijlstra <peterz@...radead.org>, Josh Don <joshdon@...gle.com>,  Ingo
 Molnar <mingo@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Xi
 Wang <xii@...gle.com>, 	linux-kernel@...r.kernel.org, Juri Lelli
 <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Mel Gorman	 <mgorman@...e.de>,
 Chengming Zhou <chengming.zhou@...ux.dev>, Chuyi Zhou	
 <zhouchuyi@...edance.com>, Jan Kiszka <jan.kiszka@...mens.com>
Subject: Re: [RFC PATCH v2 7/7] sched/fair: alternative way of accounting
 throttle time

On Fri, 2025-04-18 at 11:15 +0800, Aaron Lu wrote:
> Hi Florian,
> 
> On Thu, Apr 17, 2025 at 04:06:16PM +0200, Florian Bezdeka wrote:
> > Hi Aaron,
> > 
> > On Wed, 2025-04-09 at 20:07 +0800, Aaron Lu wrote:
> > > @@ -5889,27 +5943,21 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
> > >  	cfs_rq->throttled_clock_pelt_time += rq_clock_pelt(rq) -
> > >  		cfs_rq->throttled_clock_pelt;
> > >  
> > > -	if (cfs_rq->throttled_clock_self) {
> > > -		u64 delta = rq_clock(rq) - cfs_rq->throttled_clock_self;
> > > -
> > > -		cfs_rq->throttled_clock_self = 0;
> > > -
> > > -		if (WARN_ON_ONCE((s64)delta < 0))
> > > -			delta = 0;
> > > -
> > > -		cfs_rq->throttled_clock_self_time += delta;
> > > -	}
> > > +	if (cfs_rq->throttled_clock_self)
> > > +		account_cfs_rq_throttle_self(cfs_rq);
> > >  
> > >  	/* Re-enqueue the tasks that have been throttled at this level. */
> > >  	list_for_each_entry_safe(p, tmp, &cfs_rq->throttled_limbo_list, throttle_node) {
> > >  		list_del_init(&p->throttle_node);
> > > -		enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP);
> > > +		enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP | ENQUEUE_THROTTLE);
> > >  	}
> > >  
> > >  	/* Add cfs_rq with load or one or more already running entities to the list */
> > >  	if (!cfs_rq_is_decayed(cfs_rq))
> > >  		list_add_leaf_cfs_rq(cfs_rq);
> > >  
> > > +	WARN_ON_ONCE(cfs_rq->h_nr_throttled);
> > > +
> > >  	return 0;
> > >  }
> > >  
> > 
> > I got this warning while testing in our virtual environment:
> 
> Thanks for the report.
> 
> > 
> > Any idea?
> > 
> 
> Most likely the accounting of h_nr_throttle is incorrect somewhere.
> 
> > [   26.639641] ------------[ cut here ]------------
> > [   26.639644] WARNING: CPU: 5 PID: 0 at kernel/sched/fair.c:5967 tg_unthrottle_up+0x1a6/0x3d0
> 
> The line doesn't match the code though, the below warning should be at
> line 5959:
> WARN_ON_ONCE(cfs_rq->h_nr_throttled);

See below.

> 
> > [   26.639653] Modules linked in: veth xt_nat nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc xt_recent rfkill ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt vsock_loopback vmw_vsock_virtio_transport_common ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog vmw_vsock_vmci_transport xt_comment vsock nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables intel_rapl_msr intel_rapl_common nfnetlink binfmt_misc intel_uncore_frequency_common isst_if_mbox_msr isst_if_common skx_edac_common nfit libnvdimm ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel snd_pcm crypto_simd cryptd snd_timer rapl snd soundcore vmw_balloon vmwgfx pcspkr drm_ttm_helper ttm drm_client_lib button ac drm_kms_helper sg vmw_vmci evdev joydev serio_raw drm loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 overlay nls_ascii nls_cp437 vfat fat ext4 crc16 mbcache jbd2 squashfs dm_verity dm_bufio reed_solomon dm_mod
> > [   26.639715]  sd_mod ata_generic mptspi mptscsih ata_piix mptbase libata scsi_transport_spi psmouse scsi_mod vmxnet3 i2c_piix4 i2c_smbus scsi_common
> > [   26.639726] CPU: 5 UID: 0 PID: 0 Comm: swapper/5 Not tainted 6.14.2-CFSfixes #1
> 
> 6.14.2-CFSfixes seems to be a backported kernel?
> Do you also see this warning when using this series on top of the said
> base commit 6432e163ba1b("sched/isolation: Make use of more than one
> housekeeping cpu")? Just want to make sure it's not a problem due to
> backport.

Right, I should have mentioned that crucial detail. Sorry.

I ported your series to 6.14.2 because we did/do not trust anything
newer yet for testing. The problematic workload was not available in
our lab at that time, so we had to be very carefully about deployed
kernel versions.

I'm attaching the backported patches now, so you can compare / review
if you like. Spoiler: The only differences are line numbers ;-)


Best regards,
Florian


View attachment "0001-sched-fair-Add-related-data-structure-for-task-based.patch" of type "text/x-patch" (2702 bytes)

View attachment "0002-sched-fair-Handle-throttle-path-for-task-based-throt.patch" of type "text/x-patch" (11529 bytes)

View attachment "0003-sched-fair-Handle-unthrottle-path-for-task-based-thr.patch" of type "text/x-patch" (8385 bytes)

View attachment "0004-sched-fair-Take-care-of-group-affinity-sched_class-c.patch" of type "text/x-patch" (2528 bytes)

View attachment "0005-sched-fair-get-rid-of-throttled_lb_pair.patch" of type "text/x-patch" (2795 bytes)

View attachment "0006-sched-fair-fix-h_nr_runnable-accounting-with-per-tas.patch" of type "text/x-patch" (1228 bytes)

View attachment "0007-sched-fair-alternative-way-of-accounting-throttle-ti.patch" of type "text/x-patch" (11369 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ