lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 18 Jul 2023 18:08:50 +0800
From:   Hao Jia <jiahao.os@...edance.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     lizefan.x@...edance.com, hannes@...xchg.org, mkoutny@...e.com,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH] cgroup/rstat: record the cumulative
 per-cpu time of cgroup and its descendants


Hello,
On 2023/7/18 Tejun Heo wrote:

> On Mon, Jul 17, 2023 at 05:36:12PM +0800, Hao Jia wrote:
>> Now the member variable bstat of the structure cgroup_rstat_cpu
> 
> You said "now" indicating that the behavior has changed recently but I don't
> see what changed there. Can you elaborate?

Thank you for your review, sorry for confusing you with my expression, 
it is true that it has not changed, bstat has always recorded the 
per-cpu time of cgroup itself.

> 
>> records the per-cpu time of the cgroup itself, but does not
>> include the per-cpu time of its descendants. The per-cpu time
> 
> It does. The per-cpu delta is added to its parent and then that will in turn
> be used to propagate to its parent.
>Yes, so only cgrp->bstat contains the cpu time of the cgroup and its 
descendants. rstatc->bstat only records the per-cpu time of cgroup itself.

> 
> I think you're misunderstanding how the code works. Can you please double
> check?

--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -361,11 +361,15 @@ static void cgroup_base_stat_flush(struct cgroup 
*cgrp, int cpu)
  	cgroup_base_stat_sub(&delta, &rstatc->last_bstat);
  	cgroup_base_stat_add(&cgrp->bstat, &delta);
  	cgroup_base_stat_add(&rstatc->last_bstat, &delta);

We add the current cgroup's per-cpu delta to its per-cpu variable 
(cumul_bstat).

+	cgroup_base_stat_add(&rstatc->cumul_bstat, &delta);

  	/* propagate global delta to parent (unless that's root) */
  	if (cgroup_parent(parent)) {
  		delta = cgrp->bstat;
+		rstatc = cgroup_rstat_cpu(parent, cpu);
+
  		cgroup_base_stat_sub(&delta, &cgrp->last_bstat);

Since we hold the global cgroup_rstat_lock and disable interrupts in 
cgroup_base_stat_flush() and we only update cgrp->bstat here.
So the global delta at this time is equal to the increment of this 
cgroup and its descendants on this cpu (unless root cgroup), so we can 
add the global delta to the cumul_bstat of its parent and propagate it 
upward.

+		cgroup_base_stat_add(&rstatc->cumul_bstat, &delta);
  		cgroup_base_stat_add(&parent->bstat, &delta);
  		cgroup_base_stat_add(&cgrp->last_bstat, &delta);
  	}


I wrote a kernel module to verify and test the above kernel code,
The kernel module adds a cpu.stat_percpu_all file for each cgroup, which 
can output the per-cpu time information of the cgroup and its 
descendants calculated in two ways:
1. Cumulatively add the per-cpu bstat of cgroup and each of its descendants.
2. Directly output @cumul_bstat read in the kernel (patch has been applied).
When the child cgroup is not removed, the results calculated by the two 
methods should be equal.

NOTE: We need to flush the data before "cat cpu.stat_percpu_all", such 
as "cat cpu.stat".

Code link:
https://github.com/jiaozhouxiaojia/cgv2-stat-percpu_test/tree/main

The testing procedure is in README.

I installed the 6.5.0-rc1 kernel on my machine (an Intel Xeon(R) 
Platinum 8260 machine 96 CPUs in total.) and did some tests.
So far I haven't found anything unusual, if I'm wrong, please point it 
out, it's very helpful to me, thanks again!

Thanks,
Hao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ