lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <19251b0c-a6b6-ba5b-3b7b-620416165121@redhat.com> Date: Thu, 2 Nov 2023 15:07:14 -0400 From: Waiman Long <longman@...hat.com> To: Yosry Ahmed <yosryahmed@...gle.com> Cc: Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>, Johannes Weiner <hannes@...xchg.org>, cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, Joe Mario <jmario@...hat.com>, Sebastian Jug <sejug@...hat.com> Subject: Re: [PATCH v2] cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked() On 11/2/23 00:35, Yosry Ahmed wrote: > On Wed, Nov 1, 2023 at 5:53 PM Waiman Long <longman@...hat.com> wrote: >> When cgroup_rstat_updated() isn't being called concurrently with >> cgroup_rstat_flush_locked(), its run time is pretty short. When >> both are called concurrently, the cgroup_rstat_updated() run time >> can spike to a pretty high value due to high cpu_lock hold time in >> cgroup_rstat_flush_locked(). This can be problematic if the task calling >> cgroup_rstat_updated() is a realtime task running on an isolated CPU >> with a strict latency requirement. The cgroup_rstat_updated() call can >> happens when there is a page fault even though the task is running in > s/happens/happen > >> user space most of the time. >> >> The percpu cpu_lock is used to protect the update tree - >> updated_next and updated_children. This protection is only needed >> when cgroup_rstat_cpu_pop_updated() is being called. The subsequent >> flushing operation which can take a much longer time does not need >> that protection. > nit: add: as it is already protected by cgroup_rstat_lock. > >> To reduce the cpu_lock hold time, we need to perform all the >> cgroup_rstat_cpu_pop_updated() calls up front with the lock >> released afterward before doing any flushing. This patch adds a new >> cgroup_rstat_updated_list() function to return a singly linked list of >> cgroups to be flushed. >> >> By adding some instrumentation code to measure the maximum elapsed times >> of the new cgroup_rstat_updated_list() function and each cpu iteration of >> cgroup_rstat_updated_locked() around the old cpu_lock lock/unlock pair >> on a 2-socket x86-64 server running parallel kernel build, the maximum >> elapsed times are 27us and 88us respectively. The maximum cpu_lock hold >> time is now reduced to about 30% of the original. >> >> Below were the run time distribution of cgroup_rstat_updated_list() >> during the same period: >> >> Run time Count >> -------- ----- >> t <= 1us 12,574,302 >> 1us < t <= 5us 2,127,482 >> 5us < t <= 10us 8,445 >> 10us < t <= 20us 6,425 >> 20us < t <= 30us 50 >> >> Signed-off-by: Waiman Long <longman@...hat.com> > LGTM with some nits. > > Reviewed-by: Yosry Ahmed <yosryahmed@...gle.com> > >> --- >> include/linux/cgroup-defs.h | 6 +++++ >> kernel/cgroup/rstat.c | 45 ++++++++++++++++++++++++------------- >> 2 files changed, 36 insertions(+), 15 deletions(-) >> >> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h >> index 265da00a1a8b..daaf6d4eb8b6 100644 >> --- a/include/linux/cgroup-defs.h >> +++ b/include/linux/cgroup-defs.h >> @@ -491,6 +491,12 @@ struct cgroup { >> struct cgroup_rstat_cpu __percpu *rstat_cpu; >> struct list_head rstat_css_list; >> >> + /* >> + * A singly-linked list of cgroup structures to be rstat flushed. >> + * Protected by cgroup_rstat_lock. > Do you think we should mention that this is a scratch area for > cgroup_rstat_flush_locked()? IOW, this field will be invalid or may > contain garbage otherwise. I can certainly add that into the comment. > > It might be also useful to mention that the scope of usage for this is > for each percpu flushing iteration. The cgroup_rstat_lock can be > dropped between percpu flushing iterations, so different flushers can > reuse this field safely because it is re-initialized in every > iteration and only used there. > >> + */ >> + struct cgroup *rstat_flush_next; >> + >> /* cgroup basic resource statistics */ >> struct cgroup_base_stat last_bstat; >> struct cgroup_base_stat bstat; >> diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c >> index d80d7a608141..a86d40ed8bda 100644 >> --- a/kernel/cgroup/rstat.c >> +++ b/kernel/cgroup/rstat.c >> @@ -145,6 +145,34 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos, >> return pos; >> } >> >> +/* >> + * Return a list of updated cgroups to be flushed >> + */ > Why not just on a single line? > /* Return a list of updated cgroups to be flushed */ Yes, it can be compressed into a one liner. Thanks for the review and suggestion. Cheers, Longman
Powered by blists - more mailing lists