lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <hjxarrg2jy6cyy5hptjjbkop76jmb6mjdcazlcyqe6nnaoo3l7@7amn6gdssmeg>
Date: Mon, 8 Dec 2025 19:35:30 +0100
From: Michal Koutný <mkoutny@...e.com>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, 
	"Paul E . McKenney" <paulmck@...nel.org>, JP Kobryn <inwardvessel@...il.com>, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Meta kernel team <kernel-team@...a.com>
Subject: Re: [PATCH] cgroup: rstat: force flush on css exit

Hi Shakeel.

On Thu, Dec 04, 2025 at 01:06:00PM -0800, Shakeel Butt <shakeel.butt@...ux.dev> wrote:
> Cuurently the rstat update side is lockless and transfers the css of
> cgroup whose stats has been updated through lockless list (llist). There
> is an expected race where rstat updater skips adding css to the llist
> because it was already in the list but the flusher might not see those
> updates done by the skipped updater.

Notice that there's css_rstat_flush() in
css_free_rwork_fn()/css_rstat_exit().

> Usually the subsequent updater will take care of such situation but what
> if the skipped updater was the last updater before the cgroup is removed
> by the user. In that case stat updates by the skipped updater will be
> lost. To avoid that let's always flush the stats of the offlined cgroup.

Are you sure here that this is the different cause of the loss than the
other with unlocked cmpxchg you posted later?

> @@ -283,6 +283,16 @@ static struct cgroup_subsys_state *css_rstat_updated_list(
>  
>  	css_process_update_tree(root->ss, cpu);
>  
> +	/*
> +	 * We allow race between rstat updater and flusher which can cause a
> +	 * scenario where the updater skips adding the css to the list but the
> +	 * flusher might not see updater's updates. Usually the subsequent
> +	 * updater would take care of that but what if that was the last updater
> +	 * on that CPU before getting removed. Handle that scenario here.
> +	 */
> +	if (!css_is_online(root))
> +		__css_process_update_tree(root, cpu);
> +

I'm thinking about this approach:

@@ -482,6 +484,15 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
        if (!css->rstat_cpu)
                return;

+       /*
+        * We allow race between rstat updater and flusher which can cause a
+        * scenario where the updater skips adding the css to the list but the
+        * flusher might not see updater's updates. Usually the subsequent
+        * updater would take care of that but what if that was the last updater
+        * on that CPU before getting removed. Handle that scenario here.
+        */
+       for_each_possible_cpu(cpu)
+               css_rstat_updated(css, cpu);
        css_rstat_flush(css);

        /* sanity check */

because that moves the special treating from relatively commonn
css_rstat_updated_list() to only cgroup_exit().

(I didn't check this wouldn't break anything.)

Thanks,
Michal

Download attachment "signature.asc" of type "application/pgp-signature" (266 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ