lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <b3eb050d-9451-4b60-b06c-ace7dab57497@embeddedor.com>
Date: Sat, 30 Aug 2025 15:30:11 +0200
From: "Gustavo A. R. Silva" <gustavo@...eddedor.com>
To: Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>,
 Michal Koutný <mkoutny@...e.com>
Cc: cgroups@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
 linux-hardening@...r.kernel.org, "Gustavo A. R. Silva"
 <gustavoars@...nel.org>
Subject: [RFC] cgroup: Avoid thousands of -Wflex-array-member-not-at-end
 warnings

Hi all,

I'm working on enabling -Wflex-array-member-not-at-end in mainline, and
I ran into thousands (yes, 14722 to be precise) of these warnings caused
by an instance of `struct cgroup` in the middle of `struct cgroup_root`.
See below:

620 struct cgroup_root {
	...
633         /*
634          * The root cgroup. The containing cgroup_root will be destroyed on its
635          * release. cgrp->ancestors[0] will be used overflowing into the
636          * following field. cgrp_ancestor_storage must immediately follow.
637          */
638         struct cgroup cgrp;
639
640         /* must follow cgrp for cgrp->ancestors[0], see above */
641         struct cgroup *cgrp_ancestor_storage;
	...
};

Based on the comments above, it seems that the original code was expecting
cgrp->ancestors[0] and cgrp_ancestor_storage to share the same addres in
memory.

However when I take a look at the pahole output, I see that these two members
are actually misaligned by 56 bytes. See below:

struct cgroup_root {
	...

	/* --- cacheline 1 boundary (64 bytes) --- */
	struct cgroup              cgrp __attribute__((__aligned__(64))); /*    64  2112 */

	/* XXX last struct has 56 bytes of padding */

	/* --- cacheline 34 boundary (2176 bytes) --- */
	struct cgroup *            cgrp_ancestor_storage; /*  2176     8 */

	...

	/* size: 6400, cachelines: 100, members: 11 */
	/* sum members: 6336, holes: 1, sum holes: 16 */
	/* padding: 48 */
	/* paddings: 1, sum paddings: 56 */
	/* forced alignments: 1, forced holes: 1, sum forced holes: 16 */
} __attribute__((__aligned__(64)));

This is due to the fact that struct cgroup have some tailing padding after
flexible-array member `ancestors` due to alignment to 64 bytes, see below:

struct cgroup {
	...

	struct cgroup *            ancestors[];          /*  2056     0 */

	/* size: 2112, cachelines: 33, members: 43 */
	/* sum members: 2000, holes: 3, sum holes: 56 */
	/* padding: 56 */
	/* paddings: 2, sum paddings: 8 */
	/* forced alignments: 1 */
} __attribute__((__aligned__(64)));

The offset for `ancestors` is at 2056, but sizeof(struct group) == 2112 due
to the 56 bytes of tailing padding. This looks buggy. (thinkingface)

So, one solution for this is to use the TRAILING_OVERLAP() helper and
move these members at the end of `struct cgroup_root`. With this the
misalignment disappears (together with the 14722 warnings :) ), and now
both cgrp->ancestors[0] and cgrp_ancestor_storage share the same address
in memory. See below:

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 539c64eeef38..901a46f70a02 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -630,16 +630,6 @@ struct cgroup_root {
         struct list_head root_list;
         struct rcu_head rcu;    /* Must be near the top */

-       /*
-        * The root cgroup. The containing cgroup_root will be destroyed on its
-        * release. cgrp->ancestors[0] will be used overflowing into the
-        * following field. cgrp_ancestor_storage must immediately follow.
-        */
-       struct cgroup cgrp;
-
-       /* must follow cgrp for cgrp->ancestors[0], see above */
-       struct cgroup *cgrp_ancestor_storage;
-
         /* Number of cgroups in the hierarchy, used only for /proc/cgroups */
         atomic_t nr_cgrps;

@@ -651,6 +641,18 @@ struct cgroup_root {

         /* The name for this hierarchy - may be empty */
         char name[MAX_CGROUP_ROOT_NAMELEN];
+
+       /*
+        * The root cgroup. The containing cgroup_root will be destroyed on its
+        * release. cgrp->ancestors[0] will be used overflowing into the
+        * following field. cgrp_ancestor_storage must immediately follow.
+        *
+        * Must be last --ends in a flexible-array members.
+        */
+       TRAILING_OVERLAP(struct cgroup, cgrp, ancestors,
+               /* must follow cgrp for cgrp->ancestors[0], see above */
+               struct cgroup *cgrp_ancestor_storage;
+       );
  };

However, this causes the size of struct cgroup_root to increase from 6400
bytes to 16384 bytes due to struct cgroup to be aligned to page size 4096
bytes. See below:

struct cgroup_root {
	struct kernfs_root *       kf_root;              /*     0     8 */
	unsigned int               subsys_mask;          /*     8     4 */
	int                        hierarchy_id;         /*    12     4 */
	struct list_head           root_list;            /*    16    16 */
	struct callback_head       rcu __attribute__((__aligned__(8))); /*    32    16 */
	atomic_t                   nr_cgrps;             /*    48     4 */
	unsigned int               flags;                /*    52     4 */
	char                       name[64];             /*    56    64 */
	/* --- cacheline 1 boundary (64 bytes) was 56 bytes ago --- */
	char                       release_agent_path[4096]; /*   120  4096 */

	/* XXX 3976 bytes hole, try to pack */

	/* --- cacheline 128 boundary (8192 bytes) --- */
	union {
		struct cgroup      cgrp __attribute__((__aligned__(4096))); /*  8192  8192 */
		struct {
			unsigned char __offset_to_ancestors[5784]; /*  8192  5784 */
			/* --- cacheline 218 boundary (13952 bytes) was 24 bytes ago --- */
			struct cgroup * cgrp_ancestor_storage; /* 13976     8 */
		};                                       /*  8192  5792 */
	} __attribute__((__aligned__(4096)));            /*  8192  8192 */

	/* size: 16384, cachelines: 256, members: 10 */
	/* sum members: 12408, holes: 1, sum holes: 3976 */
	/* forced alignments: 2, forced holes: 1, sum forced holes: 3976 */
} __attribute__((__aligned__(4096)));

I've tried with the struct_group_tagged()/container_of() technique:

https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?h=testing/wfamnae-next20250723&id=03da6b0772af1a62778400f26fe57796fe1ebf27

but cgroup_root grows up to 20K in this case.

So, I guess my question here is... what do you think?... (thinkingface)

Thanks!
-Gustavo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ