[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac686bdb-470b-07ab-2ef9-3d47fd06e6cd@gmail.com>
Date: Wed, 5 Jan 2022 10:32:48 -0800
From: Florian Fainelli <f.fainelli@...il.com>
To: Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>
Subject: Re: [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct
dsa_switch into a single u32
On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> struct dsa_switch has 9 boolean properties, many of which are in fact
> set by drivers for custom behavior (vlan_filtering_is_global,
> needs_standalone_vlan_filtering, etc etc). The binary layout of the
> structure could be improved. For example, the "bool setup" at the
> beginning introduces a gratuitous 7 byte hole in the first cache line.
>
> The change merges all boolean properties into bitfields of an u32, and
> places that u32 in the first cache line of the structure, since many
> bools are accessed from the data path (untag_bridge_pvid, vlan_filtering,
> vlan_filtering_is_global).
>
> We place this u32 after the existing ds->index, which is also 4 bytes in
> size. As a positive side effect, ds->tagger_data now fits into the first
> cache line too, because 4 bytes are saved.
>
> Before:
>
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
> bool setup; /* 0 1 */
>
> /* XXX 7 bytes hole, try to pack */
>
> struct device * dev; /* 8 8 */
> struct dsa_switch_tree * dst; /* 16 8 */
> unsigned int index; /* 24 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> struct notifier_block nb; /* 32 24 */
>
> /* XXX last struct has 4 bytes of padding */
>
> void * priv; /* 56 8 */
> /* --- cacheline 1 boundary (64 bytes) --- */
> void * tagger_data; /* 64 8 */
> struct dsa_chip_data * cd; /* 72 8 */
> const struct dsa_switch_ops * ops; /* 80 8 */
> u32 phys_mii_mask; /* 88 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> struct mii_bus * slave_mii_bus; /* 96 8 */
> unsigned int ageing_time_min; /* 104 4 */
> unsigned int ageing_time_max; /* 108 4 */
> struct dsa_8021q_context * tag_8021q_ctx; /* 112 8 */
> struct devlink * devlink; /* 120 8 */
> /* --- cacheline 2 boundary (128 bytes) --- */
> unsigned int num_tx_queues; /* 128 4 */
> bool vlan_filtering_is_global; /* 132 1 */
> bool needs_standalone_vlan_filtering; /* 133 1 */
> bool configure_vlan_while_not_filtering; /* 134 1 */
> bool untag_bridge_pvid; /* 135 1 */
> bool assisted_learning_on_cpu_port; /* 136 1 */
> bool vlan_filtering; /* 137 1 */
> bool pcs_poll; /* 138 1 */
> bool mtu_enforcement_ingress; /* 139 1 */
> unsigned int num_lag_ids; /* 140 4 */
> unsigned int max_num_bridges; /* 144 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> size_t num_ports; /* 152 8 */
>
> /* size: 160, cachelines: 3, members: 27 */
> /* sum members: 141, holes: 4, sum holes: 19 */
> /* paddings: 1, sum paddings: 4 */
> /* last cacheline: 32 bytes */
> };
>
> After:
>
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
> struct device * dev; /* 0 8 */
> struct dsa_switch_tree * dst; /* 8 8 */
> unsigned int index; /* 16 4 */
> u32 setup:1; /* 20: 0 4 */
> u32 vlan_filtering_is_global:1; /* 20: 1 4 */
> u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */
> u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */
> u32 untag_bridge_pvid:1; /* 20: 4 4 */
> u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */
> u32 vlan_filtering:1; /* 20: 6 4 */
> u32 pcs_poll:1; /* 20: 7 4 */
> u32 mtu_enforcement_ingress:1; /* 20: 8 4 */
>
> /* XXX 23 bits hole, try to pack */
>
> struct notifier_block nb; /* 24 24 */
>
> /* XXX last struct has 4 bytes of padding */
>
> void * priv; /* 48 8 */
> void * tagger_data; /* 56 8 */
> /* --- cacheline 1 boundary (64 bytes) --- */
> struct dsa_chip_data * cd; /* 64 8 */
> const struct dsa_switch_ops * ops; /* 72 8 */
> u32 phys_mii_mask; /* 80 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> struct mii_bus * slave_mii_bus; /* 88 8 */
> unsigned int ageing_time_min; /* 96 4 */
> unsigned int ageing_time_max; /* 100 4 */
> struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */
> struct devlink * devlink; /* 112 8 */
> unsigned int num_tx_queues; /* 120 4 */
> unsigned int num_lag_ids; /* 124 4 */
> /* --- cacheline 2 boundary (128 bytes) --- */
> unsigned int max_num_bridges; /* 128 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> size_t num_ports; /* 136 8 */
>
> /* size: 144, cachelines: 3, members: 27 */
> /* sum members: 132, holes: 2, sum holes: 8 */
> /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
> /* paddings: 1, sum paddings: 4 */
> /* last cacheline: 16 bytes */
> };
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
Reviewed-by: Florian Fainelli <f.fainelli@...il.com>
--
Florian
Powered by blists - more mailing lists