lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e8e18e4-3517-4c6c-8457-a4278b906f5d@hartkopp.net>
Date: Mon, 15 Sep 2025 19:41:56 +0200
From: Oliver Hartkopp <socketcan@...tkopp.net>
To: Vincent Mailhol <mailhol@...nel.org>,
 Marc Kleine-Budde <mkl@...gutronix.de>
Cc: linux-can@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] can: raw: use bitfields to store flags in struct
 raw_sock



On 15.09.25 12:47, Vincent Mailhol wrote:
> On 15/09/2025 at 19:16, Oliver Hartkopp wrote:
>> On 15.09.25 11:23, Vincent Mailhol wrote:
>>> The loopback, recv_own_msgs, fd_frames and xl_frames fields of struct
>>> raw_sock just need to store one bit of information.
>>>
>>> Declare all those members as a bitfields of type unsigned int and
>>> width one bit.
>>>
>>> Add a temporary variable to raw_setsockopt() and raw_getsockopt() to
>>> make the conversion between the stored bits and the socket interface.
>>>
>>> This reduces struct raw_sock by eight bytes.
>>>
>>> Statistics before:
>>>
>>>     $ pahole --class_name=raw_sock net/can/raw.o
>>>     struct raw_sock {
>>>         struct sock                sk __attribute__((__aligned__(8))); /*
>>> 0   776 */
>>>
>>>         /* XXX last struct has 1 bit hole */
>>>
>>>         /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
>>>         int                        bound;                /*   776     4 */
>>>         int                        ifindex;              /*   780     4 */
>>>         struct net_device *        dev;                  /*   784     8 */
>>>         netdevice_tracker          dev_tracker;          /*   792     0 */
>>>         struct list_head           notifier;             /*   792    16 */
>>>         int                        loopback;             /*   808     4 */
>>>         int                        recv_own_msgs;        /*   812     4 */
>>>         int                        fd_frames;            /*   816     4 */
>>>         int                        xl_frames;            /*   820     4 */
>>>         struct can_raw_vcid_options raw_vcid_opts;       /*   824     4 */
>>>         canid_t                    tx_vcid_shifted;      /*   828     4 */
>>>         /* --- cacheline 13 boundary (832 bytes) --- */
>>>         canid_t                    rx_vcid_shifted;      /*   832     4 */
>>>         canid_t                    rx_vcid_mask_shifted; /*   836     4 */
>>>         int                        join_filters;         /*   840     4 */
>>>         int                        count;                /*   844     4 */
>>>         struct can_filter          dfilter;              /*   848     8 */
>>>         struct can_filter *        filter;               /*   856     8 */
>>>         can_err_mask_t             err_mask;             /*   864     4 */
>>>
>>>         /* XXX 4 bytes hole, try to pack */
>>>
>>>         struct uniqframe *         uniq;                 /*   872     8 */
>>>
>>>         /* size: 880, cachelines: 14, members: 20 */
>>>         /* sum members: 876, holes: 1, sum holes: 4 */
>>>         /* member types with bit holes: 1, total: 1 */
>>>         /* forced alignments: 1 */
>>>         /* last cacheline: 48 bytes */
>>>     } __attribute__((__aligned__(8)));
>>>
>>> ...and after:
>>>
>>>     $ pahole --class_name=raw_sock net/can/raw.o
>>>     struct raw_sock {
>>>         struct sock                sk __attribute__((__aligned__(8))); /*
>>> 0   776 */
>>>
>>>         /* XXX last struct has 1 bit hole */
>>>
>>>         /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
>>>         int                        bound;                /*   776     4 */
>>>         int                        ifindex;              /*   780     4 */
>>>         struct net_device *        dev;                  /*   784     8 */
>>>         netdevice_tracker          dev_tracker;          /*   792     0 */
>>>         struct list_head           notifier;             /*   792    16 */
>>>         unsigned int               loopback:1;           /*   808: 0  4 */
>>>         unsigned int               recv_own_msgs:1;      /*   808: 1  4 */
>>>         unsigned int               fd_frames:1;          /*   808: 2  4 */
>>>         unsigned int               xl_frames:1;          /*   808: 3  4 */
>>
>> This means that the former data structures (int) are not copied but bits are set
>> (shifted, ANDed, ORed, etc) right?
>>
>> So what's the difference in the code the CPU has to process for this
>> improvement? Is implementing this bitmap more efficient or similar to copy the
>> (unsigned ints) as-is?
> 
> It will indeed have to add a couple assembly instructions. But this is peanuts.
> In the best case, the out of order execution might very well optimize this so
> that not even a CPU tick is wasted. In the worst case, it is a couple CPU ticks.
> 
> On the other hands, reducing the size by 16 bytes lowers the risk to have a
> cache miss. And removing one cache miss outperforms by an order of magnitude the
> penalty of adding a couple assembly instructions.
> 
> Well, I did not benchmark it, but this is a commonly accepted trade off.

Ok.
Most accesses of those values like ro->fd_frames are read-only anyway, 
which might add an additional AND operation with a constant value.

Therefore your suggested changes are not in the hot path anyway and the 
ro->fd_frames = !!flag operation is executed at socket creation time only.

Generally it is interesting the the compiler can handle bits in this way.

Acked-by: Oliver Hartkopp <socketcan@...tkopp.net>

Thanks!

Oliver

> 
> Yours sincerely,
> Vincent Mailhol
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ