netdev - Re: [PATCH bpf] bpf: avoid false sharing of map refcount with max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <721cc2e2-e03c-5763-9e6d-bffc4a6b772a@iogearbox.net>
Date:   Tue, 9 Jan 2018 17:40:19 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Edward Cree <ecree@...arflare.com>, ast@...com
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH bpf] bpf: avoid false sharing of map refcount with
 max_entries

On 01/09/2018 05:23 PM, Edward Cree wrote:
> On 09/01/18 12:17, Daniel Borkmann wrote:
>> In addition to commit b2157399cc98 ("bpf: prevent out-of-bounds
>> speculation") also change the layout of struct bpf_map such that
>> false sharing of fast-path members like max_entries is avoided
>> when the maps reference counter is altered. Therefore enforce
>> them to be placed into separate cachelines.
>>
>> pahole dump after change:
>>
>>   struct bpf_map {
>>         const struct bpf_map_ops  * ops;                 /*     0     8 */
>>         struct bpf_map *           inner_map_meta;       /*     8     8 */
>>         void *                     security;             /*    16     8 */
>>         enum bpf_map_type          map_type;             /*    24     4 */
>>         u32                        key_size;             /*    28     4 */
>>         u32                        value_size;           /*    32     4 */
>>         u32                        max_entries;          /*    36     4 */
>>         u32                        map_flags;            /*    40     4 */
>>         u32                        pages;                /*    44     4 */
>>         u32                        id;                   /*    48     4 */
>>         int                        numa_node;            /*    52     4 */
>>         bool                       unpriv_array;         /*    56     1 */
>>
>>         /* XXX 7 bytes hole, try to pack */
>>
>>         /* --- cacheline 1 boundary (64 bytes) --- */
>>         struct user_struct *       user;                 /*    64     8 */
>>         atomic_t                   refcnt;               /*    72     4 */
>>         atomic_t                   usercnt;              /*    76     4 */
>>         struct work_struct         work;                 /*    80    32 */
>>         char                       name[16];             /*   112    16 */
>>         /* --- cacheline 2 boundary (128 bytes) --- */
>>
>>         /* size: 128, cachelines: 2, members: 17 */
>>         /* sum members: 121, holes: 1, sum holes: 7 */
>>   };
>>
>> Now all entries in the first cacheline are read only throughout
>> the life time of the map, set up once during map creation. Overall
>> struct size and number of cachelines doesn't change from the
>> reordering. struct bpf_map is usually first member and embedded
>> in map structs in specific map implementations, so also avoid those
>> members to sit at the end where it could potentially share the
>> cacheline with first map values e.g. in the array since remote
>> CPUs could trigger map updates just as well for those (easily
>> dirtying members like max_entries intentionally as well) while
>> having subsequent values in cache.
>>
>> Quoting from Goolge's Project Zero blog [1]:
> typo "Goolge".

Sigh, thanks for catching! Alexei, let me know if you need a resend or
would just amend the message & fix up the typo.