[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <721cc2e2-e03c-5763-9e6d-bffc4a6b772a@iogearbox.net>
Date: Tue, 9 Jan 2018 17:40:19 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Edward Cree <ecree@...arflare.com>, ast@...com
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH bpf] bpf: avoid false sharing of map refcount with
max_entries
On 01/09/2018 05:23 PM, Edward Cree wrote:
> On 09/01/18 12:17, Daniel Borkmann wrote:
>> In addition to commit b2157399cc98 ("bpf: prevent out-of-bounds
>> speculation") also change the layout of struct bpf_map such that
>> false sharing of fast-path members like max_entries is avoided
>> when the maps reference counter is altered. Therefore enforce
>> them to be placed into separate cachelines.
>>
>> pahole dump after change:
>>
>> struct bpf_map {
>> const struct bpf_map_ops * ops; /* 0 8 */
>> struct bpf_map * inner_map_meta; /* 8 8 */
>> void * security; /* 16 8 */
>> enum bpf_map_type map_type; /* 24 4 */
>> u32 key_size; /* 28 4 */
>> u32 value_size; /* 32 4 */
>> u32 max_entries; /* 36 4 */
>> u32 map_flags; /* 40 4 */
>> u32 pages; /* 44 4 */
>> u32 id; /* 48 4 */
>> int numa_node; /* 52 4 */
>> bool unpriv_array; /* 56 1 */
>>
>> /* XXX 7 bytes hole, try to pack */
>>
>> /* --- cacheline 1 boundary (64 bytes) --- */
>> struct user_struct * user; /* 64 8 */
>> atomic_t refcnt; /* 72 4 */
>> atomic_t usercnt; /* 76 4 */
>> struct work_struct work; /* 80 32 */
>> char name[16]; /* 112 16 */
>> /* --- cacheline 2 boundary (128 bytes) --- */
>>
>> /* size: 128, cachelines: 2, members: 17 */
>> /* sum members: 121, holes: 1, sum holes: 7 */
>> };
>>
>> Now all entries in the first cacheline are read only throughout
>> the life time of the map, set up once during map creation. Overall
>> struct size and number of cachelines doesn't change from the
>> reordering. struct bpf_map is usually first member and embedded
>> in map structs in specific map implementations, so also avoid those
>> members to sit at the end where it could potentially share the
>> cacheline with first map values e.g. in the array since remote
>> CPUs could trigger map updates just as well for those (easily
>> dirtying members like max_entries intentionally as well) while
>> having subsequent values in cache.
>>
>> Quoting from Goolge's Project Zero blog [1]:
> typo "Goolge".
Sigh, thanks for catching! Alexei, let me know if you need a resend or
would just amend the message & fix up the typo.
Powered by blists - more mailing lists