[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51893658-b798-c8da-fc9f-3e65f771eaec@solarflare.com>
Date: Tue, 9 Jan 2018 16:23:08 +0000
From: Edward Cree <ecree@...arflare.com>
To: Daniel Borkmann <daniel@...earbox.net>, <ast@...com>
CC: <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf] bpf: avoid false sharing of map refcount with
max_entries
On 09/01/18 12:17, Daniel Borkmann wrote:
> In addition to commit b2157399cc98 ("bpf: prevent out-of-bounds
> speculation") also change the layout of struct bpf_map such that
> false sharing of fast-path members like max_entries is avoided
> when the maps reference counter is altered. Therefore enforce
> them to be placed into separate cachelines.
>
> pahole dump after change:
>
> struct bpf_map {
> const struct bpf_map_ops * ops; /* 0 8 */
> struct bpf_map * inner_map_meta; /* 8 8 */
> void * security; /* 16 8 */
> enum bpf_map_type map_type; /* 24 4 */
> u32 key_size; /* 28 4 */
> u32 value_size; /* 32 4 */
> u32 max_entries; /* 36 4 */
> u32 map_flags; /* 40 4 */
> u32 pages; /* 44 4 */
> u32 id; /* 48 4 */
> int numa_node; /* 52 4 */
> bool unpriv_array; /* 56 1 */
>
> /* XXX 7 bytes hole, try to pack */
>
> /* --- cacheline 1 boundary (64 bytes) --- */
> struct user_struct * user; /* 64 8 */
> atomic_t refcnt; /* 72 4 */
> atomic_t usercnt; /* 76 4 */
> struct work_struct work; /* 80 32 */
> char name[16]; /* 112 16 */
> /* --- cacheline 2 boundary (128 bytes) --- */
>
> /* size: 128, cachelines: 2, members: 17 */
> /* sum members: 121, holes: 1, sum holes: 7 */
> };
>
> Now all entries in the first cacheline are read only throughout
> the life time of the map, set up once during map creation. Overall
> struct size and number of cachelines doesn't change from the
> reordering. struct bpf_map is usually first member and embedded
> in map structs in specific map implementations, so also avoid those
> members to sit at the end where it could potentially share the
> cacheline with first map values e.g. in the array since remote
> CPUs could trigger map updates just as well for those (easily
> dirtying members like max_entries intentionally as well) while
> having subsequent values in cache.
>
> Quoting from Goolge's Project Zero blog [1]:
typo "Goolge".
Powered by blists - more mailing lists