[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190131183531.d466egde46lywzwa@ast-mbp.dhcp.thefacebook.com>
Date: Thu, 31 Jan 2019 10:35:33 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Martynas Pumputis <m@...bda.lt>
Cc: netdev@...r.kernel.org, ast@...nel.org, daniel@...earbox.net
Subject: Re: [PATCH bpf-next] bpf: add optional memory accounting for maps
On Wed, Jan 30, 2019 at 03:02:51PM +0100, Martynas Pumputis wrote:
> Previously, memory allocated for a map was not accounted. Therefore,
> this memory could not be taken into consideration by the cgroups
> memory controller.
>
> This patch introduces the "BPF_F_ACCOUNT_MEM" flag which enables
> the memory accounting for a map, and it can be set during
> the map creation ("BPF_MAP_CREATE") in "map_flags".
>
> When enabled, we account only that amount of memory which is charged
> against the "RLIMIT_MEMLOCK" limit.
>
> To validate the change, first we create the memory cgroup "test-map":
>
> # mkdir /sys/fs/cgroup/memory/test-map
>
> And then we run the following program against the cgroup:
>
> $ cat test_map.c
> <..>
> int main() {
> usleep(3 * 1000000);
> assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, 0) > 0);
> usleep(3 * 1000000);
> }
> # cgexec -g memory:test-map ./test_map &
> # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
> 397312
> 258048
>
> <after 3 sec the map has been created>
>
> # bpftool map list
> 19: hash flags 0x0
> key 8B value 16B max_entries 65536 memlock 5771264B
> # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
> 401408
> 262144
>
> As we can see, the memory allocated for map is not accounted, as
> 397312B + 5771264B > 401408B.
>
> Next, we enabled the accounting and re-run the test:
>
> $ cat test_map.c
> <..>
> int main() {
> usleep(3 * 1000000);
> assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, BPF_F_ACCOUNT_MEM) > 0);
> usleep(3 * 1000000);
> }
> # cgexec -g memory:test-map ./test_map &
> # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
> 450560
> 307200
>
> <after 3 sec the map has been created>
>
> # bpftool map list
> 20: hash flags 0x80
> key 8B value 16B max_entries 65536 memlock 5771264B
> # cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
> 6221824
> 6078464
>
> This time, the memory (including kmem) is accounted, as
> 450560B + 5771264B <= 6221824B
>
> Signed-off-by: Martynas Pumputis <m@...bda.lt>
...
> @@ -49,7 +51,9 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
>
> err = -ENOMEM;
>
> - m->flush_list = alloc_percpu(struct list_head);
> + if (account_mem)
> + gfp |= __GFP_ACCOUNT;
> + m->flush_list = alloc_percpu_gfp(struct list_head, gfp);
I think it's better to account this memory by default.
Extra flag during map creation is not needed.
There are nokmem and nosocket memcg boot options.
We can add one more to turn off accounting of bpf map memory.
Powered by blists - more mailing lists