[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 11 Dec 2018 19:17:28 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: netdev@...r.kernel.org, sandipan@...ux.ibm.com,
mdroth@...ux.vnet.ibm.com
Subject: Re: [PATCH bpf v2] bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
On Tue, Dec 11, 2018 at 12:14:12PM +0100, Daniel Borkmann wrote:
> Michael and Sandipan report:
>
> Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
> JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
> and is adjusted again at init time if MODULES_VADDR is defined.
>
> For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
> the compile-time default at boot-time, which is 0x9c400000 when
> using 64K page size. This overflows the signed 32-bit bpf_jit_limit
> value:
>
> root@...ntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
> -1673527296
>
> and can cause various unexpected failures throughout the network
> stack. In one case `strace dhclient eth0` reported:
>
> setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
> 16) = -1 ENOTSUPP (Unknown error 524)
>
> and similar failures can be seen with tools like tcpdump. This doesn't
> always reproduce however, and I'm not sure why. The more consistent
> failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
> host would time out on systemd/netplan configuring a virtio-net NIC
> with no noticeable errors in the logs.
>
> Given this and also given that in near future some architectures like
> arm64 will have a custom area for BPF JIT image allocations we should
> get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
> 4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
> so therefore add another overridable bpf_jit_alloc_exec_limit() helper
> function which returns the possible size of the memory area for deriving
> the default heuristic in bpf_jit_charge_init().
>
> Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
> bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
> JIT memory provider, and therefore in case archs implement their custom
> module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
> vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
>
> Additionally, for archs supporting large page sizes, we should change
> the sysctl to be handled as long to not run into sysctl restrictions
> in future.
>
> Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
> Reported-by: Sandipan Das <sandipan@...ux.ibm.com>
> Reported-by: Michael Roth <mdroth@...ux.vnet.ibm.com>
> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
Applied, Thanks
Powered by blists - more mailing lists