netdev - Re: [PATCH net-next 1/2] bpf: improve htab inlining for future 32 bit jits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <8c2b13a7-45d3-77c2-1ff4-25c183ed82ab@fb.com>
Date:   Fri, 18 Aug 2017 17:24:13 -0700
From:   Alexei Starovoitov <ast@...com>
To:     Daniel Borkmann <daniel@...earbox.net>, <davem@...emloft.net>
CC:     <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 1/2] bpf: improve htab inlining for future 32 bit
 jits

On 8/18/17 5:21 PM, Daniel Borkmann wrote:
> On 08/19/2017 02:00 AM, Alexei Starovoitov wrote:
>> On 8/18/17 4:51 PM, Daniel Borkmann wrote:
>>> Lets future proof htab lookup inlining, commit 9015d2f59535 ("bpf:
>>> inline htab_map_lookup_elem()") was making the assumption that a
>>> direct call emission to __htab_map_lookup_elem() will always work
>>> out for JITs. This is currently true since all JITs we have are
>>> for 64 bit archs, but in case of 32 bit JITs like upcoming arm32,
>>> we get a NULL pointer dereference when executing the call to
>>> __htab_map_lookup_elem() since passed arguments are of a different
>>> size (unsigned long vs. u64 for pointers) than what we do out of
>>> BPF. Thus, lets do a proper BPF_CALL_2() declaration such that we
>>> don't need to make any such assumptions.
>>>
>>> Reported-by: Shubham Bansal <illusionist.neo@...il.com>
>>> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
>>
>> assuming on 64-bit archs the should be no perf difference
>> and only increase in .text, since __htab_map_lookup_elem
>> is now force inlined into a bunch of places?
>> I guess that's ok, but kinda sux for 64-bit archs to pay
>> such penalty because of 32-bit archs.
>
> Yeah true, text bumps from 11k to 13k, doesn't pay off.
>
>> May be drop always_inline and do such thing conditionally
>> on 32-bit archs only?
>
> I will guard with this instead:
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4f6e7eb..e42c096 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -4160,7 +4160,11 @@ static int fixup_bpf_calls(struct
> bpf_verifier_env *env)
>                         continue;
>                 }
>
> -               if (ebpf_jit_enabled() && insn->imm ==
> BPF_FUNC_map_lookup_elem) {
> +               /* BPF_EMIT_CALL() assumptions in some of the
> map_gen_lookup
> +                * handlers are currently limited to 64 bit only.
> +                */
> +               if (ebpf_jit_enabled() && BITS_PER_LONG == 64 &&
> +                   insn->imm == BPF_FUNC_map_lookup_elem) {
>                         map_ptr = env->insn_aux_data[i + delta].map_ptr;
>                         if (map_ptr == BPF_MAP_PTR_POISON ||
>                             !map_ptr->ops->map_gen_lookup)

sure. looks good to me for now. We probably need to first measure
the perf gains out of inlining on 32-bit arm to go next step.
Thanks!