netdev - Re: [PATCH bpf-next 07/10] bpf: add support for BTF pointers to x86 JIT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c08a5ce5-cc82-97ae-40cd-8f8bdd8a5668@fb.com>
Date:   Wed, 9 Oct 2019 17:46:16 +0000
From:   Alexei Starovoitov <ast@...com>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Alexei Starovoitov <ast@...nel.org>
CC:     "David S. Miller" <davem@...emloft.net>,
        Daniel Borkmann <daniel@...earbox.net>,
        "x86@...nel.org" <x86@...nel.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next 07/10] bpf: add support for BTF pointers to x86
 JIT

On 10/9/19 10:38 AM, Andrii Nakryiko wrote:
> On Fri, Oct 4, 2019 at 10:04 PM Alexei Starovoitov <ast@...nel.org> wrote:
>>
>> Pointer to BTF object is a pointer to kernel object or NULL.
>> Such pointers can only be used by BPF_LDX instructions.
>> The verifier changed their opcode from LDX|MEM|size
>> to LDX|PROBE_MEM|size to make JITing easier.
>> The number of entries in extable is the number of BPF_LDX insns
>> that access kernel memory via "pointer to BTF type".
>> Only these load instructions can fault.
>> Since x86 extable is relative it has to be allocated in the same
>> memory region as JITed code.
>> Allocate it prior to last pass of JITing and let the last pass populate it.
>> Pointer to extable in bpf_prog_aux is necessary to make page fault
>> handling fast.
>> Page fault handling is done in two steps:
>> 1. bpf_prog_kallsyms_find() finds BPF program that page faulted.
>>     It's done by walking rb tree.
>> 2. then extable for given bpf program is binary searched.
>> This process is similar to how page faulting is done for kernel modules.
>> The exception handler skips over faulting x86 instruction and
>> initializes destination register with zero. This mimics exact
>> behavior of bpf_probe_read (when probe_kernel_read faults dest is zeroed).
>>
>> JITs for other architectures can add support in similar way.
>> Until then they will reject unknown opcode and fallback to interpreter.
>>
>> Signed-off-by: Alexei Starovoitov <ast@...nel.org>
>> ---
>>   arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++++++--
>>   include/linux/bpf.h         |  3 ++
>>   include/linux/extable.h     | 10 ++++
>>   kernel/bpf/core.c           | 20 +++++++-
>>   kernel/bpf/verifier.c       |  1 +
>>   kernel/extable.c            |  2 +
>>   6 files changed, 127 insertions(+), 5 deletions(-)
>>
> 
> This is surprisingly easy to follow :) Looks good overall, just one
> concern about 32-bit distance between ex_handler_bpf and BPF jitted
> program below. And I agree with Eric, probably need to ensure proper
> alignment for exception_table_entry array.

already fixed.


> [...]
> 
>> @@ -805,6 +835,48 @@ stx:                       if (is_imm8(insn->off))
>>                          else
>>                                  EMIT1_off32(add_2reg(0x80, src_reg, dst_reg),
>>                                              insn->off);
>> +                       if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
>> +                               struct exception_table_entry *ex;
>> +                               u8 *_insn = image + proglen;
>> +                               s64 delta;
>> +
>> +                               if (!bpf_prog->aux->extable)
>> +                                       break;
>> +
>> +                               if (excnt >= bpf_prog->aux->num_exentries) {
>> +                                       pr_err("ex gen bug\n");
> 
> This should never happen, right? BUG()?

absolutely not. No BUGs in kernel for things like this.
If kernel can continue it should.

>> +                                       return -EFAULT;
>> +                               }
>> +                               ex = &bpf_prog->aux->extable[excnt++];
>> +
>> +                               delta = _insn - (u8 *)&ex->insn;
>> +                               if (!is_simm32(delta)) {
>> +                                       pr_err("extable->insn doesn't fit into 32-bit\n");
>> +                                       return -EFAULT;
>> +                               }
>> +                               ex->insn = delta;
>> +
>> +                               delta = (u8 *)ex_handler_bpf - (u8 *)&ex->handler;
> 
> how likely it is that global ex_handle_bpf will be close enough to
> dynamically allocated piece of exception_table_entry?

99.9% Since we rely on that in other places in the JIT.
See BPF_CALL, for example.
But I'd like to keep the check below. Just in case.
Same as in BPF_CALL.