[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58F550DC.3050200@iogearbox.net>
Date: Tue, 18 Apr 2017 01:33:48 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
David Miller <davem@...emloft.net>
CC: brouer@...hat.com, kubakici@...pl, netdev@...r.kernel.org,
xdp-newbies@...r.kernel.org
Subject: Re: [PATCH v3 net-next RFC] Generic XDP
On 04/18/2017 01:04 AM, Alexei Starovoitov wrote:
> On Mon, Apr 17, 2017 at 03:49:55PM -0400, David Miller wrote:
>> From: Jesper Dangaard Brouer <brouer@...hat.com>
>> Date: Sun, 16 Apr 2017 22:26:01 +0200
>>
>>> The bpf tail-call use-case is a very good example of why the
>>> verifier cannot deduct the needed HEADROOM upfront.
>>
>> This brings up a very interesting question for me.
>>
>> I notice that tail calls are implemented by JITs largely by skipping
>> over the prologue of that destination program.
>>
>> However, many JITs preload cached SKB values into fixed registers in
>> the prologue. But they only do this if the program being JITed needs
>> those values.
>>
>> So how can it work properly if a program that does not need the SKB
>> values tail calls into one that does?
>
> For x86 JIT it's fine, since caching of skb values is not part of the prologue:
> emit_prologue(&prog);
> if (seen_ld_abs)
> emit_load_skb_data_hlen(&prog);
> and tail_call jumps into the next program as:
> EMIT4(0x48, 0x83, 0xC0, PROLOGUE_SIZE); /* add rax, prologue_size */
> EMIT2(0xFF, 0xE0); /* jmp rax */
> whereas inside emit_prologue() we have:
> B UILD_BUG_ON(cnt != PROLOGUE_SIZE);
>
> arm64 has similar proplogue skipping code and it's even
> simpler than x86, since it doesn't try to optimize LD_ABS/IND in assembler
> and instead calls into bpf_load_pointer() from generated code,
> so no caching of skb values at all.
>
> s390 jit has partial skipping of prologue, since bunch
> of registers are save/restored during tail_call and it looks fine
> to me as well.
And ppc64 does unwinding/tearing down the stack of the prog before
jumping into the other program. Thus, no skipping of others prologue;
looks fine, too.
> It's very hard to extend test_bpf.ko with tail_calls, since maps need
> to be allocated and populated with file descriptors which are
> not feasible to do from .ko. Instead we need a user space based test for it.
> We've started building one in tools/testing/selftests/bpf/test_progs.c
> much more tests need to be added. Thorough testing of tail_calls
> is on the todo list.
Powered by blists - more mailing lists