[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a049a97-d012-1d98-5308-93a91d6b3055@iogearbox.net>
Date: Fri, 1 Mar 2019 20:51:09 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Yonghong Song <yhs@...com>, Alexei Starovoitov <ast@...com>
Cc: "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"joe@...d.net.nz" <joe@...d.net.nz>,
"john.fastabend@...il.com" <john.fastabend@...il.com>,
"tgraf@...g.ch" <tgraf@...g.ch>, Andrii Nakryiko <andriin@...com>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"lmb@...udflare.com" <lmb@...udflare.com>
Subject: Re: [PATCH bpf-next v2 1/7] bpf: implement lookup-free direct value
access
On 03/01/2019 06:18 PM, Yonghong Song wrote:
> On 2/28/19 3:18 PM, Daniel Borkmann wrote:
>> This generic extension to BPF maps allows for directly loading an
>> address residing inside a BPF map value as a single BPF ldimm64
>> instruction.
>>
>> The idea is similar to what BPF_PSEUDO_MAP_FD does today, which
>> is a special src_reg flag for ldimm64 instruction that indicates
>> that inside the first part of the double insns's imm field is a
>> file descriptor which the verifier then replaces as a full 64bit
>> address of the map into both imm parts.
>>
>> For the newly added BPF_PSEUDO_MAP_VALUE src_reg flag, the idea
>> is similar: the first part of the double insns's imm field is
>> again a file descriptor corresponding to the map, and the second
>> part of the imm field is an offset. The verifier will then replace
>> both imm parts with an address that points into the BPF map value
>> for maps that support this operation. BPF_PSEUDO_MAP_VALUE is a
>> distinct flag as otherwise with BPF_PSEUDO_MAP_FD we could not
>> differ offset 0 between load of map pointer versus load of map's
>> value at offset 0.
>>
>> This allows for efficiently retrieving an address to a map value
>> memory area without having to issue a helper call which needs to
>> prepare registers according to calling convention, etc, without
>> needing the extra NULL test, and without having to add the offset
>> in an additional instruction to the value base pointer.
>>
>> The verifier then treats the destination register as PTR_TO_MAP_VALUE
>> with constant reg->off from the user passed offset from the second
>> imm field, and guarantees that this is within bounds of the map
>> value. Any subsequent operations are normally treated as typical
>> map value handling without anything else needed for verification.
>>
>> The two map operations for direct value access have been added to
>> array map for now. In future other types could be supported as
>> well depending on the use case. The main use case for this commit
>> is to allow for BPF loader support for global variables that
>> reside in .data/.rodata/.bss sections such that we can directly
>> load the address of them with minimal additional infrastructure
>> required. Loader support has been added in subsequent commits for
>> libbpf library.
>
> The patch version #1 provides a way to replace the load with
> immediate (presumably read-only data). This will be good for
> the use case like below:
>
> if (static_variable_kernel_version == V1) {
> /* code here will work for kernel V1 */
> ... access helpers available for V1 ...
> } else if (static_variable_kernel_version == V2) {
> /* code here will work for kernel V2 */
> ... access helpers available for V2 ...
> }
>
> The approach here did not replace the map value access with values from
> e.g., readonly section for which libbpf could provide an interface to
> fill in data from user.
>
> This may require a little more analysis, e.g.,
> ptr = ld_imm64 from a readonly section
> ...
> *(u32 *)ptr;
> *(u64 *)(ptr + 8);
> ...
>
> Do you think we could do this in kernel verifier or we should
> push the whole readonly stuff into user space?
And in your case the static_variable_kernel_version would be determined
at runtime, for example, where you then would want to eliminate all the
other branches, right? Meaning, you'd need a way to turn this into a imm
load such that verifier will detect these dead branches and patch them
out, which it should already be able to do. How would you mark these
special vars like static_variable_kernel_version such that they have
special treatment from the rest, some sort of builtin? Potentially one
could get away with doing this from loader side if it's simple enough,
though one thing that would be good to avoid is to duplicate all the
complex branch fixup logic etc that we have in kernel already. Are you
thinking to mark these via BTF in some way such that loader does inline
replacement?
Thanks,
Daniel
Powered by blists - more mailing lists