[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5a84482-13ef-47d8-bf07-8017060a5d64@linux.dev>
Date: Sun, 26 Nov 2023 21:53:04 -0800
From: Yonghong Song <yonghong.song@...ux.dev>
To: Eduard Zingerman <eddyz87@...il.com>, Daniel Xu <dxu@...uu.xyz>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Shuah Khan <shuah@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, Alexei Starovoitov <ast@...nel.org>,
Steffen Klassert <steffen.klassert@...unet.com>, antony.antony@...unet.com,
Mykola Lysenko <mykolal@...com>, Martin KaFai Lau <martin.lau@...ux.dev>,
Song Liu <song@...nel.org>, John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...gle.com>,
Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
bpf <bpf@...r.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, devel@...ux-ipsec.org,
Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH ipsec-next v1 6/7] bpf: selftests: test_tunnel: Disable
CO-RE relocations
On 11/27/23 12:44 AM, Yonghong Song wrote:
>
> On 11/26/23 8:52 PM, Eduard Zingerman wrote:
>> On Sun, 2023-11-26 at 18:04 -0600, Daniel Xu wrote:
>> [...]
>>>> Tbh I'm not sure. This test passes with preserve_static_offset
>>>> because it suppresses preserve_access_index. In general clang
>>>> translates bitfield access to a set of IR statements like:
>>>>
>>>> C:
>>>> struct foo {
>>>> unsigned _;
>>>> unsigned a:1;
>>>> ...
>>>> };
>>>> ... foo->a ...
>>>>
>>>> IR:
>>>> %a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
>>>> %bf.load = load i8, ptr %a, align 4
>>>> %bf.clear = and i8 %bf.load, 1
>>>> %bf.cast = zext i8 %bf.clear to i32
>>>>
>>>> With preserve_static_offset the getelementptr+load are replaced by a
>>>> single statement which is preserved as-is till code generation,
>>>> thus load with align 4 is preserved.
>>>>
>>>> On the other hand, I'm not sure that clang guarantees that load or
>>>> stores used for bitfield access would be always aligned according to
>>>> verifier expectations.
>>>>
>>>> I think we should check if there are some clang knobs that prevent
>>>> generation of unaligned memory access. I'll take a look.
>>> Is there a reason to prefer fixing in compiler? I'm not opposed to it,
>>> but the downside to compiler fix is it takes years to propagate and
>>> sprinkles ifdefs into the code.
>>>
>>> Would it be possible to have an analogue of BPF_CORE_READ_BITFIELD()?
>> Well, the contraption below passes verification, tunnel selftest
>> appears to work. I might have messed up some shifts in the macro,
>> though.
>
> I didn't test it. But from high level it should work.
>
>>
>> Still, if clang would peek unlucky BYTE_{OFFSET,SIZE} for a particular
>> field access might be unaligned.
>
> clang should pick a sensible BYTE_SIZE/BYTE_OFFSET to meet
> alignment requirement. This is also required for BPF_CORE_READ_BITFIELD.
>
>>
>> ---
>>
>> diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
>> b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
>> index 3065a716544d..41cd913ac7ff 100644
>> --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
>> +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
>> @@ -9,6 +9,7 @@
>> #include "vmlinux.h"
>> #include <bpf/bpf_helpers.h>
>> #include <bpf/bpf_endian.h>
>> +#include <bpf/bpf_core_read.h>
>> #include "bpf_kfuncs.h"
>> #include "bpf_tracing_net.h"
>> @@ -144,6 +145,38 @@ int ip6gretap_get_tunnel(struct __sk_buff *skb)
>> return TC_ACT_OK;
>> }
>> +#define BPF_CORE_WRITE_BITFIELD(s, field, new_val) ({ \
>> + void *p = (void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \
>> + unsigned byte_size = __CORE_RELO(s, field, BYTE_SIZE); \
>> + unsigned lshift = __CORE_RELO(s, field, LSHIFT_U64); \
>> + unsigned rshift = __CORE_RELO(s, field, RSHIFT_U64); \
>> + unsigned bit_size = (rshift - lshift); \
>> + unsigned long long nval, val, hi, lo; \
>> + \
>> + asm volatile("" : "=r"(p) : "0"(p)); \
>
> Use asm volatile("" : "+r"(p)) ?
>
>> + \
>> + switch (byte_size) { \
>> + case 1: val = *(unsigned char *)p; break; \
>> + case 2: val = *(unsigned short *)p; break; \
>> + case 4: val = *(unsigned int *)p; break; \
>> + case 8: val = *(unsigned long long *)p; break; \
>> + } \
>> + hi = val >> (bit_size + rshift); \
>> + hi <<= bit_size + rshift; \
>> + lo = val << (bit_size + lshift); \
>> + lo >>= bit_size + lshift; \
>> + nval = new_val; \
>> + nval <<= lshift; \
>> + nval >>= rshift; \
>> + val = hi | nval | lo; \
>> + switch (byte_size) { \
>> + case 1: *(unsigned char *)p = val; break; \
>> + case 2: *(unsigned short *)p = val; break; \
>> + case 4: *(unsigned int *)p = val; break; \
>> + case 8: *(unsigned long long *)p = val; break; \
>> + } \
>> +})
>
> I think this should be put in libbpf public header files but not sure
> where to put it. bpf_core_read.h although it is core write?
>
> But on the other hand, this is a uapi struct bitfield write,
> strictly speaking, CORE write is really unnecessary here. It
> would be great if we can relieve users from dealing with
> such unnecessary CORE writes. In that sense, for this particular
> case, I would prefer rewriting the code by using byte-level
> stores...
or preserve_static_offset to clearly mean to undo bitfield CORE ...
[...]
Powered by blists - more mailing lists