[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a66c937f-94c0-eaf8-5b37-8587d66c0c62@fb.com>
Date: Mon, 1 Jul 2019 17:40:18 +0000
From: Yonghong Song <yhs@...com>
To: Stanislav Fomichev <sdf@...ichev.me>,
Andrii Nakryiko <andrii.nakryiko@...il.com>
CC: Stanislav Fomichev <sdf@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"Andrii Nakryiko" <andriin@...com>,
kernel test robot <rong.a.chen@...el.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: allow wide (u64) aligned stores for
some fields of bpf_sock_addr
On 7/1/19 9:04 AM, Stanislav Fomichev wrote:
> On 07/01, Andrii Nakryiko wrote:
>> On Sat, Jun 29, 2019 at 10:53 PM Yonghong Song <yhs@...com> wrote:
>>>
>>>
>>>
>>> On 6/28/19 4:10 PM, Stanislav Fomichev wrote:
>>>> Since commit cd17d7770578 ("bpf/tools: sync bpf.h") clang decided
>>>> that it can do a single u64 store into user_ip6[2] instead of two
>>>> separate u32 ones:
>>>>
>>>> # 17: (18) r2 = 0x100000000000000
>>>> # ; ctx->user_ip6[2] = bpf_htonl(DST_REWRITE_IP6_2);
>>>> # 19: (7b) *(u64 *)(r1 +16) = r2
>>>> # invalid bpf_context access off=16 size=8
>>>>
>>>> From the compiler point of view it does look like a correct thing
>>>> to do, so let's support it on the kernel side.
>>>>
>>>> Credit to Andrii Nakryiko for a proper implementation of
>>>> bpf_ctx_wide_store_ok.
>>>>
>>>> Cc: Andrii Nakryiko <andriin@...com>
>>>> Cc: Yonghong Song <yhs@...com>
>>>> Fixes: cd17d7770578 ("bpf/tools: sync bpf.h")
>>>> Reported-by: kernel test robot <rong.a.chen@...el.com>
>>>> Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
>>>
>>> The change looks good to me with the following nits:
>>> 1. could you add a cover letter for the patch set?
>>> typically if the number of patches is more than one,
>>> it would be a good practice with a cover letter.
>>> See bpf_devel_QA.rst .
>>> 2. with this change, the comments in uapi bpf.h
>>> are not accurate any more.
>>> __u32 user_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
>>> * Stored in network byte order.
>>>
>>> */
>>> __u32 msg_src_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
>>> * Stored in network byte order.
>>> */
>>> now for stores, aligned 8-byte write is permitted.
>>> could you update this as well?
>>>
>>> From the typical usage pattern, I did not see a need
>>> for 8-tye read of user_ip6 and msg_src_ip6 yet. So let
>>> us just deal with write for now.
>>
>> But I guess it's still possible for clang to optimize two consecutive
>> 4-byte reads into single 8-byte read in some circumstances? If that's
>> the case, maybe it's a good idea to have corresponding read checks as
>> well?
> I guess clang can do those kinds of optimizations. I can put it on my
> todo and address later (or when we actually see it out in the wild).
Okay, I find a Facebook internal app. does trying to read the 4 bytes
and compare to a predefined loopback address. We may need to handle
read cases as well. But this can be a followup after actual tryout.
>
>> But overall this looks good to me:
>>
>> Acked-by: Andrii Nakryiko <andriin@...com>
> Thanks for a review!
>
>>>
>>> With the above two nits,
>>> Acked-by: Yonghong Song <yhs@...com>
>>>
>>>> ---
>>>> include/linux/filter.h | 6 ++++++
>>>> net/core/filter.c | 22 ++++++++++++++--------
>>>> 2 files changed, 20 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/include/linux/filter.h b/include/linux/filter.h
>>>> index 340f7d648974..3901007e36f1 100644
>>>> --- a/include/linux/filter.h
>>>> +++ b/include/linux/filter.h
>>>> @@ -746,6 +746,12 @@ bpf_ctx_narrow_access_ok(u32 off, u32 size, u32 size_default)
>>>> return size <= size_default && (size & (size - 1)) == 0;
>>>> }
>>>>
>>>> +#define bpf_ctx_wide_store_ok(off, size, type, field) \
>>>> + (size == sizeof(__u64) && \
>>>> + off >= offsetof(type, field) && \
>>>> + off + sizeof(__u64) <= offsetofend(type, field) && \
>>>> + off % sizeof(__u64) == 0)
>>>> +
>>>> #define bpf_classic_proglen(fprog) (fprog->len * sizeof(fprog->filter[0]))
>>>>
>>>> static inline void bpf_prog_lock_ro(struct bpf_prog *fp)
>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>> index dc8534be12fc..5d33f2146dab 100644
>>>> --- a/net/core/filter.c
>>>> +++ b/net/core/filter.c
>>>> @@ -6849,6 +6849,16 @@ static bool sock_addr_is_valid_access(int off, int size,
>>>> if (!bpf_ctx_narrow_access_ok(off, size, size_default))
>>>> return false;
>>>> } else {
>>>> + if (bpf_ctx_wide_store_ok(off, size,
>>>> + struct bpf_sock_addr,
>>>> + user_ip6))
>>>> + return true;
>>>> +
>>>> + if (bpf_ctx_wide_store_ok(off, size,
>>>> + struct bpf_sock_addr,
>>>> + msg_src_ip6))
>>>> + return true;
>>>> +
>>>> if (size != size_default)
>>>> return false;
>>>> }
>>>> @@ -7689,9 +7699,6 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
>>>> /* SOCK_ADDR_STORE_NESTED_FIELD_OFF() has semantic similar to
>>>> * SOCK_ADDR_LOAD_NESTED_FIELD_SIZE_OFF() but for store operation.
>>>> *
>>>> - * It doesn't support SIZE argument though since narrow stores are not
>>>> - * supported for now.
>>>> - *
>>>> * In addition it uses Temporary Field TF (member of struct S) as the 3rd
>>>> * "register" since two registers available in convert_ctx_access are not
>>>> * enough: we can't override neither SRC, since it contains value to store, nor
>>>> @@ -7699,7 +7706,7 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
>>>> * instructions. But we need a temporary place to save pointer to nested
>>>> * structure whose field we want to store to.
>>>> */
>>>> -#define SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, OFF, TF) \
>>>> +#define SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, SIZE, OFF, TF) \
>>>> do { \
>>>> int tmp_reg = BPF_REG_9; \
>>>> if (si->src_reg == tmp_reg || si->dst_reg == tmp_reg) \
>>>> @@ -7710,8 +7717,7 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
>>>> offsetof(S, TF)); \
>>>> *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(S, F), tmp_reg, \
>>>> si->dst_reg, offsetof(S, F)); \
>>>> - *insn++ = BPF_STX_MEM( \
>>>> - BPF_FIELD_SIZEOF(NS, NF), tmp_reg, si->src_reg, \
>>>> + *insn++ = BPF_STX_MEM(SIZE, tmp_reg, si->src_reg, \
>>>> bpf_target_off(NS, NF, FIELD_SIZEOF(NS, NF), \
>>>> target_size) \
>>>> + OFF); \
>>>> @@ -7723,8 +7729,8 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
>>>> TF) \
>>>> do { \
>>>> if (type == BPF_WRITE) { \
>>>> - SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, OFF, \
>>>> - TF); \
>>>> + SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, SIZE, \
>>>> + OFF, TF); \
>>>> } else { \
>>>> SOCK_ADDR_LOAD_NESTED_FIELD_SIZE_OFF( \
>>>> S, NS, F, NF, SIZE, OFF); \
>>>>
Powered by blists - more mailing lists