[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <389e52fe-13e9-4ded-bfb0-fcffea9b1cbf@huawei.com>
Date: Sat, 26 Aug 2023 19:54:02 +0800
From: "liujian (CE)" <liujian56@...wei.com>
To: John Fastabend <john.fastabend@...il.com>, Jakub Sitnicki
<jakub@...udflare.com>
CC: <ast@...nel.org>, <daniel@...earbox.net>, <andrii@...nel.org>,
<martin.lau@...ux.dev>, <song@...nel.org>, <yonghong.song@...ux.dev>,
<kpsingh@...nel.org>, <sdf@...gle.com>, <haoluo@...gle.com>,
<jolsa@...nel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
<kuba@...nel.org>, <pabeni@...hat.com>, <dsahern@...nel.org>,
<netdev@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 1/7] bpf, sockmap: add BPF_F_PERMANENT flag
for skmsg redirect
On 2023/8/26 9:32, John Fastabend wrote:
> Jakub Sitnicki wrote:
>> On Thu, Aug 24, 2023 at 10:39 PM +08, Liu Jian wrote:
>>> If the sockmap msg redirection function is used only to forward packets
>>> and no other operation, the execution result of the BPF_SK_MSG_VERDICT
>>> program is the same each time. In this case, the BPF program only needs to
>>> be run once. Add BPF_F_PERMANENT flag to bpf_msg_redirect_map() and
>>> bpf_msg_redirect_hash() to implement this ability.
>>>
>>> Then we can enable this function in the bpf program as follows:
>>> bpf_msg_redirect_hash(xx, xx, xx, BPF_F_INGRESS | BPF_F_PERMANENT);
>>>
>>> Test results using netperf TCP_STREAM mode:
>>> for i in 1 64 128 512 1k 2k 32k 64k 100k 500k 1m;then
>>> netperf -T 1,2 -t TCP_STREAM -H 127.0.0.1 -l 20 -- -m $i -s 100m,100m -S 100m,100m
>>> done
>>>
>>> before:
>>> 3.84 246.52 496.89 1885.03 3415.29 6375.03 40749.09 48764.40 51611.34 55678.26 55992.78
>>> after:
>>> 4.43 279.20 555.82 2080.79 3870.70 7105.44 41836.41 49709.75 51861.56 55211.00 54566.85
>>>
>>> Signed-off-by: Liu Jian <liujian56@...wei.com>
>
> [...]
>
>>> /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */
>>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
>>> index a29508e1ff35..df1443cf5fbd 100644
>>> --- a/net/core/skmsg.c
>>> +++ b/net/core/skmsg.c
>>> @@ -885,6 +885,11 @@ int sk_psock_msg_verdict(struct sock *sk, struct sk_psock *psock,
>>> goto out;
>>> }
>>> psock->redir_ingress = sk_msg_to_ingress(msg);
>>> + if (!msg->apply_bytes && !msg->cork_bytes)
>>> + psock->redir_permanent =
>>> + msg->flags & BPF_F_PERMANENT;
>>> + else
>>> + psock->redir_permanent = false;
>>
>> Above can be rewritten as:
>>
>> psock->redir_permanent = !msg->apply_bytes &&
>> !msg->cork_bytes &&
>> (msg->flags & BPF_F_PERMANENT);
>>
>> But as I wrote earlier, I don't think it's a good idea to ignore the
>> flag. We can detect this conflict at the time the bpf_msg_sk_redirect_*
>> helper is called and return an error.
>>
>> Naturally that means that that bpf_msg_{cork,apply}_bytes helpers need
>> to be adjusted to return an error if BPF_F_PERMANENT has been set.
>
> So far we've not really done much to protect a user from doing
> rather silly things. The following will all do something without
> errors,
>
> bpf_msg_apply_bytes()
> bpf_msg_apply_bytes() <- reset apply bytes
>
> bpf_msg_cork_bytes()
> bpf_msg_cork_bytes() <- resets cork byte
>
> also,
>
> bpf_msg_redirect(..., BPF_F_INGRESS);
> bpf_msg_redirect(..., 0); <- resets sk_redir and flags
>
> maybe there is some valid reason to even do above if further parsing
> identifies some reason to redirect to a alert socket or something.
>
> My original thinking was in the interest of not having a bunch of
> extra checks for performance reasons we shouldn't add guard rails
> unless something really unexpected might happen like a kernel
> panic or what not.
>
> This does feel a bit different though because before we
> didn't have calls that could impact other calls. My best idea
> is to just create a precedence and follow it. I would propose,
>
> 'If BPF_F_PERMANENT is set apply_bytes and cork_bytes are
> ignored.'
>
I think it's better.
Both low-priority or high-priority are ok for me. But I think it's
better that BPF_F_PERMANENT has a low priority. Because BPF_F_PERMANEN
is only for performance, and apply_bytes or cork_bytes may be used to a
user logic function.
> The other direction (what is above?) has a bit of an inconsistency
> where these two flows are different?
>
> bpf_apply_bytes()
> bpf_msg_redirect(..., BPF_F_PERMANENT)
>
> and
>
> bpf_msg_redirect(..., BPF_F_PERMANENT)
> bpf_apply_bytes()
>
> It would be best if order of operations doesn't change the
> outcome because that starts to get really hard to reason about.
>
> This avoids having to add checks all over the place and then
> if users want we could give some mechanisms to read apply
> and cork bytes so people could write macros over those if
> they really want the hard error.
>
> WDYT?
>
> [...]
>
> Thanks!
Powered by blists - more mailing lists