[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <690b450c-5024-8e02-dbc7-3d1b974af835@huawei.com>
Date: Fri, 28 Oct 2022 09:23:51 +0800
From: shaozhengchao <shaozhengchao@...wei.com>
To: Stanislav Fomichev <sdf@...gle.com>
CC: <bpf@...r.kernel.org>, <netdev@...r.kernel.org>,
<davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<pabeni@...hat.com>, <ast@...nel.org>, <daniel@...earbox.net>,
<andrii@...nel.org>, <martin.lau@...ux.dev>, <song@...nel.org>,
<yhs@...com>, <john.fastabend@...il.com>, <kpsingh@...nel.org>,
<haoluo@...gle.com>, <jolsa@...nel.org>, <oss@....io>,
<weiyongjun1@...wei.com>, <yuehaibing@...wei.com>
Subject: Re: [PATCH bpf-next] bpf: fix issue that packet only contains l2 is
dropped
On 2022/10/28 0:37, Stanislav Fomichev wrote:
> On Thu, Oct 27, 2022 at 4:58 AM shaozhengchao <shaozhengchao@...wei.com> wrote:
>>
>>
>>
>> On 2022/10/25 1:13, Stanislav Fomichev wrote:
>>> On Sat, Oct 22, 2022 at 4:36 AM shaozhengchao <shaozhengchao@...wei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2022/10/22 2:16, Stanislav Fomichev wrote:
>>>>> On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@...wei.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2022/10/21 1:45, Stanislav Fomichev wrote:
>>>>>>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@...wei.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2022/10/18 0:36, Stanislav Fomichev wrote:
>>>>>>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@...wei.com> wrote:
>>>>>>>>>>
>>>>>>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward
>>>>>>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly.
>>>>>>>>>> So fix it.
>>>>>>>>>>
>>>>>>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0
>>>>>>>>>>
>>>>>>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len")
>>>>>>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@...wei.com>
>>>>>>>>>> ---
>>>>>>>>>> net/bpf/test_run.c | 6 +++---
>>>>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>>>>>>>>>> index 13d578ce2a09..aa1b49f19ca3 100644
>>>>>>>>>> --- a/net/bpf/test_run.c
>>>>>>>>>> +++ b/net/bpf/test_run.c
>>>>>>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
>>>>>>>>>> {
>>>>>>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
>>>>>>>>>>
>>>>>>>>>> - if (!skb->len)
>>>>>>>>>> - return -EINVAL;
>>>>>>>>>> -
>>>>>>>>>> if (!__skb)
>>>>>>>>>> return 0;
>>>>>>>>>>
>>>>>>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>>>>>>>>>> if (IS_ERR(data))
>>>>>>>>>> return PTR_ERR(data);
>>>>>>>>>>
>>>>>>>>>> + if (size == ETH_HLEN)
>>>>>>>>>> + is_l2 = true;
>>>>>>>>>> +
>>>>>>>>>
>>>>>>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3
>>>>>>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to
>>>>>>>>> the hooks that don't expect it.
>>>>>>>>> Does it make sense to start with a small reproducer that triggers the
>>>>>>>>> issue first? We can have a couple of cases for
>>>>>>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that
>>>>>>>>> redirects to different devices (to trigger dev_is_mac_header_xmit).
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Hi Stanislav:
>>>>>>>> Thank you for your review. Is_l2 is the flag of a specific
>>>>>>>> hook. Therefore, do you mean that if skb->len is equal to 0, just
>>>>>>>> add the length back?
>>>>>>>
>>>>>>> Not sure I understand your question. All I'm saying is - you can't
>>>>>>> flip that flag arbitrarily. This flag depends on the attach point that
>>>>>>> you're running the prog against. Some attach points expect packets
>>>>>>> with l2, some expect packets without l2.
>>>>>>>
>>>>>>> What about starting with a small reproducer? Does it make sense to
>>>>>>> create a small selftest that adds net namespace + fq_codel +
>>>>>>> bpf_prog_test run and do redirect ingress/egress with len
>>>>>>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only
>>>>>>> len=0 that's problematic or some other combination as well?
>>>>>>>
>>>>>> yes, only skb->len = 0 will cause null-ptr-deref issue.
>>>>>> The following is the process of triggering the problem:
>>>>>> enqueue a skb:
>>>>>> fq_codel_enqueue()
>>>>>> ...
>>>>>> idx = fq_codel_classify() --->if idx != 0
>>>>>> flow = &q->flows[idx];
>>>>>> flow_queue_add(flow, skb); --->add skb to flow[idex]
>>>>>> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0
>>>>>> ...
>>>>>> fq_codel_drop() --->set sch->limit = 0, always
>>>>>> drop packets
>>>>>> ...
>>>>>> idx = i --->becuase backlogs in every
>>>>>> flows is 0, so idx = 0
>>>>>> ...
>>>>>> flow = &q->flows[idx]; --->get idx=0 flow
>>>>>> ...
>>>>>> dequeue_head()
>>>>>> skb = flow->head; --->flow->head = NULL
>>>>>> flow->head = skb->next; --->cause null-ptr-deref
>>>>>> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and
>>>>>> then skb!=NULL, it will be OK.
>>>>>> Maybe, I will fix it in fq_codel.
>>>>>
>>>>> I think the consensus here is that the stack, in general, doesn't
>>>>> expect the packets like this. So there are probably more broken things
>>>>> besides fq_codel. Thus, it's better if we remove the ability to
>>>>> generate them from the bpf side instead of fixing the individual users
>>>>> like fq_codel.
>>>>>
>>>>>> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the
>>>>>> length back, like bellow:
>>>>>> if (is_l2 || !skb->len)
>>>>>> __skb_push(skb, hh_len);
>>>>>> is it OK?
>>>>>
>>>>> Probably not?
>>>>>
>>>>> Looking at the original syzkaller report, prog_type is
>>>>> BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header.
>>>>> Can we do something like:
>>>>>
>>>>> if (!is_l2 && !skb->len) {
>>>>> // append some dummy byte to the skb ?
>>>>> }
>>>>>
>>>>>
>>>> I pad one byte, and test OK.
>>>> if (!is_l2 && !skb->len)
>>>> __skb_push(skb, 1);
>>>>
>>>> Does it look OK to you?
>>>
>>> Nope, this will eat a byte out of the l2 header. We need to skb_put
>>> and make sure we allocate enough to make that skb_put succeed.
>>>
>>> But stepping back a bit: it feels like it's all unnecessary? The only
>>> valid use-case of this is probing for the BPF_PROG_TEST_RUN as cilium
>>> does. This is mostly about testing, so fixing it in the users seems
>>> fair? No real production code is expected to generate these zero-len
>>> packets. Or are we concerned that this will leak into stable kernels?
>>>
>>> I feel like we are trying to add more complexity here for no apparent reason.
>>>
>> I agree with you. users should make sure the correct skb len and
>> configurations are passed into kernel. Incorrect configurations should
>> be discarded to ensure kernel stability.
>>
>> Lorenz, Can you modify the user-mode test code?
>
> Lorenz already fixed it for Cilium. I think the discussion here is
> around other potential users out there.
> Let's wait for them to appear if it is indeed a problem?
Currently, Lorenz only does the workaround. Maybe Lorenz still expects
to change the value back to 14 bytes?
Powered by blists - more mailing lists