lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMDZJNXY249r_SBuSjCwkAf-xGF98-5EPN41d23Jix0fTawZTw@mail.gmail.com>
Date:   Sat, 11 Dec 2021 08:37:35 +0800
From:   Tonghao Zhang <xiangxia.m.yue@...il.com>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Antoine Tenart <atenart@...nel.org>,
        Alexander Lobakin <alexandr.lobakin@...el.com>,
        Wei Wang <weiwan@...gle.com>, Arnd Bergmann <arnd@...db.de>
Subject: Re: [net v5 2/3] net: sched: add check tc_skip_classify in sch egress

On Sat, Dec 11, 2021 at 4:11 AM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> On 12/10/21 8:54 PM, Tonghao Zhang wrote:
> > On Sat, Dec 11, 2021 at 1:46 AM Tonghao Zhang <xiangxia.m.yue@...il.com> wrote:
> >> On Sat, Dec 11, 2021 at 1:37 AM Tonghao Zhang <xiangxia.m.yue@...il.com> wrote:
> >>> On Sat, Dec 11, 2021 at 12:43 AM John Fastabend
> >>> <john.fastabend@...il.com> wrote:
> >>>> xiangxia.m.yue@ wrote:
> >>>>> From: Tonghao Zhang <xiangxia.m.yue@...il.com>
> >>>>>
> >>>>> Try to resolve the issues as below:
> >>>>> * We look up and then check tc_skip_classify flag in net
> >>>>>    sched layer, even though skb don't want to be classified.
> >>>>>    That case may consume a lot of cpu cycles. This patch
> >>>>>    is useful when there are a lot of filters with different
> >>>>>    prio. There is ~5 prio in in production, ~1% improvement.
> >>>>>
> >>>>>    Rules as below:
> >>>>>    $ for id in $(seq 1 5); do
> >>>>>    $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
> >>>>>    $ done
> >>>>>
> >>>>> * bpf_redirect may be invoked in egress path. If we don't
> >>>>>    check the flags and then return immediately, the packets
> >>>>>    will loopback.
> >>>>
> >>>> This would be the naive case right? Meaning the BPF program is
> >>>> doing a redirect without any logic or is buggy?
> >>>>
> >>>> Can you map out how this happens for me, I'm not fully sure I
> >>>> understand the exact concern. Is it possible for BPF programs
> >>>> that used to see packets no longer see the packet as expected?
> >>>>
> >>>> Is this the path you are talking about?
> >>> Hi John
> >>> Tx ethx -> __dev_queue_xmit -> sch_handle_egress
> >>> ->  execute BPF program on ethx with bpf_redirect(ifb0) ->
> >>> -> ifb_xmit -> ifb_ri_tasklet -> dev_queue_xmit -> __dev_queue_xmit
> >>> the packets loopbacks, that means bpf_redirect doesn't work with ifb
> >>> netdev, right ?
> >>> so in sch_handle_egress, I add the check skb_skip_tc_classify().
>
> But why would you do that? Usage like this is just broken by design..
As I understand, we can redirect packets to a target device either at
ingress or at *egress

The commit ID: 3896d655f4d491c67d669a15f275a39f713410f8
Allow eBPF programs attached to classifier/actions to call
bpf_clone_redirect(skb, ifindex, flags) helper which will mirror or
redirect the packet by dynamic ifindex selection from within the
program to a target device either at ingress or at egress. Can be used
for various scenarios, for example, to load balance skbs into veths,
split parts of the traffic to local taps, etc.

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=3896d655f4d491c67d669a15f275a39f713410f8
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=27b29f63058d26c6c1742f1993338280d5a41dc6

But at egress the bpf_redirect doesn't work with ifb.
> If you need to loop anything back to RX, just use bpf_redirect() with
Not use it to loop packets back. the flags of bpf_redirect is 0. for example:

tc filter add dev veth1 \
egress bpf direct-action obj test_bpf_redirect_ifb.o sec redirect_ifb
https://patchwork.kernel.org/project/netdevbpf/patch/20211208145459.9590-4-xiangxia.m.yue@gmail.com/
> BPF_F_INGRESS? What is the concrete/actual rationale for ifb here?
We load balance the packets to different ifb netdevices at egress. On
ifb, we install filters, rate limit police,




-- 
Best regards, Tonghao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ