lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpVm_GdVT7MiuVGpzvx9zEsXKjZer5yF8Vh8c3EKVBM3-Q@mail.gmail.com>
Date:   Mon, 15 Feb 2021 20:06:32 -0800
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, duanxiongchun@...edance.com,
        Dongdong Wang <wangdongdong.6@...edance.com>,
        jiang.wang@...edance.com, Cong Wang <cong.wang@...edance.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jakub Sitnicki <jakub@...udflare.com>,
        Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [Patch bpf-next v3 4/5] skmsg: use skb ext instead of TCP_SKB_CB

On Mon, Feb 15, 2021 at 5:50 PM John Fastabend <john.fastabend@...il.com> wrote:
>
> Cong Wang wrote:
> > On Mon, Feb 15, 2021 at 4:54 PM John Fastabend <john.fastabend@...il.com> wrote:
> > >
> > > Cong Wang wrote:
> > > > On Mon, Feb 15, 2021 at 3:57 PM John Fastabend <john.fastabend@...il.com> wrote:
> > > > >
> > > > > For TCP case we can continue to use CB and not pay the price. For UDP
> > > > > and AF_UNIX we can do the extra alloc.
> > > >
> > > > I see your point, but specializing TCP case does not give much benefit
> > > > here, the skmsg code would have to check skb->protocol etc. to decide
> > > > whether to use TCP_SKB_CB() or skb_ext:
> > > >
> > > > if (skb->protocol == ...)
> > > >   TCP_SKB_CB(skb) = ...;
> > > > else
> > > >   ext = skb_ext_find(skb);
> > > >
> > > > which looks ugly to me. And I doubt skb->protocol alone is sufficient to
> > > > distinguish TCP, so we may end up having more checks above.
> > > >
> > > > So do you really want to trade code readability with an extra alloc?
> > >
> > > Above is ugly. So I look at where the patch replaces things,
> > >
> > > sk_psock_tls_strp_read(), this is TLS specific read hook so can't really
> > > work in generic case anyways.
> > >
> > > sk_psock_strp_read(), will you have UDP, AF_UNIX stream parsers? Do these
> > > even work outside TCP cases.
> > >
> > > For these ones: sk_psock_verdict_apply(), sk_psock_verdict_recv(),
> > > sk_psock_backlog(), can't we just do some refactoring around their
> > > hook points so we know the context. For example sk_psock_tls_verdict_apply
> > > is calling sk_psock_skb_redirect(). Why not have a sk_psock_unix_redirect()
> > > and a sk_psock_udp_redirect(). There are likely some optimizations we can
> > > deploy this way. We've already don this for tls and sk_msg types for example.
> > >
> > > Then the helpers will know their types by program type, just use the right
> > > variants.
> > >
> > > So not suggestiong if/else the checks so much as having per type hooks.
> > >
> >
> > Hmm, but sk_psock_backlog() is still the only one that handles all three
> > above cases, right? It uses TCP_SKB_CB() too and more importantly it
> > is also why we can't use a per-cpu struct here (see bpf_redirect_info).
>
> Right, but the workqueue is created at init time where we will know the
> socket type based on the program/map types so can build the redirect
> backlog queue there based on the type needed. I also have a patch in

Hmm? How could a socket type match the skb type when we redirect
across-protocol?

In my use case, I want to redirect an AF_UNIX skb to a UDP socket,
clearly checking the UDP socket workqueue can't find out it is an
AF_UNIX skb. It has to be a per-skb check.

> mind that would do more specific TCP things in that code anyways. I
> can flush it out this week if anyone cares. The idea is we are wasting
> lots of cycles using skb_send_sock_locked when we can just inject
> the packet directlyy into the tcp stack.

Actually I did try the same, it clearly doesn't work for cross-protocol.

Anyway, please let me know what your suggestion for skb ext here?

It looks like we either have some ugly packet type checks, or just
use the skb ext. I don't see any other way yet, I also explored the
struct sk_buff again and still can not find anything we can reuse.
(_skb_refdst can only be reused very briefly with
tcp_skb_tsorted_save().)

Therefore, I believe using skb ext is still the best solution here.

>
> Also on the original patch here, we can't just kfree_skb() on alloc
> errors because that will look like a data drop. Errors need to be
> handled gracefully without dropping data. At least in the TCP case,
> but probably also in UDP and AF_UNIX cases as well. Original code
> was pretty loose in this regard, but it caused users to write bug
> reports and then I've been fixing most of them. If you see more
> cases let me know.

What's your suggestion here? Return -EAGAIN? But it requires
the caller put it in a loop to be graceful, but we can't do it in, for
example, sk_psock_tls_strp_read().

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ