lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170419172420.43333be1@redhat.com>
Date:   Wed, 19 Apr 2017 17:24:20 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Alexei Starovoitov <ast@...com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "xdp-newbies@...r.kernel.org" <xdp-newbies@...r.kernel.org>,
        brouer@...hat.com
Subject: Re: XDP question: best API for returning/setting egress port?

On Wed, 19 Apr 2017 14:33:27 +0200
Daniel Borkmann <daniel@...earbox.net> wrote:

> On 04/19/2017 02:00 PM, Jesper Dangaard Brouer wrote:
> > On Tue, 18 Apr 2017 13:54:45 -0700
> > John Fastabend <john.fastabend@...il.com> wrote:  
> >> On 17-04-18 12:58 PM, Jesper Dangaard Brouer wrote:  
> >>>
> >>> As I argued in NetConf presentation[1] (from slide #9) we need a port
> >>> mapping table (instead of using ifindex'es).  Both for supporting
> >>> other "port" types than net_devices (think sockets), and for
> >>> sandboxing what XDP can bypass.
> >>>
> >>> I want to create a new XDP action called XDP_REDIRECT, that instruct
> >>> XDP to send the xdp_buff to another "port" (get translated into a
> >>> net_device, or something else depending on internal port type).
> >>>
> >>> Looking at the userspace/eBPF interface, I'm wondering what is the
> >>> best API for "returning" this port number from eBPF?
> >>>
> >>> The options I see is:
> >>>
> >>> 1) Split-up the u32 action code, and e.g let the high-16-bit be the
> >>>     port number and lower-16bit the (existing) action verdict.
> >>>
> >>>   Pros: Simple API
> >>>   Cons: Number of ports limited to 64K
> >>>
> >>> 2) Extend both xdp_buff + xdp_md to contain a (u32) port number, allow
> >>>     eBPF to update xdp_md->port.
> >>>
> >>>   Pros: Larger number of ports.
> >>>   Cons: This require some ebpf translation steps between xdp_buff <-> xdp_md.
> >>>         (see xdp_convert_ctx_access)
> >>>
> >>> 3) Extend only xdp_buff and create bpf_helper that set port in xdp_buff.
> >>>
> >>>   Pros: Hides impl details, and allows helper to give eBPF code feedback
> >>>         (on e.g. if port doesn't exist any longer)
> >>>   Cons: Helper function call likely slower?  
> >>
> >> How about doing this the same way redirect is done in the tc case? I have this
> >> patch under test,
> >>
> >>   https://github.com/jrfastab/linux/commit/e78f5425d5e3c305b4170ddd85c61c2e15359fee  
> >
> > I have been looking at this approach, which is close to option #3 above.
> >
> > The problem with your implementation that you use a per-cpu store.
> > This creates the problem of storing state between packets. First packet
> > can call helper bpf_xdp_redirect() setting an ifindex, but program can
> > still return XDP_PASS.  Next packet can call XDP_REDIRECT and use the
> > ifindex set from the first packet.  IMHO this is a problematic API to
> > expose.
> >
> > I do see that the TC interface that uses the same approach, via helper
> > bpf_redirect().  Maybe it have the same API problem?  Looking at
> > sch_handle_ingress() I don't see this is handled (e.g. by always
> > clearing this_cpu_ptr(redirect_info)->ifindex = 0).  
> 
> It's cleared in {skb,xdp}_do_redirect() right after fetching the
> ifindex. I think this approach is just fine. The example described
> above is a misuse of the API by a buggy program calling bpf_xdp_redirect()
> and returning XDP_PASS while another time it returns XDP_REDIRECT
> without the bpf_xdp_redirect() helper, sounds very exotic, but it's
> as buggy as, say, a program doing the csum update wrong, a program
> writing the wrong data to the packet, doing adjust head on the wrong
> header offset, jumping into the wrong tail call entry and other things.

For TC I guess it is fine to keep it as is, because it is needed to
avoid extending skb.  IHMO for XDP I see no reason to keep a
per-cpu-store (which besides will be slower), simply update
xdp_buff.port should be sufficient (which is only relevant for this
packet).

As noted in option#3, my concern is that calling a helper function call
will be slower, than simply returning the needed port info? 

Maybe some bpf experts can tell me if such helper call could be
optimized out with some bpf magic?

> I think encoding this into an action code is rather limiting, f.e.
> where would we place a flags argument if needed in future? Would
> that mean, we need a XDP_REDIRECT2 return code that also allows for
> encoding flags?

Nope, it will be extensible.

We can start with:

 struct xdp_ret {
     union {
         __u32 act;
         struct {
             __u16 action;
             __u16 port;
         };
 };

And later change it to:

 struct xdp_ret {
     union {
         __u32 act;
         struct {
             __u8  action;
             __u8  flags;
             __u16 port;
         };
 };

If actions does not go above 255.  I would prefer that we start with
the latter, else people would argue that we need to extend the
structure like:

 struct xdp_ret {
     union {
         __u32 act;
         struct {
             union {
                 __u16 action;
                 struct {
                     __u8 action2;
                     __u8 flags;
                 };
             };
             __u16 port;
         };
 };


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ