[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOftzPjodTDZHg+Y7ayR-JX7LMZ3PXfYWBgWDonXJj_1mhZaqA@mail.gmail.com>
Date: Thu, 13 Sep 2018 14:17:17 -0700
From: Joe Stringer <joe@...d.net.nz>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Joe Stringer <joe@...d.net.nz>, daniel@...earbox.net,
netdev <netdev@...r.kernel.org>, ast@...nel.org,
john fastabend <john.fastabend@...il.com>, tgraf@...g.ch,
Martin KaFai Lau <kafai@...com>,
Nitin Hande <nitin.hande@...il.com>, mauricio.vasquez@...ito.it
Subject: Re: [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF
On Thu, 13 Sep 2018 at 14:02, Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Thu, Sep 13, 2018 at 01:55:01PM -0700, Joe Stringer wrote:
> > On Thu, 13 Sep 2018 at 12:06, Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> > >
> > > On Wed, Sep 12, 2018 at 5:06 PM, Alexei Starovoitov
> > > <alexei.starovoitov@...il.com> wrote:
> > > > On Tue, Sep 11, 2018 at 05:36:36PM -0700, Joe Stringer wrote:
> > > >> This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and
> > > >> bpf_sk_lookup_udp() which allows BPF programs to find out if there is a
> > > >> socket listening on this host, and returns a socket pointer which the
> > > >> BPF program can then access to determine, for instance, whether to
> > > >> forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the
> > > >> socket, so when a BPF program makes use of this function, it must
> > > >> subsequently pass the returned pointer into the newly added sk_release()
> > > >> to return the reference.
> > > >>
> > > >> By way of example, the following pseudocode would filter inbound
> > > >> connections at XDP if there is no corresponding service listening for
> > > >> the traffic:
> > > >>
> > > >> struct bpf_sock_tuple tuple;
> > > >> struct bpf_sock_ops *sk;
> > > >>
> > > >> populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
> > > >> sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0);
> > > > ...
> > > >> +struct bpf_sock_tuple {
> > > >> + union {
> > > >> + __be32 ipv6[4];
> > > >> + __be32 ipv4;
> > > >> + } saddr;
> > > >> + union {
> > > >> + __be32 ipv6[4];
> > > >> + __be32 ipv4;
> > > >> + } daddr;
> > > >> + __be16 sport;
> > > >> + __be16 dport;
> > > >> + __u8 family;
> > > >> +};
> > > >
> > > > since we can pass ptr_to_packet into map lookup and other helpers now,
> > > > can you move 'family' out of bpf_sock_tuple and combine with netns_id arg?
> > > > then progs wouldn't need to copy bytes from the packet into tuple
> > > > to do a lookup.
> >
> > If I follow, you're proposing that users should be able to pass a
> > pointer to the source address field of the L3 header, and assuming
> > that the L3 header ends with saddr+daddr (no options/extheaders), and
> > is immediately followed by the sport/dport then a packet pointer
> > should work for performing socket lookup. Then it is up to the BPF
> > program writer to ensure that this is the case, or otherwise fall back
> > to populating a copy of the sock tuple on the stack.
>
> yep.
>
> > > have been thinking more about it.
> > > since only ipv4 and ipv6 supported may be use size of bpf_sock_tuple
> > > to infer family inside the helper, so it doesn't need to be passed explicitly?
> >
> > Let me make sure I understand the proposal here.
> >
> > The current structure and function prototypes are:
> >
> > struct bpf_sock_tuple {
> > union {
> > __be32 ipv6[4];
> > __be32 ipv4;
> > } saddr;
> > union {
> > __be32 ipv6[4];
> > __be32 ipv4;
> > } daddr;
> > __be16 sport;
> > __be16 dport;
> > __u8 family;
> > };
> ...
> > You're proposing something like:
> >
> > struct bpf_sock_tuple4 {
> > __be32 saddr;
> > __be32 daddr;
> > __be16 sport;
> > __be16 dport;
> > __u8 family;
> > };
> >
> > struct bpf_sock_tuple6 {
> > __be32 saddr[4];
> > __be32 daddr[4];
> > __be16 sport;
> > __be16 dport;
> > __u8 family;
> > };
>
> I think the split is unnecessary.
> I'm proposing:
> struct bpf_sock_tuple {
> union {
> __be32 ipv6[4];
> __be32 ipv4;
> } saddr;
> union {
> __be32 ipv6[4];
> __be32 ipv4;
> } daddr;
> __be16 sport;
> __be16 dport;
> };
>
> that points directly into the packet (when ipv4 options are not there)
> and bpf_sk_lookup_tcp() uses 'size' argument to figure out ipv4/ipv6 family.
Needs to be subtly different, the 'sport'/'dport' offset would be
wrong in the IPv4 case otherwise:
$ cat foo.c
#include <linux/types.h>
struct bpf_sock_tuple {
union {
__be32 ipv6[4];
__be32 ipv4;
} saddr;
union {
__be32 ipv6[4];
__be32 ipv4;
} daddr;
__be16 sport;
__be16 dport;
};
int main(int argc, char *argv[]) {
struct bpf_sock_tuple tuple;
return 0;
}
$ gcc -g ./foo.c -o foo.o
$ pahole foo.o
struct bpf_sock_tuple {
union {
__be32 ipv6[4]; /* 16 */
__be32 ipv4; /* 4 */
} saddr; /* 0 16 */
union {
__be32 ipv6[4]; /* 16 */
__be32 ipv4; /* 4 */
} daddr; /* 16 16 */
__be16 sport; /* 32 2 */
__be16 dport; /* 34 2 */
/* size: 36, cachelines: 1, members: 4 */
/* last cacheline: 36 bytes */
};
---
We could take my definitions above and do the following if we want to
try to type the helper definition:
union bpf_sock_tuple {
struct bpf_sock_tuple4 t4;
struct bpf_sock_tuple6 t6;
};
Powered by blists - more mailing lists