[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231018170045.8620-1-kuniyu@amazon.com>
Date: Wed, 18 Oct 2023 10:00:45 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <martin.lau@...ux.dev>
CC: <andrii@...nel.org>, <ast@...nel.org>, <bpf@...r.kernel.org>,
<daniel@...earbox.net>, <davem@...emloft.net>, <dsahern@...nel.org>,
<edumazet@...gle.com>, <haoluo@...gle.com>, <john.fastabend@...il.com>,
<jolsa@...nel.org>, <kpsingh@...nel.org>, <kuba@...nel.org>,
<kuni1840@...il.com>, <kuniyu@...zon.com>, <mykolal@...com>,
<netdev@...r.kernel.org>, <pabeni@...hat.com>, <sdf@...gle.com>,
<song@...nel.org>, <yonghong.song@...ux.dev>
Subject: Re: [PATCH v1 bpf-next 05/11] bpf: tcp: Add SYN Cookie generation SOCK_OPS hook.
From: Martin KaFai Lau <martin.lau@...ux.dev>
Date: Tue, 17 Oct 2023 17:54:53 -0700
> On 10/13/23 3:04 PM, Kuniyuki Iwashima wrote:
> > This patch adds a new SOCK_OPS hook to generate arbitrary SYN Cookie.
> >
> > When the kernel sends SYN Cookie to a client, the hook is invoked with
> > bpf_sock_ops.op == BPF_SOCK_OPS_GEN_SYNCOOKIE_CB if the listener has
> > BPF_SOCK_OPS_SYNCOOKIE_CB_FLAG set by bpf_sock_ops_cb_flags_set().
> >
> > The BPF program can access the following information to encode into
> > ISN:
> >
> > bpf_sock_ops.sk : 4-tuple
> > bpf_sock_ops.skb : TCP header
> > bpf_sock_ops.args[0] : MSS
> >
> > The program must encode MSS and set it to bpf_sock_ops.replylong[0],
> > which will be looped back to the paired hook added in the following
> > patch.
> >
> > Note that we do not call tcp_synq_overflow() so that the BPF program
> > can set its own expiration period.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > ---
> > include/uapi/linux/bpf.h | 18 +++++++++++++++-
> > net/ipv4/tcp_input.c | 38 +++++++++++++++++++++++++++++++++-
> > tools/include/uapi/linux/bpf.h | 18 +++++++++++++++-
> > 3 files changed, 71 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 7ba61b75bc0e..d3cc530613c0 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -6738,8 +6738,17 @@ enum {
> > * options first before the BPF program does.
> > */
> > BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG = (1<<6),
> > + /* Call bpf when the kernel generates SYN Cookie (ISN) for SYN+ACK.
> > + *
> > + * The bpf prog will be called to encode MSS into SYN Cookie with
> > + * sock_ops->op == BPF_SOCK_OPS_GEN_SYNCOOKIE_CB.
> > + *
> > + * Please refer to the comment in BPF_SOCK_OPS_GEN_SYNCOOKIE_CB for
> > + * input and output.
> > + */
> > + BPF_SOCK_OPS_SYNCOOKIE_CB_FLAG = (1<<7),
> > /* Mask of all currently supported cb flags */
> > - BPF_SOCK_OPS_ALL_CB_FLAGS = 0x7F,
> > + BPF_SOCK_OPS_ALL_CB_FLAGS = 0xFF,
> > };
> >
> > /* List of known BPF sock_ops operators.
> > @@ -6852,6 +6861,13 @@ enum {
> > * by the kernel or the
> > * earlier bpf-progs.
> > */
> > + BPF_SOCK_OPS_GEN_SYNCOOKIE_CB, /* Generate SYN Cookie (ISN of
> > + * SYN+ACK).
> > + *
> > + * args[0]: MSS
> > + *
> > + * replylong[0]: ISN
> > + */
> > };
> >
> > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index 584825ddd0a0..c86a737e4fe6 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -6966,6 +6966,37 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops,
> > }
> > EXPORT_SYMBOL_GPL(tcp_get_syncookie_mss);
> >
> > +#if IS_ENABLED(CONFIG_CGROUP_BPF) && IS_ENABLED(CONFIG_SYN_COOKIES)
> > +static int bpf_skops_cookie_init_sequence(struct sock *sk, struct request_sock *req,
> > + struct sk_buff *skb, __u32 *isn)
> > +{
> > + struct bpf_sock_ops_kern sock_ops;
> > + int ret;
> > +
> > + memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp));
> > +
> > + sock_ops.op = BPF_SOCK_OPS_GEN_SYNCOOKIE_CB;
> > + sock_ops.sk = req_to_sk(req);
> > + sock_ops.args[0] = req->mss;
> > +
> > + bpf_skops_init_skb(&sock_ops, skb, tcp_hdrlen(skb));
> > +
> > + ret = BPF_CGROUP_RUN_PROG_SOCK_OPS_SK(&sock_ops, sk);
> > + if (ret)
> > + return ret;
> > +
> > + *isn = sock_ops.replylong[0];
>
> sock_ops.{replylong,reply} cannot be used. afaik, no existing sockops hook
> relies on {replylong,reply}. It is a union of args[4]. There could be a few
> skops bpf in the same cgrp and each of them will be run one after another. (eg.
> two skops progs want to generate cookie).
Ah, I missed that case. Looking at bpf_prog_run_array_cg(), multiple
SOCK_OPS prog can be attached and args[] are reused. Then, we cannot
use replylong[] for interface from bpf prog.
>
> I don't prefer to extend the uapi 'struct bpf_sock_ops' and then the
> sock_ops_convert_ctx_access(). Adding member to the kernel 'struct
> bpf_sock_addr_kern' could still be considered if it is really needed.
>
> One option is to add kfunc to allow the bpf prog to directly update the value of
> the kernel obj (e.g. tcp_rsk(req)->snt_isn here).
Yes, we need to set snt_isn, mss, sack_ok etc based on _CB (if we
continue with SOCK_OPS).
>
> Also, we need to allow a bpf prog to selectively generate custom cookie for one
> SYN but fall-through to the kernel cookie for another SYN.
Initially I implemented the fallback but the validation hook looked bit
ugly (because of reqsk allocation) and removed the fallback flow.
Also, I thought it can be done with other hooks so that such SYN will be
distributed to another listener.
Powered by blists - more mailing lists