lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230525143508.GA21064@fastly.com>
Date:   Thu, 25 May 2023 07:35:08 -0700
From:   Joe Damato <jdamato@...tly.com>
To:     Yonghong Song <yhs@...a.com>
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, ast@...nel.org, edumazet@...gle.com,
        martin.lau@...ux.dev, song@...nel.org, john.fastabend@...il.com,
        kpsingh@...nel.org, sdf@...gle.com, jolsa@...nel.org,
        haoluo@...gle.com
Subject: Re: [PATCH bpf-next] bpf: Export rx queue info for reuseport ebpf
 prog

On Wed, May 24, 2023 at 10:26:32PM -0700, Yonghong Song wrote:
> 
> 
> On 5/24/23 8:37 PM, Joe Damato wrote:
> >BPF_PROG_TYPE_SK_REUSEPORT / sk_reuseport ebpf programs do not have
> >access to the queue_mapping or napi_id of the incoming skb. Having
> >this information can help ebpf progs determine which listen socket to
> >select.
> >
> >This patch exposes both queue_mapping and napi_id so that
> >sk_reuseport ebpf programs can use this information to direct incoming
> >connections to the correct listen socket in the SOCKMAP.
> >
> >For example:
> >
> >A multi-threaded userland program with several threads accepting client
> >connections via a reuseport listen socket group might want to direct
> >incoming connections from specific receive queues (or NAPI IDs) to specific
> >listen sockets to maximize locality or for use with epoll busy-poll.
> >
> >Signed-off-by: Joe Damato <jdamato@...tly.com>
> >---
> >  include/uapi/linux/bpf.h |  2 ++
> >  net/core/filter.c        | 10 ++++++++++
> >  2 files changed, 12 insertions(+)
> >
> >diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >index 9273c654743c..31560b506535 100644
> >--- a/include/uapi/linux/bpf.h
> >+++ b/include/uapi/linux/bpf.h
> >@@ -6286,6 +6286,8 @@ struct sk_reuseport_md {
> >  	 */
> >  	__u32 eth_protocol;
> >  	__u32 ip_protocol;	/* IP protocol. e.g. IPPROTO_TCP, IPPROTO_UDP */
> >+	__u32 rx_queue_mapping; /* Rx queue associated with the skb */
> >+	__u32 napi_id;          /* napi id associated with the skb */
> >  	__u32 bind_inany;	/* Is sock bound to an INANY address? */
> >  	__u32 hash;		/* A hash of the packet 4 tuples */
> 
> This won't work. You will need to append to the end of data structure
> to keep it backward compatibility.
> 
> Also, recent kernel has a kfunc bpf_cast_to_kern_ctx() which converts
> a ctx to a kernel ctx and you can then use tracing-coding-style to
> access those fields. In this particular case, you can do
> 
>    struct sk_reuseport_kern *kctx = bpf_cast_to_kern_ctx(ctx);
> 
> We have
> 
> struct sk_reuseport_kern {
>         struct sk_buff *skb;
>         struct sock *sk;
>         struct sock *selected_sk;
>         struct sock *migrating_sk;
>         void *data_end;
>         u32 hash;
>         u32 reuseport_id;
>         bool bind_inany;
> };
> 
> through sk and skb pointer, you should be access the fields presented in
> this patch. You can access more fields too.
> 
> So using bpf_cast_to_kern_ctx(), there is no need for more uapi changes.
> Please give a try.

Thanks! I was looking at an LTS kernel tree that didn't have
bpf_cast_to_kern_ctx; this is very helpful and definitely a better way to
go.

Sorry for the noise.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ