[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <987c3ca8-156b-47ed-b0b6-ed6d7d54d168@linutronix.de>
Date: Fri, 5 Jul 2024 13:12:04 +0200
From: Florian Kauer <florian.kauer@...utronix.de>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
ast@...nel.org, daniel@...earbox.net, john.fastabend@...il.com
Cc: davem@...emloft.net, kuba@...nel.org, hawk@...nel.org,
edumazet@...gle.com, pabeni@...hat.com, andrii@...nel.org,
martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org,
yonghong.song@...ux.dev, kpsingh@...nel.org, sdf@...gle.com,
haoluo@...gle.com, jolsa@...nel.org, netdev@...r.kernel.org,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
xdp-newbies@...r.kernel.org
Subject: Re: [PATCH] bpf: provide map key to BPF program after redirect
On 7/5/24 13:01, Toke Høiland-Jørgensen wrote:
> Florian Kauer <florian.kauer@...utronix.de> writes:
>
>> Both DEVMAP as well as CPUMAP provide the possibility
>> to attach BPF programs to their entries that will be
>> executed after a redirect was performed.
>>
>> With BPF_F_BROADCAST it is in also possible to execute
>> BPF programs for multiple clones of the same XDP frame
>> which is, for example, useful for establishing redundant
>> traffic paths by setting, for example, different VLAN tags
>> for the replicated XDP frames.
>>
>> Currently, this program itself has no information about
>> the map entry that led to its execution. While egress_ifindex
>> can be used to get this information indirectly and can
>> be used for path dependent processing of the replicated frames,
>> it does not work if multiple entries share the same egress_ifindex.
>>
>> Therefore, extend the xdp_md struct with a map_key
>> that contains the key of the associated map entry
>> after performing a redirect.
>>
>> See
>> https://lore.kernel.org/xdp-newbies/5eb6070c-a12e-4d4c-a9f0-a6a6fafa41d1@linutronix.de/T/#u
>> for the discussion that led to this patch.
>>
>> Signed-off-by: Florian Kauer <florian.kauer@...utronix.de>
>> ---
>> include/net/xdp.h | 3 +++
>> include/uapi/linux/bpf.h | 2 ++
>> kernel/bpf/devmap.c | 6 +++++-
>> net/core/filter.c | 18 ++++++++++++++++++
>> 4 files changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>> index e6770dd40c91..e70f4dfea1a2 100644
>> --- a/include/net/xdp.h
>> +++ b/include/net/xdp.h
>> @@ -86,6 +86,7 @@ struct xdp_buff {
>> struct xdp_txq_info *txq;
>> u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
>> u32 flags; /* supported values defined in xdp_buff_flags */
>> + u64 map_key; /* set during redirect via a map */
>> };
>>
>> static __always_inline bool xdp_buff_has_frags(struct xdp_buff *xdp)
>> @@ -175,6 +176,7 @@ struct xdp_frame {
>> struct net_device *dev_rx; /* used by cpumap */
>> u32 frame_sz;
>> u32 flags; /* supported values defined in xdp_buff_flags */
>> + u64 map_key; /* set during redirect via a map */
>> };
>
> struct xdp_frame is size constrained, so we shouldn't be using precious
> space on this. Besides, it's not information that should be carried
> along with the packet after transmission. So let's put it into struct
> xdp_txq_info and read it from there the same way we do for egress_ifindex :)
Very reasonable, but do you really mean struct xdp_frame or xdp_buff?
Only the latter has the xdp_txq_info?
>
>> static __always_inline bool xdp_frame_has_frags(struct xdp_frame *frame)
>> @@ -257,6 +259,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
>> xdp->data_meta = frame->data - frame->metasize;
>> xdp->frame_sz = frame->frame_sz;
>> xdp->flags = frame->flags;
>> + xdp->map_key = frame->map_key;
>> }
>>
>> static inline
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 35bcf52dbc65..7dbb0f2a236c 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -6455,6 +6455,8 @@ struct xdp_md {
>> __u32 rx_queue_index; /* rxq->queue_index */
>>
>> __u32 egress_ifindex; /* txq->dev->ifindex */
>> +
>> + __u64 map_key; /* set during redirect via a map in xdp_buff */
>> };
>
> Maybe make the comment a bit easier to understand? Something like "key
> of devmap/cpumap entry that is executing"?
Agreed, thanks!
>
> -Toke
>
Powered by blists - more mailing lists