[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87le7ndo4z.fsf@toke.dk>
Date: Wed, 14 Feb 2024 17:08:44 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Jesper Dangaard Brouer <hawk@...nel.org>, bpf@...r.kernel.org,
netdev@...r.kernel.org, Björn Töpel
<bjorn@...nel.org>, "David S. Miller"
<davem@...emloft.net>, Alexei Starovoitov <ast@...nel.org>, Andrii
Nakryiko <andrii@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Eric
Dumazet <edumazet@...gle.com>, Hao Luo <haoluo@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Jiri Olsa <jolsa@...nel.org>, John Fastabend
<john.fastabend@...il.com>, Jonathan Lemon <jonathan.lemon@...il.com>, KP
Singh <kpsingh@...nel.org>, Maciej Fijalkowski
<maciej.fijalkowski@...el.com>, Magnus Karlsson
<magnus.karlsson@...el.com>, Martin KaFai Lau <martin.lau@...ux.dev>,
Paolo Abeni <pabeni@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Song Liu <song@...nel.org>, Stanislav Fomichev <sdf@...gle.com>, Thomas
Gleixner <tglx@...utronix.de>, Yonghong Song <yonghong.song@...ux.dev>
Subject: Re: [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via
task_struct on PREEMPT_RT.
Sebastian Andrzej Siewior <bigeasy@...utronix.de> writes:
> On 2024-02-14 14:23:10 [+0100], Toke Høiland-Jørgensen wrote:
>> Sebastian Andrzej Siewior <bigeasy@...utronix.de> writes:
>>
>> > On 2024-02-13 21:50:51 [+0100], Jesper Dangaard Brouer wrote:
>> >> I generally like the idea around bpf_xdp_storage.
>> >>
>> >> I only skimmed the code, but noticed some extra if-statements (for
>> >> !NULL). I don't think they will make a difference, but I know Toke want
>> >> me to test it...
>> >
>> > I've been looking at the assembly for the return value of
>> > bpf_redirect_info() and there is a NULL pointer check. I hoped it was
>> > obvious to be nun-NULL because it is a static struct.
>> >
>> > Should this become a problem I could add
>> > "__attribute__((returns_nonnull))" to the declaration of the function
>> > which will optimize the NULL check away.
>>
>> If we know the function will never return NULL (I was wondering about
>> that, actually), why have the check in the C code at all? Couldn't we just
>> omit it entirely instead of relying on the compiler to optimise it out?
>
> The !RT version does:
> | static inline struct bpf_redirect_info *xdp_storage_get_ri(void)
> | {
> | return this_cpu_ptr(&bpf_redirect_info);
> | }
>
> which is static and can't be NULL (unless by mysterious ways the per-CPU
> offset + bpf_redirect_info offset is NULL). Maybe I can put this in
> this_cpu_ptr()… Let me think about it.
>
> For RT I have:
> | static inline struct bpf_xdp_storage *xdp_storage_get(void)
> | {
> | struct bpf_xdp_storage *xdp_store = current->bpf_xdp_storage;
> |
> | WARN_ON_ONCE(!xdp_store);
> | return xdp_store;
> | }
> |
> | static inline struct bpf_redirect_info *xdp_storage_get_ri(void)
> | {
> | struct bpf_xdp_storage *xdp_store = xdp_storage_get();
> |
> | if (!xdp_store)
> | return NULL;
> | return &xdp_store->ri;
> | }
>
> so if current->bpf_xdp_storage is NULL then we get a warning and a NULL
> pointer. This *should* not happen due to xdp_storage_set() which
> assigns the pointer. However if I missed a spot then there is the check
> which aborts further processing.
>
> During testing I forgot a spot in egress and the test module. You could
> argue that the warning is enough since it should pop up in testing and
> not production because the code is always missed and not by chance (go
> boom, send a report). I *think* I covered all spots, at least the test
> suite didn't point anything out to me.
Well, I would prefer if we could make sure we covered everything and not
have this odd failure mode where redirect just mysteriously stops
working. At the very least, if we keep the check we should have a
WARN_ON in there to make it really obvious that something needs to be
fixed.
This brings me to another thing I was going to point out separately, but
may as well mention it here: It would be good if we could keep the
difference between the RT and !RT versions as small as possible to avoid
having subtle bugs that only appear in one configuration.
I agree with Jesper that the concept of a stack-allocated "run context"
for the XDP program makes sense in general (and I have some vague ideas
about other things that may be useful to stick in there). So I'm
wondering if it makes sense to do that even in the !RT case? We can't
stick the pointer to it into 'current' when running in softirq, but we
could change the per-cpu variable to just be a pointer that gets
populated by xdp_storage_set()?
I'm not really sure if this would be performance neutral (it's just
moving around a few bits of memory, but we do gain an extra pointer
deref), but it should be simple enough to benchmark.
> I was unsure if I need something around net_tx_action() due to
> TC_ACT_REDIRECT (I think qdisc) but this seems to be handled by
> sch_handle_egress().
Yup, I believe you're correct.
-Toke
Powered by blists - more mailing lists