[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87fru7ody3.fsf@toke.dk>
Date: Fri, 24 May 2024 09:40:20 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>, Amery Hung <ameryhung@...il.com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org, yangpeihao@...u.edu.cn,
daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
sinquersw@...il.com, jhs@...atatu.com, jiri@...nulli.us, sdf@...gle.com,
xiyou.wangcong@...il.com, yepeilin.cs@...il.com
Subject: Re: [RFC PATCH v8 18/20] selftests: Add a bpf fq qdisc to selftest
Martin KaFai Lau <martin.lau@...ux.dev> writes:
> [ ... ]
>
>> +SEC("struct_ops/bpf_fq_dequeue")
>> +struct sk_buff *BPF_PROG(bpf_fq_dequeue, struct Qdisc *sch)
>> +{
>> + struct dequeue_nonprio_ctx cb_ctx = {};
>> + struct sk_buff *skb = NULL;
>> +
>> + skb = fq_dequeue_prio();
>> + if (skb) {
>> + bpf_skb_set_dev(skb, sch);
>> + return skb;
>> + }
>> +
>> + ktime_cache = dequeue_now = bpf_ktime_get_ns();
>> + fq_check_throttled();
>> + bpf_loop(q_plimit, fq_dequeue_nonprio_flows, &cb_ctx, 0);
>> +
>> + skb = get_stashed_skb();
>> +
>> + if (skb) {
>> + bpf_skb_set_dev(skb, sch);
>> + return skb;
>> + }
>> +
>> + if (cb_ctx.expire)
>> + bpf_qdisc_watchdog_schedule(sch, cb_ctx.expire, q_timer_slack);
>> +
>> + return NULL;
>> +}
>
> The enqueue and dequeue are using the bpf map (e.g. arraymap) or global var
> (also an arraymap). Potentially, the map can be shared by different qdisc
> instances (sch) and they could be attached to different net devices also. Not
> sure if there is potentail issue? e.g. the bpf_fq_reset below.
> or a bpf prog dequeue a skb with a different skb->dev.
I think behaviour like this is potentially quite interesting and will
allow some neat optimisations (skipping a redirect to a different
interface and just directly enqueueing it to a different place comes to
mind). However, as you point out it may lead to weird things like a
mismatched skb->dev, so if we allow this we should make sure that the
kernel will disallow (or fix) such behaviour. I'm not sure how difficult
that is; for instance, I *think* the mismatched skb->dev can lead to
bugs, but I'm not quite sure.
So maybe it's better to disallow "crossing over" like this, and relax
that restriction later if/when we have a concrete use case?
-Toke
Powered by blists - more mailing lists