lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iKYvgikRmHc5MOS-eEBwy9-4+ODf5HNYnL2NUcpGBUXTQ@mail.gmail.com>
Date: Wed, 12 Nov 2025 04:21:57 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Victor Nogueira <victor@...atatu.com>, 
	syzbot ci <syzbot+ci51c71986dfbbfee2@...kaller.appspotmail.com>, 
	Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net, eric.dumazet@...il.com, 
	horms@...nel.org, jiri@...nulli.us, kuba@...nel.org, kuniyu@...gle.com, 
	netdev@...r.kernel.org, pabeni@...hat.com, toke@...hat.com, 
	willemb@...gle.com, xiyou.wangcong@...il.com, syzbot@...ts.linux.dev, 
	syzkaller-bugs@...glegroups.com, Alexei Starovoitov <ast@...nel.org>
Subject: Re: [syzbot ci] Re: net_sched: speedup qdisc dequeue

On Tue, Nov 11, 2025 at 1:34 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>
> On Tue, Nov 11, 2025 at 4:04 PM Victor Nogueira <victor@...atatu.com> wrote:
> >
> > Hi Eric,
> >
> > On Tue, Nov 11, 2025 at 4:44 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Tue, Nov 11, 2025 at 11:23 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > >
> > > > On Tue, Nov 11, 2025 at 8:28 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > > >
> > > > > On Tue, Nov 11, 2025 at 6:09 AM syzbot ci
> > > > > <syzbot+ci51c71986dfbbfee2@...kaller.appspotmail.com> wrote:
> > > > > >
> > > > > > syzbot ci has tested the following series
> > > > > >
> > > > > > [v2] net_sched: speedup qdisc dequeue
> > > > > > [...]
> > > > > > and found the following issue:
> > > > > > WARNING in sk_skb_reason_drop
> > > > > >
> > > > > > Full report is available here:
> > > > > > https://ci.syzbot.org/series/a9dbee91-6b1f-4ab9-b55d-43f7f50de064
> > > > > >
> > > > > > ***
> > > > > >
> > > > > > WARNING in sk_skb_reason_drop
> > > > > > [...]
> > > > struct bpf_skb_data_end {
> > > >   struct qdisc_skb_cb qdisc_cb;
> > > >   void *data_meta;
> > > >    void *data_end;
> > > > };
> > > >
> > > > So anytime BPF calls bpf_compute_data_pointers(), it overwrites
> > > > tc_skb_cb(skb)->drop_reason,
> > > > because offsetof(   ..., data_meta) == offsetof(... drop_reason)
> > > >
> > > > CC Victor and Daniel
> > >
> > > Quick and dirty patch to save/restore the space.
> > >
> > > diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
> > > index 7fbe42f0e5c2b7aca0a28c34cd801c3a767c804e..004d8fe2f29d89bd7df82d90b7a1e2881f7a463b
> > > 100644
> > > --- a/net/sched/cls_bpf.c
> > > +++ b/net/sched/cls_bpf.c
> > > @@ -82,11 +82,16 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
> > >                                        const struct tcf_proto *tp,
> > >                                        struct tcf_result *res)
> > >  {
> > > +       struct bpf_skb_data_end *cb = (struct bpf_skb_data_end *)skb->cb;
> > >         struct cls_bpf_head *head = rcu_dereference_bh(tp->root);
> > >         bool at_ingress = skb_at_tc_ingress(skb);
> > >         struct cls_bpf_prog *prog;
> > > +       void *save[2];
> > >         int ret = -1;
> > >
> > > +       save[0] = cb->data_meta;
> > > +       save[1] = cb->data_end;
> > > +
> > >         list_for_each_entry_rcu(prog, &head->plist, link) {
> > >                 int filter_res;
> > >
> > > @@ -133,7 +138,8 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
> > >
> > >                 break;
> > >         }
> > > -
> > > +       cb->data_meta = save[0];
> > > +       cb->data_end = save[1];
> > >         return ret;
> > >  }
> >
> > I think you are on the right track.
> > Maybe we can create helper functions for this.
> > Something like bpf_compute_and_save_data_end [1] and
> > and bpf_restore_data_end [2], but for data_meta as well.
> > Also, I think we might have the same issue in tcf_bpf_act [3].
> >
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/include/linux/filter.h#n907
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/include/linux/filter.h#n917
> > [3] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/act_bpf.c#n50
> >
>
> Digging a bit - when you send the fixes, this overwritting i believe
> was introduced in:
> commit db58ba45920255e967cc1d62a430cebd634b5046

I will test the following :

 include/linux/filter.h |   20 ++++++++++++++++++++
 net/sched/act_bpf.c    |    7 +++----
 net/sched/cls_bpf.c    |    6 ++----
 3 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index a104b39942305af245b9f2938a0acf7d7ab33c23..03e7516c61872c1aa98e0be743abb96d496e49c3
100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -901,6 +901,26 @@ static inline void
bpf_compute_data_pointers(struct sk_buff *skb)
        cb->data_end  = skb->data + skb_headlen(skb);
 }

+static inline int bpf_prog_run_data_pointers(
+       const struct bpf_prog *prog,
+       struct sk_buff *skb)
+{
+       struct bpf_skb_data_end *cb = (struct bpf_skb_data_end *)skb->cb;
+       void *save_data_meta, *save_data_end;
+       int res;
+
+       save_data_meta = cb->data_meta;
+       save_data_end = cb->data_end;
+
+       bpf_compute_data_pointers(skb);
+       res = bpf_prog_run(prog, skb);
+
+       cb->data_meta = save_data_meta;
+       cb->data_end = save_data_end;
+
+       return res;
+}
+
 /* Similar to bpf_compute_data_pointers(), except that save orginal
  * data in cb->data and cb->meta_data for restore.
  */
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 396b576390d00aad56bca6a18b7796e5324c0aef..3f5a5dc55c29433525b319f1307725d7feb015c6
100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -47,13 +47,12 @@ TC_INDIRECT_SCOPE int tcf_bpf_act(struct sk_buff *skb,
        filter = rcu_dereference(prog->filter);
        if (at_ingress) {
                __skb_push(skb, skb->mac_len);
-               bpf_compute_data_pointers(skb);
-               filter_res = bpf_prog_run(filter, skb);
+               filter_res = bpf_prog_run_data_pointers(filter, skb);
                __skb_pull(skb, skb->mac_len);
        } else {
-               bpf_compute_data_pointers(skb);
-               filter_res = bpf_prog_run(filter, skb);
+               filter_res = bpf_prog_run_data_pointers(filter, skb);
        }
+
        if (unlikely(!skb->tstamp && skb->tstamp_type))
                skb->tstamp_type = SKB_CLOCK_REALTIME;
        if (skb_sk_is_prefetched(skb) && filter_res != TC_ACT_OK)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 7fbe42f0e5c2b7aca0a28c34cd801c3a767c804e..a32754a2658bb7d21e8ceb62c67d6684ed4f9fcc
100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -97,12 +97,10 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
                } else if (at_ingress) {
                        /* It is safe to push/pull even if skb_shared() */
                        __skb_push(skb, skb->mac_len);
-                       bpf_compute_data_pointers(skb);
-                       filter_res = bpf_prog_run(prog->filter, skb);
+                       filter_res =
bpf_prog_run_data_pointers(prog->filter, skb);
                        __skb_pull(skb, skb->mac_len);
                } else {
-                       bpf_compute_data_pointers(skb);
-                       filter_res = bpf_prog_run(prog->filter, skb);
+                       filter_res =
bpf_prog_run_data_pointers(prog->filter, skb);
                }
                if (unlikely(!skb->tstamp && skb->tstamp_type))
                        skb->tstamp_type = SKB_CLOCK_REALTIME;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ