[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZZVaIOay_IqSDabg@nanopsycho>
Date: Wed, 3 Jan 2024 13:59:12 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Victor Nogueira <victor@...atatu.com>, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
xiyou.wangcong@...il.com, idosch@...sch.org, mleitner@...hat.com,
vladbu@...dia.com, paulb@...dia.com, pctammela@...atatu.com,
netdev@...r.kernel.org, kernel@...atatu.com,
syzbot+84339b9e7330daae4d66@...kaller.appspotmail.com,
syzbot+806b0572c8d06b66b234@...kaller.appspotmail.com,
syzbot+0039110f932d438130f9@...kaller.appspotmail.com
Subject: Re: [PATCH net-next v2 1/1] net/sched: We should only add
appropriate qdiscs blocks to ports' xarray
Tue, Jan 02, 2024 at 06:06:00PM CET, jhs@...atatu.com wrote:
>On Tue, Jan 2, 2024 at 10:54 AM Jiri Pirko <jiri@...nulli.us> wrote:
>>
>> Tue, Jan 02, 2024 at 03:52:01PM CET, jhs@...atatu.com wrote:
>> >On Tue, Jan 2, 2024 at 9:29 AM Jiri Pirko <jiri@...nulli.us> wrote:
>> >>
>> >> Tue, Jan 02, 2024 at 03:06:28PM CET, jhs@...atatu.com wrote:
>> >> >On Tue, Jan 2, 2024 at 4:59 AM Jiri Pirko <jiri@...nulli.us> wrote:
>> >> >>
>> >> >> The patch subject should briefly describe the nature of the change. Not
>> >> >> what "we" should or should not do.
>> >> >>
>> >> >>
>> >> >> Sun, Dec 31, 2023 at 06:23:20PM CET, victor@...atatu.com wrote:
>> >> >> >We should only add qdiscs to the blocks ports' xarray in ingress that
>> >> >> >support ingress_block_set/get or in egress that support
>> >> >> >egress_block_set/get.
>> >> >>
>> >> >> Tell the codebase what to do, be imperative. Please read again:
>> >> >> https://www.kernel.org/doc/html/v6.6/process/submitting-patches.html#describe-your-changes
>> >> >>
>> >> >
>> >> >We need another rule in the doc on nit-picking which states that we
>> >> >need to make progress at some point. We made many changes to this
>> >> >patchset based on your suggestions for no other reason other that we
>> >> >can progress the discussion. This is a patch that fixes a bug of which
>> >> >there are multiple syzbot reports and consumers of the API(last one
>> >> >just reported from the MTCP people). There's some sense of urgency to
>> >> >apply this patch before the original goes into net. More importantly:
>> >> >This patch fixes the issue and follows the same common check which was
>> >> >already being done in the committed patchset to check if the qdisc
>> >> >supports the block set/get operations.
>> >> >
>> >> >There are about 3 ways to do this check, you objected to the original,
>> >> >we picked something that works fine, and now you are picking a
>> >> >different way with tcf_block. I dont see how tcf_block check would
>> >> >help or solve this problem at all given this is a qdisc issue not a
>> >> >class issue. What am I missing?
>> >>
>> >> Perhaps I got something wrong, but I thought that the issue is
>> >> cl_ops->tcf_block being null for some qdiscs, isn't it?
>> >>
>> >
>> >We attach these ports/netdevs only on capable qdiscs i.e ones that
>> >have in/egress_block_set/get() - which happen to be ingress and
>> >clsact only.
>> >The problem was we were blindly assuming that presence of
>> >cl->tcf_block() implies presence of in/egress_block_set/get(). The
>> >earlier patches surrounded this code with attribute checks and so it
>> >worked there.
>>
>> Syskaller report says:
>>
>> KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
>> CPU: 1 PID: 5061 Comm: syz-executor323 Not tainted 6.7.0-rc6-syzkaller-01658-gc2b2ee36250d #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
>> RIP: 0010:qdisc_block_add_dev net/sched/sch_api.c:1190 [inline]
>>
>> Line 1190 is:
>> block = cl_ops->tcf_block(sch, TC_H_MIN_INGRESS, NULL);
>>
>> So the cl_ops->tcf_block == NULL
>>
>> Why can't you just check it? Why do you want to check in/egress_block_set/get()
>> instead? I don't follow :/
>>
>
>Does it make sense to add to the port xarray just because we have
>cl_ops->tcf_block()? There are many qdiscs which have
>cl_ops->tcf_block() (example htb) but cant be used in the block add
>syntax (see question further below on tdc test).
The whole block usage in qdiscs other than ingress and clsact seems odd
to me to be honest. What's the reason for that?.
>--
>$sudo tc qdisc add dev lo egress_block 21 handle 1: root htb
>Error: Egress block sharing is not supported.
>---
>
>Did you look at the other syzbot reports?
Yeah. The block usage in other qdiscs looks very odd.
>
>> Btw, the checks in __qdisc_destroy() do also look wrong.
>
>Now I am not following, please explain. The same code structure check
>is used in fill_qdisc
>(https://elixir.bootlin.com/linux/v6.7-rc8/source/net/sched/sch_api.c#L940)
>for example to pull the block info, is that wrong?
There, you don't call tcf_block() at all, so how is that relevant?
>
>> >
>> >BTW: Do you have an example of a test case where we can test the class
>> >grafting (eg using htb with tcf_block)? It doesnt have any impact on
>> >this patcheset here but we want to add it as a regression checker on
>> >tdc in the future if someone makes a change.
>
>An answer to this will help.
Answer is "no".
>
>cheers,
>jamal
>
>
>> >cheers,
>> >jamal
>> >
>> >> >
>> >> >cheers,
>> >> >jamal
>> >> >
>> >> >> >
>> >> >> >Fixes: 913b47d3424e ("net/sched: Introduce tc block netdev tracking infra")
>> >> >> >Signed-off-by: Victor Nogueira <victor@...atatu.com>
>> >> >> >Reviewed-by: Jamal Hadi Salim <jhs@...atatu.com>
>> >> >> >Reported-by: Ido Schimmel <idosch@...dia.com>
>> >> >> >Closes: https://lore.kernel.org/all/ZY1hBb8GFwycfgvd@shredder/
>> >> >> >Tested-by: Ido Schimmel <idosch@...dia.com>
>> >> >> >Reported-and-tested-by: syzbot+84339b9e7330daae4d66@...kaller.appspotmail.com
>> >> >> >Closes: https://lore.kernel.org/all/0000000000007c85f5060dcc3a28@google.com/
>> >> >> >Reported-and-tested-by: syzbot+806b0572c8d06b66b234@...kaller.appspotmail.com
>> >> >> >Closes: https://lore.kernel.org/all/00000000000082f2f2060dcc3a92@google.com/
>> >> >> >Reported-and-tested-by: syzbot+0039110f932d438130f9@...kaller.appspotmail.com
>> >> >> >Closes: https://lore.kernel.org/all/0000000000007fbc8c060dcc3a5c@google.com/
>> >> >> >---
>> >> >> >v1 -> v2:
>> >> >> >
>> >> >> >- Remove newline between fixes tag and Signed-off-by tag
>> >> >> >- Add Ido's Reported-by and Tested-by tags
>> >> >> >- Add syzbot's Reported-and-tested-by tags
>> >> >> >
>> >> >> > net/sched/sch_api.c | 34 ++++++++++++++++++++--------------
>> >> >> > 1 file changed, 20 insertions(+), 14 deletions(-)
>> >> >> >
>> >> >> >diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
>> >> >> >index 299086bb6205..426be81276f1 100644
>> >> >> >--- a/net/sched/sch_api.c
>> >> >> >+++ b/net/sched/sch_api.c
>> >> >> >@@ -1187,23 +1187,29 @@ static int qdisc_block_add_dev(struct Qdisc *sch, struct net_device *dev,
>> >> >> > struct tcf_block *block;
>> >> >> > int err;
>> >> >> >
>> >> >>
>> >> >> Why don't you just check cl_ops->tcf_block ?
>> >> >> In fact, there could be a helper to do it for you either call the op or
>> >> >> return NULL in case it is not defined.
>> >> >>
>> >> >>
>> >> >> >- block = cl_ops->tcf_block(sch, TC_H_MIN_INGRESS, NULL);
>> >> >> >- if (block) {
>> >> >> >- err = xa_insert(&block->ports, dev->ifindex, dev, GFP_KERNEL);
>> >> >> >- if (err) {
>> >> >> >- NL_SET_ERR_MSG(extack,
>> >> >> >- "ingress block dev insert failed");
>> >> >> >- return err;
>> >> >> >+ if (sch->ops->ingress_block_get) {
>> >> >> >+ block = cl_ops->tcf_block(sch, TC_H_MIN_INGRESS, NULL);
>> >> >> >+ if (block) {
>> >> >> >+ err = xa_insert(&block->ports, dev->ifindex, dev,
>> >> >> >+ GFP_KERNEL);
>> >> >> >+ if (err) {
>> >> >> >+ NL_SET_ERR_MSG(extack,
>> >> >> >+ "ingress block dev insert failed");
>> >> >> >+ return err;
>> >> >> >+ }
>> >> >> > }
>> >> >> > }
>> >> >> >
>> >> >> >- block = cl_ops->tcf_block(sch, TC_H_MIN_EGRESS, NULL);
>> >> >> >- if (block) {
>> >> >> >- err = xa_insert(&block->ports, dev->ifindex, dev, GFP_KERNEL);
>> >> >> >- if (err) {
>> >> >> >- NL_SET_ERR_MSG(extack,
>> >> >> >- "Egress block dev insert failed");
>> >> >> >- goto err_out;
>> >> >> >+ if (sch->ops->egress_block_get) {
>> >> >> >+ block = cl_ops->tcf_block(sch, TC_H_MIN_EGRESS, NULL);
>> >> >> >+ if (block) {
>> >> >> >+ err = xa_insert(&block->ports, dev->ifindex, dev,
>> >> >> >+ GFP_KERNEL);
>> >> >> >+ if (err) {
>> >> >> >+ NL_SET_ERR_MSG(extack,
>> >> >> >+ "Egress block dev insert failed");
>> >> >> >+ goto err_out;
>> >> >> >+ }
>> >> >> > }
>> >> >> > }
>> >> >> >
>> >> >> >--
>> >> >> >2.25.1
>> >> >> >
Powered by blists - more mailing lists