netdev - Re: [PATCH rcu/dev 1/3] net: Use call_rcu_flush() for qdisc_free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <DCDBA54C-C35B-497D-BB39-224C88B94660@joelfernandes.org>
Date:   Thu, 17 Nov 2022 16:58:26 -0500
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     linux-kernel@...r.kernel.org, Cong Wang <xiyou.wangcong@...il.com>,
        David Ahern <dsahern@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
        Paolo Abeni <pabeni@...hat.com>, rcu@...r.kernel.org,
        rostedt@...dmis.org, paulmck@...nel.org, fweisbec@...il.com
Subject: Re: [PATCH rcu/dev 1/3] net: Use call_rcu_flush() for qdisc_free_cb



> On Nov 17, 2022, at 4:44 PM, Eric Dumazet <edumazet@...gle.com> wrote:
> 
> On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google)
> <joel@...lfernandes.org> wrote:
>> 
>> In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY
>> causes a networking test to fail in the teardown phase.
>> 
>> The failure happens during: ip netns del <name>
>> 
>> Using ftrace, I found the callbacks it was queuing which this series fixes. Use
>> call_rcu_flush() to revert to the old behavior. With that, the test passes.
>> 
>> Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
>> ---
>> net/sched/sch_generic.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index a9aadc4e6858..63fbf640d3b2 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -1067,7 +1067,7 @@ static void qdisc_destroy(struct Qdisc *qdisc)
>> 
>>        trace_qdisc_destroy(qdisc);
>> 
>> -       call_rcu(&qdisc->rcu, qdisc_free_cb);
>> +       call_rcu_flush(&qdisc->rcu, qdisc_free_cb);
>> }
> 
> I took a look at this one.
> 
> qdisc_free_cb() is essentially freeing : Some per-cpu memory, and the
> 'struct Qdisc'
> 
> I do not see why we need to force a flush for this (small ?) piece of memory.

I’ll try to drop that and rerun the test, and get back to you. It could be that there is a different callback that this flush() is compensating for, or something. I am pretty sure at one point, dropping this patch made the test fail most of the time. Now it passes 100%.

I’ll also attempt to collect a complete trace, maybe I’ll learn some networking code in the process..

Thanks!