lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CC3744B1-20A7-4628-873A-2551938009D4@joelfernandes.org>
Date:   Thu, 17 Nov 2022 16:16:43 -0500
From:   Joel Fernandes <joel@...lfernandes.org>
To:     paulmck@...nel.org
Cc:     Eric Dumazet <edumazet@...gle.com>, linux-kernel@...r.kernel.org,
        Cong Wang <xiyou.wangcong@...il.com>,
        David Ahern <dsahern@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
        Paolo Abeni <pabeni@...hat.com>, rcu@...r.kernel.org,
        rostedt@...dmis.org, fweisbec@...il.com, jiejiang@...gle.com,
        Thomas Glexiner <tglx@...utronix.de>
Subject: Re: [PATCH rcu/dev 3/3] net: Use call_rcu_flush() for dst_destroy_rcu



> On Nov 17, 2022, at 2:29 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
> 
> On Thu, Nov 17, 2022 at 05:40:40PM +0000, Joel Fernandes wrote:
>>> On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@...lfernandes.org> wrote:
>>> 
>>> On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@...gle.com> wrote:
>>>> 
>>>> On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@...lfernandes.org> wrote:
>>>>> 
>>>>> Hello Eric,
>>>>> 
>>>>> On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote:
>>>>>> On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google)
>>>>>> <joel@...lfernandes.org> wrote:
>>>>>>> 
>>>>>>> In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY
>>>>>>> causes a networking test to fail in the teardown phase.
>>>>>>> 
>>>>>>> The failure happens during: ip netns del <name>
>>>>>> 
>>>>>> And ? What happens then next ?
>>>>> 
>>>>> The test is doing the 'ip netns del <name>' and then polling for the
>>>>> disappearance of a network interface name for upto 5 seconds. I believe it is
>>>>> using netlink to get a table of interfaces. That polling is timing out.
>>>>> 
>>>>> Here is some more details from the test's owner (copy pasting from another
>>>>> bug report):
>>>>> In the cleanup, we remove the netns, and thus will cause the veth pair being
>>>>> removed automatically, so we use a poll to check that if the veth in the root
>>>>> netns still exists to know whether the cleanup is done.
>>>>> 
>>>>> Here is a public link to the code that is failing (its in golang):
>>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161
>>>>> 
>>>>> Here is a public link to the line of code in the actual test leading up to the above
>>>>> path (this is the test that is run:
>>>>> network.RoutingFallthrough.ipv4_only_primary) :
>>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52
>>>>> 
>>>>>>> Using ftrace, I found the callbacks it was queuing which this series fixes. Use
>>>>>>> call_rcu_flush() to revert to the old behavior. With that, the test passes.
>>>>>> 
>>>>>> What is this test about ? What barrier was used to make it not flaky ?
>>>>> 
>>>>> I provided the links above, let me know if you have any questions.
>>>>> 
>>>>>> Was it depending on some undocumented RCU behavior ?
>>>>> 
>>>>> This is a new RCU feature posted here for significant power-savings on
>>>>> battery-powered devices:
>>>>> https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a
>>>>> 
>>>>> There is also an LPC presentation about the same, I can dig the link if you
>>>>> are interested.
>>>>> 
>>>>>> Maybe adding a sysctl to force the flush would be better for functional tests ?
>>>>>> 
>>>>>> I would rather change the test(s), than adding call_rcu_flush(),
>>>>>> adding merge conflicts to future backports.
>>>>> 
>>>>> I am not too sure about that, I think a user might expect the network
>>>>> interface to disappear from the networking tables quickly enough without
>>>>> dealing with barriers or kernel iternals. However, I added the authors of the
>>>>> test to this email in the hopes he can provide is point of views as well.
>>>>> 
>>>>> The general approach we are taking with this sort of thing is to use
>>>>> call_rcu_flush() which is basically the same as call_rcu() for systems with
>>>>> CALL_RCU_LAZY=n. You can see some examples of that in the patch series link
>>>>> above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only
>>>>> Android and ChromeOS that are using it. I am adding Jie to share any input,
>>>>> he is from the networking team and knows this test well.
>>>>> 
>>>>> 
>>>> 
>>>> I do not know what is this RCU_LAZY thing, but IMO this should be opt-in
>>> 
>>> You should read the links I sent you. We did already try opt-in,
>>> Thomas Gleixner made a point at LPC that we should not add new APIs
>>> for this purpose and confuse kernel developers.
>>> 
>>>> For instance, only kfree_rcu() should use it.
>>> 
>>> No. Most of the call_rcu() usages are for freeing memory, so the
>>> consensus is we should apply this as opt out and fix issues along the
>>> way. We already did a lot of research/diligence on seeing which users
>>> need conversion.
>>> 
>>>> We can not review hundreds of call_rcu() call sites and decide if
>>>> adding arbitrary delays cou hurt .
>>> 
>>> That work has already been done as much as possible, please read the
>>> links I sent.
>> 
>> Also just to add, this test is a bit weird / corner case, as in anyone
>> expecting a quick response from call_rcu() is broken by design.
>> However, for these callbacks, it does not matter much which API they
>> use as they are quite infrequent for power savings.
> 
> The "broken by design" is a bit strong.  Some of those call_rcu()
> invocations have been around for the better part of 20 years, after all.
> 
> That aside, I do hope that we can arrive at something that will enhance
> battery lifetime while avoiding unnecessary disruption.  But we are
> unlikely to be able to completely avoid disruption.  As this email
> thread illustrates.  ;-)

Another approach, with these 3 patches could be to keep the call_rcu() but add an rcu_barrier() after them. I think people running ip del netns should not have to wait for their RCU cb to take too long to run and remove user visible state. But I would need suggestions from networking experts which CBs of these 3, to do this for. Or for all of them.

Alternatively, we can also patch just the test with a new knob that does rcu_barrier. But I dislike that as it does not fix it for all users. Probably the ip utilities will also need a patch then.

Thanks,

- Joel 


> 
>                            Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ