lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Mar 2019 09:39:31 -0600
From:   David Ahern <dsahern@...il.com>
To:     Dmitry Safonov <dima@...sta.com>, linux-kernel@...r.kernel.org
Cc:     Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Ido Schimmel <idosch@...lanox.com>, netdev@...r.kernel.org
Subject: Re: [RFC 4/4] net/ipv4/fib: Don't synchronise_rcu() every 512Kb

On 3/26/19 9:30 AM, Dmitry Safonov wrote:
> Fib trie has a hard-coded sync_pages limit to call synchronise_rcu().
> The limit is 128 pages or 512Kb (considering common case with 4Kb
> pages).
> 
> Unfortunately, at Arista we have use-scenarios with full view software
> forwarding. At the scale of 100K and more routes even on 2 core boxes
> the hard-coded limit starts actively shooting in the leg: lockup
> detector notices that rtnl_lock is held for seconds.
> First reason is previously broken MAX_WORK, that didn't limit pending
> balancing work. While fixing it, I've noticed that the bottle-neck is
> actually in the number of synchronise_rcu() calls.
> 
> I've tried to fix it with a patch to decrement number of tnodes in rcu
> callback, but it hasn't much affected performance.
> 
> One possible way to "fix" it - provide another sysctl to control
> sync_pages, but in my POV it's nasty - exposing another realisation
> detail into user-space.

well, that was accepted last week. ;-)

commit 9ab948a91b2c2abc8e82845c0e61f4b1683e3a4f
Author: David Ahern <dsahern@...il.com>
Date:   Wed Mar 20 09:18:59 2019 -0700

    ipv4: Allow amount of dirty memory from fib resizing to be controllable


Can you see how that change (should backport easily) affects your test
case? From my perspective 16MB was the sweet spot.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ