lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <81A3C9D37E72666E+13e8b5ef-8499-45e6-9442-060ef88152d8@uniontech.com>
Date: Wed, 25 Sep 2024 15:45:20 +0800
From: yushengjin <yushengjin@...ontech.com>
To: Stephen Hemminger <stephen@...workplumber.org>,
 Eric Dumazet <edumazet@...gle.com>
Cc: pablo@...filter.org, kadlec@...filter.org, roopa@...dia.com,
 razor@...ckwall.org, davem@...emloft.net, kuba@...nel.org,
 pabeni@...hat.com, netfilter-devel@...r.kernel.org, coreteam@...filter.org,
 bridge@...ts.linux.dev, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, gouhao@...ontech.com
Subject: Re: [PATCH v3] net/bridge: Optimizing read-write locks in ebtables.c


在 25/9/2024 上午12:40, Stephen Hemminger 写道:
> On Tue, 24 Sep 2024 15:46:17 +0200
> Eric Dumazet <edumazet@...gle.com> wrote:
>
>> On Tue, Sep 24, 2024 at 3:33 PM Stephen Hemminger
>> <stephen@...workplumber.org> wrote:
>>> On Tue, 24 Sep 2024 17:09:06 +0800
>>> yushengjin <yushengjin@...ontech.com> wrote:
>>>   
>>>> When conducting WRK testing, the CPU usage rate of the testing machine was
>>>> 100%. forwarding through a bridge, if the network load is too high, it may
>>>> cause abnormal load on the ebt_do_table of the kernel ebtable module, leading
>>>> to excessive soft interrupts and sometimes even directly causing CPU soft
>>>> deadlocks.
>>>>
>>>> After analysis, it was found that the code of ebtables had not been optimized
>>>> for a long time, and the read-write locks inside still existed. However, other
>>>> arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks
>>>> in read-write locks had been discovered a long time ago.
>>>>
>>>> Ref link: https://lore.kernel.org/lkml/20090428092411.5331c4a1@nehalam/
>>>>
>>>> So I referred to arp/ip/ip6 modification methods to optimize the read-write
>>>> lock in ebtables.c.
>>> What about doing RCU instead, faster and safer.
>> Safer ? How so ?
>>
>> Stephen, we have used this stuff already in other netfilter components
>> since 2011
>>
>> No performance issue at all.
>>
> I was thinking that lockdep and analysis tools do better job looking at RCU.
> Most likely, the number of users of ebtables was small enough that nobody looked
> hard at it until now.

Even though there are few users of ebtables, there are still serious issues.
This is the data running on the arm Kunpeng-920 (96 cpus) machine,When I 
only run
wrk tests, the softirq of the system will rapidly increase to 25%:

02:50:07 PM  CPU %usr  %nice %sys %iowait %irq  %soft  %steal %guest  
%gnice %idle
02:50:25 PM  all    0.00    0.00    0.05    0.00    0.72 23.20    
0.00    0.00    0.00   76.03
02:50:26 PM  all    0.00    0.00    0.08    0.00    0.72 24.53    
0.00    0.00    0.00   74.67
02:50:27 PM  all    0.01    0.00    0.13    0.00    0.75 24.89    
0.00    0.00    0.00   74.23

If ebatlse queries, updates, and other operations are continuously 
executed at this time, softirq
will increase again to 50%:

02:52:23 PM  all    0.00    0.00    1.18    0.00    0.54 48.91    
0.00    0.00    0.00   49.36
02:52:24 PM  all    0.00    0.00    1.19    0.00    0.43 48.23    
0.00    0.00    0.00   50.15
02:52:25 PM  all    0.00    0.00    1.20    0.00    0.50 48.29    
0.00    0.00    0.00   50.01

More seriously, soft lockup may occur:

Message from syslogd@...alhost at Sep 25 14:52:22 ...
  kernel:watchdog: BUG: soft lockup - CPU#88 stuck for 23s! [ebtables:3896]

So i think soft lockup is even more unbearable than performance.

>
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ