lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80515DEDE931DC2A+7fa48c7a-2955-4afb-821f-a0108a72009f@uniontech.com>
Date: Mon, 23 Sep 2024 18:05:51 +0800
From: yushengjin <yushengjin@...ontech.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: pablo@...filter.org, kadlec@...filter.org, roopa@...dia.com,
 razor@...ckwall.org, davem@...emloft.net, kuba@...nel.org,
 pabeni@...hat.com, netfilter-devel@...r.kernel.org, coreteam@...filter.org,
 bridge@...ts.linux.dev, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net/bridge: Optimizing read-write locks in ebtables.c


在 23/9/2024 下午5:29, Eric Dumazet 写道:
> On Mon, Sep 23, 2024 at 11:16 AM yushengjin <yushengjin@...ontech.com> wrote:
>> When conducting WRK testing, the CPU usage rate of the testing machine was
>> 100%. forwarding through a bridge, if the network load is too high, it may
>> cause abnormal load on the ebt_do_table of the kernel ebtable module, leading
>> to excessive soft interrupts and sometimes even directly causing CPU soft
>> deadlocks.
>>
>> After analysis, it was found that the code of ebtables had not been optimized
>> for a long time, and the read-write locks inside still existed. However, other
>> arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks
>> in read-write locks had been discovered a long time ago.
>>
>> Ref link: https://lore.kernel.org/lkml/20090428092411.5331c4a1@nehalam/
>>
>> So I referred to arp/ip/ip6 modification methods to optimize the read-write
>> lock in ebtables.c.
>>
>> test method:
>> 1) Test machine creates bridge :
>> ``` bash
>> brctl addbr br-a
>> brctl addbr br-b
>> brctl addif br-a enp1s0f0 enp1s0f1
>> brctl addif br-b enp130s0f0 enp130s0f1
>> ifconfig br-a up
>> ifconfig br-b up
>> ```
>> 2) Testing with another machine:
>> ``` bash
>> ulimit -n 2048
>> ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://4.4.4.2:80/4k.html &
>> ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://5.5.5.2:80/4k.html &
>> ```
>>
>> Signed-off-by: yushengjin <yushengjin@...ontech.com>
>> ---
>>   include/linux/netfilter_bridge/ebtables.h |  47 +++++++-
>>   net/bridge/netfilter/ebtables.c           | 132 ++++++++++++++++------
>>   2 files changed, 145 insertions(+), 34 deletions(-)
>>
>> diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
>> index fd533552a062..dd52dea20fb8 100644
>> --- a/include/linux/netfilter_bridge/ebtables.h
>> +++ b/include/linux/netfilter_bridge/ebtables.h
>> @@ -93,7 +93,6 @@ struct ebt_table {
>>          char name[EBT_TABLE_MAXNAMELEN];
>>          struct ebt_replace_kernel *table;
>>          unsigned int valid_hooks;
>> -       rwlock_t lock;
>>          /* the data used by the kernel */
>>          struct ebt_table_info *private;
>>          struct nf_hook_ops *ops;
>> @@ -124,4 +123,50 @@ static inline bool ebt_invalid_target(int target)
>>
>>   int ebt_register_template(const struct ebt_table *t, int(*table_init)(struct net *net));
>>   void ebt_unregister_template(const struct ebt_table *t);
>> +
>> +/**
>> + * ebt_recseq - recursive seqcount for netfilter use
>> + *
>> + * Packet processing changes the seqcount only if no recursion happened
>> + * get_counters() can use read_seqcount_begin()/read_seqcount_retry(),
>> + * because we use the normal seqcount convention :
>> + * Low order bit set to 1 if a writer is active.
>> + */
>> +DECLARE_PER_CPU(seqcount_t, ebt_recseq);
>> +
>> +/**
>> + * ebt_write_recseq_begin - start of a write section
>> + *
>> + * Begin packet processing : all readers must wait the end
>> + * 1) Must be called with preemption disabled
>> + * 2) softirqs must be disabled too (or we should use this_cpu_add())
>> + * Returns :
>> + *  1 if no recursion on this cpu
>> + *  0 if recursion detected
>> + */
>> +static inline unsigned int ebt_write_recseq_begin(void)
>> +{
>> +       unsigned int addend;
>> +
>> +       addend = (__this_cpu_read(ebt_recseq.sequence) + 1) & 1;
>> +
>> +       __this_cpu_add(ebt_recseq.sequence, addend);
>> +       smp_mb();
>> +
>> +       return addend;
>> +}
>> +
>> +/**
>> + * ebt_write_recseq_end - end of a write section
>> + * @addend: return value from previous ebt_write_recseq_begin()
>> + *
>> + * End packet processing : all readers can proceed
>> + * 1) Must be called with preemption disabled
>> + * 2) softirqs must be disabled too (or we should use this_cpu_add())
>> + */
>> +static inline void ebt_write_recseq_end(unsigned int addend)
>> +{
>> +       smp_wmb();
>> +       __this_cpu_add(ebt_recseq.sequence, addend);
>> +}
> Why not reusing xt_recseq, xt_write_recseq_begin(), xt_write_recseq_end(),
> instead of copy/pasting them ?
>
> This was added in
>
> commit 7f5c6d4f665bb57a19a34ce1fb16cc708c04f219    netfilter: get rid
> of atomic ops in fast path
They used different seqcounts, I'm worried it might have an impact.
>
> If this is an include mess, just move them in a separate include file.

Can i copy  ebt_write_recseq_begin(), ebt_write_recseq_endend to
include/linux/netfilter/x_tables.h ?

Or add a parameter in xt_write_recseq_begin() , xt_write_recseq_end()  to
clarify whether it is xt_recseq or ebt_recseq.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ