[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+4wbef3k6at_Kf+8MBmU4HhE9nxMRvROR_OxsZptffjA@mail.gmail.com>
Date: Mon, 23 Sep 2024 11:29:54 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: yushengjin <yushengjin@...ontech.com>
Cc: pablo@...filter.org, kadlec@...filter.org, roopa@...dia.com,
razor@...ckwall.org, davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
bridge@...ts.linux.dev, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net/bridge: Optimizing read-write locks in ebtables.c
On Mon, Sep 23, 2024 at 11:16 AM yushengjin <yushengjin@...ontech.com> wrote:
>
> When conducting WRK testing, the CPU usage rate of the testing machine was
> 100%. forwarding through a bridge, if the network load is too high, it may
> cause abnormal load on the ebt_do_table of the kernel ebtable module, leading
> to excessive soft interrupts and sometimes even directly causing CPU soft
> deadlocks.
>
> After analysis, it was found that the code of ebtables had not been optimized
> for a long time, and the read-write locks inside still existed. However, other
> arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks
> in read-write locks had been discovered a long time ago.
>
> Ref link: https://lore.kernel.org/lkml/20090428092411.5331c4a1@nehalam/
>
> So I referred to arp/ip/ip6 modification methods to optimize the read-write
> lock in ebtables.c.
>
> test method:
> 1) Test machine creates bridge :
> ``` bash
> brctl addbr br-a
> brctl addbr br-b
> brctl addif br-a enp1s0f0 enp1s0f1
> brctl addif br-b enp130s0f0 enp130s0f1
> ifconfig br-a up
> ifconfig br-b up
> ```
> 2) Testing with another machine:
> ``` bash
> ulimit -n 2048
> ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://4.4.4.2:80/4k.html &
> ./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://5.5.5.2:80/4k.html &
> ```
>
> Signed-off-by: yushengjin <yushengjin@...ontech.com>
> ---
> include/linux/netfilter_bridge/ebtables.h | 47 +++++++-
> net/bridge/netfilter/ebtables.c | 132 ++++++++++++++++------
> 2 files changed, 145 insertions(+), 34 deletions(-)
>
> diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
> index fd533552a062..dd52dea20fb8 100644
> --- a/include/linux/netfilter_bridge/ebtables.h
> +++ b/include/linux/netfilter_bridge/ebtables.h
> @@ -93,7 +93,6 @@ struct ebt_table {
> char name[EBT_TABLE_MAXNAMELEN];
> struct ebt_replace_kernel *table;
> unsigned int valid_hooks;
> - rwlock_t lock;
> /* the data used by the kernel */
> struct ebt_table_info *private;
> struct nf_hook_ops *ops;
> @@ -124,4 +123,50 @@ static inline bool ebt_invalid_target(int target)
>
> int ebt_register_template(const struct ebt_table *t, int(*table_init)(struct net *net));
> void ebt_unregister_template(const struct ebt_table *t);
> +
> +/**
> + * ebt_recseq - recursive seqcount for netfilter use
> + *
> + * Packet processing changes the seqcount only if no recursion happened
> + * get_counters() can use read_seqcount_begin()/read_seqcount_retry(),
> + * because we use the normal seqcount convention :
> + * Low order bit set to 1 if a writer is active.
> + */
> +DECLARE_PER_CPU(seqcount_t, ebt_recseq);
> +
> +/**
> + * ebt_write_recseq_begin - start of a write section
> + *
> + * Begin packet processing : all readers must wait the end
> + * 1) Must be called with preemption disabled
> + * 2) softirqs must be disabled too (or we should use this_cpu_add())
> + * Returns :
> + * 1 if no recursion on this cpu
> + * 0 if recursion detected
> + */
> +static inline unsigned int ebt_write_recseq_begin(void)
> +{
> + unsigned int addend;
> +
> + addend = (__this_cpu_read(ebt_recseq.sequence) + 1) & 1;
> +
> + __this_cpu_add(ebt_recseq.sequence, addend);
> + smp_mb();
> +
> + return addend;
> +}
> +
> +/**
> + * ebt_write_recseq_end - end of a write section
> + * @addend: return value from previous ebt_write_recseq_begin()
> + *
> + * End packet processing : all readers can proceed
> + * 1) Must be called with preemption disabled
> + * 2) softirqs must be disabled too (or we should use this_cpu_add())
> + */
> +static inline void ebt_write_recseq_end(unsigned int addend)
> +{
> + smp_wmb();
> + __this_cpu_add(ebt_recseq.sequence, addend);
> +}
Why not reusing xt_recseq, xt_write_recseq_begin(), xt_write_recseq_end(),
instead of copy/pasting them ?
This was added in
commit 7f5c6d4f665bb57a19a34ce1fb16cc708c04f219 netfilter: get rid
of atomic ops in fast path
If this is an include mess, just move them in a separate include file.
Powered by blists - more mailing lists