[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CANn89iLzm3UhzXP+R5cwVnwvSYic2EzKQui9tgJyji-SQh+MPQ@mail.gmail.com>
Date: Wed, 6 Apr 2022 20:14:19 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: lixue liang <lixue.liang5086@...il.com>,
Daniel Borkmann <daniel@...earbox.net>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Pablo Neira Ayuso <pablo@...filter.org>, rsanger@...d.net.nz,
Yajun Deng <yajun.deng@...ux.dev>,
jiapeng.chong@...ux.alibaba.com, netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH] af_packet: fix efficiency issues in packet_read_pending
On Wed, Apr 6, 2022 at 8:02 PM lixue liang <lixue.liang5086@...il.com> wrote:
>
> Hi, Eric Dumazet.The pending_refcnt will indeed increase and decrease on different cpus,
> but the caller of packet_read_pending in af_packet.c only needs to determine whether there is
> a pending packet, not the sum of pending packets on each cpu .
>
> In the numa of 128 cpus, if the maximum distance of node is 100 (numactl -H, the result is as shown below),
> even if the first cpu has a pending packet, the packet_read_pending of tpacket_destruct_skb still needs to
> additionally traverse the remaining 127 cpus the pending packet, which leads to additional consumption
> of more cpu time, and the return refcnt of packet_read_pending is the sum of the pending packets that the
> caller does not need.
>
> please correct me, thanks.
Please do not send HTML messages, they won't be delivered to the list.
No, the intent of the code is to sum all cpu variables.
Your change basically will break many cases, as I explained.
If for some reason CPU 0 did a single packet_inc_pending() but the
corresponding packet_dec_pending() happened on CPU 1,
we do not want packet_read_pending() to return 1, but 0.
If having per-cpu variable is too costly, we need to revisit the whole thing.
(ie possibly revert b013840810c221f2b0cf641d01531526052dc1fb packet:
use percpu mmap tx frame pending refcount)
>
>
>
> 2022年4月6日 23:20,Eric Dumazet <edumazet@...gle.com> 写道:
>
> On Wed, Apr 6, 2022 at 4:49 AM lianglixue <lixue.liang5086@...il.com> wrote:
>
>
> In packet_read_pengding, even if the pending_refcnt of the first CPU
> is not 0, the pending_refcnt of all CPUs will be traversed,
> and the long delay of cross-cpu access in NUMA significantly reduces
> the performance of packet sending; especially in tpacket_destruct_skb.
>
> When pending_refcnt is not 0, it returns without counting the number of
> all pending packets, which significantly reduces the traversal time.
>
>
> Can you describe the use case ?
>
> You are changing the slow path.
>
> Perhaps you do not use the interface in the most efficient way.
>
> Your patch is wrong, pending_refcnt increments and decrements can
> happen on different cpus.
>
>
>
> Signed-off-by: lianglixue <lianglixue@...atwall.com.cn>
> ---
> net/packet/af_packet.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index c39c09899fd0..c04f49e44a33 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1210,17 +1210,18 @@ static void packet_dec_pending(struct packet_ring_buffer *rb)
>
> static unsigned int packet_read_pending(const struct packet_ring_buffer *rb)
> {
> - unsigned int refcnt = 0;
> int cpu;
>
> /* We don't use pending refcount in rx_ring. */
> if (rb->pending_refcnt == NULL)
> return 0;
>
> - for_each_possible_cpu(cpu)
> - refcnt += *per_cpu_ptr(rb->pending_refcnt, cpu);
> + for_each_possible_cpu(cpu) {
> + if (*per_cpu_ptr(rb->pending_refcnt, cpu) != 0)
> + return 1;
> + }
>
> - return refcnt;
> + return 0;
> }
>
> static int packet_alloc_pending(struct packet_sock *po)
> --
> 2.27.0
>
>
Powered by blists - more mailing lists