lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKZ7LE6y-c=E5uQRtMuf2vg2h479SoxEwN5jNFJ+FgGtA@mail.gmail.com>
Date: Fri, 30 Jun 2023 12:49:15 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jian Wen <wenjianhn@...il.com>
Cc: davem@...emloft.net, Jian Wen <wenjian1@...omi.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] tcp: add a scheduling point in established_get_first()

On Fri, Jun 30, 2023 at 9:18 AM Jian Wen <wenjianhn@...il.com> wrote:
>
> Kubernetes[1] is going to stick with /proc/net/tcp for a while.
>
> This commit reduces the scheduling latency introduced by established_get_first(),
> similar to commit acffb584cda7 ("net: diag: add a scheduling point in inet_diag_dump_icsk()").
>
> In our environment, the scheduling latency affects:
> 1. the performance of latency-sensitive services like Redis
> 2. the delay of synchronize_net() that is called with RTNL is locked
>    12 times when Dockerd is deleting a container
>
> [1] https://github.com/google/cadvisor/blob/v0.47.2/container/libcontainer/handler.go#L130
>
> Signed-off-by: Jian Wen <wenjian1@...omi.com>
> ---
>  net/ipv4/tcp_ipv4.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index fd365de4d5ff..3271848e9c9a 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -57,6 +57,7 @@
>  #include <linux/init.h>
>  #include <linux/times.h>
>  #include <linux/slab.h>
> +#include <linux/sched.h>
>
>  #include <net/net_namespace.h>
>  #include <net/icmp.h>
> @@ -2456,6 +2457,7 @@ static void *established_get_first(struct seq_file *seq)
>                                 return sk;
>                 }
>                 spin_unlock_bh(lock);
> +               cond_resched();
>         }
>
>         return NULL;
> --
> 2.25.1
>
Hi Jian, thanks for your patch.

Few points:

- Note that net-next is currently closed (merge window)

- Also, /proc interface does not hold RTNL, not sure why you mention
RTNL in the changelog,
and not other mutexes in the kernel that also would be impacted by the
long duration of established_get_first() ?

- The cond_resched() should be done even if all buckets are empty ?

- Using inet_diag, Kubernetes could list both IPv4/IPv6 sockets in one dump,
and benefit from more modern interface (with cond_resched() already there)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ