[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZlbfmAy5_mh9QxIS@TONYMAC-ALIBABA.local>
Date: Wed, 29 May 2024 15:56:08 +0800
From: Tony Lu <tonylu@...ux.alibaba.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Jason Xing <kerneljasonxing@...il.com>, Kevin Yang <yyd@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 0/2] tcp: add sysctl_tcp_rto_min_us
On Wed, May 29, 2024 at 09:39:02AM +0200, Eric Dumazet wrote:
> On Wed, May 29, 2024 at 9:00 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> >
> > On Wed, May 29, 2024 at 2:43 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > Hello Kevin,
> > >
> > > On Wed, May 29, 2024 at 1:13 AM Kevin Yang <yyd@...gle.com> wrote:
> > > >
> > > > Adding a sysctl knob to allow user to specify a default
> > > > rto_min at socket init time.
> > >
> > > I wonder what the advantage of this new sysctl knob is since we have
> > > had BPF or something like that to tweak the rto min already?
> > >
> > > There are so many places/parameters of the TCP stack that can be
> > > exposed to the user side and adjusted by new sysctls...
> > >
> > > Thanks,
> > > Jason
> > >
> > > >
> > > > After this patch series, the rto_min will has multiple sources:
> > > > route option has the highest precedence, followed by the
> > > > TCP_BPF_RTO_MIN socket option, followed by this new
> > > > tcp_rto_min_us sysctl.
> > > >
> > > > Kevin Yang (2):
> > > > tcp: derive delack_max with tcp_rto_min helper
> > > > tcp: add sysctl_tcp_rto_min_us
> > > >
> > > > Documentation/networking/ip-sysctl.rst | 13 +++++++++++++
> > > > include/net/netns/ipv4.h | 1 +
> > > > net/ipv4/sysctl_net_ipv4.c | 8 ++++++++
> > > > net/ipv4/tcp.c | 3 ++-
> > > > net/ipv4/tcp_ipv4.c | 1 +
> > > > net/ipv4/tcp_output.c | 11 ++---------
> > > > 6 files changed, 27 insertions(+), 10 deletions(-)
> > > >
> > > > --
> > > > 2.45.1.288.g0e0cd299f1-goog
> > > >
> > > >
> >
> > Oh, I think you should have added Paolo as well.
> >
> > +Paolo Abeni
>
> Many cloud customers do not have any BPF expertise.
> If they use existing BPF programs (added by a product), they might not
> have the ability to change it.
+1, eBPF actually is not easy to write, debug and manage for now.
Sysctls are easy to use, just put it into /etc/sysctl.conf and save it
into users' customized images or templates. AFAIK, there is no standard
system kit to handle eBPF in most OS distros.
>
> We tried advising them to use route attributes, after
> commit bbf80d713fe75cfbecda26e7c03a9a8d22af2f4f ("tcp: derive
> delack_max from rto_min")
>
> Alas, dhcpd was adding its own routes, without the "rto_min 5"
> attribute, then systemd came...
> Lots of frustration, lots of wasted time, for something that has been
> used for more than a decade
> in Google DC.
>
> With a sysctl, we could have saved months of SWE, and helped our
> customers sooner.
>
> Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Powered by blists - more mailing lists