lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADVnQy=_x2TJjSZB_zLuvHyNHiWXM1mS_1GG8sDHUTjjh=ga9w@mail.gmail.com>
Date: Tue, 13 Aug 2024 11:23:02 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Subash Abhinov Kasiviswanathan <quic_subashab@...cinc.com>, edumazet@...gle.com, soheil@...gle.com, 
	yyd@...gle.com, ycheng@...gle.com, davem@...emloft.net, kuba@...nel.org, 
	netdev@...r.kernel.org, dsahern@...nel.org, 
	Sean Tranchetti <quic_stranche@...cinc.com>
Subject: Re: [PATCH net] tcp: Update window clamping condition

On Tue, Aug 13, 2024 at 5:27 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 8/9/24 01:06, Subash Abhinov Kasiviswanathan wrote:
> > This patch is based on the discussions between Neal Cardwell and
> > Eric Dumazet in the link
> > https://lore.kernel.org/netdev/20240726204105.1466841-1-quic_subashab@quicinc.com/
> >
> > It was correctly pointed out that tp->window_clamp would not be
> > updated in cases where net.ipv4.tcp_moderate_rcvbuf=0 or if
> > (copied <= tp->rcvq_space.space). While it is expected for most
> > setups to leave the sysctl enabled, the latter condition may
> > not end up hitting depending on the TCP receive queue size and
> > the pattern of arriving data.
> >
> > The updated check should be hit only on initial MSS update from
> > TCP_MIN_MSS to measured MSS value and subsequently if there was
> > an update to a larger value.
> >
> > Fixes: 05f76b2d634e ("tcp: Adjust clamping window for applications specifying SO_RCVBUF")
> > Signed-off-by: Sean Tranchetti <quic_stranche@...cinc.com>
> > Signed-off-by: Subash Abhinov Kasiviswanathan <quic_subashab@...cinc.com>
> > ---

Acked-by: Neal Cardwell <ncardwell@...gle.com>

> >   net/ipv4/tcp_input.c | 28 ++++++++++++----------------
> >   1 file changed, 12 insertions(+), 16 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index e2b9583ed96a..e37488d3453f 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -238,9 +238,14 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb)
> >                */
> >               if (unlikely(len != icsk->icsk_ack.rcv_mss)) {
> >                       u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE;
> > +                     u8 old_ratio = tcp_sk(sk)->scaling_ratio;
> >
> >                       do_div(val, skb->truesize);
> >                       tcp_sk(sk)->scaling_ratio = val ? val : 1;
> > +
> > +                     if (old_ratio != tcp_sk(sk)->scaling_ratio)
>
> Should we do this only for sk->sk_userlocks & SOCK_RCVBUF_LOCK ?

No, I'm pretty sure all TCP sockets need to do this, regardless of
their (or sk->sk_userlocks & SOCK_RCVBUF_LOCK) status, because no
matter whether sk->sk_rcvbuf is autotuned or locked to a fixed value,
every socket ideally should have an up-to-date tp->scaling_ratio value
so that it can accurately translate the sk->sk_rcvbuf into a receive
window.

This is basically what I was arguing here:
  https://lore.kernel.org/netdev/CADVnQynKT7QEhm1WksrNQv3BbYhTd=wWaxueybPBQDPtXbJu-A@mail.gmail.com/

> I think that explicitly checking for an ratio increase would be safer:
> even if len increased I guess the ratio could decrease in some edge
> scenarios.

I agree that the ratio could decrease in some edge scenarios (e.g., if
traffic shifts to arriving via NIC with a less space-efficient buffer
allocation strategy). But, in such scenarios, wouldn't it be better to
have the window clamp to adjust to the correct value?

FWIW, this current version of the patch looks good to me. :-)

Thanks!

neal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ