[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKPqdBWQMQMuYXDo=SBi7gjQgnBMFFnHw0BZK328HKFwA@mail.gmail.com>
Date: Thu, 16 May 2024 09:57:38 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: "Subash Abhinov Kasiviswanathan (KS)" <quic_subashab@...cinc.com>
Cc: soheil@...gle.com, ncardwell@...gle.com, yyd@...gle.com, ycheng@...gle.com,
quic_stranche@...cinc.com, davem@...emloft.net, kuba@...nel.org,
netdev@...r.kernel.org
Subject: Re: Potential impact of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
On Thu, May 16, 2024 at 9:16 AM Subash Abhinov Kasiviswanathan (KS)
<quic_subashab@...cinc.com> wrote:
>
> On 5/15/2024 11:36 PM, Eric Dumazet wrote:
> > On Thu, May 16, 2024 at 4:32 AM Subash Abhinov Kasiviswanathan (KS)
> > <quic_subashab@...cinc.com> wrote:
> >>
> >> On 5/15/2024 1:10 AM, Eric Dumazet wrote:
> >>> On Wed, May 15, 2024 at 6:47 AM Subash Abhinov Kasiviswanathan (KS)
> >>> <quic_subashab@...cinc.com> wrote:
> >>>>
> >>>> We recently noticed that a device running a 6.6.17 kernel (A) was having
> >>>> a slower single stream download speed compared to a device running
> >>>> 6.1.57 kernel (B). The test here is over mobile radio with iperf3 with
> >>>> window size 4M from a third party server.
> >>>
> >
> > DRS is historically sensitive to initial conditions.
> >
> > tcp_rmem[1] seems too big here for DRS to kick smoothly.
> >
> > I would use 0.5 MB perhaps, this will also also use less memory for
> > local (small rtt) connections
> I tried 0.5MB for the rmem[1] and I see the same behavior where the
> receiver window is not scaling beyond half of what is specified on
> iperf3 and is not matching the download speed of B.
What do you mean by "specified by iperf3" ?
We can not guarantee any stable performance for applications setting SO_RCVBUF.
This is because the memory overhead depends from one version to the other.
>
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c?h=v6.6.17#n385
> >
> > Hmm... rmnet_map_deaggregate() looks very strange.
> >
> > I also do not understand why this NIC driver uses gro_cells, which was
> > designed for virtual drivers like tunnels.
> >
> > ca32fb034c19e00c changelog is sparse,
> > it does not explain why standard GRO could not be directly used.
> >
> rmnet doesn't directly interface with HW. It is a virtual driver which
> attaches over real hardware drivers like MHI (PCIe), QMI_WWAN (USB), IPA
> to expose networking across different mobile APNs.
>
> As rmnet didn't have it's own NAPI struct, I added support for GRO using
> cells. I tried disabling GRO and I don't see a difference in download
> speeds or the receiver window either.
>
> >>
> >> From netif_receive_skb_entry tracing, I see that the truesize is around
> >> ~2.5K for ~1.5K packets.
> >
> > This is a bit strange, this does not match :
> >
> >> ESTAB 4324072 0 192.0.0.2:42278 223.62.236.10:5215
> >> ino:129232 sk:3218 fwmark:0xc0078 <->
> >> skmem:(r4511016,
> >
> > -> 4324072 bytes of payload , using 4511016 bytes of memory
> I probably need to dig into this further. If the memory usage here was
> inline with the actual size to truesize ratio, would it cause the
> receiver window to grow.
>
> Only explicitly increasing the window size to 16M in iperf3 matches the
> download speed of B which suggests that sender server is unable to scale
> the throughput for 4M case due to limited receiver window advertised by
> A for the RTT in this specific configuration.
What happens if you let autotuning enabled ?
Powered by blists - more mailing lists